Analytics Engineer Python & PySpark

Schwarz Unternehmenskommunikation GmbH & Co. KG

Berlin, Germany

2 days ago

Role details

Contract type

Permanent contract

Employment type

Part-time / full-time

Working hours

Regular working hours

Languages

English

Job location

Berlin, Germany

Tech stack

Query Performance

Java

API

Artificial Intelligence

Data analysis

Cloud Computing

Code Review

Computer Security

ETL

Digital Technology

Dimensional Modeling

E-Business

Python

Standard Sql

SQL Databases

Data Streaming

GIT

Data Layers

PySpark

Git Flow

Integration Tests

Optimization Algorithms

Data Pipelines

Databricks

Job description

Schwarz Digits creates the technological foundation for digital sovereignty in Europe. As the IT and digital division of the Schwarz Group, we develop and manage the IT infrastructures for the retail divisions Lidl and Kaufland, as well as Schwarz Production and PreZero. At the same time, we operate as an independent provider in the external market to support companies across Europe in their digital transformation. We bundle our core services in the areas of Cloud, Cyber Security, Data & AI, Communication, and Workspace.

Join us and contribute to digital sovereignty in Europe. With us, you will work at the intersection of agility and security: You will benefit from fast decision-making processes, enjoy genuine creative freedom in your projects, and be able to build upon the stable foundation of the Schwarz Group.

As a digital system provider for Lidl's online business, we at Schwarz Digits create efficient, scalable digital products and services and develop them for the future. Our divisions are divided into Commerce Platforms, Data Tech & Enablement, New Digital Business Models and SCRM Loyalty Platforms.

Play your part in our team´s success. You will join the Article Intelligence and Finance team who is responsible for building data products and running them. We are committed to deliver high quality data products to our stakeholders and build collaboration across all teams that we work with.

Your tasks

scalable production-ready APIs using Python and Java (optional).
ETL and CI/CD pipelines
data product templates
monitoring dashboards for usage of our data products

Requirements

Do you have experience in SQL?, * A passion for diving into complex datasets to extract insights

A proactive approach to optimizing data flows and improving overall data quality.
Ability to bridge the gap between technical infrastructure and business needs, translating "messy" requirements into clean data products.
Expert-level knowledge of SQL, including complex window functions, CTEs, and query performance tuning.
Deep understanding of dimensional modeling and practical experience implementing SCD Type 1, Type 2, and Type 3 logic.
Practical experience with PySpark, specifically focusing on optimization techniques such as broadcasting, salting, and partitioning.
Proven experience designing and implementing data layers (Bronze, Silver, Gold) within a Medallion Architecture.
Experience writing and maintaining unit and integration tests for data pipelines to ensure reliability.
Mastery of Git, including branching strategies, pull request workflows, and collaborative code reviews.
Experience using CI/CD pipelines to automate data deployments.
Experience with Databricks.

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all