Analytics Engineer Python & PySpark

Schwarz Unternehmenskommunikation GmbH & Co. KG
Bad Friedrichshall, Germany
2 days ago

Role details

Contract type
Permanent contract
Employment type
Part-time / full-time
Working hours
Regular working hours
Languages
English

Job location

Bad Friedrichshall, Germany

Tech stack

Query Performance
Java
API
Data analysis
Big Data
Code Review
Information Engineering
ETL
Software Debugging
Dimensional Modeling
Python
Open Source Technology
Standard Sql
SQL Databases
Data Streaming
GIT
Data Layers
PySpark
Git Flow
Integration Tests
Optimization Algorithms
Data Pipelines
Databricks

Job description

  • In this role you will work hands on building:
  • scalable production-ready APIs using Python and Java (optional).
  • ETL and CI/CD pipelines
  • data product templates
  • monitoring dashboards for usage of our data products
  • You will work as part of a large data community, you will:
  • provide debugging support for data teams
  • deploy open-source tools on cloud platforms
  • support and learn from our data community
  • support our team in establishing data engineering and analysis best practices

Requirements

Do you have experience in SQL?, + A passion for diving into complex datasets to extract insights

  • A proactive approach to optimizing data flows and improving overall data quality.
  • Ability to bridge the gap between technical infrastructure and business needs, translating "messy" requirements into clean data products.
  • Expert-level knowledge of SQL, including complex window functions, CTEs, and query performance tuning.
  • Deep understanding of dimensional modeling and practical experience implementing SCD Type 1, Type 2, and Type 3 logic.
  • Practical experience with PySpark, specifically focusing on optimization techniques such as broadcasting, salting, and partitioning.
  • Proven experience designing and implementing data layers (Bronze, Silver, Gold) within a Medallion Architecture.
  • Experience writing and maintaining unit and integration tests for data pipelines to ensure reliability.
  • Mastery of Git, including branching strategies, pull request workflows, and collaborative code reviews.
  • Experience using CI/CD pipelines to automate data deployments.
  • Experience with Databricks.
  • Ability to write/configure CI/CD pipelines.
  • Optional but highly valued if you have:
  • Experience building or using data quality monitoring systems.
  • Experience with dqx, Great Expectations, or Soda.
  • Ability to write/configure CI/CD pipelines.

About the company

Schwarz Digits creates the technological foundation for digital sovereignty in Europe. As the IT and digital division of the Schwarz Group, we develop and manage the IT infrastructures for the retail divisions Lidl and Kaufland, as well as Schwarz Production and PreZero. At the same time, we operate as an independent provider in the external market to support companies across Europe in their digital transformation. We bundle our core services in the areas of Cloud, Cyber Security, Data & AI, Communication, and Workspace. Join us and contribute to digital sovereignty in Europe. With us, you will work at the intersection of agility and security: You will benefit from fast decision-making processes, enjoy genuine creative freedom in your projects, and be able to build upon the stable foundation of the Schwarz Group. As a digital system provider for Lidl's online business, we at Schwarz Digits create efficient, scalable digital products and services and develop them for the future. Our divisions are divided into Commerce Platforms, Data Tech & Enablement, New Digital Business Models and SCRM Loyalty Platforms. Play your part in our team´s success. You will join the Article Intelligence and Finance team who is responsible for building data products and running them. We are committed to deliver high quality data products to our stakeholders and build collaboration across all teams that we work with.

Apply for this position