TG
nicht verfügbar bis 01.04.2024

Letztes Update: 13.03.2024

Data Architect & Data Engineer

Abschluss: M.Sc. Physics
Stunden-/Tagessatz: anzeigen
Sprachkenntnisse: deutsch (Muttersprache) | englisch (verhandlungssicher) | kroatisch (Muttersprache)

Dateianlagen

2023-Resume-TG_130323.pdf

Skills

Programming Languages: Python, Go, JavaScript
Data Engineering: Databricks, Hadoop (HDFS, YARN), Spark, SQL, BigQuery, Vertica, Postgres, Elasticsearch, Azure Data Factory, Synapse, MongoDB, Scylla, Celery, Redis, bash scripting, Airflow, Jenkins, Docker, Flask, Django
Data Science: Tensorflow, Jupyter, gensim, spaCy, Hugging Face, Pandas, Numpy
BI: Tableau
DevOps: Azure, GCP, AWS, OpenShift, Kubernetes, Debian, Ubuntu, Nginx, Terraform

Projekthistorie

06/2023 - bis jetzt
Data architect

- Lead the consulting on new data architecture (Data Warehouse vs Data Lakehouse) for a German manufacturing company & facilitated process for adoption of DLH architecture on Azure + Databricks
- Planned and developed a proof of concept logical data warehouse on Azure (Synapse, ADLS)

01/2022 - bis jetzt
Solo Founder


01/2023 - 10/2023
Data Engineer and Airflow Architect
ECC AG/Deutsche Börse AG (Banken und Finanzdienstleistungen)

  • automated Kibana Monitor as well as Grafana Dashboard generation as part of a monitoring and alerting migration to OpenShift
  • setting up Airflow on Kubernetes/OpenShift

06/2020 - 11/2021
Data Engineer
Zattoo

Zattoo is one of the leading TV streaming providers in Europe and was acquired by TX Ventures.
* Designed and built a data mart for subscriber activity metrics as part of company wide effort to
consolidate company success metrics (with Airflow and BigQuery).
* Contributed to GCP based data warehouse redesign: introducing Kafka, a data lake and BigQuery.
* Set up Airflow as the new main orchestration tool along with best practices as well as a Docker based
development environment.

04/2018 - 10/2019
Data Engineer
Motionlogic

Motionlogic was a Deutsche Telekom owned startup offering traffic & location reports.
* Designed and implemented a query engine using PySpark, HDFS, Redis & MongoDB to produce
individually billable reports which largely expanded the product line
* Collaborated closely with Data Science team on quality and performance improvements of the core
business algorithm for trip/activity extraction from movement chains and thereby ensured product
satisfaction of some of Europe's largest telecommunications companies. Implemented in PySpark to
process more than 3TB per day for on-premise cluster with >1000 cores, 60 servers, >10 TB RAM.

04/2015 - 10/2015
Quality Control Analyst Intern
i4i

i4i is a VC-funded startup providing structured content apps for the life sciences industry to solve compliance.
* Implemented extensive integration and regression testing in Python and maintained documentation.

Zertifikate

Microsoft Certified: Azure Data Engineer Associate
2022
Databricks Certified Developer: Apache Spark with Python
databricks training
2019

Reisebereitschaft

Nur Remote verfügbar

Sonstige Angaben

I am a Data Architect and Data Engineer with over 8 years of experience in designing, implementing, and optimizing large-scale data infrastructure solutions. My expertise includes defining architectural patterns, establishing standards, setting up best practices for data processing, storage and serving. Clearly communicating complex technical concepts to stakeholders is crucial for me in fostering successful collaboration between technical and non-technical teams.
Profilbild von Anonymes Profil, Data Architect & Data Engineer Data Architect & Data Engineer
Registrieren