Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Week 1		Week 1
Week 2		Week 2
Week 3		Week 3
Week5		Week5
README.md		README.md

Repository files navigation

Data Engineering Zoomcamp (cohort 2025) by DataTalks.club

a free nine-week course covering fundamentals of data engineering

Learn and practice each part of the data engineering process and apply your acquired knowledge and skills to develop an end-to-end data pipeline from the ground up.

Syllabus

Overview

This course consists of modules, workshops, and a project that helps you apply the concepts and tools learned during the course. The syllabus is structured to guide you step-by-step through the world of data engineering.

Table of Contents

Module 1: Containerization and Infrastructure as Code
Module 2: Workflow Orchestration
Workshop 1: Data Ingestion
Module 3: Data Warehouse
Module 4: Analytics Engineering
Module 5: Batch Processing
Module 6: Streaming
Project

Module 1: Containerization and Infrastructure as Code

Course overview
Introduction to GCP
Docker and docker-compose
Running Postgres locally with Docker
Setting up infrastructure on GCP with Terraform
Preparing the environment for the course
Homework

Module 2: Workflow Orchestration

Data Lake
Workflow orchestration
Workflow orchestration with Kestra
Homework

Workshop 1: Data Ingestion

Reading from apis
Building scalable pipelines
Normalising data
Incremental loading
Homework

Module 3: Data Warehouse

Data Warehouse
BigQuery
Partitioning and clustering
BigQuery best practices
Internals of BigQuery
BigQuery Machine Learning

Module 4: Analytics engineering

Basics of analytics engineering
dbt (data build tool)
BigQuery and dbt
Postgres and dbt
dbt models
Testing and documenting
Deployment to the cloud and locally
Visualizing the data with google data studio and metabase

Module 5: Batch processing

Batch processing
What is Spark
Spark Dataframes
Spark SQL
Internals: GroupBy and joins

Module 6: Streaming

Introduction to Kafka
Schemas (avro)
Kafka Streams
Kafka Connect and KSQL

Project

Putting everything we learned to practice

Week 1 and 2: working on your project
Week 3: reviewing your peers

About

No description, website, or topics provided.

Report repository

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook 100.0%