Data Engineering of the Future
- Next start date : 15 September 2025
- Study Mode : Delivery: Hybrid (Johannesburg Campus + Virtual)
- Campuses : Johannesburg
Overview: Design and deploy modern data pipelines with Spark, Kafka, and cloud-native warehousing. Build the infrastructure that powers analytics and machine learning at scale.
Programme Duration: 29 Weeks
Key Learning Outcomes
✔︎ Build ETL pipelines using Apache Airflow and Spark
✔︎ Stream data in real-time with Kafka and Flink
✔︎ Manage scalable storage with Delta Lake and cloud warehouses
✔︎ Implement data governance and lineage frameworks
Core Modules
Module | Key Topics | Weeks |
1. Data Engineering Fundamentals | Batch vs. stream, ETL, data modeling | 5 |
2. Distributed Systems | Spark, Hadoop, fault tolerance | 6 |
3. Real-Time Data | Kafka, Flink, event-driven architectures | 6 |
4. Cloud Data Platforms | Snowflake, BigQuery, Azure Synapse | 6 |
5. Data Ops & Governance | CI/CD, testing, metadata, lineage | 6 |
Certification Pathways
Certification | Body | Duration | Prerequisites |
Azure Data Engineer | Microsoft | 3 weeks | Modules 1–4 |
AWS Data Analytics | AWS | 3 weeks | Modules 1–4 |
Databricks Certified Engineer | Databricks | 2 weeks | Modules 2–4 |
Capstone Experience
IoT pipeline for real-time sensor data analytics
Data lakehouse implementation with Spark and Delta Lake
Technology Stack
◉ Pipelines: Apache Airflow, Spark
◉ Messaging: Kafka, MQTT
◉ Storage: Delta Lake, S3, BigQuery
Career Acceleration
Roles:
Data Engineer (Avg. salary: R850k)
Cloud Data Architect (Avg. salary: R1.05M)
Industry Demand: 39% increase in cloud data engineering roles (LinkedIn 2024)