Data Engineering
Overview
The curriculum for this course is designed to provide a comprehensive introduction to the field of data engineering. This program covers the essential skills and knowledge needed to begin a career as a data engineer, starting with Python programming fundamentals and extending to more advanced topics like distributed systems, big data analytics, and ETL processes.
The program ends with a graduation project, where students apply what they’ve learned in a practical, real-world scenario. This project is an opportunity to demonstrate your skills and understanding of data engineering principles.
Course Details
Program Length: 8 weeks
Class Length: 2 hours
Original Price: 1000 €
With Growth Labs Academy Scholarship: 499 €*
Outcomes
This course will provide you with:
- A solid foundation in data engineering
- Preparation, to tackle real-world data challenges
- A path to a successful career in data engineering
Please ensure to have a computer that meets the specified requirements:
- Supported operating systems: macOS, Linux, or Windows (Pro edition required).
- Latest OS version, fully up to date.
- All security updates installed.
- At least 100GB of free space on the hard drive.
- At least 16GB of RAM, 32GB RAM is strongly preferred.
- Support for video conferencing and screen-sharing, with a reliable webcam and microphone.
Python
- Intro to Python, basic syntax, data types, variable declarations, conditions, loops and strings
- Data structures, lists, tuples and ranges
- Dictionaries and sets
- Functions and modules
- Libraries for data processing and data visualization in Python (NumPy,
- Pandas, Matplotlib and Seaborn)
- Virtual Environments
Data Formats
- JSON
- XML
- CSV
Introduction to Big Data
- Definition and Characteristics
- Challenges and Opportunities
- Use Cases
- Big Data Enabling Technologies
Storage
- Relational databases
- SQL
- NoSQL databases
- Vector databases
- Graph databases
- S3
- Azure blob
Data Modeling
- Data design and modeling
- Normalization
Data
- Data discovery
- Data integration
- Transformation and enrichment
Data Warehousing
- Introduction to Data Warehousing
- What is the role of Data Warehousing
Data Lakes
- Introduction to Data Lakes
- What is the role of Data Lakes
ETL processes
- Basics of ETL processes and tools.
- Building ETL processes.
To successfully pass the class, students should aim to reach a minimum of 90% of the available points. We’ve created a flexible environment which will enable you to have the best learning experience and elevate you on to greater heights.
Punctuality, participation in discussions, completion of assignments, and demonstration of professional courtesy to others are required, in accordance with our Code of Conduct. Attendance will be taken at the beginning of every class. Passing requires at least 90% attendance. Students should always contact the instructors ahead of time if they are unable to attend all or part of the published class/lab hours.