I passed the Google Cloud Professional Data Engineer exam on the first attempt on 27th September. Let me share my preparation with you all for your reference.
Table of Contents
- My Previous Experience with GCP
- Current work with GCP
- My Preparation For Exam
- Practice Test
- One important Tip
My Previous Experience with GCP
I started my GCP journey in 2018 when I was working on Kafka and Kubernetes. I was so happy to see my first distributed software installed and communicating with each other. (It was a pleasure when you deploy distributed software like Kafka in a cluster environment)
During that time I learned about streaming technology, GCP Compute Engine, Docker, Kubernetes.
Current Work with GCP
My current work is to build a data platform on GCP. I explored various products like BigQuery, Data Studio, Cloud Spanner, Cloud BigTable, Google cloud storage, Cloud Composer, KMS, Dataflow, and apache beam. I tried POC and tried to understand each product and finalize the design for the business requirements.
My Preparation for Exam
Although I worked with many data analytics products on GCP, there were some products I didn’t explore like PubSub, Dataproc, Data fusion, Data prep, and Machine Learning products like Kubeflow, Machine Learning Platform.
Hence I needed some online courses which could give me a basic understanding of the product.
I watch many video courses on Pluralsight and Coursera
1: Video Courses I Completed
List of courses I completed on Pluralsight:
Data Storage
https://app.pluralsight.com/library/courses/google-bigtable-architecting-big-data-solutions/table-of-contents
https://app.pluralsight.com/library/courses/google-cloud-spanner-creating-administering-instances/table-of-contents
https://app.pluralsight.com/library/courses/google-cloud-sql-creating-administering-instances
https://app.pluralsight.com/library/courses/google-cloud-platform-firestore-leveraging-realtime-database-solutions
Data Processing
https://app.pluralsight.com/library/courses/google-cloud-platform-pubsub-architecting-stream-processing-solutions/table-of-contents
https://app.pluralsight.com/library/courses/building-blocks-hadoop-hdfs-mapreduce-yarn/table-of-contents
https://app.pluralsight.com/library/courses/google-dataflow-architecting-serverless-big-data-solutions/table-of-contents
https://app.pluralsight.com/library/courses/google-cloud-functions-getting-started/table-of-contents
Pipeline Orchestration
https://app.pluralsight.com/library/courses/google-cloud-platform-composer-building-pipelines-workflow-orchestration/table-of-contents
Machine Learning
https://app.pluralsight.com/library/courses/applying-machine-learning-data-gcp
https://app.pluralsight.com/library/courses/smart-analytics-machine-learning-ai-gcp/table-of-contents
https://app.pluralsight.com/library/courses/tensorflow-understanding-foundations
Overall Data Platform
https://app.pluralsight.com/library/courses/modernizing-data-lakes-data-warehouses-gcp/table-of-contents
Above courses were very helpful to set the base for understanding of data engineering on GCP.
2: Product Documentation
Next step is took was to go through the documentation of each product. This was helpful in understanding detail concepts . Going through entire documentation looked too much initially but this is very important in order to not miss out on certain important concepts. I decided to use 1 hour of my day to go through the documentation. https://cloud.google.com/docs
3: GCP Qwiklabs
I practiced data engineering quest on GCP Qwiklabs which is good way to perform hands-on. https://bit.ly/35v0TZ1
4: Final Steps
- I referred this blog very useful during last week preparation in order to avoid if i am missing any concepts.
https://deploy.live/blog/google-cloud-certified-professional-data-engineer/
- This course from GCP on Pluralsight was also good for last week preparation.
https://app.pluralsight.com/library/courses/preparing-google-cloud-professional-data-engineer-exam
Practice Test
- Google Cloud Provide Sample Question, which I practiced before exam: https://cloud.google.com/certification/sample-questions/data-engineer
- Exam Topics: https://www.examtopics.com/exams/google/professional-data-engineer/view/4/
- Test prep training: https://www.testpreptraining.com/google-cloud-certified-professional-data-engineer-free-practice-test
One important Tip
Understanding open source alternatives to each GCP product would be very helpful in exam, since questions include comparison with open source alternatives.
For example, I studied GCP PubSub, but I also learnt about Apache Kafka . This is important because In the exam you would often ask about questions on comparison and alternatives to choose from. Hence If learning about Google Dataflow is mandatory to pass the exam but understanding Apache Spark and Flink is also important to become data engineer.
Thanks for your time!