Session - 18 BigQuery spotify case study and assignements
Session - 19 BigQuery working nested fields and repeated fields for better optimizations
Session - 20 BigQuery views, materialized views and authorized views
Session - 21 DataProc Hadoop and Spark Introduction, What is DataProc, Cluster Types and Cluster Creation
Session - 22 DataProc pyspark, pyspark jobs, extract step - how to read the data from multiple data sources (csv, json, txt, avro, parquet, mysql, bigquery)
Session - 23 DataProc load step - how to load the data to multple data sinks (csv, json, txt, avro, parquet, mysql, bigquery)
Session - 24 DataProc transform step - how to perform various transformations-1
Session - 25 DataProc transform step - how to perform various transformations-2, and assignements
Session - 26A DataProc creating a pyspark job and submit to dataproc cluster
Session - 26B DataProc End to End batch pipeline (dataproc, bigquery, gcs)
Session - 27 Dataflow Introduction, Apache Beam pipeline Introduction, differemce between dataproc and dataflow
Session - 28 Dataflow extract step - how to read the data from multiple data sources (csv, json, txt, avro, parquet, mysql, bigquery)
Session - 29 Dataflow load step - how to load the data to multple data sinks (csv, json, txt, avro, parquet, mysql, bigquery)
Session - 30 Dataflow transform step - how to perform various transformations-map, filter, pardo, groupbykey, combineperkey ..and assignements
Session - 31 Dataflow creating beam pipeline, creating beam pipeline from templates(gcs to bigquery), pubsub
Session - 32 Dataflow End to End Steaming pipeline creation (pubsub, dataflow, gcs, bigquery)
Session - 33 ComposerAirflow introduction, What is DAG, how to create DAG, composer enviromenet creation, cron job format