Redshift Jobs
Need to create pipeline Amazon redshift - Talend orchestration(ETL) - power BI using Jenkins
From this data set : transactions data with 2.5 m records Task 1 : Using Sqoop load data in HDFS ( I have done it) Task 2 : From HDFS using pyspark need to create 4 dimension tables and 1 fact tables. Task 3: Those tables should be copied to redshift cluser Task 4: Analyse data using redshift queries (8 queries) At every step there is hint document we need to compare results.