synapse pyspark delta lake merge scd type2 without primary key
₹12500-37500 INR
Closed
Posted 5 months ago
₹12500-37500 INR
Paid on delivery
I am looking for a skilled professional who can help me with a project titled "synapse pyspark delta lake merge scd type2 without primary key". The ideal candidate should have experience and expertise in the following areas:
Desired Outcome:
- The desired outcome of the merge process is to update existing records and insert new records.
Data Quality:
- The level of data quality required for the outcome is high integrity, with no duplicates and full accuracy.
Handling Historical Data:
- There is a specific requirement to keep track of historical changes to the data.
Skills and Experience:
- Proficiency in Synapse, Pyspark, Delta Lake
- Experience with SCD Type 2 implementation
- Strong understanding of data integrity and accuracy
- Ability to handle historical data changes
Scenerio:
**Problem ** I have a set of rows coming from previous process which has no primary key, and the composite keys are bound to change which are not a good case for composite key, only way the rows are unique is the whole row( including all keys and all values). I need to implement the SCD type2 on this data. The environment is Synapse pyspark, using delta lake Merge command and more.
how I tried Using row hash: In this case the challenge without primary/composite key is to find which rows have changed/updated. With any updated values the row hash is changing and resulting into new row.
please suggest how this problem can be solved. If you have any questions on this, please write back.
If you have the skills and experience mentioned above, please submit your proposal for this project.
I've meticulously reviewed the job description and am confident in my alignment with the requirements. With a commitment to quality, I assure timely completion within the project deadline.