New ADF Change Data Capture Capabilities
Ойын-сауық
Mark walks through a demo of the new top-level ADF resource for CDC (Change Data Capture). Build a Change Data Capture process in #Azure #DataFactory in minutes without a pipeline, data flow, or a trigger!
Пікірлер: 18
This is an amazing feature to add. With this in place it's a case of comparing the performance of the CDC with the need for a compute (in most cases a merge) to update the existing records with the new/updated record brought across and how this will perform against a simple copy statement which would use Polybase (in the case of synapse dedicated or azure SQL DW). This needs to be added to the options in synapse pipelines (via synapse studio) considering this is intended to be the home for future data development.
What happens if there are multiple CSV under that folder in the source, would it be able to monitor all of them or we would need to pick one ?
Can we enable native sql CDC if we are connecting to DB2?
Can we do dedup transactions with same primary key within the 15 min interval before loading the target?
Are there plans to add full load capabilities directly within the CDC Resource? Currently, I have to establish a distinct pipeline / data flow for full loads. Incorporating this feature would greatly simplify the process.
Hi - I have setup as SQL Azure Database source and Managed SQL Instance target but it does only capture inserts and not delete or modifications on rows. So just replicate only inserts
How will connect from SQL On-Premise to Azure SQL Database . As we are unable to see integration runtime .
If I am not wrong, We can't use this when we are ingesting data from external source like other RDBMS servers like Oracle, and API based ingestion like salesforce?
@MSDataFactory
Жыл бұрын
Correct, this is only for CDC sources. Use pipelines with data flows to build a more complex ETL pipeline that combines CDC with other source types.
If I am using CDC with frequency of 15 minutes. Any idea how much will it cost to me ( I am using all basic options nothing very premium ).
Hi, I am using a Parquet source and an Azure SQL sink, it is unable to detect deletions between files. I have applied a key to the ID and this still cannot detect them. Is this something that will be resolved soon? I think there are many businesses ingesting from third party sources facing a similar issue where capturing incremental changes including deletions is of huge importance.
@MSDataFactory
8 ай бұрын
For file sources (like Parquet), ADF CDC only supports update and creation events, not deletes.
Hello, how do you capture rows that should be deleted in SQL DB in case they've been deleted from the source?
@MSDataFactory
8 ай бұрын
You will need to enable CDC on the SQL source DB in order to capture deletes.
Can we parameterize this ?
@MSDataFactory
Жыл бұрын
Not at this time
Can these CDC's be executed on a Self Hosted Integration Runtime?
@MSDataFactory
Жыл бұрын
In order to talk to Vnet, private, or on-prem data sources/sinks using CDC in ADF, use the managed Vnet option of the Azure IR