Select Page

Oracle Database On-Premises to Azure Data Lake Storage for Microsoft Fabric: Real-Time Data Ingestion, Migration, and Streaming using Oracle GoldenGate

Author: Ozwen Coelho | 6 min read | February 11, 2025

Nowadays, organization need to migrate and move data from Oracle database to various data lakes, such as Microsoft Fabric and Snowflake. This data lakes are used in Data Analytics and by Data Scientists which help organizations to process vast amounts of data in real time or batch, derive insights, and make data-driven decisions.

Oracle GoldenGate is very powerful tool to capture and replicate data, followed by converting it into various format, e.g. Json/parquet.Let’s dive into using Oracle GoldenGate for the extraction of data, configuring the replication process for distributed applications/analytics, and then converting data into Json/parquet format.

Environment Detail (Source): RHEL(8.10) || 19c (19.25) Oracle Database (GoldenGate binaries for Oracle (extract process) and GoldenGate binaries for distributed applications/analytics (Replicate process)
Environment Detail (Target): Azure Cloud || ADLS || Microsoft Azure BLOB storage

(Step 1) Download GoldenGate Binaries for Linux and for Distributed Applications and Analytics

Oracle GoldenGate Downloads

(Step 2) Install GoldenGate Binaries for Linux and for Distributed Applications and Analytics

/u01/app/oracle/product/GG/23_db || GG Binaries for Linux 
/u01/app/oracle/product/GG/23_az || GG Binaries for Distributed Applications and Analytics

(Step 3) Apply Patches to the GoldenGate Binaries to Be on the Latest Version

Applied patch#37071319 using command “opatch apply” on GG Binaries for Linux(/u01/app/oracle/product/GG/23_db) 
Applied patch#37108354 using command “opatch apply” on GG Binaries for Distributed Applications and Analytics (/u01/app/oracle/product/GG/23_az) 

(Step 4) Configure Az Tool to Test Connectivity from On-premises to Azure Portal for Storage Account

(Step 5) Perform Prep Work on Oracle Database

Enable archivelog || alter database archivelog; 
Enable Force logging || alter database force logging; 
Enabled Supplemental Logging || alter database add supplemental log data; 
Enable Golden Gate replication || alter system set enable_goldengate_replication=true; 
Create GG user || create user oggadmin identified by "*********"; Grant required permission || exec 
dbms_goldengate_auth.grant_admin_privilege('oggadmin');

(Step 6) Configure Golden Gate 23c Service Manager and Create a Deployment /u01/app/oracle/product/GG/23_db/bin/oggca.sh

/u01/app/oracle/product/GG/23_db/bin/oggca.sh

(Step 7) Prep Work for Extraction – Add a Database, Connection, Transdata, Checkpoint Table, Heartbeat Table, Extract Group Addition, and Registration

Add an Online Extract

(Step 8) Configure the Extraction Process for Table (Initial Load + Continuous Replication) Initial load

Continuous replication

(Step 9) Configure Deployment for Distributed Applications and Analytics

/u01/app/oracle/product/GG/23_az/bin/oggca.sh






(Step 10) Install Dependencies (This is Required to Create Replicated in Parquet)

https://docs.oracle.com/en/middleware/goldengate/big-data/23/gadbd/qs-realtimeparquet-ingestion-azure-data-lake-storage-goldengate-daa.html#GUID-75E81FD0-0DF4418A-9643-8F63243DB74C

[oracle@******* DependencyDownloader]$ which java 
/usr/bin/java 
[oracle@******* DependencyDownloader]$ pwd 
/u01/app/oracle/product/GG/23_az/opt/DependencyDownloader 
[oracle@******* DependencyDownloader]$
 
./parquet.sh 1.15.0  || https://search.maven.org/artifact/org.apache.parquet/parquethadoop 
./hadoop.sh 3.4.1 || https://search.maven.org/artifact/org.apache.hadoop/hadoop-azure ./azure_blob_storage.sh 12.29.0 || https://search.maven.org/artifact/com.azure/azurestorage-blob 

Copy the downloaded files from above to /u01/app/oracle/product/GG/oggjar/

(Step 11) Configure Replicate Process to Push Data to Azure Blob Storage

Properties files:

#The File Writer Handler – no need to change
gg.handlerlist=filewriter
gg.handler.filewriter.type=filewriter
gg.handler.filewriter.mode=op
gg.handler.filewriter.pathMappingTemplate=./dirout
gg.handler.filewriter.stateFileDirectory=./dirsta
gg.handler.filewriter.fileRollInterval=7m
gg.handler.filewriter.inactivityRollInterval=5s
gg.handler.filewriter.fileWriteActiveSuffix=.tmp
gg.handler.filewriter.finalizeAction=delete
### Avro OCF – no need to change
gg.handler.filewriter.format=avro_row_ocf
gg.handler.filewriter.fileNameMappingTemplate=${groupName}_${fullyQualifiedTableName}_${currentTimestamp}.avro
gg.handler.filewriter.format.pkUpdateHandling=delete-insert
gg.handler.filewriter.format.metaColumnsTemplate=${optype},${position}
gg.handler.filewriter.format.iso8601Format=false
gg.handler.filewriter.partitionByTable=true
gg.handler.filewriter.rollOnShutdown=true
#The Parquet Event Handler – no need to change
gg.handler.filewriter.eventHandler=parquet
gg.eventhandler.parquet.type=parquet
gg.eventhandler.parquet.pathMappingTemplate=./dirparquet
gg.eventhandler.parquet.fileNameMappingTemplate=${groupName}_${fullyQualifiedTableName}_${currentTimestamp}.parquet
gg.eventhandler.parquet.writeToHDFS=false
gg.eventhandler.parquet.finalizeAction=delete
#TODO Select ABS Event Handler – no need to change
gg.eventhandler.parquet.eventHandler=abs
#TODO Set ABS Event Handler - please update as needed
gg.eventhandler.abs.type=abs
gg.eventhandler.abs.bucketMappingTemplate=******
gg.eventhandler.abs.pathMappingTemplate=ogg/data/fullload/${fullyQualifiedTableName}
gg.eventhandler.abs.accountName=*********
#TODO: Edit the Azure storage account key if access key is used
gg.eventhandler.abs.accountKey=************
#TODO: Edit the Azure shared access signature (SAS) to if SAS is used.
#gg.eventhandler.abs.sasToken=
#TODO: Edit the tenant ID, Client ID and Secret of the application if LDAP is used.
#gg.eventhandler.abs.tenantId=
#gg.eventhandler.abs.clientId=
#gg.eventhandler.abs.clientSecret=
#TODO Set the classpath to the paths you noted in step1
gg.classpath=/u01/app/oracle/product/GG/oggjar/*
jvm.bootoptions=-Xmx512m -Xms32m

Want to connect with the author or explore other database resources? Visit Datavail.com for more.

Subscribe to Our Blog

Never miss a post! Stay up to date with the latest database, application and analytics tips and news. Delivered in a handy bi-weekly update straight to your inbox. You can unsubscribe at any time.