Executing Tasks using OCI CLI in Oracle Cloud Infrastructure Data Integration

Here we will use the Oracle Cloud Infrastructure CLI to execute an OCI Data Integration task which has been published to an application. To execute a task you’ll need the OCID of the workspace itself, the application key and the task key. Everything else can be defaulted and overwritten when needed;

Run a task from the OCI Console

Tasks can have parameters for overriding the data asset, connection, schema, entity for any source or target operator. Tasks can also have parameters for filter conditions and join conditions, below you can see what happens in the console when you run a task with parameters, the console will prompt to see if you wish to override the defaults;

Run the task and pass runtime parameters.

In the code example below we will see how to execute with default values and then how to execute and pass a different object name to be used during execution.

oci data-integration task-run create --workspace-id YOUR_WORKSPACE_OCID --application-key YOUR_APPLICATION_KEY --registry-metadata '{"aggregator-key":"YOUR_TASK_KEY"}'

We can execute this from within the OCI Console as below;

Execute the create task run command

You can then refresh the runs and see the new task run in the OCI Console;

When you want to override parameters pass in values in the config provider property. This is a JSON snippet which has binding for the parameters. If it’s a database table you can simply pass in the table name, if it’s a file you’’ll need to pass in the data format for the data, compression and so on. For example below you can see dataFormat, the format attribute passes in model type of CSV_FORMAT, the encoding (UTF-8), the delimiter (,), the quote character (“), the time stamp format (yyyy-MM-dd HH:mm:ss.SSS), escape character(\);

Example of a CSV data file parameter named INVENTORY_DATA

If it was JSON, the data format is much simpler, just need the model type of JSON_FORMAT and the encoding of UTF-8 (for example);

Example of a JSON data file parameter named INVENTORY_DATA.

If its a database table its even simpler, the parameter is named SRCDATAPARAM;

In each of these you can see the key has a special format;

dataref:connectionKey/bucketName/FILE_ENTITY:fileName

or

dataref:connectionKey/schemaName/TABLE_ENTITY:tableName

Below you can see an example where the file and format is passed in as a parameter to the execution, the property is passed into the parameter config-provider.

oci data-integration task-run create --workspace-id YOUR_WORKSPACE_OCID --application-key YOUR_APPLICATION_KEY --registry-metadata '{"aggregator-key":"YOUR_TASK_KEY"}' --config-provider '{"bindings":{"INVENTORY_DATA":{"rootObjectValue":{"modelType":"ENRICHED_ENTITY","entity":{"key":"dataref:0a5ae9ef-5b74-447a-9d5b-49511f3b7600/a_supplierx_stage/FILE_ENTITY:supplierx_inventory2.csv","name":"supplierx_inventory2.csv","modelType":"FILE_ENTITY","resourceName":"FILE_ENTITY:supplierx_inventory2.csv"},"dataFormat":{"formatAttribute":{"modelType":"CSV_FORMAT","encoding":"UTF-8","delimiter":",","quoteCharacter":"\"","hasHeader":true,"timestampFormat":"yyyy-MM-dd HH:mm:ss.SSS","isFilePattern":false,"escapeCharacter":"\\"},"type":"CSV","compressionConfig":{"codec":"NONE"},"isFilePattern":false}}}}}'

You can then see the task running in the console;

Monitor the tasks running from within the OCI Console

It’s quite simple to execute commands using the CLI, you can put it in a cron job or your favorite scheduler. There are some other useful capabilities.

Process Directories/Folders

Data Integration also supports processing directories of objects in a bucket, that’s based on the naming convention for the object — so if the objects have name ‘20200707/meterx10934.csv’, ‘20200707/meterx10935.csv’ etc., then using ‘20200707/’ as the file entity name when executing above will process all objects with that prefix, in that logical directory.

Process File Patterns

Data Integration also supports processing patterns of objects, so you can process ‘supplierx*json” for example. Enter this as the resource name and set isFilePattern to true above.

Process Zipped Data

Data Integration also supports processing zipped objects, so you can process ‘supplierx*csv.gz” for example. Enter this as the resource name and set the compression algorithm to use above. There is support for GZIP, BZIP2, DEFLATE, LZ4, SNAPPY, you can set the value in the codec property within the compressionConfig property;

The compression options are defined when selecting the entity.

The above are cool capabilities that let you design small and process much more via the parameterization of these properties.

I hope you found this post useful, it was a quick flyover of using the Oracle Cloud Infrastructure CLI to execute a task which had been published to an application.

Check out the Oracle Cloud Infrastructure here along with details of the Oracle Cloud Infrastructure Data Integration service;

https://www.oracle.com/middleware/data-integration/oracle-cloud-infrastructure-data-integration

Architect at @Oracle developing cloud services for data. Connect on Twitter @i_m_dave