Check Last Run State in OCI Data Integration
This blog post shows a useful REST task that can be used in your pipelines for checking the last run state of a task run in OCI Data Integration. Its an interesting task not only functionally, but design wise also. It leverages system parameters/variables so the task can be used in any region, tenancy, workspace, application.
The system parameters used are;
- SYS.REGION — rather than hard-wiring you can use this to get the region string of the current workspace
- SYS.WORKSPACE_ID — this is the workspace OCID of the current executing task
- SYS.APPLICATION_KEY — this is the application key for the current executing task
The task to check the status is using a public API from OCI Data Integration, it is using the ListTaskRuns API and it leverages the nameStartsWith parameter and orders the results in descending order by date and limits the result to 1, this is how we get the last run. We then check for successful execution and also for last status. The value we check is passed as a parameter so you can check for SUCCESS or ERROR for example by calling with different parameter values. There is also a CONDITION boolean property you can use to define whether the task always returns success or whether it indeed does check the status — this is useful if you are always running (say on a schedule), or whether you want to manually run and skip the task in the pipeline if the task was successful last time (and only process failed tasks).
Using this task I can create a pipeline that can be run in different modes; always running the task or skipping if it was successful last time.
I configure check last run state with the following parameter values initially (check task status of ERROR, and also run all is true;
When I actually run the pipeline if I want to run the task regardless I can pass true for condition, if I want to skip the task if it was successful, I can pass false for condition (it will only be executed if the last execution was in error).
Lets look at how
The task uses the current workspace, application and you pass the task name prefix and task run status you want to check. You also pass a boolean condition that will let you differentiate between runall and skip tasks that succeeded. For example in this pipeline here, my pipeline has a parameter RUNALL, and I check for ERROR task run status, I will run the task if either RUNALL is true or last task run status is ERROR.
The URL is defined using the system parameters like this, the task also has TASK_TYPE and TASK_NAME exposed as parameters;
https://dataintegration.${SYS.REGION}.oci.oraclecloud.com/20200430/workspaces/${SYS.WORKSPACE_ID}/applications/${SYS.APPLICATION_KEY}/taskRuns?limit=1&sortOrder=DESC&sortBy=timeUpdated&aggregatorType=${TASK_TYPE}&nameStartsWith=${TASK_NAME}
You can see that here within the task editor;
The success condition is defined as follows
SYS.RESPONSE_STATUS >= 200 AND SYS.RESPONSE_STATUS < 300 AND (CAST(JSON_PATH(SYS.RESPONSE_PAYLOAD, '$.items[0].status') as String) == '${TASK_STATUS}' OR ${CONDITION})
You can see that here;
Get the library here
You can find the task in this collection of REST tasks in the dataintegration folder;
Summary
This blog post shows a useful REST task that can be used in your pipelines for checking the last run state of a task run in OCI Data Integration. Its an interesting task not only functionally, but design wise also. It leverages system parameters/variables so the task can be used in any region, tenancy, workspace, application. Check the OCI Data Integration documentation here; https://docs.oracle.com/en-us/iaas/data-integration/home.htm
Thanks for reading!