Running scripts in the Oracle Cloud

Here we see how to execute python scripts stored in OCI Object Storage inside a serverless instance in OCI’s Container Instance Service. You can then run via REST, any supported OCI SDK or from within OCI Data Integration in amongst your other jobs. This is a cloud native, serverless approach that is secure and you are in control of your scripts, their dependencies (software, networking and authorization) and where it runs.

Bridgeport Covered Bridge

The example defined in this post defines one Docker image that can run different scripts — so you can use for different purposes. The driver script reads the scripts from OCI Object Storage and executes the scripts. The driver script is driven by 3 environment properties that are passed when the container is run;

NAMESPACE — OCI Object Storage namespace where the bucket and steps to execute are stored

BUCKET_NAME — bucket name where the Python scripts are stored.

STEPS — comma separated list of python scripts to execute

The scripts will have python libraries and OCI Object Storage available to utilize, store this in a python file named driver.py, it uses resource principal below to authenticate.

The Dockerfile is defined as below;

The requirements.txt file is defined as;

This lets us create a Docker image that when pushed to OCI Container Registry can then be used to execute any Python script. Since I am executing via OCI Data Integration and using a workspace resource, I will add a resource principal policy statement to allow Data Integration to manage compute container family;

Then need policies for any OCI resources your scripts may use — the script below created a bucket and uploads an object, so I have added those permissions too = here we are controlling permissions for this script, so users can’t just run whatever they want — they must be granted permission;

Now let’s look at the example, below the step/python script will create a bucket and load file (we granted permissions above), it uses the OCI resource principal to authenticate with OCI;

Although I said any python script, it’s any script that has all of its library requirements met with what is in the requirements.txt file. That’s the great part here. If you depend on any other libraries, then simply add them to the requirements.txt file.

Can then use the OCI Data Integration REST Task to orchestrate the script — include in a pipeline, schedule, set parameters and so on. See here for information on the Container Instance REST Task.

Unload from Redshift?

If you wanted to unload from AWS Redshift, its easy to do using this approach, simple use the python connector from your python script, add in the dependency for your module like this;

I can then write steps that utilize the Redshift connector for example to unload data into AWS S3 as Parquet;

There are tonnes more examples. Its very simple, this makes integration very flexible and very easy.

Summary

Hope you found this useful, Docker is really a fantastic way of sandboxing your scripts, using OCI policies to secure what they can do and also Docker for portable environments that you can easily run, there are so many possibilities and frameworks for building smart solutions also, more on that to come. Check out the documentation for OCI Data Integration below. Send me comments, questions and ideas, would love to hear them.

Related resources

--

--

Architect at @Oracle developing cloud services for data. Connect on Twitter @i_m_dave

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
David Allan

Architect at @Oracle developing cloud services for data. Connect on Twitter @i_m_dave