Running scripts in the Oracle Cloud

David Allan
4 min readJan 24, 2023

--

Here we see how to execute python scripts stored in OCI Object Storage inside a serverless instance in OCI’s Container Instance Service. You can then run via REST, any supported OCI SDK or from within OCI Data Integration in amongst your other jobs. This is a cloud native, serverless approach that is secure and you are in control of your scripts, their dependencies (software, networking and authorization) and where it runs.

Bridgeport Covered Bridge

The example defined in this post defines one Docker image that can run different scripts — so you can use for different purposes. The driver script reads the scripts from OCI Object Storage and executes the scripts. The driver script is driven by 3 environment properties that are passed when the container is run;

NAMESPACE — OCI Object Storage namespace where the bucket and steps to execute are stored

BUCKET_NAME — bucket name where the Python scripts are stored.

STEPS — comma separated list of python scripts to execute

The scripts will have python libraries and OCI Object Storage available to utilize, store this in a python file named driver.py, it uses resource principal below to authenticate.

import oci
import os

def execute_steps(rps,namespace,bucket_name,step_list):
object_storage = oci.object_storage.ObjectStorageClient({}, signer=rps)
for step in step_list:
get_obj = object_storage.get_object(namespace_name=namespace, bucket_name=bucket_name, object_name=step)
try:
exec(get_obj.data.content)
except Exception as e:
print(e)

rps = oci.auth.signers.get_resource_principals_signer()

namespace=os.environ.get('NAMESPACE')
bucket=os.environ.get('BUCKET_NAME')
steps=os.environ.get('STEPS')

step_list = [x.strip() for x in steps.split(',')]
execute_steps(rps,namespace,bucket,step_list)

The Dockerfile is defined as below;

FROM python:3
WORKDIR /usr/src/app
RUN pip3 --version
COPY requirements.txt ./
RUN pip3 install --no-cache-dir -r requirements.txt
COPY driver.py .
CMD [ "python3", "./driver.py" ]

The requirements.txt file is defined as;

oci

This lets us create a Docker image that when pushed to OCI Container Registry can then be used to execute any Python script. Since I am executing via OCI Data Integration and using a workspace resource, I will add a resource principal policy statement to allow Data Integration to manage compute container family;

allow any-user to manage compute-container-family in compartment <compartment-name> where ALL {request.principal.type = 'disworkspace', request.principal.id='<workspace_ocid>'} 

Then need policies for any OCI resources your scripts may use — the script below created a bucket and uploads an object, so I have added those permissions too = here we are controlling permissions for this script, so users can’t just run whatever they want — they must be granted permission;

allow any-user to use buckets in compartment <compartment-name> where ALL {request.principal.type='disworkspace', request.principal.id='<workspace_ocid>'}
allow any-user to manage objects in compartment <compartment-name> where ALL {request.principal.type='disworkspace',request.principal.id='<workspace_ocid>'}

Now let’s look at the example, below the step/python script will create a bucket and load file (we granted permissions above), it uses the OCI resource principal to authenticate with OCI;

import oci

def create_data_in_os(rps,namespace,bucket_name,object_name):
bytes = b'print("Hello")'
object_storage = oci.object_storage.ObjectStorageClient({}, signer=rps)
try:
object_storage.put_object(namespace_name=namespace,bucket_name=bucket_name,object_name=object_name,put_object_body=bytes)
print("Data file " + object_name + " uploaded to bucket " + bucket_name)
except:
print("INFO: object already exists")

rps = oci.auth.signers.get_resource_principals_signer()

namespace="NAMESPACE"
bucket="BUCKET"
entity="output/step1.csv"

create_data_in_os(rps,namespace,bucket,entity)

Although I said any python script, it’s any script that has all of its library requirements met with what is in the requirements.txt file. That’s the great part here. If you depend on any other libraries, then simply add them to the requirements.txt file.

Can then use the OCI Data Integration REST Task to orchestrate the script — include in a pipeline, schedule, set parameters and so on. See here for information on the Container Instance REST Task.

Unload from Redshift?

If you wanted to unload from AWS Redshift, its easy to do using this approach, simple use the python connector from your python script, add in the dependency for your module like this;

oci
redshift_connector

I can then write steps that utilize the Redshift connector for example to unload data into AWS S3 as Parquet;

import redshift_connector

conn = redshift_connector.connect(
host='YOURHOST.redshift-serverless.amazonaws.com',
database='YOUR_DATABASE',
port=YOUR_PORT,
user='YOUR_USER',
password='YOUR_PASSWORD'
)

# Create a Cursor object
cursor = conn.cursor()
cursor.execute("unload ('select * from YOURTABLE') to 's3://YOUR_BUCKET/YOUR_FILE.txt' iam_role 'arn:aws:iam::YOUR_ROLE' parquet;")

There are tonnes more examples. Its very simple, this makes integration very flexible and very easy.

Summary

Hope you found this useful, Docker is really a fantastic way of sandboxing your scripts, using OCI policies to secure what they can do and also Docker for portable environments that you can easily run, there are so many possibilities and frameworks for building smart solutions also, more on that to come. Check out the documentation for OCI Data Integration below. Send me comments, questions and ideas, would love to hear them.

Related resources

--

--

David Allan

Architect at @Oracle The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.