Zipping Objects in OCI
Many times there is the need to create archive files from many other files (using zip for the archiving), this is true from laptops, to on premise compute nodes to the cloud. Here we will see how to use an OCI Function to create a zip archive from a list of objects. The function we have designed has an input payload defined with object names to zip, the bucket they are in and in which bucket to write the zip along with the zip file name.
The function streams the objects from object storage (using tips from useful python articles like this), creates the zip in a stream (uses the python module stream-zip) and then uploads the parts in to Object Storage using the Object Storage multi part upload APIs;
The initial code for this can be found below — this uses the OCI Object Storage APIs to create a multipart upload (see here);
After creating the function, you can invoke from the command line to test.
| fn invoke distools difunctionzipper
The above specifies that the source files headerdata.csv and linedata.csv will be read from the bucket sourcedata and zipped into a new zip archive named targetdata.zip and stored in bucket targetdata.
Well one of the cool things about OCI Functions is that they have an endpoint you can invoke the function from, so you can integrate with any of your tools in the ecosystem we can integrate into OIC (Oracle Integration Cloud) or OCI Data Integration. Here we will see how we can integrate with OCI Data Integration.
From OCI Data Integration you can create a REST Task to execute this from within a schedule, in a pipeline or wherever you like;
The REST Task uses the OCI Function endpoint URL, you can copy this from your function and ensure you can reach that from OCI DI. The REST task above has the request payload as a parameter so when the task is executed you can modify the payload;
Now this function and DI Task is defined its now easy to incorporate this in to your data pipelines.
The above example illustrates a data pipeline that loads data into object storage and then zips the objects if successful, otherwise pushes a notification.
Above we’ve seen how we can zip objects in Object Storage and integrate with other tools such as OCI Data Integration and Oracle Integration Cloud.
As you can see this is useful to incorporate all kinds of other REST activities. OCI Functions is a great way of adding custom business logic and they are automatically exposed with REST endpoints also! On top of that as you have seen here in the OCI Data Integration service, one of the most useful tasks is the REST Task, here we can extend data integration to call all kinds of activities. Hope you found this post useful.