AWS S3 Batch Operations: Beginner’s Guide

Let’s get going.

Accessing the PreviewIf you don’t have access to S3 batch operations preview, fill in the form in this page.

It took a couple of days before I got an answer from AWS, so arm yourself with patience.

Getting StartedNow that you have access to the preview, you can find the Batch Operations tab from the side of the S3 console:Access Batch operations from the S3 consoleOnce you have reached the Batch operations console, let’s talk briefly about jobs.

JobsCentral to S3 Batch Operations is the concept of Job.

In a nutshell, a Job determines:In which buckets your objects are locatedWhat operation to do on the objectsWhich objects to run the operations onWe’ll soon create our first job.

But first, let’s create a test bucket, just to experiment a little with Batch Operations.

Creating the Test BucketBefore you create your first job, create a new bucket with a few objects.

I created a new S3 bucket named “spgingras-batch-test” in which I uploaded 3 files (file1.

jpg, file2.

jpg, file3.

jpg):Contents of my bucketI know, it’s quite small, but for demonstration purposes it’s going to be just fine.

Next you’ll need to create a CSV file that contains 2 colums (bucket name, object name) for each object you want the job to operate on.

In my case, I want the job to operate on all 3 files, so my CSV file looks like this:Contents of the manifest fileNow, save the CSV and upload it inside your bucket: I named the file “manifest.

csv”:manifest.

csv is now in my bucketBefore we can create our first jobs, we must create a IAM role that Batch Operations can assume.

This role will allow Batch Operations to read your bucket and modify the objects in it.

Creating the IAM Role for Batch OperationsHere, I’m assuming you are familiar with creating IAM roles.

I won’t give screenshots for all steps required to create the IAM role.

From the IAM console, create a new IAM role.

Choose any service to use the role (it’s not important, as we’ll soon overwrite the trust policy for this role):Choose any service.

Here, I chose EC2, but it can be any other (Lambda, S3, etc).

Don’t choose any specific permissions for this role yet.

Once the role is created, update the role’s Trust Relationship to:{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "batchoperations.

s3.

amazonaws.

com" }, "Action": "sts:AssumeRole" } ]}For permissions, create a new inline policy:{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObjectTagging", "s3:PutObjectVersionTagging" ], "Resource": "arn:aws:s3:::spgingras-batch-test/*" }, { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:GetObjectVersion", "s3:GetBucketLocation" ], "Resource": [ "arn:aws:s3:::spgingras-batch-test/manifest.

csv" ] }, { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetBucketLocation" ], "Resource": [ "arn:aws:s3:::spgingras-batch-test/*" ] } ]}Be sure to replace “spgingras-batch-test” with your own bucket’s name.

Now save the policy, and the IAM role is now ready to be used.

We’re now set to create our first job.

Creating our First JobWhat we’ll want to Batch Operations to help us with is add a tag to every object in the bucket.

If you have ever done this before, you’ll know that it can be a pain in the butt to update tags on millions of S3 objects.

Thankfully, it can be done in a pinch using Batch Operations.

From the Batch Operations console, click the “Create Job” button:Go ahead, just click itIn the first step, choose “CSV” (1) as the Manifest format.

Also, enter the path to your manifest file (2) (mine is s3://spgingras-batch-test/manifest.

csv):The first screenThen, click “Next”.

On the second screen you will decide what operation to run on the S3 objects.

Choose the “Replace all tags” (1), and add new tags to the list (2).

I chose to add the “type” and “environment” tags, but you can choose anything you want:Here, you decide which tags to apply to the S3 objectsNote that this will replace all tags on all objects in the manifest.

Also, it’s pretty cool that at some point in the future, you’ll be able to invoke Lambda functions on your S3 objects!.Once you’re done, click “Next”.

On the following screen, you will have to choose the IAM role you have created previously.

Remember, this role will be used by Batch Operations to play with your bucket.

For this example, I have named the IAM role simply “batch-role”.

Uncheck the “Generate completion report” (1) (you don’t need that for the demo) and pick the IAM role from the dropdown (2):Uncheck “Generate completion report” and select the previously created IAM roleNow, click “Next”.

On the following screen, review the details to make sure everything is OK, and click “Create job”.

The job is now created, and we can run it.

Running Your JobNow that the job is created, it’s time to run it.

From the Batch Operations console, click on the Job’s ID:Find your job in the Batch operations console and click on the job’s IDIn the job’s description screen, click on the “Confirm and run” button:Hit that button to start the magicAnd in the next screen, confirm the details and click “Run job”.

Now, go back to the Batch Operations console.

Wait until your job’s status (1) is “Complete”.

Spam that refresh button (2) if needed:Refresh your job’s status until it's marked as CompleteNow that the job is completed, go back to your bucket.

Open one of the object’s Properties pane:Tags are all set!You’ll notice that all tags of the object have been updated.

The same will be true for every other object you included in the manifest fileWrap UpUsing S3 Batch Operations, it’s now pretty easy to modify S3 objects at scale.

Simply select files you want to act on in a manifest, create a job and run it.

No servers to create, no scaling to manage.

With this new feature of S3, here are some ideas of tasks you could run:copy S3 objects in bulk from one bucket to anothersend media files to Elastic Transcoderretroactively update tags on old S3 objects.

. More details

Leave a Reply