September 2010 Archives

I never paid much attention to the HTTP Vary header.  In fact, I've been fortunate enough to avoid it for this long and never really had to care much about it.  Well, it turns out when you're configuring a high-performance reverse proxy, understanding the Vary header and what it means to your reverse proxy caching policies is absolutely crucial.

Here's an interesting problem I recently solved that dealt with Squid, Apache, and that elusive Vary response header ...
My first experience using Amazon Web Services for a production quality project was quite fun, and deeply interesting.  I've played with AWS a bit on my own time, but I recently had a chance to really sink my teeth into it and implement production level code that uses AWS as a real platform for an upcoming web, and mobile application.

Perhaps the most interesting, and frustrating, part of this project involved storing hundreds of thousands of objects in an AWS S3 bucket.  If you're not familiar with S3, it's the AWS equivalent to an online storage web-service.  The concept is simple: you create an S3 "bucket" then shove "objects" into the bucket, creating folders where necessary.  Of course, you can also update and delete objects.  If it helps, think of S3 as a pseudo online file-system that's theoretically capable of storing an unlimited amount of data.  Yes, I'm talking Exabytes of data ... theoretically ... if you're willing to pay Amazon for that much storage.

In any event, I created a new S3 bucket and eventually placed hundreds of thousands of objects into it.  S3 handled this with ease.  The problem, however, was when it came time to delete this bucket and all objects inside of it.  Turns out, there is no native S3 API call that recursively deletes an S3 bucket, or renames it for that matter.  I guess Amazon leaves it up to the developer to implement such functionality?

That said, if you need to recursively delete a very large S3 bucket, you really have 2 options: use a tool like s3funnel or write your own tool that efficiently deletes multiple objects concurrently.  Note that I say concurrently, otherwise you'll waste a lot of time sitting around waiting for a single-threaded delete to remove objects one at a time, which is horribly inefficient.  Well this sounds like a perfect problem for a thread pool and wouldn't you guess it, even a CountDownLatch!

Continue reading for the code ...

Twitter (@markkolich)

Translate

About this Archive

This page is an archive of entries from September 2010 listed from newest to oldest.

August 2010 is the previous archive.

October 2010 is the next archive.

Find recent content on the main index or look in the archives to find all content.