Ushanka: A Closer Look: Amazon S3

Sunday, February 10, 2008

A Closer Look: Amazon S3

One of the most mature Amazon web services, Amazon Simple Storage Service (S3) provides a virtually unlimited data storage service. That's right: you can upload as much data as you'd like and it will be held on their machines with all the network capacity you could ever want and with redundancy built-in. Hard drive failures are easily the primary cause of server downtime and Amazon has taken the burden upon themselves to manage all the devices and failures that go along with it. As the name implies, the service is designed to provide simple access so you can't do funky things like mount the virtual filesystem directly.

I've been using S3 for over a year and I haven't had any reliability issues with it. Others have had brief outages but they were mostly when the service was first introduced. I'm quite happy with S3 but there's one missing feature that keeps it from being the ultimate simple storage service: range-PUT.

Suppose I've got a file on S3 and I want to update a small part of it. Without range-PUT, I would normally have to transfer the entire file again using the HTTP PUT method to store it on the remote host. Using the Content-Range header, I could specify just the range of bytes that have changed within the file and transfer just that portion. This feature would save a lot of bandwidth (and, consequently, money) if files often get modified partially.

Of course, supporting Content-Range opens up a can of worms. What happens if the file doesn't exist and the start of my range isn't offset 0? What if the file does exist but the start offset is beyond the end of file (i.e. not a simple append)? I can think of two solutions that seem reasonable: return an error or create the file if it doesn't exist and zero-pad the holes. The former would be easier to implement while the latter would produce a behaviour like Linux sparse files.

There are two major application classes that range-PUT would be suited to. The first would be the class of applications where we always append to the end-of-file. Log files would fit into this category but, more importantly, we could resume broken transfers. When uploading large files (S3 supports file sizes of up to 5GB), I've found that my connections often get dropped so if I could just append to an existing file, I could write an upload tool that would auto-resume. The second class of applications would be the ones that only update part of a file. In most cases, I'd imagine this kind of update would take place to change some file metadata. For example, if I modify the metadata for my MP3 file, I'd rather just upload the few changed bytes instead of uploading the whole MP3 again. The music is the same, it's just the metadata that has changed. This problem is even worse when dealing with video files.

S3 is a fantastic storage service. It's reliable, it's cheap, and it takes away the hassle of managing your own hardware or creating a highly-available, redundant persistent store. If S3 supported range-PUT, it would save a huge amount of bandwidth resulting in an even lower cost of operation.

Labels: amazon, s3

6 Comments:

Does S3 support content-range GET?

By Charlie, At April 5, 2008 2:08 PM
Thanks for the post, Charlie. Yes, S3 supports range-GET.

By sharvil, At April 5, 2008 3:24 PM
I completely agree with this, although the relatively high cost of a PUT compared to a GET (10x the price) needs to come down for it to be effectively used by logs.

Have you suggested this at all to the Amazon people?

By Janakan, At June 9, 2008 3:51 PM
There's a thread on the Amazon developer forums about this feature from June 2006 but it never got anywhere. I've posted again to check on the status so let's keep our fingers crossed and hope that we can see some movement in this area.

The URL of the discussion is: http://developer.amazonwebservices.com/connect/thread.jspa?messageID=92534

By sharvil, At June 13, 2008 3:28 PM
How do you reckon getdropbox.com does it? I have come across other backup solutions built on top of S3 that claim differential backups.

Are they simply storing the entire file + the difference for subsequent updates and recreating the update file from that? There has to be a better way.

By aleemb, At July 12, 2008 5:10 AM
I don't have all the details on getdropbox.com since I don't have an account there but from the video I would assume that they're using EC2, S3, and a client application to do differential backups.

It probably works like a standard RCS with binary diffs using xdelta or rsync. When the client application notices a change in filesystem, it does a diff and sends it to the EC2 node with the revision #.

The EC2 instance acts like a mediator in the publish/subscribe model - on startup, a client subscribes to an account and listens for updates. When the EC2 instance receives an update from a client, it stores the diff into S3 and sends all listeners the new revision number and the filename to fetch from S3.

There's probably another process running on EC2 that rolls up the old diffs if there are too many revisions (e.g. if there are more than 100 revisions, apply the first 90 to the file and delete those diffs).

This is just a guess as to how it's actually done but it seems like a reasonable approach. I'm not sure how conflicts are resolved in case there are concurrent writes but this seems like a utility for an "average user" so it seems unlikely that there will be concurrent writes to the same file.

By sharvil, At July 16, 2008 1:01 PM
MBT shoes
MBT
MBT shoes
cheap MBT shoes
MBT shoes UK
MBT shoes
MBT shoes
cheap MBT
christian louboutin
christian heels
christian shoes
christian louboutin
air jordan
jordan shoes
cheap air jordan
air jordan retro
vibram
vibram shoes
air jordan
jordan shoes
mbt shoes
mbt shoes euro
nfl jerseys

By Anonymous, At April 30, 2010 1:17 AM