views:

493

answers:

4

I'm in a bit of a pickle here. I currently have a website hosted in a shared hosting environment by a third party hosting provider. As such, I do not have root access to the IIS server that this website is on.

I currently have a directory on the site, such as:

mysite.com/myfiles

The "myfiles" directory currently has a lot of files in it. So many files that it is putting me over the disk space allotment at my host. There are a couple things to now consider:

  1. The host does not have a more generous plan for us to upgrade to. They are only willing to charge us (very high) overage fees.

  2. We need to remain with this host until the end of our contract with them, which is almost a year away.

I would like to take the contents of this directory and put it on Amazon S3, which would relieve the disk space strain on the hosting account. The only issue is that the URL's to the files need to remain the same!

So for instance, if an external website links to mysite.com/myfiles/image.jpg, I want the image on that site to continue working without a hitch.

Is there any possible way to achieve this?

A: 

You may find information about what you need by looking through the Virtual Hosting of Buckets documentation for Amazon S3. You can customise the hostname that will be used to access your S3 files, which may help do what you need to do.

Greg Hewgill
+4  A: 

I'd recommend creating a S3 bucket (and maybe a Cloudfront distribution that sits over it), filled with a folder layout that would correspond to your existing site (for future migration). Then create a CNAME entry in your DNS to give your bucket/distribution a friendly name (e.g. s3.my.domain).

Then add a URL rewriter to your existing site that forwards requests for http://my.domain/myfiles/xxx to the matching S3 URL, e.g. `http://s3.my.domain/myfiles/xxx'.

When your hosting contract is completed change DNS to point your root and www entries at the bucket/distribution, or another host as required.

I'm currently using Cloudfront to geographically cache static content for one of my businesses and it works great; zero downtime so far ( ~6  more than 12 months).


June 2010: Cloudfront has been excellent, much cheaper than the previous hosting arrangement. We're currently serving around 2.5M requests per month (~750GB) for only US$120.

devstuff
A: 

First question: why is it important for third-party sites to have unbroken URLs to your site? Are they paying you for content? If they're not paying you for content, is there some benefit that you get by giving them content for free? This is a business decision, and perhaps you come to the conclusion that it's worth paying the overage fees to provide that content.

Second question: how many URLs are actually linked from third-party sites? You could take the time to keep those URLs available, and switch everything else to S3 hosting. A Google "link:" query can help answer this.

So, to solutions: first solution works if you have the ability to create 301 redirects. Simply set up a redirect for every URL that you want to move.

Second solution is a reverse proxy, in which URLs on mysite.com are mapped to mysite.s3.amazon.com. I'm not sure that this is really a good solution; you'll be paying for bandwidth to proxy the files. Plus, if you have the ability to set up a reverse proxy, you have the ability to create 301 redirects.

Third solution: move your site en masse, and have the domain name map to S3 (via CNAME mapping). Yes, you end up paying for a hosting service that you no longer use. And if you use dynamic content, it's not going to work (but perhaps then you pay the $30/month for S3).

And finally, not a solution but a path forward: use a distinct domain name for additional static content. That domain name can be mapped via CNAME to an Amazon bucket.

kdgregory
It is important for third party sites to have unbroken URL's to the site because there are a number of sites that have linked back to our site, as well as have actual images embedded in their sites (with our permission) that request the actual image from our server. We want to maintain these links and prevent these images from breaking. Why? To be considerate, I guess.The answer to the second question is that we have no way of knowing where exactly all of these external URL's are references are. I'm not satisfied with the scope or accuracy of Google's "link:" query.
Toilet Overflow
A: 

Move the files and setup a 404 handler on the directory. The 404 handler can redirect the client with a 301 or 302 status to the S3 url transparently. The other recommendations to use url rewriting likely won't work for any non aspx files since you're on a shared hosting provider and they typically don't support wildcard ASP.NET mappings.

Chris