views:

109

answers:

5

I have over 10 million small static html files (10K each). And I am an user of bluehost.com. It have a limitation of 50000 files. It sent an email to warn me that if I didn't delete the files in 30 days, it would disable my account.

So I am looking for a free service to host my files. I considered google app engine, but it has a even more strict limitation: no more than 1000 files.(each should not larger than 1 Mb). And it seems that I could upload the files to code.google.com which provides free project hosting service.

Any good suggestions? I prefer a free one or a cheap one. And it should have a programming interface to upload and download files. Thank you in advance.

A: 

You can certainly do this on a Linode (cheapest plan $20/month). But I think that might be overkill. Amazon S3 charges by the gigabyte, so you wouldn't pay very much for that. As far as straight web hosting goes, you've got me - I don't know of a provider that will let you do that to their poor directories.

Borealid
10 million x 10K is about 100 GB, so I think than Linode is out of question, and S3 sound like the best choice. (Will be about 15$ for a month of storage, plus $0.01 for each 1000 requests)
Regent
Oh, 10K each. I read that as 10B, which would make only 100MB. Heh.
Borealid
+4  A: 

I would consider converting all the files into a database and coding a small server side script to retrieve the data, then use some rewriting rules to redirect the visitor to the script.

Most web hosts nowadays offer some sort of server side language a database of some sort. Many also allow you to use .htaccess files to put your rewrite rules in.

Tangrs
This is a fine idea. +1. Every hosting company is probably going to cringe when hearing of 10 million files, and this is an elegant way around it.
Pekka
Except that most hosting companies have a limit on the size of the database, generally much more restrictive than the amount of hosting space.
Matt Ellen
Isn't the size of the database usually included in the disk usage?
Tangrs
If not, maybe the database could be a SQLite database which is stored on the disk
Tangrs
@Tangrs, I find for cheap or free hosting options that the size of the database is not related to the disk space offered. Of course if the hosting company provides access to install your own database than that wouldn't matter - I can't see a cheap/free hosting company doing that.
Matt Ellen
As said before, the O.P can choose a web host that offers SQLite support. SQLite databases are stored on the disk as a file. It's supported by PHP and many server side scripting engines.
Tangrs
+1  A: 

Use zip and a frontend that unpacks the files (if needed).

10 million files is generated code. Don't. Just create on demand.

[edit] Then you don't need to store the pages at all. There are data structures that let you reconstruct the original page while being fast searchable combined with using little extra storage space.

Stephan Eggermont
But if the site had heavy traffic, wouldn't that have a huge performance penalty?
Tangrs
No. It would provide a huge boost in performance. Remember, you can mostly return zipped results. And you will not be killed as fast by the disk access
Stephan Eggermont
@Stephan Eggermont: What if you only need to download one of these files (10 kB), but is forced to download a ZIP (50 GB) containing them all? What if every user (10 per minute) needs to do this? (If I want to read one article in Wikipedia, I would become rather "pissed-off" if I had to download a ZIP of the entire encyclopedia in order to get my article.)
Andreas Rejbrand
@Andreas: Don't be silly. You can return parts of a zip file. That's why you need the frontend.
Stephan Eggermont
OK. I did not understand that you meant so. (I did not downvote, btw.)
Andreas Rejbrand
+2  A: 

10 million files made by hand ? If they where made by a program try to move the program into a dynamic web language like php.

PeterMmm
A: 

Do you need access to all the files all the time?

You could archive them into zip files, as text tends to compress quite well, and maybe that would give you the required space saving.

Matt Ellen