tags:

views:

185

answers:

10

Im building multi-server support for a file upload site Im running. When images are uploaded.... they are thumbnailed and stored on the main front-end server, until cron executes (every 10 mins) and moves them to storage servers, so for the first 10 mins, they will reside, and be served off the main front-end server.

When a file is uploaded, users are given embed codes... which is a thumbnail url + link to the full size, which is a html page. So it might be something like http://www.domain.com/temp_content/filename.jpg which links to http://www.domain.com/file-ID

Except in 10 mins, http://www.domain.com/temp_content/filename.jpg wont exist, it will be http://server1.domain.com/thumbs/filename.jpg

if the user grabbed the original code... the thumb file will be broken.

I COULD move the file to its destination, without cron, but that will take time, and will lag the script until the move is complete. I also dont like to have users running commands like that, I'd rather have the server do them at regular intervals.

Anything else I can?

+3  A: 

You could use a mod_rewrite command in your .htaccess to check if the file in temp_content exists, and if it doesn't, have it redirect them to the new location.

Eric Petroelje
+1  A: 

Have you considered a database storing the image_name/image_location, and a generic PHP script to serve the images from the database-details?

Jonathan Sampson
So for every thumb thats embedded into a page... it will have to query the DB and gets it's location? That would be disastrous.
Yegor
No. You should couple this with caching, of course.
Jonathan Sampson
Performance might not be that bad - this is what databases are designed for and are therefore generally faster then file-system access. From an engineering perspective, go with the simplest design and the tackle performance issues if they actually arise.
RichAmberale
A: 

Really, given your situation the only option I can see is to give your users the actual URL's, since I'm sure you will be able to know them. You will then need to notify the users that they cannot actually use the link for 10 minutes.

In an idea world, I would see you putting this file directly to its final resting place, given your need to allow users access to the link.

Mitchel Sellers
If I have 10 servers, I dont know where the file will go. Cron decides that at time of execution.
Yegor
A: 

How about some JavaScript?

<img src="http://www.domain.com/temp_content/filename.jpg"
onerror="this.src='http://server1.domain.com/thumbs/filename.jpg'"&gt;
Detect
I wont know if its server1, server 16, or server9 until the file is moved.
Yegor
A: 

I don't see how this will be an issue. If you're having the user upload files their locations must be stored in a record somewhere correct? And your pages must be checking these records when you generate them right? Why not just add another field that specifies what server they're on. Then just make sure the cron task updates the record once it's done moving the file?

Spencer Ruport
i would suggest thinking of a database if one is available rather than a file.
Raj More
A: 

This is the kind of thing that begs for a network hack. Basically, a custom routing thing that is pointed to by the domain and forwards packets to the appropriate host. It's ugly

In all honesty though, this is a great question for ServerFault, because they're the folks who would know how to set up the topology to enable this kind of thing.

Alex Gartrell
A: 

Use a database to map image names to locations. Modern databases do caching. IF you find that performance is bad, you could also have a simple hash-table cache in memory - with this you could store could store ~2 million name->location mappings in 500MB of RAM (assuming ~256 bytes per mapping).

For the script that serves the files, it could

  1. Actually serve the file (read bytes from the location and send the bytes out to the client).
  2. Redirect the client to actual location on a different server.
RichAmberale
A: 

I currently move all uploaded media on our website to amazon s3/cloudfront after about 10 minutes time, and do a combination of 2 things to redirect users to the new location.

for public assets (thumbs, etc...), we cache the parent item's definition with the new location of media (eg. server1.site.com/media/1.jpg).

for private assets, requests are made to a script that checks authentication, then issues a 302 redirect to the auth'd s3 url.

long story short, store the new location in memcache, have mod_rewrite pipe to a script on 404 for the original file, then 302 redirect to the new location.

Jason
A: 

Cron does not "decide" anything about the load balancing of the files across servers. Cron executes a command at an appointed time.

Where is the logic that actually does the load balancing of images across servers? Can the load balancing portion of that logic be done by the script that handles the upload? The cron job can still handle the actual moving of the image from the upload server to the final server.

This way the user could be presented with a "temp" location to verify the upload. After confirming the upload, the user would get a "final" link to the image, and a message about "may need to wait 10 minutes for the image to be 'live' at that location."

semiuseless
cron script acts as a load balancer, it looks at some pollers are decides the appropriate server that the file should be loaded on.
Yegor
+1  A: 

very simple:

i don't like your paths for your files, so i changed it. ^_^

create the link to go to the main storage immediately. http://www.domain.com/file/filename.jpg

on the main server use a ruleset like so

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} /file/(.+)
RewriteRule ^/file/.+ /temp_content/%1 [L,R]

dallas marlow