tags:

views:

86

answers:

5

Hello,

I have a website where users can create albums and then add images to those albums. I then have a section on the page where I display all images added by the users. You can see it here: http://www.mojvideo.com/slike

The problem is that we have a few users who upload a lot (anything between 20 to 100) of images at once and they kinda "spam" the first few pages (especially as some images are almost identical). I'm now struggling to find a better way to display these images so that content from more users is shown. My database structure looks like this:

foto_albums: id | title | user_id

foto_images: id | foto_albums_id (reference to the album the image is in) | user_id | title

(the reason user_id is duplicated in foto_images is that I have a "special" personal album that isn't stored in the database, so if a user uploads an image to that album the foto_albums_id get's a value of 0, just thought I'd mention that to avoid any confusion)

I'm looking for ideas and some help implementing them...

+1  A: 

Why don't you randomize the pictures?

[WARNING Horrible performance!]Something like ORDER BY RAND()?[/WARNING]

soulmerge
randomizing isn't acceptable. This page serves as a browser and I need to display the latest images first...
Jan Hancic
A: 

is the issue only on the front page? can you display only the latest few images from each user on that page, using select top(x) and then display all of the user images on their individual page?

Jason
This page is a "browse" page so displaying only a couple of the latest images per user isn't acceptable.
Jan Hancic
+1  A: 

The dreaded is_private column :)

I don't know that there's an efficient way of doing what you're looking for without adding some extra columns to your schema. You could assign a random number to each image, then order by that (faster than ORDER BY RAND(), since it's not computed each time). You could add an upload timestamp, then order by (say) seconds * 60 + hours; although that looks strange, it should prevent clustering of same-upload images (supposing that parsing an image takes at least 0.5s, which is a somewhat reasonable assumption given current upload speeds and image file sizes). You could invert the index column (as in, string reversal) and order by that.

None of these fundamentally change the concept, but to get them to work quickly you'd need to alter your schema.

Have you considered modifying your presentation? Maybe a sample of a user's most recent (highest id?) images, followed by the necessary "read more below the fold." Something like:

[img_1/kyle]  [img_2/kyle]      [img_3/kyle]      "See all of Kyle's 50 pix"
[img_1/jan]   [img_1/soulmerge] [img_2/soulmerge] [img_1/jason]
[img_2/jason] [img_3/jason]     "See all of Jason's 72 pix"

Just a thought.

Edit: based on your comments, if you'd like to display the most recent on a per-album basis, well, it's highly database-dependent; I think some databases make this easy, but MySQL definitely does not. Here's an example of a way to accomplish this (in MySQL 5.0+):

SELECT
  i.id,
  i.user_id,
  i.title
FROM
  foto_images i,
  (
    SELECT
      foto_albums_id,
      MAX(id) AS image_id
    FROM foto_images
    WHERE foto_albums_id <> 0
    GROUP BY foto_albums_id
  ) tmp
WHERE
  i.foto_albums_id <> 0
  AND i.foto_albums_id = t.foto_albums_id
  AND (
    SELECT COUNT(*)
    FROM foto_image
    WHERE
      foto_albums_id = i.foto_albums_id
      AND id BETWEEN i.id AND t.image_id
  ) < @NUMBER_OF_PREVIEW_IMAGES_TO_SELECT
ORDER BY i.id;

Note that, somewhat dependent on your indexing and amount of data, this query can be highly inefficient. Possibly consider warehousing the information? An extra column, display_on_main_page, defined at upload termination?

kyle
Randomizing isn't an option, because latest images need to be listed first. The problem is that I don't really know what I wan't :) You'r last suggestion would work if it worked on a per album basis. But I have no idea how to write a query that would accomplish this...
Jan Hancic
What I mean by "per album basis" is that if one user uploads 50 images in one album and 50 in another, then there would be X images displayed from one album and X from the other. Not just X from one album.
Jan Hancic
I will try this and see how our server copes with it. That extra column sounds like a great idea, I will think about it.
Jan Hancic
Actually -- I can't believe I missed this -- display_on_main_page should be something more like album_index. Then you can order the images within an album based on that column, as well as eliminate all the subqueries in the SQL.
kyle
A: 

If you really want to display all images sorted by upload time and you don't want a single user's images to take up too much space in a single page, you need to limit the amount of images they are allowed to upload. Just disable the upload link when a user has already uploaded X images the last 24 hours (and also display a warning saying "you already uploaded X images the last 24 hours" if the user tries to bypass that rule)

tehvan
This could work. But it isn't practical. If you take one average album, a user uploads his images that he took on some event for instance, and that is usually around 50-60 images. I don't wan't to force this user to upload only half of those images and the other half the next day.
Jan Hancic
+1  A: 

i think you need to clarify what is your problem, and what you hope to accomplish as all of the responses thus far haven't been acceptable.

what exactly do you want to happen?

what exactly is happening now?

Jason
Check out dude's link, it's a nice addendum to his text (I was confused, too). It looks like the data is live, so you may not see it at firt, but paginate down a little bit and it will become apparent.
kyle