views:

332

answers:

1

I want to build a site similar to ebay (a mini version of it, using a LAMP stack as the basic setup I guess or maybe you guys will suggest something else) and I'm wondering how they built their system. The part I don't understand the most is how they manage their categories. They have one search code and probably one code for posting items for sale as well as one code for displaying the items. But how do they create/store the template for each category? Also what is the database structure behind their setup? And finally they have so many categories and sub-categories, let's say somebody posts an item inside (which is most likely the process ebay used to add categories) Motors -> Parts & Accessories > Racing Parts

A few days later, people request more sub categories under "Racing Parts":

  • Accessories
  • Auto Racing Parts
  • Fasteners, Fluids & Gaskets
  • Kart Racing Parts
  • Safety Equipment
  • Other

So now they have a new level for Racing parts that go like this:

  • Motors -> Parts & Accessories > Racing Parts > Accessories
  • Motors -> Parts & Accessories > Racing Parts > Fasteners etc..

What happens with the existing listings that have been posted prior to adding the new subcategories? do they get moved to a sub-category? Does ebay force new items to be listed in the sub-categories and removes their old posting form for "Racing Parts"? If they do, what if the user is missing a category, the user may get confused and not post and then ebay will lose money. And if they don't remove the general Racing Parts posting form, then users will post in a category that is too generic and now it will become difficult to use the "Refine search" option because all the forms have different fields that ebay could filter by.

If you guys have any ideas, please let me know. I'm really confused on how they do it and would really like to understand :)

+3  A: 

Here is Randy Shoup on Ebay's Architecture

He mainly talks about scalability, availability, manageability etc. Schema is something that you ahve to devise on your own based on your specific requirements. Slides

From his chat transcript

"It is even a relational database or is it really different?

It is a very different. It's a search engine like Google or Yahoo! developed by the same people that developed the AltaVista search engine, and as with many search engines, it's developed on similar principles, which is that it's an inverted index. There's a set of documents with IDs, keywords are indexed into those documents, and query operations happen by intersecting lists or vectors of those keywords, very simply, and there's a lot more detail about how that works. The challenge for -- just as an aside, the challenge for an eBay-style search engine is that our users expect the search engine to be updated in essentially near real-time. When somebody bids on an item that changes the price, and price is a filter that people are very interested in querying on. So it actually means that the style -- the sort of classic web search engine style of "you build the index in a kind of batch mode and then upload it to the search engine" is something that doesn't really work for us. It needs to be a lot more real-time. So I will talk a little bit about how that real-time system works in my asynchrony section, but anyhow, to finish the thought on scalability for search, the idea is that the search engine can be horizontally split. So there is this overall search index of whatever size it is. We divide it up into chunks of ten or twenty or sixty or hundred, and divide the infrastructure that way. And then we have an aggregator piece, which now does do scatter/gather over all those different parts of the index. So somebody queries for "iPod" or "Mickey Mouse" or "Wii" and the aggregator sends the query to each one of the different splits or shards and gets the results back and aggregates them and sends them back to user."

CodeToGlory
great, thanks a lot! Sounds really interesting.
Kentor