Can anyone share me how does youtube stored video related information in there tables ? What would be the table structure and what would be the various columns in tables and the relations between them ?
Thanks in advance
Can anyone share me how does youtube stored video related information in there tables ? What would be the table structure and what would be the various columns in tables and the relations between them ?
Thanks in advance
There are only two ways to find out:
1) Hack into YouTube and find out yourself
2) Get a job at Google and hope you get to be on the YouTube-design team.
As for getting others to do this and tell you: I doubt that anyone has been able to pull (1) off. (Sure there have smaller hacks, but AFAIK nothing that would reveal what you ask for). And anyone doing (2) probably is not allowed to tell you.
What I do wonder is why you want to know. Even if you knew I doubt there was a way to make use of it on YouTube itself, so the only use I can think of is rebuilding it on your own website. If that is indeed your goal, you will have to think of a database design yourself, I am afraid.
I believe they store files themselves on disk and keep track of them in a group of Excel files. So when a user requests a page, the appropriate Excel file is parsed, links to files are extracted and displayed along with the properties to the user.
Using Excel offers advanced functions like getting reports on load count and other usage statistics on files. For example, you can build graphics to show you the geographical distribution of daily reach for particular videos based on IP addresses of those requests.
Excel files themselves have a relatively simple structure and can be processed at a very high speed, compared to the ubiquitous database solutions where for each single requests to process many services have to be invoked to interoperate which causes a response lag and a reduced processing rate which can be noticeable on high load sites with millions of requests coming in.
Mastermind clearly has the more correct answer (I've voted him up), but for interest...
YouTube has an interesting database architecture that in an odd way reflected their eventually being taken over by Google.
As everyone knows Google has an odd take on making reliable servers - instead of creating expensive high reliability servers (with features like redundent power supplies), they instead use many cheap commodity machines, and combine it with a software and storage architecture designed for failure.
YouTube mirrors that with their database system. They use MySQL and its infamous MyISAM tables for speed. Alone this would be a recipe for disaster, as YouTube would end up posting the "sorry the database got corrupted you have to recreate your account" message more frequently then any other php powered forum out there.
Instead YouTube created a layer to duplicate data across several databases - not mirroring in the traditional sense, but instead like a kind of redundent load balancing or stripped RAID setup where a record is redundently stored in some but not all of the databases. This not only allows individual MySQL databases to crash (its trivial to automate the automatic deletion and recreation of those databases with MySQL), but also allows them to scale their database backend in ways monolithic databases cannot - they can simply add extra machines and let the system populate them with the excess data.
The videos are stored on disk and given their unique IDs ('GHa93n0GjBU' is an Avatar clip) Then they'll have a simple table that contains details such as: Unique video ID User ID Date uploaded Length Description Maybe a list of related video IDs stored in some sort of comma seperated list
Probably then a second table for Comments which links to the video by it's Unique ID