I have 6 server with a aggregated storage capacity of 175TB all hosting different sorts of media. A lot of the media is double and copies stored in different formats so what I need is a library or something I can use to read the tags in the media and decided if it is the best copy available. For example some of my media is japanese content in which I have DVD and now blu ray rips of said content. This content sometimes has "Hardsubs" ie, subtitles that are encoded into the video and "Softsubs which are subtitles that are rendered on top of the raw video when it plays/ I would like to be able to find all copies of that rip and compare them by resolution and wether or not they have soft subs and which audio format and quality.
Therefore, can anyone suggest a library I can incorporate into my program to do this?
EDIT: I forgot to mention, the distribution server mounts the other servers as drives and is running windows server so I will probably code the solution in C#. And all the media is for my own legal use I have so many copies because some of the stuff is in other format for other players. For example I have some of my blu rays re-encoded to xvid for my xbox since it can't play Blu ray.
When this is done, I plan to open source the code since there doesn't seem to be anything like this already and I'm sure it can help someone else.