views:

115

answers:

4

I've been looking at the freebase project for storing data. It seems to be a great place to store concrete, objective data like names, locations and dates. Is it a good place to store subjective data like opinions or ratings? Is there another/better open data, semantic data store or strategy for storing and querying this kind of information?

Additionally, since it is subjective I can be sure that others will not agree with my opinion. How would I store the opinions of others inline so the crowd opinion could be represented better?

Is freebase the right place to store this type of data?

For example: a restaurant rating or a movie rating. The movie rating would probably be less time sensitive than the restaurant rating. Any non-identifying information about the person who entered the data would be interesting for determining other factors and relationships.

A: 

Data is data, what you want to do is label the data as what it is, an opinion or a rating. A "fact" I suppose which could be inferred from such data would be that most people had x subjective opinion about said topic.

altCognito
Thanks for the reply!Makes sense, but I'd really like to know if freebase is the best place to store that data. Just like wikipedia may not be the best place for movie reviews, freebase may not be the best place for that data either...Also, subjective data could vary by multiple factors like locale, time, gender, age, etc... a simple count or percentage may not capture the full picture of a crowd opinion.
Kevin Williams
A: 

I find designing/selecting data formats is very hard without an understanding of the questions I will be asking using that data. What purpose do you expect the data to be used for? Come up with some use cases and that may guide your search.

Storing attributed data is an open research topic, with development in (among other places) the Intelligence community: these users obviously need to keep track of where information came from, and who has added to it along the way, both to verify its reliability and to do things like track whether Secret information has been included by accident. That may be a good place to look.

Alex Feinman
thanks for the response. I've added a couple of simple use cases (movie/restauerant reviews). I could build a relational database of my own, but my motivation is to do so in an 'open' and reusable format.
Kevin Williams
+1  A: 

The Semantic Web is more or less a variant of first-order logic, for the most part, so the important part is to have a clear understanding of what each of your predicates "mean". This idea is very simple but applicable to a wide-variety of meaning representations - i.e. it is behind the entity model of databases.

There should be no problem representing the information you mentioned in a semantic web representation. Just be sure to have a clear definition of what each of your predicates denote, so that the meaning doesn't shift over time and you end up with an inconsistent representation.

Genesereth's book is old but a good one if you are interested in reading about this in further detail. I think a lot of people who worked on the Semantic Web were involved in Douglas Lenat's Cyc project which gradually shifted to a logic-based meaning representation over time.

http://www.amazon.com/Logical-Foundations-Artificial-Intelligence-Genesereth/dp/0934613311

The site for Cyc:

http://www.cyc.com/

Larry Watanabe
thanks, I'll check out the book.
Kevin Williams