Hi all: If I am storing News articles in a DB with different categories such as "Tech", "Finance", and "Health", would a distributed database work well in this system vs a RDBMS? Each of the news items would have the news articles attached as well as a few other items. I am wondering if querying would be faster, though.
Let's say I never have more than a million rows, and I want to grab the latest (within 5 hours) tech articles. I imagine that would be a map-reduce of "Give me all tech articles" (possibly 10000), then weed out only the ones that have the latest timestamp.
Am I thinking about tackling the problem in the right way, and would a DDB even be the best solution? In a few years there might be 5 million items, but even then....