Hi guys,
Looking for some guidance. I've got a requirement to get article content from specific sources that will be used for data analysis in a nutshell. So we've got to get the latest articles, and store them in our database for processing later on.
I'm not sure really sure of the best approach. Our code for current news retrieval (from a newsfeed provider) runs from C on UNIX. Basically using CURL and parsing the XML for storage in a database.
But the solution I need now is different. Every website is different obviously. Basically I just want to be able to have a cron job that will call something that will get the latest articles from the relevant website as required.
Any ideas appreciated. I'm also currently looking at AutomationAnywhere perhaps as a quick solution if it works for us.
Thanks!
Manoj