tags:

views:

67

answers:

4

I have to write a C parser for online blogs and different word manipulation features.

I know how to parse / tokenise stings in C, but how would you on execution download the pages content to a local /tmp directory as an HTML file so I can save the information (the blogs) into a string using I/O?

Or, just grab the block of text directly from the page I am viewing...

My system could be either Ubuntu or Windows 7, so I dont think wget will cut it. Please help.

+7  A: 

Take a look at libcurl:

libcurl is a free and easy-to-use client-side URL transfer library, supporting [...] HTTP, HTTPS, [...]

libcurl is highly portable, it builds and works identically on numerous platforms, including [...] Linux, [...] Windows, [...]

Georg Fritzsche
sounds like this is what im looking for, after reading back my own spec it sounds like im building an rss reader app. is there any pitfalls I need to be aware of with C/libcurl graphic libraries if I take this direction
JB87
@JB87: I'm not sure what libcurl has to do with graphic libraries? If you are going for a cross-platform GUI using gtk+ or something like this then it might be easier to go with their fitting implementations instead.
Georg Fritzsche
+1  A: 

Alternatively you can make use of system to execute wget

codaddict
A: 

And there is libsoup too.

Praveen S
A: 

MSDN: URLDownloadToFile

Johann Gerell