I want to parse the html file, pdf file, csv file and text file.Now parsing for which type of file (specified above) is easiest and efficient ?
Because I want to parse pdf ,html ,csv and text file through common parsing code if possible.
And now suppose if parsing for html is easiest and efficient then :
I will write the parsing code for html file and will try to convert pdf file to the html file(if possible)so the code written for parsing html file will also work for pdf file also.
And thus I will try to convert pdf,csv and text file to html file.And write the code for parsing html file and thus this code will parse html,pdf,csv and text file.
So (1) Which type of file parsing is easiest and efficient (pdf,csv,html,text) ? (2) And converting files(pdf,text,html,csv) to eachother is possible. Like if html parsing easiest then pdf to html,text to html and csv to html.