The situation:
Each page I scrape has <input>
elements with a title=
and a value=
I don't know what is going to be on the page.
I want to have all my collected data in a single table at the end, with a column for each title.
So basically, I need each row of data to line up with all the others, and if a row doesn't have a certain element, then it should be blank (but there must be something there to keep the alignment).
eg.
First page has: {animal: cat, colour: blue, fruit: lemon, day: monday}
Second page has: {animal: fish, colour: green, day: saturday}
Third page has: {animal: dog, number: 10, colour: yellow, fruit: mango, day: tuesday}
Then my resulting table should be:
animal | number | colour | fruit | day
cat | none | blue | lemon | monday
fish | none | green | none | saturday
dog | 10 | yellow | mango | tuesday
Although it would be good to keep the order of the title
value
pairs, which I know dictionaries wont do.
So basically, I need to generate columns from all the titles
(kept in order but somehow merged together)
What would be the best way of going about this without knowing all the possible titles and explicitly specifying an order for the values to be put in?