views:

28

answers:

1

Hi all, I have around 8000 xml files that needs to be converted into text files. The text file must contain title, description and keywords of the xml file without the tags and removing other elements and attributes as well. In other words, i need to create 8000 text files containing the title,description and keywords of the xml file. I need codings for this to be done systematically. Any help would be greatly appreciated. Thanks in advance.

Hey all thank you all so so much with your replies. Here's a sample of what my xml looks like:

<?xml version="1.0"?>
<metadata>
<identifier>43productionsNightatthegraveyard</identifier>

<title>Night at the graveyard</title>

<collection>opensource_movies</collection>
<mediatype>movies</mediatype>
<resource>movies</resource>
<upload_application appid="ccPublisher" version="2.2.1"/>
<uploader>[email protected]</uploader>

<description>una noche en el cementerio (terror)</description>

<license>http://creativecommons.org/licenses/by-nc/3.0/&lt;/license&gt;
<title>Night at the graveyard</title>
  <format>Video</format>
<adder>[email protected]</adder>
<licenseurl>http://creativecommons.org/licenses/by-nc/3.0/&lt;/licenseurl&gt;
<year>2007</year>

<keywords>Night,at,the,graveyard,43,productions</keywords>

<holder>43 productions</holder>
<publicdate>2007-04-11 19:52:28</publicdate>
</metadata>

And this would be the output:

una noche en el cementerio (terror)

Night at the graveyard

Night,at,the,graveyard,43,productions

This need to be saved with the same name but in text format. Thanks all so much if any more suggestions would be much appreciated.

A: 

This seems like a fairly straight forward XPATH query to pull out the description, title and keywords section. Since you didnt mention which programming language your using I cant offer you much more beyond that and the general process below:

  1. Load the XML document and do an xpath query for the title (like /metadata/title/)
  2. Repeat for the description and keyword elements
  3. Take the XML filename, drop the .XML name and write the above 3 values into the file and close it
  4. Rinse and repeat 8000 times. :)
GrayWizardx
Is the any ways to do this systematically? Are there any websites i could go to? i believe xslt may help me but i'm still unsure of how to search all the xml files in the same directory and converting them into text files. Thanks for your reply.
Jason