tags:

views:

264

answers:

9

Assuming I'm starting a project from scratch, which is not dependent on any other project. I would like to use a format to store feeds, something like XML, since XML is not the only available format of its kind, I would like to know: why should I choose one over the rest?

I will be using perl.

'Feed' is a description of a product (name, price, type, short description, up to 120 words).

+5  A: 

Depends on your needs. For small, lightweight apps I personally think XML is overkill: http://www.codinghorror.com/blog/2008/05/xml-the-angle-bracket-tax.html

I prefer YAML in that case. for interaction with javascript use json. If you truly need to define your own grammar (read: schema) then xml is it. Very powerful, you have to decide what you are trying to do - otherwise your question is too broad to give a definitive answer.

Joe
XML is almost always overkill. :)
Ether
XML is like violence: if it doesn't solve your problem, you aren't using enough of it.
Jimmy
+8  A: 

Each will do the job.

JSON has the advantage of super-easy parsing in JavaScript.

XML has the advantage of more common usage across languages, useful for the storage you mention; and also valuable for pass it around through different systems.

YAML has libraries for all languages, but is somewhat less commonly used, so you may have to introduce libraries.

Joshua Fox
err, if there's executable code, it isn't JSON, which is limited to just data
ysth
Thanks, I edited it accordingly.
Joshua Fox
You'll have to introduce libraries for either of YAML or JSON in any language, i think. But Perl's YAML libraries are good - YAML came out of the Perl world.
Tom Anderson
+5  A: 

I agree with Joe. For example, if it's a javascript app; json would be a strong candidate. Personally, I'd go with json for just about anything but only because that is the one I'm most comfortable with.

orolo
agreed - I avoid xml until forced to use it. key balue pairs work just fine for most usages.
Joe
+3  A: 

JSON would be my pick. JSON and YAML are lightweight and easy to get started with (no formal Schema required). JSON is more widely used and more compatible with various other technologies than YAML. For example, PHP has a built-in function to decode or encode JSON, not YAML. JavaScript of course just loves JSON, considering it’s a strict subset of valid JavaScript.

Alan H.
YAML: YAML Ain't (lightweight) Markup Language
tsee
+2  A: 

Choose XML if you need to interoperate with systems you don't control (XML Schema is invaluable here), if you will be transforming the data extensively into text, HTML, or XML (haters notwithstanding, XSLT is peerless), if your data includes a lot of text markup, if your data needs to be human-editable (haters notwithstanding, editable XML that's validated against a schema is a pretty good tool for a lot of jobs), and/or if you need to interoperate any of the myriad of tools and technologies that work with XML.

Choose JSON if you really can't be bothered with any of the above.

Choose YAML if you're working in an environment that's got a lot of YAML support.

Robert Rossney
+8  A: 

We can't really answer that without knowing a lot more. Just because you're not currently dependent on any other projects, are you likely to interact with them at some point in the future? If so, what technologies do they prefer? At the BBC, we've had some projects "JSON-only", only to find out that Java developers who wanted to access our API were begging us to provide a simple XML API simply because they have so many tool built around XML. They didn't even care about namespaces, attributes, or anything else; they just wanted those angle-brackets.

As for "storing feeds", I also not sure what you mean there. You explain the data in the feed, but what are you then going to do with those feeds? Parse them? Cache and reserve them? Write them out to cuneiform tablets? :)

I sounds like what you actually want is a database and you want to persist the data there and later make it serialisable as JSON/YAML/XML or whatever your desired format is. What I'd recommend is to be able to pull the data out into a Perl data structure and then have "formatters" which know how to serialise that data structure to the desired output. That way you can serialise to, say, JSON, and later if that's not good enough, easily switch to YAML or something else. In fact, if others need your data (one-way data tends not to be useful), they can ask for JSON, YAML, XML or whatever. You have more flexibility and aren't tied into a decision that you made up front.

That being said, I don't know your system, so it's tough to say what the right thing to do is. Also, not that JSON and YAML aren't exactly interchangeable with XML. Subtle differences can and will trip you up.

Ovid
A: 

I think xml is for big data and json is for small and not too complex data that do not need multiple dimension of array. I might be wrong. ^^ And i only see yaml in google app engine. Which appear to me , it is quite suitable for storing preferences and data of an application.

wizztjh
A: 

If the data's not hierarchical or going to have data interspersed in e.g., the description This product is great for <targetDemo/> who love it's <featureSet/>), you may want to consider Comma Separated Values (CSV) or some other format like tab separated.

It's old school but it gets the job done without weighing your file down with a bunch of describing text. I.e., in XML, you'd have the following non-value data for each feed.

<feed name="" price="" type="" description=""/>

...contrasted with CSV:

"", , "", ""

If you want, you can add header row at the top for documentation purposes.

There's also plenty of tooling around CSV, from command line utilities like awk to GUIs such as Excel.

Another alternative, if you don't really need the data to be editable via a text editor but don't want to deploy a more robust database service, would be SQLite which allows you to perform RDBMS-style CRUD operations on a flat binary file.

steamer25
A: 

In the absence of interoperability concerns, i don't think there's much in it. There are good libraries for all of them in all languages; some of them are built-in, some aren't. Yur interface to those libraries will be narrow - just in data-access code - so if one has a painful API, even that doesn't matter much.

JSON is, for me, the most pleasant to edit by hand, which is a small plus.

YAML can handle non-tree data structures using the &/* notation. Neither XML nor JSON have a built-in way to do that. Your use doesn't need it, though.

Tom Anderson