tags:

views:

771

answers:

10

Why do so many projects use XML as a configuration file language?

+6  A: 

XML is a well developed and adopted standard, making it easier to read and understand than proprietary configuration formats.

Also, it's worth understanding that XML serialization is a common tool available in most languages that makes saving object data extremely easy for developers. Why build your own way of saving a hierarchy of complex data when someone else has already done the work for you?

.NET: http://msdn.microsoft.com/en-us/library/system.xml.serialization.aspx

PHP: http://us.php.net/serialize

Python: http://docs.python.org/library/pickle.html

Java: http://java.sun.com/developer/technicalArticles/Programming/serialization/

Robert Venables
+2  A: 

Because parsing XML is relatively easy, and if your schema is clearly specified, any utility can read and write information easily into it.

Stefano Borini
+6  A: 

Because XML sounds cool and enterprisey.

Edit: I didn't realize my answer was so vague, until a commenter requested the definition of enterprisey. Citing Wikipedia:

[...] the term "enterprisey" is intended to go beyond the concern of "overkill for smaller organizations", to imply the software is overly complex even for large organizations and simpler, proven solutions are available.

My point is that XML is a buzzword and as such is being overused. Despite other opinions, XML is not easy to parse (just look at libxml2, its gzipped source package is currently over 3MB). Due to the amount of redundancy it is also annoying to write by hand. For example, Wikipedia lists XML configuration as one of the reasons for the decrease of the popularity of jabberd in favor of other implementations.

avakar
Why the downvotes? Despite being a sarcastic answer, it is a fairly valid reason why XML is so popular. Not the only reason, certainly, but an important one.
Chris Lutz
Without defining what "enterprisey" means, this answer is not useful.
Greg Hewgill
Thank you for support, Chris.
avakar
Its not easy to parse, but maybe more usable and expandable then just plain text. Personally, I never use XML willingly.
teh_noob
Seems it's also popular to bash XML. What would you rather use in it's place, e.g. for app configuration or data tranfer? People rarely offer alternatives. It may be a little verbose for your specific use in your specific situation. But I'll take that hit in having tools to parse it, to transform it, to query it and to validate it.
JonoW
JonoW, XML certainly has valid uses. I'm strictly talking about config files. I believe that in most cases, `key=value` is not only sufficient, but also more readable and easier to parse.
avakar
avakar, you're idea works for simple settings but I would argue that XML is better because it's more expandable. Say, for example, you may have a list of objects with the same name (say, a list of directories) -- would you simply just call the same value many times over? append a number? What I'm getting at is that XML is a standard that can be followed easily.
Nazadus
+4  A: 

One other point, if you have an XSD (schema file) to describe your configuration file, it is trivial for your application to validate the configuration file.

JonnyBoats
Any schema will bring you this advantage. XSD is not the only schema language.
bortzmeyer
+11  A: 
  1. XML is easy to parse. There are several popular, lightweight, featureful, and/or free XML parsing libraries avaliable in most languages.
  2. XML is easy to read. It is a very human-readable markup language, so it's easy for humans to write as well as for computers to write.
  3. XML is well specified. Everyone and his dog knows how to write decent XML, so there's no confusion about the syntax.
  4. XML is popular. Somewhere along the way, some Important People™ started pushing the idea that XML was the "future", and a lot of people bought it.

As a side note, I'm not trying to defend XML. It has its uses, and I will be using it in a project whenever I get back to that. In many cases, though, and especially configuration files, the only advantage it has is that it's a standardized format, and I think this is far outweighed by numerous disadvantages (i.e. it's too verbose). However, my personal preferences don't matter - I was merely answering why some people might choose to use XML as a configuration file format. I personally never will.

Chris Lutz
I think all points are valid advantages of XML in general; however, I don't see the relation to configuration files. In many cases a simple ini file would do for configuration, which would also be easy to parse, and much easier to write and read than XML.
0xA3
@divo - INI files are very Windows-centric, and better supported on Windows than on other platforms. Unix has it's own Unix-centric configuration file formats, which are better supported on Unix than other platforms. XML has the advantage of being equally well supported everywhere. Plus, XML allows more hierarchical structure - INI files don't allow deep nesting like XML.
Chris Lutz
I agree with points 1, 3, and 4.But "XML is easy to read, it is a very human-readable markup language, so it's easy for humans to write"; surely you jest? Is this some different definition of the word "human" that I'm currently unaware of?
bignose
It never fails to amaze me how hard <property attribute"foo">bar</property> is for some people to read. It's just like the damned metric system all over again!
annakata
Also: 5). XML has a number of readily available tools at various levels: from XPath, XSD, and XSLT to IDE support.
annakata
Actually I think that XML is hard to parse. I also don't see it as a human readable format.
elcuco
@Chris: My point was that all the advantages you mention for XML config files all hold for ini-like files as well or even more. The advantage of XML comes into play, when there is a need to store more complex or hierarchical structures and you would need validation.
0xA3
Readability is highly subjective, but XML has the advantage of largely being in English and making use of very few symbols. @elcuco - Full XML is a little difficult to parse, but full C or Perl or Ruby is much harder to parse. The basics of XML are very easy to parse, and then it's just a matter of adding on the other stuff. @divo - I agree, and my current project's configuration files will certainly not be in XML, but a lot of people prefer to just use XML for whatever reason. Perhaps see point 4.
Chris Lutz
@annakata, if your configuration file is that simple, what's the benefit in using XML instead of just key-value pairs? Contrariwise, if your configuration is complex enough that you actually need XML, why claim it's going to be easy for a human to visually parse?
bignose
+1  A: 

Well.., XML is a general-purpose specification that can hold descriptions, nested information and data about something. And there are many APIs and softwares that can parse it and read it.

So it's much easy to describe something in formal way that is known cross platforms and applications.

Saleh Al-Zaid
Wah! 5 answers during writing mine :S impressive
Saleh Al-Zaid
A: 

Its because XML allows you to basically make your own semantic markup, which can be read by a parser built in virtually any language. An added benefit is that the configuration file written in XML can be used on projects where you are using two or more languages. IF you were to make a configuration file where everything was defined as variables for a specific language, it would only work in that language, obviously.

teh_noob
+10  A: 

This is an important question.

Most alternatives (JSON, YAML, INI files) are easier to parse than XML.

Also, in languages like Python -- where everything is source -- it's easier to simply put your configuration in a clearly-labeled Python module.

Yet, some people will say that XML has some advantage over JSON or Python.

What's important about XML is that the "universality" of XML syntax doesn't really apply much when writing a configuration file that's specific to an application. Since portability of a configuration file doesn't matter, some Python folks write their configuration files in Python.


Edit

Security of a configuration file does not matter. The "configuring a Python program in Python is a security risk" argument seems to ignore the fact that Python is already installed and running as source. Why work up a complex hack in a configuration file when you have the source? Just hack the source.

I've heard folks say that "someone" could hack your app via the configuration file. Who's this "someone"? The sysadmin? The DBA? The developer? THere aren't a lot of mysterious "someone"s with access to the configuration files.

And anyone who could hack up the Python configuration file for nefarious purposes could probably install keyloggers, fake certificates or other more serious threats.

S.Lott
good points! +1
0xA3
Portability of a configuration file doesn't matter, true. But it is important that the domain of the configuration language be restricted for security reasons; a full-blown general-purpose language is far too broad for most configuration needs, and is in those cases an unnecessary security risk.
bignose
+2  A: 

Hi Guys, Thanks for your answers. This question, as naive as it may seem at first glance was not so naive :)

Personally I don't like XML for configuration files, I think it's hard for people to read and change, and it's hard for computers to parse because it's so generic and powerful.

INI files or Java propery files are fine for only the most basic applications that does require nesting. common solutions to add nesting to those formats look like:

level1.key1=value
level1.key2=value
level2.key1=value

not a pretty sight, a lot of redundancy and hard to move things between nodes.

JSON is not a bad language, but it's designed to be easy for computers to parse (it's valid JavaScript), so it's not wildly used for configuration files.

JSON looks like this:

{"menu": {
  "id": "file",
  "value": "File",
  "popup": {
    "menuitem": [
      {"value": "New", "onclick": "CreateNewDoc()"},
      {"value": "Open", "onclick": "OpenDoc()"},
      {"value": "Close", "onclick": "CloseDoc()"}
    ]
  }
}}

In my opinion, it's too cluttered with commas and quotes.

YAML is good for configuration files, here is a sample:

invoice: 34843
date   : 2001-01-23
bill-to: &id001
    given  : Chris
    family : Dumars

however, I don't like its syntax too much, and I think that using the whitespace to define scopes make things a bit fragile (think pasting a block to a different nesting level).

A few days ago I started to write my own language for configuration file, I dubbed it Swush.

Here are a few sample: as a simple key-value pairs:

key:value
key:value2
key1:value3

or as a more complex and commented

server{
    connector{
         protocol : http // HTTP or BlahTP
         port : 8080     # server port
         host : localhost /* server host name*/
    }

    log{
        output{
             file : /var/log/server.log
             format : %t%s
        }
    }
}

Swush supports strings in the simple form above, or in quotes - which allows whitespaces and even newlines inside strings. I am going to add arrays soon, somethings like:

name [1 2 b c "Delta force"]

There is a Java implementation, but more implementations are welcome. :). check the site for more information (I covered most of it, but the Java API provide a few interesting features like selectors)

Omry
If you're trying to do a generic configuration parser, maybe you could change the colon to an equal sign. That way you'll be able to parse lots of other existing config files as well.
avakar
that's a good point, and I thought about it.but I think key:value is more readable than key=valueconsider: connector{ protocol : http // HTTP or BlahTP port : 8080 # server port host : localhost /* server host name*/ }vs: connector{ protocol = http // HTTP or BlahTP port = 8080 # server port host = localhost /* server host name*/ }I somehow like the first one better, what do you think?
Omry
out, comments are really bad for code samples.
Omry
I suppose it depends on what you're accustomed to. I'd personally prefer `key=value`. You can always support both.
avakar
A: 

If you can answer why NOT to use XML for configuration files, then you probably can answer why as well.

JRL