Is it appropriate to use XML tags (element names) written in non-ASCII natural languages? The XML spec allows it (see Names and Exceptions), but I couldn't find any best practices about this at W3C and related pages.
What I'm looking for is practical advice regarding which tools support this, whether important XML-related technologies such as XSLT and XForms may have problems with it, etc.
I think Andrey and Tomalak are missing the point. XML is not necessarily read by programmers, it is read by many different professionals. So the arguments comparing it to source code don't necessarily apply.
Let me clarify: I mean a Bulgarian legal domain, where many terms are specific to the Bulgarian legal process, and may not even have exact English translations. Translating them would be laborious, imprecise and impractical. Transliterating to ASCII is suboptimal.
So back to the question: what tool limitations would I face? (Eclipse supports UTF, so writing xpaths wouldn't be a problem.)
To get people started in the technical direction that I'd like: in several systems we've used generation techniques to ensure perfect correspondence between XML schemas, Java beans and database schemas.
- Java: this article says that Unicode is ok
- Oracle: identifiers can contain only alphanumeric characters from your database character set
- I'd have to check for the tooling we use (JibX, Dozer, Hibernate, JXPath...)