What technology problems arise from creating a markup language for email?

I am wondering what technology problems arise from associating a markup language to email? Without examining the language let us assume a hypothetical markup language exists with the following conditions:

It meets all possible user-agent needs for properly structuring and defining content in email.
It properly sanctions communications in a single document to allow multiple author contributions in representation of an email thread.
It properly associates RFC 5322 similar header data to each instance of communication in the document using markup conventions.
It solves all possible problems associated with accessibility, semantics, and other issues confined solely to the markup technology itself.
It solves all possible security conditions with regards to application layer processing and solves absolutely no problems associated with transmission.
The language may or may not be written in some derivative of XML and is immediately available XML derived technologies.
The language instances require validation from the user-agent before they are allowed to be transmitted as email.

With that being said what technology problems are associated with such a project? Will this present programming problems to user-agents? Would such a project prove incompatible to RFC 5322 form email where the content is to be only 7bit ASCII? Would such a technology prove harmful to email servers? Are there additional security problems associated with such a project? What are your other technology specific general thoughts about such a project? Please keep answers and responses as technology/programming focused as possible. I will down vote any comments related to business opinions or adoption.

Speaking as a semantics wonk, one of the most troublesome problems is adequate constraints over the meaning of the data. Assume you have infinite buy-in and cooperation, as your final sentence implies. :-) I contend that XML alone still does not adequately enforce the semantics you will ultimately desire. Regrettably, I can't bring examples to bear quickly, but speaking off the cuff:

"the contents of the GOK tag must be an integer no larger than the total number of BREEP tags contained inside the preceding FLOIT tag"

is a rule I don't see you enforcing through XML validation any time soon. (I could be wrong; it's early in the day for me.)

This isn't fatal, but it's hard, and not only that, it's deceptively hard. In a nutshell, the semantics you'll eventually need to enforce in any endeavor will quickly require the equivalent of first-order logic (if not second-order), which requires a robust description language. Such languages exist (Common Logic springs immediately to mind, as does OWL Full)... but then you need a kick-ass reasoning engine to enforce the rules.

I say it's deceptively hard for reasons which unfortunately creep close to your verboten business opinions, but it still bears mentioning if only for the sake of human factors. That is, in my experience, users are so used to the limitations imposed by an relational algebra-ish way of approaching data modeling, that they tend to stick naturally to very crude rules like "this field must be an integer, that one must be a string", and assume subconsciously that real semantics will be enforced by human eyeballs. In other words, it will be hard to see the need for anything more than effectively mere syntactic enforcement... but that's just my experience; yours might be quite different.

I understand adoption will be difficult, which is why I prefer to not have that repeated to me to the point where it becomes the only thing of value somebody continually points out to me. I am thinking most of the semantic limitations you mentioned could be solved using Schema definitions including quantity expectations and type definitions for certain tags. Relationship constraints would require an assisting technology like OWL as you mentioned, but I am not sure that is necessary for the language itself. I am wondering if there is harm manifest outside the language which I have thought.

2009-08-27 14:27:36

I suppose it would depend on where you draw the line between "language" and "other". Eventually, the semantics behind those markups has to be expressed, formally (re: not as a comment) and publicly. Even if that expression isn't part of the language, it would still need to be made as visible as the language, wouldn't you say?

Paul Brinkley 2009-08-27 15:10:30

I would say that. In my opinion semantics are half declarative and half structural, assuming the meta-language used to write the markup language is structurally self-aware. Even though the structural interpretations from the language itself would be limited to moving up and down the tree only without help from assisting technologies. It is my opinion that whether the assisting technologies are made immediately available to the user is purely a design consideration of the processing user-agent opposed to a functional demand of the language.

2009-08-27 15:27:17

ansaurus

tags:

views:

answers:

What technology problems arise from creating a markup language for email?

related questions