views:

326

answers:

1

I am trying to write a Thunderbird extension which will let you compose a message but it will process the message text before sending it out. So I need access to the plain text content of the email body.

Here is what I have so far, just as some test code in the Extension Developer Javascript console.

var composer = document.getElementById('msgcomposeWindow');
var frame = composer.getElementsByAttribute('id', 'content-frame').item(0);
if(frame.editortype != 'textmail') {
  print('Sorry, you are not composing in plain text.');
  return;
}

var doc = frame.contentDocument.documentElement;

// XXX: This does not work because newlines are not in the string!
var text = doc.textContent;
print('Message content:');
print(text);
print('');

// Do a TreeWalker through the composition window DOM instead.
var body = doc.getElementsByTagName('body').item(0);
var acceptAllNodes = function(node) { return NodeFilter.FILTER_ACCEPT; };
var walker = document.createTreeWalker(body, NodeFilter.SHOW_TEXT | NodeFilter.SHOW_ELEMENT, { acceptNode: acceptAllNodes }, false);

var lines = [];

var justDidNewline = false;
while(walker.nextNode()) {
  if(walker.currentNode.nodeName == '#text') {
    lines.push(walker.currentNode.nodeValue);
    justDidNewline = false;
  }
  else if(walker.currentNode.nodeName == 'BR') {
    if(justDidNewline)
      // This indicates back-to-back newlines in the message text.
      lines.push('');
    justDidNewline = true;
  }
}

for(a in lines) {
  print(a + ': ' + lines[a]);
}

I would appreciate any feedback as to whether I'm on the right track. I also have some specific questions:

  • Does doc.textContent really not have newlines? How stupid is that? I'm hoping it's just a bug with the Javascript console but I suspect not.
  • Is the TreeWalker correct? I first tried NodeFilter.SHOW_TEXT but it did not traverse into the <SPAN>s which contain the quoted material in a reply. Similarly, it seems funny to FILTER_ACCEPT every node and then manually cherry-pick it later, but I had the same problem where if I rejected a SPAN node, the walker would not step inside.
  • Consecutive <BR>s break the naive implementation because there is no #text node in between them. So I manually detect them and push empty lines on my array. Is it really necessary to do that much manual work to access the message content?
+1  A: 

Well, don't everybody chime in at once!

I posted this as a mozilla.dev.extensions thread and there was some fruitful discussion. I've been playing around in Venkman and the solution is to throw away my DOM/DHTML habits and write to the correct API.

var editor = window.gMsgCompose.editor;

// 'text/html' works here too
var text = editor.outputToString('text/plain', editor.eNone)

Now text has the plaintext version of the email body being composed.

jhs