




Does anyone know of any vbscript / javascript implementations of a RTF to HTML converter?

I have seen some in VB / c#, but cannot find any reference to a scripted version. Before I start to write one - does anyone know of an existing open source project that deals with this?

How comfortable are you with PHP? This class seems to do the trick, so you could either use it as is, or convert to Javascript, or even use as a guideline.

Thank you - perfect place for me to start, much appreciated Gausie!

You didn't specify which platform you are targeting. However, the fact you mentioned VBScript as well as Javascript suggests you are at the least using a Windows-based machine. If so, and you have access to a version of Word, you could use a script automating a conversion, using Word as an out of process server. Even then, you didn't really say whether this is meant to be done from a Windows session, or via a web server.

If you wanted to do this from a Windows session, you could use the following VBScript, run by the Windows Scripting Host:


Option Explicit

Private Sub ConvertToHtml(documentFileName)
Const wdFormatHTML = 8
Dim fso
Dim wordApplication
Dim newDocument
Dim htmlFileName

    On Error Resume Next

    Set fso = WScript.CreateObject("Scripting.FileSystemObject")

    documentFileName = fso.GetAbsolutePathName(documentFileName)

    If Not fso.FileExists(documentFileName) Then
     WScript.Echo "The file '" & documentFileName & "' does not exist."
    End If

    Set wordApplication = WScript.CreateObject("Word.Application")

    If Err.Number <> 0 Then
     Select Case Err.Number
     Case &H80020009
      WScript.Echo "Word not installed properly."
     Case Else
     End Select
    End If

    Set newDocument = wordApplication.Documents.Open(documentFileName, False)

    If Err.Number <> 0 Then
     Select Case Err.Number
     Case Else
     End Select
    End If

    ' Construct a file name which is the same as the original file, but with a different extension.
    htmlFileName = Left(documentFileName, InStrRev(documentFileName, ".")) & "htm"

    newDocument.SaveAs htmlFileName, wdFormatHTML



End Sub

Private Sub Main
Dim arguments

    Set arguments = WScript.Arguments

    If arguments.Count = 0 Then
     WScript.Echo "Missing file argument."
     ConvertToHtml arguments(0)
    End If

End Sub
Private Sub ShowDefaultErrorMsg
    WScript.Echo "Error #" & CStr(Err.Number) & vbNewLine & vbNewLine & Err.Description
End Sub


If you want to use this from a webserver, things are a little different. You could adapt the VBScript for an ASP page, or convert it to an ASP.NET page. In any case, you will have to replace the WSH objects with the appropriate internal objects. However, be warned: whilst it is possible to use an out of process server from IIS, it is generally a bad idea, unless you know this is going to be an extremely low volume server. Even then, the fact that Word potentially uses GUI elements makes this potentially dangerous, because it is possible that Word might show a dialogue in some error condition.

In this case, it may be better to disconnect the two processes by shelling out from the server script to the Windows scripting host code, and instead return a page that does a client side pull after an appropriate delay.

Mark Bertenshaw
Hi Mark,Amazingly thorough and helpful answer! I should have specified I need to do this just in script - but smashing response!
"just in script" = without depending on external objects