views:

1274

answers:

2

hi guys

Well I testing my jython program, that does some neat [".xls", ".doc", ".rtf", ".tif", ".tiff", ".pdf" files] -> pdf (intermediary file) -> tif (final output) conversion using Open Office. We moved away from MS Office due to the problems we had with automation. Now it seems we have knocked down many bottles related to show stopper errors with one bottle remaining. OO hangs after a while.

It happens where you see this line '<<<<<<<<<<<<' in the code

What is the correct way for me to handle a stalled Open Office process. could you please provide useful links, and give me a good suggestion on the way out.
Also one more question.

Sum up:
* How to handle a stalled Open Office instance?
* How to make conversion with java headless, so I dont have a GUI popping up all the time wasting memory.
* also any general suggestions on code quality, optimizations and general coding standards will be most appreciated.


Traceback (innermost last):
File "dcmail.py", line 184, in ?
File "dcmail.py", line 174, in main
File "C:\DCMail\digestemails.py", line 126, in process_inbox
File "C:\DCMail\digestemails.py", line 258, in _convert
File "C:\DCMail\digestemails.py", line 284, in _choose_conversion_type
File "C:\DCMail\digestemails.py", line 287, in _open_office_convert
File "C:\DCMail\digestemails.py", line 299, in _load_attachment_to_convert
com.sun.star.lang.DisposedException: java.io.EOFException
at com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge$MessageDi spatcher.run(java_remote_bridge.java:176)

com.sun.star.lang.DisposedException: com.sun.star.lang.DisposedException: java.i o.EOFException

Just to clear up this exception only throws when I kill the open office process. Otherwise the program just waits for open office to complete. Indefinitely


The Code (with non functional code tags)

[code]

#ghost script handles these file types
GS_WHITELIST=[".pdf"]
#Open Office handles these file types
OO_WHITELIST=[".xls", ".doc", ".rtf", ".tif", ".tiff"]
#whitelist is used to check against any unsupported files.
WHITELIST=GS_WHITELIST + OO_WHITELIST

def _get_service_manager(self):
    try:
        self._context=Bootstrap.bootstrap();
        self._xMultiCompFactory=self._context.getServiceManager()
        self._xcomponentloader=UnoRuntime.queryInterface(XComponentLoader, self._xMultiCompFactory.createInstanceWithContext("com.sun.star.frame.Desktop", self._context))
    except:
        raise OpenOfficeException("Exception Occurred with Open Office")

def _choose_conversion_type(self,fn):
    ext=os.path.splitext(fn)[1]
    if ext in GS_WHITELIST:
        self._ghostscript_convert_to_tiff(fn)
    elif ext in OO_WHITELIST:
        self._open_office_convert(fn)

def _open_office_convert(self,fn):
    self._load_attachment_to_convert(fn)
    self._save_as_pdf(fn)
    self._ghostscript_convert_to_tiff(fn)

def _load_attachment_to_convert(self, file):
    file=self._create_UNO_File_URL(file)
    properties=[]
    p=PropertyValue()
    p.Name="Hidden"
    p.Value=True
    properties.append(p)
    properties=tuple(properties) 
    self._doc=self._xcomponentloader.loadComponentFromURL(file, "_blank",0, properties) <<<<<<<<<<<<<<< here is line 299

def _create_UNO_File_URL(self, filepath):
    try:
        file=str("file:///" + filepath)
        file=file.replace("\\", "/")
    except MalformedURLException, e:
        raise e
    return file

def _save_as_pdf(self, docSource):
    dirName=os.path.dirname(docSource)
    baseName=os.path.basename(docSource)
    baseName, ext=os.path.splitext(baseName)
    dirTmpPdfConverted=os.path.join(dirName + DIR + PDF_TEMP_CONVERT_DIR)
    if not os.path.exists(dirTmpPdfConverted):
        os.makedirs(dirTmpPdfConverted)
    pdfDest=os.path.join(dirTmpPdfConverted + DIR + baseName + ".pdf")
    url_save=self._create_UNO_File_URL(pdfDest)
    properties=self._create_properties(ext)
    try:
        try:
            self._xstorable=UnoRuntime.queryInterface(XStorable, self._doc);
            self._xstorable.storeToURL(url_save, properties)
        except AttributeError,e:
                self.logger.info("pdf file already created (" + str(e) + ")")
                raise e
    finally:
        try:
            self._doc.dispose()
        except:
            raise

def _create_properties(self,ext):
    properties=[]
    p=PropertyValue()
    p.Name="Overwrite"
    p.Value=True
    properties.append(p)
    p=PropertyValue()
    p.Name="FilterName"
    if   ext==".doc":
        p.Value='writer_pdf_Export'
    elif ext==".rtf":
        p.Value='writer_pdf_Export'
    elif ext==".xls":
        p.Value='calc_pdf_Export'
    elif ext==".tif":
        p.Value='draw_pdf_Export'
    elif ext==".tiff":
        p.Value='draw_pdf_Export'
    properties.append(p)
    return tuple(properties)

def _ghostscript_convert_to_tiff(self, docSource):
    dest, source=self._get_dest_and_source_conversion_file(docSource)
    try:
        command = ' '.join([
            self._ghostscriptPath + 'gswin32c.exe',
           '-q',
           '-dNOPAUSE',
           '-dBATCH',
           '-r500',
           '-sDEVICE=tiffg4',
           '-sPAPERSIZE=a4',
           '-sOutputFile=%s %s' % (dest, source),
           ])
        self._execute_ghostscript(command)
        self.convertedTifDocList.append(dest)
    except OSError, e:
        self.logger.info(e)
        raise e
    except TypeError, (e):
        raise e
    except AttributeError, (e):
        raise e
    except:
        raise

[/code]

+1  A: 

OpenOffice.org has a "-headless" parameter to run it without a GUI. I'm not sure this actually frees up all resources that would be spent on GUI. Here's how I run my server-side headless instance:

soffice -headless -accept="socket,port=1234;urp" -display :25

I can't tell what's causing the stalling problems for your Python script, but you might want to to check out PyODConverter, and see what this script does differently to maybe catch the error causing your trouble.

Alexander Malfait
well actually as I am using jython and the whole conversion thing is handled by the JAVA api(cant be bothered with pyuno and its headaches) I dont need to run an instance it opens it automatically. Anyway Thanks for the ideas.
Setori
what I want to know is during this automatic process, how do I disable the GUI. I am already familiar with this command when I was working with pyUno (soffice -headless -accept="socket,port=1234;urp" -display :25)
Setori
+1  A: 

The icky solution is to have a monitor for the OpenOffice process. If your monitor knows the PID and has privileges, it can get CPU time used every few seconds. If OO hangs in a stalled state (no more CPU), then the monitor can kill it.

The easiest way to handle this is to have the "wrapper" that's firing off the open office task watch it while it runs and kill it when it hangs. The parent process has to do a wait anyway, so it may as well monitor.

If OpenOffuce hangs in a loop, then it's tougher to spot. CPU usually goes through the roof, stays there, and the priority plummets to the lowest possible priority. Processing or hung? Judgement call. You have to let it hang like this for "a while" (pick a random duration, 432 seconds (3 dozen dozen) for instance; you'll always be second-guessing yourself.)

S.Lott
Thanks mate for the idea
Setori