Hi,
The background of my question is associated with Tesseract, the free OCR engine (1985-1995 by HP, now hosting in Google). It specifically requires an input file and an output file; the argument only takes filename (not stream / binary string), so in order to use the wrapper API such as pytesser and / or python-tesser.py, the OCR temp files must be created. I, however, have a lot of images need to OCR; frequent disk write and remove is inevitable (and of course the performance hit). The only choice I could think about is changing the wrapper class and point the temp file to RAM disk, which bring this problem up.
If you have better solution, please let me know.
Thanks a lot.
-M