views:

142

answers:

2

Hi,

The background of my question is associated with Tesseract, the free OCR engine (1985-1995 by HP, now hosting in Google). It specifically requires an input file and an output file; the argument only takes filename (not stream / binary string), so in order to use the wrapper API such as pytesser and / or python-tesser.py, the OCR temp files must be created. I, however, have a lot of images need to OCR; frequent disk write and remove is inevitable (and of course the performance hit). The only choice I could think about is changing the wrapper class and point the temp file to RAM disk, which bring this problem up.

If you have better solution, please let me know.

Thanks a lot.

-M

A: 

Are you on linux? You could try to send a file to the program through a pipe and refer to /dev/fd/0 -- it's the standard input's pathname for the current process. It should work if the application does not seek() through it.

Marco Mariani
i am on windows, so that's why i ask wmi. i, however, will try tesseract linux version and hope it could get it resolved. thanks.
Ming Xie
A: 

By searching at Google, I found a possible solution (that does not include WMI, but you can use it through subprocess):

Download the devcon utility, kind of a command-line device manager. Then, you can use something like:

subprocess.call( ("path_to_devcon\\devcon.exe", "INSTALL", "ramdisk.inf", "ramdisk") )

I hope this gives you a start.

ΤΖΩΤΖΙΟΥ
i tried to run on cmdline first. but it quickly pops up another dos window then disappear before i could see the text. is there any way to fix this? thanks. -m
Ming Xie