tags:

views:

132

answers:

6

Hi

I have been having this issue for a while and cannot figure how should I start to do this with python. My OS is windows xp pro. I need the script that moves entire (100% of the text) text from one .doc file to another. But its not so easy as it sounds. The target .doc file is not the only one but can be many of them. All the target .doc files are always in the same folder (same path) but all of them don't have the same name. The .doc file FROM where I want to move entire text is only one, always in the same folder (same path) and always with the same file name. Names of the target are only similar but as I have said before, not the same. Here is the point of whole script: Target .doc files have the names: HD1.doc HD2.doc HD3.doc HD4.doc and so on

What I would like to have is moved the entire (but really all of the text, must be 100% all) text into the .doc file with the highest ( ! ) number. The target .doc files will always start with ''HD'' and always be similar to above examples. It is possible that the doc file (target file) is only one, so only HD1.doc. Therefore ''1'' is the maximum number and the text is moved into this file. Sometimes the target file is empty but usually won't be. If it won't be then the text should be moved to the end of the text, into first new line (no empty lines inbetween). So for example in the target file which has the maximum number in its name is the following text:

a

b

c

In the file from which I want to move the text is:

d

This means I need in the target file this:

a

b

c

d

But no empty lines anywhere.

I have found (showing three different codes):

http://paste.pocoo.org/show/169309/

But neither of them make any sense to me. I know I would need to begin with finding the correct target file (correct HDX file where X is the highest number - again all HD files are and will be in the same folder) but no idea how to do this.

I meant microsoft office word .doc files. They have "pure text". What I mean with pure text is that Im also able to see them in notepad (.txt). But I need to work with .doc extensions. Python is because I need this as automated system, so I wouldn't even need to open any file. Why exsactly python and not any other programming language? The reason for this is because recently I have started learning python and need this script for my work - Python is the "only" programming language that Im interested for and thats why I would like to make this script with it. By "really 100%" I meant that entire text (everything in source file - every single line, no matter if there are 2 or several thousands) would be moved to correct (which one is correct is described in my first post) target file. I cannot move the whole file because I need to move entire text (everything gathered - source file will be always the same but contest of text will be always different - different words in lines) and not whole file because I need the text in correct .doc file with correct name and together (with "together" i mean inside the same file) with already exsisting text IF is there anything already in the target file. Because its possible that the correct target file is empty also.

If someone could suggest me anything, I would really appreciate it.

Thank you, best wishes.

A: 

yes, i updated my post with additional information.

Andro
This is not an answer.Please delete this non-answer. This is more information that should be added to the question. Please update the question and delete this non-answer.
S.Lott
This is not an answer. This status or something. Please DELETE this non-answer. We don't need status, we need a complete, clear question. Delete this, please.
S.Lott
+1  A: 

So you want to take the text from a doc file, and append it to the end of the text in another doc file. And the problem here is that's MS Word files. It's a proprietary format, and as far as I know there is not module to access them from Python.

But if you are on Windows, you can access them via the COM API, but that's pretty complicated. But look into that. Otehrwise I recommend you to not us MS Word files. The above sounds like some sort of logging facility, and it sounds like a bad idea to use Word files for this, it's too fragile.

Lennart Regebro
Indeed, using simple text files would be so easy, if formating is not crucial.
Morlock
A: 

Yes Lennart, I have heard about COM API already but honestly, I don't know what to do here. Im not even a programmer at all - just began learning. I have been trying to do this script for a while and got tired (but that doesn't mean im not discovering the solution anymore) with nothing figured out. I already have one script that works - it does completely different task but it successfuly writtes inside the .doc file. Unfortunately for me, it doesn't move any text anywhere. Would it be helpful if I upload the code somewhere?

Andro
This is not an answer. Please delete this non-answer. This appears to be a comment on another answer. This may also be information that should be added to the question. Please update the question (or comment on another answer) and DELETE this non-answer. Please do not replace this with "status". DELETE it, please.
S.Lott
+3  A: 

Openoffice ships with full python scripting support, have a look: http://wiki.services.openoffice.org/wiki/Python

Might be easier than trying to mess around with MS Word and COM apis.

anteatersa
Thanks for the nice pointer!
Morlock
A: 

I have tried to ask on openoffice forum but they don't answer. Seen the code could be something like this:

  from time import sleep
  import win32com.client
  from win32com.client import Dispatch   
  wordApp = win32com.client.Dispatch('Word.Application')
  wordApp.Visible=False
  wordApp.Documents.Open('C:\\test.doc')
  sleep(5)

  HD1 = wordApp.Documents.Open('C:\\test.doc') #HD1 word document as object.
  HD1.Content.Select.Copy() #Selects entire document and copies it. `

But I have no idea what does that mean. Also I cannot use the .doc file like that because I never know what is the correct filename (HDX.doc where X is maximum integer number, all HD are in same directory path) of the file and therefore I cannot use its name - the script should find the correct file. Also ''filename'' = wordApp.Documents.open... would for sure give me syntax error. :-(

Andro
A: 

Hello? I haven't got any help on openoffice's forum, so if anyone could please help with this issue...

Andro