Understand, almost any answer to this question will depend on the constraints of the doc files you are using...
That being said, in my mind the first option if you are going to do this would be to convert them to a more easily parsed format - RTF is a great example, and if you can get them into this format the RTF Pocket Guide from O Reilly is a GREAT resource for understanding the structure of the files. To convert the files is pretty simple if you can install abiword on the Linux machine. From a command line, you'd just run:
abiword --to=rtf some_file_name.doc
Of course, in Ruby you'd just wrap these commands.
It's the merging that is more complicated -- it will depend on your files. You'll have to make some programmer decisions about whether you're going to combine the stylesheets in each individual doc, the font tables, etc, etc, etc. The content just sits in the middle of that rtf file, but it's all the semantic and style data that you'll have to make choices about. There is no 'one way' here, simply because it depends on what you want on the other side. Here is wher ethe RTF Pocket Guide is a great help - basically you'll want to use it to understand the structure of your rtf's, and decide what you do and don't want.
Otherwise, if you just want the content with NONE of the semantics, you could always convert them to txt files, then concat them. The command is very similar:
abiword --to=txt some_file_name.doc
This is dead simple, it will just split out the text, and you can concat it and be done with it. But again, you'll lose ALL the formatting of any sort.