tags:

views:

367

answers:

2

In a Ruby web application I want users to be able to upload documents. If the user uploads a Microsoft Word file (.doc) I want Ruby to count the number of pages in the file. It would be even slicker to get the number of words, but the number of pages will do.

How would I do that? Is there a Ruby library/gem that can do that for me? Is it even possible, given the DOC-format?

+2  A: 

In ruby, to open a word file you need to use:

require 'win32ole'
word = WIN32OLE.new('word.application')
word.visible = true
word.documents.count

# open/create new document
word.documents.add

# or open file
word.documents.open(path_to_file)

(source: http://www.ruby-forum.com/topic/99742#214485)

See: http://www.perlmonks.org/?node_id=614609 for an algorithm on getting the proper/expected word count (note: the algo is in perl)

Then:

word.activedocument.close( false )
word.quit
Jonathan Fingland
The win32ole gem only runs on Windows. He's talking about a web application, so the likelihood that he's running this on Windows is vanishingly small. Of course, if there were people crazy enough to try it, they'd be on Stack Overflow...
Sarah Mei
Okay, is there any way to do this while NOT running on Windows?
avdgaag
+3  A: 

Call the ComputeStatistics() method on the document's Range object:

require 'win32ole'

WdStatisticWords = 0
WdStatisticPages = 2

word = WIN32OLE.connect('Word.Application')
doc = word.ActiveDocument

word_count = doc.Range.ComputeStatistics(WdStatisticWords)
page_count = doc.Range.ComputeStatistics(WdStatisticPages)

You'll find various articles on automating Word with Ruby here.

David Mullet
+1 for an excellent ref
Jonathan Fingland