views:

232

answers:

3

(Followup to this question)

After surviving the first wave of incoming shipments (9 hours of copy/paste), I now believe I have all the requirements.

Here is the updated workflow:

  • Monkey collects email attachments (4 Excel spreadsheets, 1 PDF)
  • Monkey creates central database, does complex calculations (right now this is also an Excel spreadsheet)
  • Monkey sends data to two bosses, who set the retail prices independently; first one to reply wins
  • Monkey sends order form to our other warehouses, also Excel
  • Monkey sends spreadsheets to VIP customers, carefully sanitized and formatted (4 different discount categories)
  • Jurily enters the data into the accounting system. I've given up on automating this part, there's too much business logic involved, and the database is a pile of sh^W legacy

My question: What technologies would you use for a quick and dirty solution? I'm mostly sold on C#, but coming from a Linux/C++ background, I'm horribly confused about my choices in Microsoft-land.

For bonus points: How would you redesign the whole system from the ground up?

Clarification: I'm looking for basically anything that has the potential to get me reading the right things, just give me the keywords and a short description. Google will guide me from there.

P.S. in case you were wondering, my job title is System Administrator.

+2  A: 

C#. Linus may b nice, but let me assume you run windows in the company anyway. THe main problem will be (1) (the attachments, PDF) - anything you do here will rely on them having a specific form, and I bet there is a monkey on the other end sending them that rgegularly changes the layout ;)

Anyhow, I would go with C# 4.0 totally. Depending on what you use server side... i may even bypass excel and go exchange direct to read the emails.

TomTom
+1 I didn't think of setting up an Exchange server, thank you.
Jurily
As far as I know, all the invoices are computer-generated; at least there's no structural variance between last month's invoices. Even if they're not, fixing the parser still sounds much better than doing the work by hand. As for the PDF, I'm hoping I can just ask the people on the other end to send us a spreadsheet as well.
Jurily
+2  A: 

You can interact with excel via Com Interop. See this article for how to do this with C#. This is an ugly solution in the sense that you must run it on a system that has Excel. You are not creating spreadsheets, you are programmatically telling Excel to create spreadsheets. This has its own advantages and disadvantages.

It's reasonably easy to create spreadsheets yourself (Excel can open spreadsheetml files), though this makes it tougher in some instances; the easiest way to generate an xls file is to tell Excel to save one (treating excel as a database is pretty easy too, but that doesn't give you access to all Excel functionality).

Brian
Note: I consider Com Interop to be the quick and dirty solution.
Brian
"I consider Com Interop to be the quick and dirty solution", yes, but it can grow from there. You can use automation via C# interop to control Outlook to get your email attachments, then open them within Excel, do what you need to there, etc. Databases can be controlled via ADO.NET. The PDFs are an issue, but all the Microsoft Office products can be automated quite nicely, esp. if using C# 4.0 on VS 2010. Learning the object models will take some work, but the VBA macro recorder (at least for Excel, I don't know about Outlook) can really help out with this.
Mike Rosenblum
Not quick and dirty - the PIA (Primary Interop Asemblies) Are the defines interface for using .NET with Office. Sort of the official way ;)
TomTom
@TomTom/Mike: I agree that Com Interop is the official way to interact with Excel, but I consider interacting with Excel to be the quick and dirty way to create Excel files, since it tends to be slower, requires Excel, and often random instances of Excel get left behind stupidly deciding not to close. Still, as a tool to run locally to automate work it should be perfectly fine nonetheless. A slow but nice way would probably be spreadsheetml. As an aside, turning XML files into SpreadsheetML files via XSLT is incredibly easy.
Brian
@Brian: after reading [this article](http://www.joelonsoftware.com/items/2008/02/19.html), and having bad experiences with OpenOffice in the past, I'm not quite convinced there _is_ another way to create Excel files. At least, not with the kind of reliability I need. These things are sent out to our biggest customers.
Jurily
@Jurily: The two ways I mention in my post both worked when I tested them, but both have disadvantages. SpreadsheetML works really beautifully, but requires Excel 2003 or greater. In my own experience, 2007 was more reliable; Excel 2003 sometimes choked on large SpreadsheetML files. Treating Excel as a database (via ADO) is also very reliable, and I have used it with great success. However, ADO is for databases, so you're just sticking in data directly. If you need formulas, this might be insufficient. Sometimes you can fake it by pre-writing an excel file and inserting into that.
Brian
@Jurily: I should point out that the reason both the methods I mention are in any way successful is that they are blessed by MS. SpreadsheetML is complicated enough that creating an application which reliably reads them when created by Excel is probably tough, but that isn't the problem you are trying to solve. As for ADO, pretty much all it lets you do is insert raw rows into your Excel worksheets. But you can get .xls files as output in case your clients don't believe in buying new versions of Office.
Brian
A: 

actually I would do this in Outlook and Excel VBA, and in Access. Don't see why can't scrape the data out of the pdf

alastair