views:

200

answers:

3

Hi all,

I am faced with the task of building a new component to be integrated into a large existing C codebase. The component is essentially a kind of compiler, and will be complicated enough that I would like to write it in OCaml (for reasons along the lines of those given here). I know that OCaml-C interaction is possible (as per the manual and this tutorial), but it looks somewhat painful.

What I'd like to know is whether others here have attempted large-scale integration of OCaml and C code, what were some of the unexpected gotchas they found, and whether at the end of the day they concluded that they would have been better off just writing the new code in C.

Note, I'm not trying to start a debate about the merits of functional versus imperative programming: let's just say we assume that OCaml happens to be the right tool for the job I have in mind, and the potential difficulty in integration is the only issue. I also don't have the option of rewriting the rest of the codebase.

To give a little more detail about the task: the component I need to implement is a certain kind of query optimizer that incorporates some research ideas my group at UC Davis is working on, and will be integrated into PostgreSQL so that we can run experiments. (A query optimizer is, essentially, a compiler.) The component would be invoked from C code, would function mostly independently but would make a certain number of calls to other PostgreSQL components to retrieve things like system catalog information, and would construct a complex C data structure (representing a physical query plan) as output.

Apologies for the somewhat open-ended question, but I'm hoping the community might be able to save me a little trouble :)

Thanks,

TJ

A: 

My rule of thumb is to stick with the language / model / style used in the existing code-base, so that future maintenance developers inherit a consistent and understandable set of application code.

The only way I could justify something like what you are suggesting would be if:

  1. You are an Expert at OCaml AND a Novice at C (so you'll be 20x as productive)
  2. You have successfully integrated it with a C library before (apparently not)

If you are at all more familiar with C than OCaml, you've just lost any "theoretical" gain from OCaml being easier to use when writing a compiler - plus it seems at though you will have more peers familiar with C around you than OCaml.

That's my "grumpy old coder" 2 cents (which used to only cost a penny!).

Ron Savage
I don't understand why you make the assumption that one must be an expert at OCaml and a novice at C in order to be more productive using OCaml in an area that is precisely its strong suit. If there were a Web component, would you write all of that in C rather than, say, Python+HTML+Javascript?
Chuck
Question: Is Ron Savage's answer actually based on any knowledge/specifics regarding OCaml or even C?
Domingo Galdos
@Chuck - Because getting the feature done in 1 month vs 5 months might make it worth the additional complexity, maintenance headaches and risk of introducing a completely new language into an existing project and team of C developers. 1 month vs 2 months? Not worth it.
Ron Savage
@Domingo - No knowledge of OCaml, knowledge of C and many other languages, plus way too much experience inheriting ramshackle conglomerate code bases with no architectural or design discipline.
Ron Savage
@Ron: I think the general rule of thumb you state is right on for production code, but my project has to do with writing research prototype code, where the maintenance and personnel issues are somewhat less relevant. (My question wasn't super clear about this.)In the best scenario, the prototype goes great, and someone would have to rewrite the code in C to put it in the real codebase. But even if we'd written it in C to begin with, it would probably be better to rewrite it from scratch at that point anyway. ("Plan to throw one away" ...)
tjgreen
@Tjgreen - Yep, that's true - my comments were more geared towards a production app. team. :-)
Ron Savage
@tjgreen; Good logic there. I will often prototype in ocaml, make sure I get the correct results, then work on a C implementation if I plan on doing it. In this way, I have some a method to verify results when bugs do come up in the C version.
nlucaroni
@Ron Understood-- a very good point on the topic of project management; just seemed pretty off-topic to me when the question was about OCaml/C specifics.
Domingo Galdos
@tjgreen: I would be careful about saying "It's prototype code, not production code." Prototype code can turn into production code with very little warning.
TwentyMiles
+7  A: 

Great question. You should be using the better tool for the job.

If in fact your intentions are to use the better tool for the job (and you are sure lexx and yacc are going to be a pain) then I have something to share with you; it's not painful at all to call ocaml from c, and vice versa. Most of the time I've been writing ocaml calling C, but I have written a few the other way. They've mostly been debug functions that don't return a result. Although, the callings back and fourth is really about packing and unpacking the ocaml value type on the C side. That tutorial you mention covers all of that, and very well.

I'm opposed to Ron Savage remarks that you have to be an expert in the language. I recall starting out where I work, and within a few months, without knowing what the fuck a functor was, being able to call C, and writing a thousand lines of C for numerical recipes, and abstract data types, and there were some hiccups (not with unpacking types, but with garbage collection of an abstract data-types), but it wasn't bad at all. Most of the inner loops in the project are written in C --taking advantage of SSE, external libraries (lapack), tighter optimized loops, and some in-lined hand optimized assembly.

I think you might need to be experienced with designing a large project and demarcating functional and imperative sections. I would really assess how much ocaml you are going to be writing, and what kind of values you want to pass to C --I'm saying this because I'd be fearful of recommending to someone to pass a recursive data-structure from ocaml to C, actually, it would be lots of unpacking tuples, their contents, and thus a lot of possibility for confusion and bugs.

nlucaroni
Very good answer. I should strongly point out, that this strongly depend of amount of OCaml work which should be done in comparison with C integration pain (API size). It apply to integrate any new language to any existing project.
Hynek -Pichi- Vychodil
Not only is this a good answer, it's even the one I wanted to hear :)
tjgreen
A: 

I one wrote a reasonably complex OCaml-C hybrid program. I was frustrated by what I found to be inadequate documentation, and I ended up spending too much time dealing with garbage collection issues. However, the resulting program worked and was fast.

I think there is a place for OCaml-C integration, but make sure it is worth the hassle. It might be simpler to have the programs communicate over a socket (assuming such IO operations won't eliminate the performance you want). It might also be more sane to just write the whole thing in C.

Andrew Cone