Sentence boundary disambiguation (SBD) is a central problem in the field of NLP. Unfortunately, those I've found and used in the past aren't in C (as it's not the favourite language for string based tasks, unless speed is a major issue)
Pipeline
If at all possible I'd create a simple pipeline - if on a Unix system this shouldn't be a problem, but even if you're on Windows with a scripting language you should be able to fill in the gaps. This means that the SBD can be the best tool for the job, not merely the only SBD you could find for language Z. For example,
./pdfconvert | SBD | my_C_tool > ...
This is the standard way we do things in my work, and unless you have more strict requirements than you've stated it should be fine.
Tools
In regards to the tools you can use,
- I'd suggest MXTERMINATOR, which is a SBD tool using Maximum Entropy modelling, as my supervisors used it in their own work recently. According to them it did miss a few sentence splits, but that was easily fixed by a sed script. They were doing SBD on astronomical papers. The main site appears down at the moment, but there is an FTP mirror available here.
- OpenNLP have a reimplementation of the above algorithm using Maximum Entropy modelling in Java (JavaDoc) and is more up to date with a seemingly stronger community behind it.
- Sentrick and many others exist also. For more there is an older list here that may be of use.
Models and Training
Now, some of these tools may give you good results out of the box, but some may not. OpenNLP includes a model for English sentence detection out of the box, which may work for you. If your domain is significantly different to the one which the tools were trained on they may not perform well however. For example, if they were trained on newspaper text they may be very good at that task but horrible at letters.
As such, you may want to train the SBD tool by giving it examples. Each of the tools should document this process, but I will warn you, it may be a bit of work. It would require you running the tool on document X, going through and manually fixing any incorrect splits and giving the correctly split document X back to the tool to train on. Depending on the sizes of the documents and the tool involved you may need to do this for one or a hundred documents until you get a reasonable result.
Good luck, and if you've any questions feel free to ask.