views:

405

answers:

8

What methods, practices and conventions do you know of to modularize C code as a project grows in size?

A: 

There are directories and files, but no namespaces or encapsulation. You can compile each module to a separate obj file, and link them together (as libraries).

Scott Whitlock
what about creating DLLs containing related functionality?
Mitch Wheat
A DLL is a type of library.
Michael Aaron Safyan
+1  A: 

Breaking the code up into libraries of related functions is one way of keeping things organized. To avoid name conflicts you can also use prefixes to allow you to reuse function names, though with good names I've never really found this to be much of a problem. For example, if you wanted to develop your own math routines but still use some from the standard math library, you could prefix yours with some string: xyz_sin(), xyz_cos().

Generally I prefer the one function (or set of closely related functions) per file and one header file per source file convention. Breaking files into directories, where each directory represents a separate library is also a good idea. You'd generally have a system of makefiles or build files that would allow you to build all or part of the entire system following the hierarchy representing the various libraries/programs.

tvanfosson
+1, but please use a single Makefile that recursively #includes Makefile.inc files in subdirs, since the traditional recursive makefile can subtly break build dependencies (google "Recursive Makefiles Considered Harmful")
j_random_hacker
+11  A: 

Create header files which contain ONLY what is necessary to use a module. In the corresponding .c file(s), make anything not meant to be visible outside (e.g. helper functions) static. Use prefixes on the names of everything externally visible to help avoid namespace collisions. (If a module spans multiple files, things become harder., as you may need to expose internal things and not be able hide them with "static")

(If I were to try to improve C, one thing I would do is make "static" the default scoping of functions. If you wanted something visible outside, you'd have to mark it with "export" or "global" or something similar.)

smcameron
+1. Using static functions for "private" behaviour limits the potential for unnecessary coupling.
j_random_hacker
Make sure that the header works in isolation by using it as the first header listed in the implementation file. If that doesn't work, the header is incomplete.
Jonathan Leffler
Breton
maybe that's just for variables and not for functions..
Breton
I never could understand how static data storage could effect data encapsulation, in C.
Luca Matteis
@Luca: Static data "works" for encapsulation provided you only ever want a single instance of that data. (But if your project/library is successful, you'll eventually find some occasion when more than one instance is needed... so it's better to avoid this path at the outset, and go with opaque pointer handles instead.)
j_random_hacker
@j_random_hacker: isn't your comment true of any file-scope variable, regardless of whether it's static or not?
Steve Melnikoff
Mostly, I meant use static with function prototypes. A library should have as little data of it's own as is possible, pushing that instead into the client of the library, unless there's a good reason not to (e.g. hardware queue that can have only one consumer, and your library is it, and fans the data out to its clients, or something like that.)
smcameron
@Steve: Yes, it's a good idea to avoid both global ("public") and static ("module-private") variables, because in both cases a later decision to allow multiple "instances" may require changes to the interface exposed by the module. Static variables are still safer (and therefore arguably more justifiable) as external code cannot depend on them by name, so changes to them inside the module are less likely to break external code.
j_random_hacker
+8  A: 

OO techniques can be applied to C code, they just require more discipline.

  • Use opaque handles to operate on objects. One good example of how this is done is the stdio library -- everything is organised around the opaque FILE* handle. Many successful libraries are organised around this principle (e.g. zlib, apr)
  • Because all members of structs are implicitly public in C, you need a convention + programmer discipline to enforce the useful technique of information hiding. Pick a simple, automatically checkable convention such as "private members end with '_'".
  • Interfaces can be implemented using arrays of pointers to functions. Certainly this requires more work than in languages like C++ that provide in-language support, but it can nevertheless be done in C.
j_random_hacker
opaque handles is a good one. And though structs are implicityly public, only if you put them in the header. You can say, in a user header for a library, for example:[code]struct opaque_foo;extern int my_func(struct opaque_foo *f);[/code]The library's internal code would of course spell out what was in opaque_foo, but no need to expose that to the code that's using the library.
smcameron
+2  A: 

The approach that Pidgin (formerly Gaim) uses is they created a Plugin struct. Each plugin populates a struct with callbacks for initialization and teardown, along with a bunch of other descriptive information. Pretty much everything except the struct is declared as static, so only the Plugin struct is exposed for linking.

Then, to handle loose coupling of the plugin communicating with the rest of the app (since it'd be nice if it did something between setup and teardown), they have a signaling system. Plugins can register callbacks to be called when specific signals (not standard C signals, but a custom extensible kind [identified by string, rather than set codes]) are issued by any part of the app (including another plugin). They can also issue signals themselves.

This seems to work well in practice - different plugins can build upon each other, but the coupling is fairly loose - no direct invocation of functions, everything's through the signaling stystem.

rampion
+1. This is a very extensible way to design a system. Interfaces (tables of callbacks) keep coupling to a minimum, making it easier to swap in new/improved implementations for different components *separately* in the future.
j_random_hacker
+3  A: 

The High and Low-Level C article contains a lot of good tips. Especially, take a look at the "Classes and objects" section.

Standards and Style for Coding in ANSI C also contains good advice of which you can pick and choose.

Judge Maygarden
+1 for the articles!
cschol
+3  A: 
  1. Don't define variables in header files; instead, define the variable in the source file and add an extern statement (declaration) in the header. This will tie into #2 and #3.
  2. Use an include guard on every header. This will save so many headaches.
  3. Assuming you've done #1 and #2, include everything you need (but only what you need) for a certain file in that file. Don't depend on the order of how the compiler expands your include directives.
Ryan Fox
+1 for #include guards.
j_random_hacker
smcameron
+2  A: 

A function should do one thing and do this one thing well.

Lots of little function used by bigger wrapper functions help to structure code from small, easy to understand (and test!) building blocks.

Create small modules with a couple of functions each. Only expose what you must, keep anything else static inside of the module. Link small modules together with their .h interface files.

Provide Getter and Setter functions for access to static file scope variables in your module. That way, the variables are only actually written to in one place. This helps also tracing access to these static variables using a breakpoint in the function and the call stack.

One important rule when designing modular code is: Don't try to optimize unless you have to. Lots of small functions usually yield cleaner, well structured code and the additional function call overhead might be worth it.

I always try to keep variables at their narrowest scope, also within functions. For example, indices of for loops usually can be kept at block scope and don't need to be exposed at the entire function level. C is not as flexible as C++ with the "define it where you use it" but it's workable.

cschol