+2  Q: 

The C vs. C++ way


So I have to write a program that will iterate through an image and record the pixel locations corresponding to each color pixel that appears in it. For example, given

I need to find the coordinates of all yellow, purple, dark pink, etc pixels.

In C++ I would use a hash table to do this. I would iterate through the image, check each pixel's value, look up that value and either add to a vector of pixel coordinates if it were found or add a new entry to the table if the value were not already there.

The problem is that I may need to write this program in pure C instead of C++. How would I go about doing this in C? I feel like implementing a hash table would be pretty obnoxious and error-prone: should I avoid doing that?

I'm pretty inexperienced with C and have a fair amount of C++ experience, if that matters.



You should be able to find well tested implementations of hash tables and linked lists for C.

Just out of curiosity, why would you be required to use that language?

+1  A: 

You might consider using uthash, which will allow you to store structures within a hash table. Then just define a struct to store pixel/count information and you are good to go.

Or you could roll your own HashTable in C, although it would be mostly as a learning exercise.

Justin Ethier
Thanks, uthash looks pretty good for this.
+6  A: 

There is no algorithm/datastructure you can implement in C++ that you can't implement in C. Sometimes it's arguably more elegant in C++, but it is never impossible in C.

Here are some C hash table implementations:

You may also be interested in this OOP in C vs C++ link:

In general, where you'd use classes in C++, you can use structs+functions in C.

Well...I doubt you'd have much luck implementing a compile-time dimensional analysis typing system in C.
Noah Roberts
@Noah: That's true. However in my answer I stated there is no _algorithm/datastructure_ you can implement in C++ and not C; I didn't say there's _nothing at all_ you can do in C++ and not C. There's no reason you couldn't implement dimensional analysis typing in C, so long as it did its checking at runtime.
Ergo there is an *algorithm/datastructure* that C++ can do that C can't.
Noah Roberts
@Noah: I disagree. An algorithm refers to a method for solving a problem, and a datastructure refers to a structure which organizes/holds data. Algorithms and datastructures used in a runtime implementation of a system could be the same as those used in a compiletime implementation.
That's certainly an interesting way of looking at it since a data structure that manipulates data during compilation behaves inherently different from one that does so during runtime. Further, an algorithm that does some or all of its manipulations before a program is ever run is also inherently different from one that does so after. In other words, the method is not the same and cannot be made the same.
Noah Roberts
@Noha: there's no inherent *theoretical* difference between compile-time and run-time with regard to algorithms - it's an implementation detail; think about using a preprocessor (templates or macros) to create a linked list structure for different types vs. using `void *` or even implementing your own dynamic type systen; although the performance characteristics will be different, all implementations will still be linked lists
A Turing machine can do anything C++ can do too - not that I'd advocate trying it.
Mark Ransom
@Christoph: +1 for "implementation detail". That's a good way of describing it.

For standard functions check the search.h header file and functions it provides. They can be cumbersome to use, but sometimes nevertheless are useful.

In particular: man hsearch and man tsearch


The answers to this SO question have pointers to a lot of options for container libraries in C that you might find useful:

Michael Burr

Try this.

Allocate a second image, with each element having the size of one coordinate. Allocate another array with the size of the maximum number of colors. Initialize that array to invalid coordinates (say, (-1,-1) ). Iterate through the image. For each color you find, find the corresponding entry in the array. Store the value in the array at the corresponding position in the second image, and store the current coordinate at that place in the array.

Once you're done, your array will form a faux-linked-list to pixel coordinates.

(Don't know if my explanation makes sense, complain if it doesn't).

+5  A: 

Here's an alternate algorithm that uses a bit of extra memory, but is easy to implement.

Go through the image and for each pixel, add it to a new array along with its coordinates. Where the pixel value is (R,G,B) put the values (R,G,B,X,Y) into the new array.

Use qsort to sort the new array. Now all pixels of the same color will be grouped together.

Mark Ransom
actually, this will most likely cosume less memory than any hashtable-based solution
+1 Nice. And counting bits for R,G,B,X,Y: 8+8+8+12+12, typically, which is less than 64, so this is a very neat solution.
Joseph Quinsey
Wouldn't an insertion sort be better since you can just add new colors in order as you find them?
Judge Maygarden
@Judge: using insertion sort would only be of benefit if you didn't store the color value with each coordinate pair, eg if you'd build a linked list of coordinates for each color; imo, this just adds unnecessary complexity - it's easier and most likely more efficient to do a linear scan of the sorted array after the `qsort()`