Honestly? This is incredibly non-trivial.
PDF rendering is done through programs that describe what will be rendered in sequence. There is a graphics state which accumulates the changes that are made by the program as well as marks a page.
There are a number of different ways that colors can be set. Hopefully your PDF documents only use the operators RG and rg which set RGB colors for stroking and non-stroking operations. This means that color operations will be in the form:
rf gf bf RG
where rf, gf, and bf are floating point numbers representing color channel intensities from 0.0 to 1.0.
It would be a matter of rewriting all the RG and rg operators to use K and k, respectively, which will use 4 channel CMYK.
This, in itself, is challenging in that you would have to read in the document/page that you want, parse the content stream and rewrite a new one that will replace the old one (again, possible but not trivial - PDF allows you to replace individual objects like the content stream by appending a new generation onto the file). Don't think about using SED. PDF is file-layout dependent and changing something inline without maintaining the same length will break the PDF.
The real problem will happen if the file uses the CS and cs operators. Consider this sequence of operations:
/DeviceRGB CS 1 0 0 SC 0 0 m 200 200 l S 200 200 m 200 0 l 0 1 0 SC S
This means set the color space to DeviceRGB, set the color to red, move to (0, 0), line to (200, 200), stroke (in red), move to (200, 200), line to (200, 0), set color to green and stroke.
This is not so simple - if you wanted to change RGB red to CMYK yellow, you could do this:
/DeviceCMYK CS 0 0 1 0 SC 0 0 m 200 200 l S 200 200 m 200 0 l 0 1 0 SC S
which will work for yellow, but will break the attempt to set to green since the CS command now demands 4 channels.
What you need to do is interpret the content stream, keeping track of what the current color space is and when a CS command comes in that has the color you want to change, you need to replace that with /DeviceCMYK CS c m y k SC and then the next r g b SC command needs to change to /DeviceRGB CS r g b SC.
This doesn't take into account how to handle ICC based color spaces, grays, LAB, n-channel, colormapped, patterns, forms etc.
PDF was not made for editing.
If I was tasked with making this happen, I would do the following:
- If it was for less than 10 files, I would open them up in Illustrator, change the colors and export in PDF
- If it was for 10 or more and less than 1000, I would hire a temp worker to do what I did in step 1.
- If it was 1000 or more and less than 10000, I would write a program to script Illustrator to make those changes, if possible.
- If it was 10000 or more and ongoing, I'd have a serious talk with management about document production, because if changes like this need to be made on a terminal file format and they can't be regenerated correctly.