ansaurus

Question

How can I delete all /* */ comments from a C source file?

Answer 1

+11 A:

See perlfaq6. It's quite a complex scenario.

$/ = undef;
$_ = <>;
s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse;
print;

A word of warning - once you've done this, do you have a test scenario to prove to yourself that you've just removed the comments and nothing valuable ? If you're running such a powerful regexp I'd ensure some sort of test (even if you simply record the behaviour before/afterwards).

Brian Agnew 2009-11-11 11:19:20

Just check that the binaries created by compiling are identical (modulo timestamps or other build identification).

ephemient 2009-11-11 14:34:54

That may well be the simplest solution

Brian Agnew 2009-11-11 14:44:01

Agreed, I would never do this on code I cared about unless I had unit tests in place to verify its correctness after filtering it.

Ether 2009-11-11 17:48:36

Answer 2

A:

very simplistic example using gawk. Please test a lot of times before implementing. Of course it doesn't take care of the other comment style // (in C++??)

$ more file
int matrix[20];
/* generate data */
for (index = 0 ;index < 20; index++)
matrix[index] = index + 1;
/* print original data */
for (index = 0; index < 5 ;index++)
/*
function(){
 blah blah
}
*/
float a;
float b;

$ awk -vRS='*/' '{ gsub(/\/\*.*/,"")}1' file
int matrix[20];


for (index = 0 ;index < 20; index++)
matrix[index] = index + 1;


for (index = 0; index < 5 ;index++)


float a;
float b;

ghostdog74 2009-11-11 11:22:30

for some reason this is not working on my machine:( `cat testint matrix[20];/* generate data */for (index = 0 ;index < 20; index++)matrix[index] = index + 1;/* print original data */` and the output is `awk -vRS='*/' '{ gsub(/\/\*.*/,"")}1' testint matrix[20];/ generate data/for (index = 0 ;index < 20; index++)matrix[index] = index + 1;/ print original data/`

Vijay Sarathi 2009-11-11 11:46:09

i already indicated, using gawk. do you have gawk?

ghostdog74 2009-11-11 11:50:54

sorry, the comment is so messed up, i didn't notice you have output. Well, it worked for me. I see you still have /generate data/ and /print original data/. As you can see from my output, it works for me.

ghostdog74 2009-11-11 11:53:30

if you still can't get it to work, there's the perl solution below you can try

ghostdog74 2009-11-11 11:54:53

Answer 3

+2 A:

Try this on the command line (replacing 'file-names' with the list of file that need to be processed):

perl -i -wpe 'BEGIN{undef $/} s!/\*.*?\*/!!sg' file-names

This program changes the files in-place (overwriting the original file with the corrected output). If you just want the output without changing the original files, omit the '-i' switch.

Explanation:

perl -- call the perl interpreter
-i      switch to 'change-in-place' mode.
-w      print warnings to STDOUT (if there are any)
 p      read the files and print $_ for each record; like while(<>){ ...; print $_;}
 e      process the following argument as a program (once for each input record)

BEGIN{undef $/} --- process whole files instead of individual lines.
s!      search and replace ...
  /\*     the starting /* marker
  .*?     followed by any text (not gredy search)
  \*/     followed by the */ marker
!!      replace by the empty string (i.e. remove comments)  
  s     treat newline characters \n like normal characters (remove multi-line comments)
   g    repeat as necessary to process all comments.

file-names   list of files to be processed.

Yaakov Belch 2009-11-11 11:25:22

See the perlfaq to understand why this is so very wrong.

brian d foy 2009-11-11 18:15:47

@brian Accepted: This is only an approximate solution.

Yaakov Belch 2009-11-12 10:16:44

Answer 4

+6 A:

Take a look at the strip_comments routine in Inline::Filters:

sub strip_comments {
    my ($txt, $opn, $cls, @quotes) = @_;
    my $i = -1;
    while (++$i < length $txt) {
    my $closer;
        if (grep {my $r=substr($txt,$i,length($_)) eq $_; $closer=$_ if $r; $r}
        @quotes) {
        $i = skip_quoted($txt, $i, $closer);
        next;
        }
        if (substr($txt, $i, length($opn)) eq $opn) {
        my $e = index($txt, $cls, $i) + length($cls);
        substr($txt, $i, $e-$i) =~ s/[^\n]/ /g;
        $i--;
        next;
        }
    }
    return $txt;
}

Sinan Ünür 2009-11-11 12:14:11

Answer 5

+4 A:

Consider:

printf("... /* ...");
int matrix[20];
printf("... */ ...");

In other words: I wouldn't use regex for this task, unless you're doing a replace-once and are positive that the above does not occur.

Bart Kiers 2009-11-11 14:27:55

Answer 6

+21 A:

Why not just use the c preprocessor to do this? Why are you confining yourself to a home-grown regex?

[Edit] This approach also handles Barts printf(".../*...") scenario cleanly

Example:

[File: t.c]
/* This is a comment */
int main () {
    /* 
     * This
     * is 
     * a
     * multiline
     * comment
     */
    int f = 42;
    /*
     * More comments
     */
    return 0;
}

.

$ cpp -P t.c
int main () {







    int f = 42;



    return 0;
}

Or you can remove the whitespace and condense everything

$ cpp -P t.c | egrep -v "^[ \t]*$"
int main () {
    int f = 42;
    return 0;
}

No use re-inventing the wheel, is there?

[Edit] If you want to not expand included files and macroa by this approach, cpp provides flags for this. Consider:

[File: t.c]

#include <stdio.h>
int main () {
    int f = 42;
    printf("   /*  ");
    printf("   */  ");
    return 0;
}

.

$ cpp -P -fpreprocessed t.c | grep -v "^[ \t]*$"
#include <stdio.h>
int main () {
    int f = 42;
    printf("   /*  ");
    printf("   */  ");
    return 0;
}

There is a slight caveat in that macro expansion can be avoided, but the original definition of the macro is stripped from the source.

ezpz 2009-11-11 14:38:06

Yes, this is what I´d use!

Bart Kiers 2009-11-11 14:58:56

The preprocessor has a (potentially indesirable) "side-effect" : it also processes macros, includes included files, and so on...

RaphaelSP 2009-11-11 14:59:57

You can get rid of macro expansion by `-fpreprocessed`. I'll update to mention this

ezpz 2009-11-11 15:04:36

-1 again. That is not a *slight* caveat if you expect the source code to compile after removing comments.

Sinan Ünür 2009-11-11 15:16:07

This caveat can be fixed: perl -wpe 's/^\s*#define/#include#define/' your-file.c |cpp -P - -fpreprocessed|perl -wpe 's/#include#define/#include/ ---- this turns any #defines into (somewhat invalid) #includes that pass through the preprocessor, to be converted back to correct #defines later. (If you agree, please add this to the answer itself).

Yaakov Belch 2009-11-12 10:15:08

Answer 7

+4 A:

Please do not use cpp for this unless you understand the ramifications:

$ cat t.c
#include <stdio.h>

#define MSG "Hello World"

int main(void) {
    /* ANNOY: print MSG using the puts function */
    puts(MSG);
    return 0;
}

Now, let's run it through cpp:

$ cpp -P t.c -fpreprocessed


#include <stdio.h>



int main(void) {


    puts(MSG);
    return 0;
}

Clearly, this file is no longer going to compile.

Sinan Ünür 2009-11-11 15:04:45

well, not after you add the `-fpreprocessed` flag, anyway

Hasturkun 2009-11-11 18:27:30

@Hasturkun and if you don't add -fpreprocessed, `#include <stdio.h>` will be expanded.

Sinan Ünür 2009-11-11 20:50:53

I tried this: perl -wpe 's/^\s*#define/#include#define/' your-file.c |cpp -P - -fpreprocessed|perl -wpe 's/#include#define/#include/ ---- this turns any #defines into (somewhat invalid) #includes that pass through the preprocessor, to be converted back to correct #defines later.

Yaakov Belch 2009-11-12 10:11:32

ansaurus

tags:

views:

answers:

How can I delete all /* */ comments from a C source file?

related questions