How can I filter a string in c? I want to remove anything that isn't [a-z0-9_]
.
int main(int argc, char ** argv) {
char* name = argv[1];
// remove anything that isn't [a-z0-9_]
printf("%s", name);
}
How can I filter a string in c? I want to remove anything that isn't [a-z0-9_]
.
int main(int argc, char ** argv) {
char* name = argv[1];
// remove anything that isn't [a-z0-9_]
printf("%s", name);
}
The C standard library doesn't supply any support for Regular Expressions.
You'll either need to download a RegEx library in C (a very common one is PCRE), or do this in a loop (easier in the case at hand, since the expression sought are all single characters, hence no backtracking).
The loop approach would look something like:
int main(int argc, char ** argv) {
char* name = argv[1];
// remove anything that isn't [a-z0-9_]
char strippedName[200];
int iIn, iOut; // subscript in Name and StrippedName respectively
iIn = iOut = 0;
while (name[iIn] != '\0' && iOut < (sizeof(strippedName) + 1)) {
// some condition defining a desirable character
// BTW, this condition should actually be
// if (islower(name[iIn]) || isdigit(name[iIn] || name[iIn] == '_')
// to match the OP's requirement exactly
if (isalnum(name[iIn]) || name[iIn] == '_')
strippedName[iOut++] = name[iIn];
iIn++;
}
strippedName[iOut++] = '\0';
printf("%s", strippedName);
}
Additional Regular expressions in the C language (other than PCRE mentioned earlier):
char *src, *dst;
for (src = name, dst = name; *src; src++) {
if ('a' <= *src && *src <= 'z'
|| '0' <= *src && *src <= '9'
|| *src == '_') *dst++ = *src;
}
*dst = '\0';
EDIT: Multiple small revisions. I hope to have the bugs out now.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char ** argv)
{
char *name, *inp, *outp;
if (argc < 2)
{
fprintf(stderr, "Insufficient arguments.\n");
return 1;
}
inp = argv[1];
name = malloc(strlen(inp) + 1);
outp = name;
if (!name)
{
fprintf(stderr, "Out of memory.\n");
return 2;
}
while (*inp)
{
if (islower((unsigned char)*inp) || isdigit((unsigned char)*inp) || *inp == '_')
*outp++ = *inp;
inp++;
}
*outp = '\0';
puts(name);
free(name);
return 0;
}
If you just want to strip those unwanted characters out of the first argument, there's no need for memory allocation, just walk through the input string character-by-character. And, if you know you'll be working in an ASCII environment (or any other that supports contiguous a
through z
), you could even replace the function calls with faster versions checking the character ranges.
But, I can't see the increase in speed as being enough to justify non-portable code.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char ** argv) {
int i;
char *p;
if (argc > 1) {
for (p = argv[1]; *p != '\0'; p++) {
if (islower(*p) || isdigit(*p) || *p == '_') {
putchar (*p);
}
}
putchar ('\n');
}
return 0;
}