views:

221

answers:

6
+2  Q: 

C-Strings in C++

I'm learning C++ for one of my CS classes, and for our first project I need to parse some URL's using c-strings (i.e. I can't use the C++ String class). The only way I can think of approaching this is just iterating through (since it's a char[]) and using some switch statements. From someone who is more experienced in C++ - is there a better approach? Could you maybe point me to a good online resource? I haven't found one yet. Thanks.

+6  A: 

Weird that you're not allowed to use C++ language features i.e. C++ strings!

There are some C string functions available in the standard C library.

e.g.

strdup - duplicate a string
strtok - breaking a string into tokens. Beware - this modifies the original string.
strcpy - copying string
strstr - find string in string
strncpy - copy up to n bytes of string
etc

There is a good online reference here with a full list of available c string functions for searching and finding things.

http://www.cplusplus.com/reference/clibrary/cstring/

You can walk through strings by accessing them like an array if you need to.

e.g.

char* url="http://stackoverflow.com/questions/1370870/c-strings-in-c"
int len = strlen(url);
for (int i = 0; i < len; ++i){
  std::cout << url[i];
}
std::cout << endl;

As for actually how to do the parsing, you'll have to work that out on your own. It is an assignment after all.

Matt H
`strdup` is not in the standard library, it defined by POSIX.
Evan Teran
If he doesn't have strdup(), it would be a nice little part of the assignment to provide it. Bootstraps!
Michael Burr
A: 

You can use C functions like strtok, strchr, strstr etc.

Dmitriy
+2  A: 

You might want to refer to an open source library that can parse URLs (as a reference for how others have done it -- obviously don't copy and paste it!), such as curl or wget (links are directly to their url parsing files).

Mark Rushakoff
For some reason I doubt that what his instructor is looking for.
Michael Burr
@Michael: I thought the same as you until I realized he might mean for the questioner to use the source for ideas.
Sean Nyman
Fair enough... Now I wonder if someone who's unaware of C library basics will be able to keep his head from asploding reading through that code?
Michael Burr
+5  A: 

There are a number of C standard library functions that can help you.

First, look at the C standard library function strtok. This allows you to retrieve parts of a C string separated by certain delimiters. For example, you could tokenize with the delimiter / to get the protocol, domain, and then the file path. You could tokenize the domain with delimiter . to get the subdomain(s), second level domain, and top level domain. Etc.

It's not nearly as powerful as a regular expression parser, which is what you would really want for parsing URLs, but it works on C strings, is part of the C standard library and is probably OK to use in your assignment.

Other C standard library functions that may help:

  • strstr() Extracts substrings just like std::string::substr()
  • strspn(), strchr() and strpbrk() Find a character or characters in a string, similar to std::string::find_first_of(), etc.

Edit: A reminder that the proper way to use these functions in C++ is to include <cstring> and use them in the std:: namespace, e.g. std::strtok().

Tyler McHenry
strtok is pretty nasty since it modifies the string. I am a big fan of const so I'd recommend avoiding strtok.
Zan Lynx
IMO, strtok is quite useful and a lot less painful than hand-coding everything when it comes to parsing strings using only the C standard library. But yes, you do have to beware of its gotchas including the string modification and its non-reentrancy (although POSIX provides a re-entrant version called strtok_r)
Tyler McHenry
+1  A: 

I don't know what the requirements are for parsing the URLs, but if this is CS level it would be appropriate to use (very simple) BNF and a (very simple) recursive descent parser.

This would make for a more robust solution than direct iteration, e.g. for malformed URLs.

Very few string functions from the standard C library would be needed.

Peter Mortensen
A: 

Many of the runtime library functions that have been mentioned work quite well, either in conjunction with or apart from the approach of iterating through the string that you mentioned (which I think is time honored).

John Lockwood