I have a URL like this:
http://192.168.0.1:8080/servlet/rece
I want to parse the URL to get the values:
IP: 192.168.0.1
Port: 8080
page: /servlet/rece
How do I do that?
I have a URL like this:
http://192.168.0.1:8080/servlet/rece
I want to parse the URL to get the values:
IP: 192.168.0.1
Port: 8080
page: /servlet/rece
How do I do that?
Write a custom parser or use one of the string replace functions to replace the separator ':' and then use sscanf()
.
With a regular expression if you want the easy way. Otherwise use FLEX/BISON.
You could also use a URI parsing library
I writed a simple code use sscanf. I want have a base way to parsing it.
cat urlparse.c
#include <stdio.h>
int main(void)
{
const char text[] = "http://192.168.0.2:8888/servlet/rece";
char ip[100];
int port = 80;
char page[100];
sscanf(text, "http://%99[^:]:%99d/%99[^\n]", ip, &port, page);
printf("ip = \"%s\"\n", ip);
printf("port = \"%d\"\n", port);
printf("page = \"%s\"\n", page);
return 0;
}
./urlparse
ip = "192.168.0.2"
port = "8888"
page = "servlet/rece"
Personnally, I steal the HTParse.c
module from the W3C (it is used in the lynx Web browser, for instance). Then, you can do things like:
strncpy(hostname, HTParse(url, "", PARSE_HOST), size)
The important thing about using a well-established and debugged library is that you do not fall into the typical traps of URL parsing (many regexps fail when the host is an IP address, for instance, specially an IPv6 one).
If you're using the CLR, you may want to consider using the System.Uri class. I don't know C, so here's an example in C#:
using System;
class Program
{
static void Main(string[] args)
{
string url = "http://192.168.0.1:8080/servlet/rece";
Console.WriteLine("Original URL: {0}", url);
Console.WriteLine();
Uri uri = new Uri(url);
Console.WriteLine("Server: {0}", uri.Host);
Console.WriteLine("Port: {0}", uri.Port);
Console.WriteLine("Page: {0}", uri.PathAndQuery);
Console.ReadLine();
}
}
It produces this output:
Original URL: http://192.168.0.1:8080/servlet/rece
Server: 192.168.0.1
Port: 8080
Page: /servlet/rece
If you're looking for a standard-compliant, performant way for parsing a URL or URI you can have a look at some code from a PHP class provided here:
http://andreas-hahn.com/en/parse-url
It is able to parse URLs as well as parse URIs, URNs and even IRIs according to RFC 3986 and RFC 3987.