tags:

views:

4488

answers:

9

J2EE has ServletRequest.getParameterValues().

On non-EE platforms, URL.getQuery() simply returns a string.

What's the normal way to properly parse the query string in a URL when not on J2EE?

+3  A: 

For a servlet or a JSP page you can get querystring key/value pairs by using request.getParameter("paramname")

String name = request.getParameter("name");

There are other ways of doing it but that's the way I do it in all the servlets and jsp pages that I create.

ChadNC
HttpServletRequest is part of J2EE which he doesn't have. Also using getParamter() is not really parsing.
Mr. Shiny and New
Please take the time to read the comment in which I asked for clarification of his question. This answer is in response to his answer to that comment in which he stated, "I'm trying to do this on Android, but all answers on all platforms would be useful answers that might give pointers (also to others who might come across this question) so don't hold back!" I answered his question based off of that comment. If you don't have anything useful to add, don't add anything
ChadNC
Don't be too upset. "This doesn't answer the question" is useful to add, IMO.
Mr. Shiny and New
A: 

You say "Java" but "not J2EE". Do you mean you are using JSP and/or servlets but not a full J2EE stack? If that's the case, then you should still have request.getParameter() available to you.

If you mean you are writing Java but you are not writing JSPs nor servlets, or that you're just using Java as your reference point but you're on some other platform that doesn't have built-in parameter parsing ... Wow, that just sounds like an unlikely question, but if so, the principle would be:

xparm=0
word=""
loop
  get next char
  if no char
    exit loop
  if char=='='
    param_name[xparm]=word
    word=""
  else if char=='&'
    param_value[xparm]=word
    word=""
    xparm=xparm+1
  else if char=='%'
    read next two chars
    word=word+interpret the chars as hex digits to make a byte
  else
    word=word+char

(I could write Java code but that would be pointless, because if you have Java available, you can just use request.getParameters.)

Jay
watch out for the character encoding when url-decoding the hex digits.
Mr. Shiny and New
It's Android, hence Java but not J2EE.
Andrzej Doyle
I forgot to mention: You also need to check for "+", which should be translated to a space. Embedded spaces are illegal in a query string.
Jay
+10  A: 

Make use of String#split().

First split on ? to get the query string part from the URL, then split on & to separate the parameters, then split each parameter on = to get the key/value pairs and finally use URLDecoder#decode() with UTF-8 to decode them. If you want to collect them, use Map<String, List<String>>.

Map<String, List<String>> params = new HashMap<String, List<String>>();
String[] urlParts = url.split("\\?");
if (urlParts.length > 1) {
    String query = urlParts[1];
    for (String param : query.split("&")) {
        String[] pair = param.split("=");
        String key = URLDecoder.decode(pair[0], "UTF-8");
        String value = URLDecoder.decode(pair[1], "UTF-8");
        List<String> values = params.get(key);
        if (values == null) {
            values = new ArrayList<String>();
            params.put(key, values);
        }
        values.add(value);
    }
}
BalusC
This works but does not account for url-encoded keys or values.
Mr. Shiny and New
Added URLDecoder.
BalusC
+1, this is the approach I would take.
Andrzej Doyle
Hi BalusC, I think we may need to return Map<String, String[]> to adhere with the return type of ServletRequest.html#getParameterMap()
Mohammed
@Daziplqa: filling and accessing `List<String>` is much easier than `String[]`. The `ServletRequest` was designed at the old Java 1.0 ages when the `java.util.List` didn't exist yet. But if you want to modify this, I don't stop you from doing so, it's only going to be more verbose :)
BalusC
Thanks for our comments, you always teach us:)
Mohammed
A: 

Parsing the query string is a bit more complicated than it seems, depending on how forgiving you want to be.

First, the query string is ascii bytes. You read in these bytes one at a time and convert them to characters. If the character is ? or & then it signals the start of a parameter name. If the character is = then it signals the start of a paramter value. If the character is % then it signals the start of an encoded byte. Here is where it gets tricky.

When you read in a % char you have to read the next two bytes and interpret them as hex digits. That means the next two bytes will be 0-9, a-f or A-F. Glue these two hex digits together to get your byte value. But remember, bytes are not characters. You have to know what encoding was used to encode the characters. The character é does not encode the same in UTF-8 as it does in ISO-8859-1. In general it's impossible to know what encoding was used for a given character set. I always use UTF-8 because my web site is configured to always serve everything using UTF-8 but in practice you can't be certain. Some user-agents will tell you the character encoding in the request; you can try to read that if you have a full HTTP request. If you just have a url in isolation, good luck.

Anyway, assuming you are using UTF-8 or some other multi-byte character encoding, now that you've decoded one encoded byte you have to set it aside until you capture the next byte. You need all the encoded bytes that are together because you can't url-decode properly one byte at a time. Set aside all the bytes that are together then decode them all at once to reconstruct your character.

Plus it gets more fun if you want to be lenient and account for user-agents that mangle urls. For example, some webmail clients double-encode things. Or double up the ?&= chars (for example: http://yoursite.com/blah??p1==v1&amp;&amp;p2==v2). If you want to try to gracefully deal with this, you will need to add more logic to your parser.

Mr. Shiny and New
That does not explain how to parse or retrieve querystring parameter values
ChadNC
Right, but a bit cumbersome. For that we already have URLDecoder.
BalusC
@ChadNC: the third sentence tells you how to parse: read in one byte at a time and convert to chars. The fourth sentence warns you of special chars. Etc. Maybe you didn't read the answer?
Mr. Shiny and New
@BalusC: URLDecoder works but it has some failure modes if you are trying to be more lenient in what kind of URL you accept.
Mr. Shiny and New
+1  A: 

I don't think there is one in JRE. You can find similar functions in other packages like Apache HttpClient. If you don't use any other packages, you just have to write your own. It's not that hard. Here is what I use,

public class QueryString {

 private Map<String, List<String>> parameters;

 public QueryString(String qs) {
  parameters = new TreeMap<String, List<String>>();

  // Parse query string
     String pairs[] = qs.split("&");
     for (String pair : pairs) {
            String name;
            String value;
            int pos = pair.indexOf('=');
            // for "n=", the value is "", for "n", the value is null
         if (pos == -1) {
          name = pair;
          value = null;
         } else {
       try {
        name = URLDecoder.decode(pair.substring(0, pos), "UTF-8");
              value = URLDecoder.decode(pair.substring(pos+1, pair.length()), "UTF-8");            
       } catch (UnsupportedEncodingException e) {
        // Not really possible, throw unchecked
           throw new IllegalStateException("No UTF-8");
       }
         }
         List<String> list = parameters.get(name);
         if (list == null) {
          list = new ArrayList<String>();
          parameters.put(name, list);
         }
         list.add(value);
     }
 }

 public String getParameter(String name) {        
  List<String> values = parameters.get(name);
  if (values == null)
   return null;

  if (values.size() == 0)
   return "";

  return values.get(0);
 }

 public String[] getParameterValues(String name) {        
  List<String> values = parameters.get(name);
  if (values == null)
   return null;

  return (String[])values.toArray(new String[values.size()]);
 }

 public Enumeration<String> getParameterNames() {  
  return Collections.enumeration(parameters.keySet()); 
 }

 public Map<String, String[]> getParameterMap() {
  Map<String, String[]> map = new TreeMap<String, String[]>();
  for (Map.Entry<String, List<String>> entry : parameters.entrySet()) {
   List<String> list = entry.getValue();
   String[] values;
   if (list == null)
    values = null;
   else
    values = (String[]) list.toArray(new String[list.size()]);
   map.put(entry.getKey(), values);
  }
  return map;
 } 
}
ZZ Coder
What's the way with the apache classes?
Will
You can use parse() method: http://hc.apache.org/httpcomponents-client/httpclient/apidocs/org/apache/http/client/utils/URLEncodedUtils.html
ZZ Coder
Please put the apache commons link in its own answer so I can vote it up.
itsadok
+6  A: 

On Android, the Apache libraries provide a Query parser:

http://developer.android.com/reference/org/apache/http/client/utils/URLEncodedUtils.html and http://hc.apache.org/httpcomponents-client/httpclient/apidocs/org/apache/http/client/utils/URLEncodedUtils.html

Will
This helped me too, thanks!
alexanderblom
Hi Will, I think the Apache class is still in alpha release (4.1)
Mohammed
@Daziplqa I'm not sure I follow? Its been in the android platform from the beginning and it works for me :)
Will
A: 

Based on the answer from BalusC, i wrote some example-Java-Code:

    if (queryString != null)
    {
        final String[] arrParameters = queryString.split("&");
        for (final String tempParameterString : arrParameters)
        {
            final String[] arrTempParameter = tempParameterString.split("=");
            if (arrTempParameter.length >= 2)
            {
                final String parameterKey = arrTempParameter[0];
                final String parameterValue = arrTempParameter[1];
                //do something with the parameters
            }
        }
    }
Andreas
+1  A: 

On Android, you can use the Uri.parse static method of the android.net.Uri class to do the heavy lifting. If you're doing anything with URIs and Intents you'll want to use it anyways.

Patrick O'Leary
A: 

Here is BalusC's answer, but it compiles and returns results:

public static Map<String, List<String>> getUrlParameters(String url)
        throws UnsupportedEncodingException {
    Map<String, List<String>> params = new HashMap<String, List<String>>();
    String[] urlParts = url.split("\\?");
    if (urlParts.length > 1) {
        String query = urlParts[1];
        for (String param : query.split("&")) {
            String pair[] = param.split("=");
            String key = URLDecoder.decode(pair[0], "UTF-8");
            String value = URLDecoder.decode(pair[1], "UTF-8");
            List<String> values = params.get(key);
            if (values == null) {
                values = new ArrayList<String>();
                params.put(key, values);
            }
            values.add(value);
        }
    }
    return params;
}
dfrankow