tags:

views:

104

answers:

6

Hi, what would be the easiest way to parse this date format below in java?:

2010-09-18T10:00:00.000+01:00

i read the DateFormat api in java and could not find a method that takes a string or even this type of date format as a paremeter to parse? When i mean by parse i mean i want to extract the "date (day month and year)", "time" and "timezone" into seperate string objects.

Thanks in advance

+1  A: 

You should be using SimpleDateFormat,

Find examples here and here to get started.

johnbk
+2  A: 

This date is in ISO 8601 format. Here's a link to a parser specific to this format that uses the Java SimpleDateFormat parsing APIs internally.

orangepips
Hi i think i diddnt explain myself properly. i get a string with the same format as the one i posted. what i want to do is to seperate the string into "date (year, month, and day)", "time" and "timezone".
jonney
Once you call parse() on the aforementioned API you will have a Java java.util.Date object. This should be enough for you to figure out how to do what you want. You'll also need to review the Java java.util.Calendar class API.
orangepips
@jonney -- in this case you *could* separate the string using either a regex or simply string slices. But that would be very brittle. I think using a calendar implementation will treat you much better.
andersoj
A: 

Use SimpleDateFormat, with the pattern as yyy-MM-dd'T'HH:mm:ss.SSSZ


Update SimpleDateFormat won't work with ISO 8601 date format. Rather use, JodaTime instead. It provides ISOChronology that complies with ISO 8601.

Brief example can be found on SO.

The Elite Gentleman
That won't work. SimpleDateFormat uses RFC 822 to format time zones, while the example has a time zone String according to ISO 8601. These are not compatible.
jarnbjo
+1  A: 

If you are doing anything non-trivial with dates and times, recommend the use of JodaTime. See this extensive SO discussion, including ISO8601. See also "Should I use native data/time...".

Here's an example code snippet, taken from this example, if you want to use JDK SimpleDateFormat.

// 2004-06-14T19:GMT20:30Z
// 2004-06-20T06:GMT22:01Z

// http://www.cl.cam.ac.uk/~mgk25/iso-time.html
//    
// http://www.intertwingly.net/wiki/pie/DateTime
//
// http://www.w3.org/TR/NOTE-datetime
//
// Different standards may need different levels of granularity in the date and
// time, so this profile defines six levels. Standards that reference this
// profile should specify one or more of these granularities. If a given
// standard allows more than one granularity, it should specify the meaning of
// the dates and times with reduced precision, for example, the result of
// comparing two dates with different precisions.

// The formats are as follows. Exactly the components shown here must be
// present, with exactly this punctuation. Note that the "T" appears literally
// in the string, to indicate the beginning of the time element, as specified in
// ISO 8601.

//    Year:
//       YYYY (eg 1997)
//    Year and month:
//       YYYY-MM (eg 1997-07)
//    Complete date:
//       YYYY-MM-DD (eg 1997-07-16)
//    Complete date plus hours and minutes:
//       YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
//    Complete date plus hours, minutes and seconds:
//       YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
//    Complete date plus hours, minutes, seconds and a decimal fraction of a
// second
//       YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)

// where:

//      YYYY = four-digit year
//      MM   = two-digit month (01=January, etc.)
//      DD   = two-digit day of month (01 through 31)
//      hh   = two digits of hour (00 through 23) (am/pm NOT allowed)
//      mm   = two digits of minute (00 through 59)
//      ss   = two digits of second (00 through 59)
//      s    = one or more digits representing a decimal fraction of a second
//      TZD  = time zone designator (Z or +hh:mm or -hh:mm)
public static Date parse( String input ) throws java.text.ParseException 
{
  //NOTE: SimpleDateFormat uses GMT[-+]hh:mm for the TZ which breaks
  //things a bit.  Before we go on we have to repair this.
  SimpleDateFormat df = new SimpleDateFormat( "yyyy-MM-dd'T'HH:mm:ssz" );

  //this is zero time so we need to add that TZ indicator for 
  if ( input.endsWith( "Z" ) ) {
    input = input.substring( 0, input.length() - 1) + "GMT-00:00";
  } else {
    int inset = 6;

    String s0 = input.substring( 0, input.length() - inset );
    String s1 = input.substring( input.length() - inset, input.length() );    

    input = s0 + "GMT" + s1;
  }

  return df.parse( input );        
}
andersoj
A: 

You can use javax.xml.datatype.DatatypeFactory#newXMLGregorianCalendar(String lexicalRepresentation) (API docs). The returned XMLGregorianCalendar gives you access to all the separate fields.

jarnbjo
+1  A: 

Another answer, since you seem to be focused on simply tearing the String apart (not a good idea, IMHO.) Let's assume the string is valid ISO8601. Can you assume it will always be in the form you cite, or is it just valid 8601? If the latter, you have to cope with a bunch of scenarios as these guys did.

The regex they came up with to validate 8601 alternatives is:

^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])
 (-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])
 ((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?
 ([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$ 

Figuring out how to tease out the correct capture groups makes me woozy. Nevertheless, the following will work for your specific case:

import java.util.regex.Matcher;
import java.util.regex.Pattern;


public class Regex8601
{
  static final Pattern r8601 = Pattern.compile("(\\d{4})-(\\d{2})-(\\d{2})T((\\d{2}):"+
                               "(\\d{2}):(\\d{2})\\.(\\d{3}))((\\+|-)(\\d{2}):(\\d{2}))");


  //2010-09-18T10:00:00.000+01:00

  public static void main(String[] args)
  {
    String thisdate = "2010-09-18T10:00:00.000+01:00";
    Matcher m = r8601.matcher(thisdate);
    if (m.lookingAt()) {
      System.out.println("Year: "+m.group(1));
      System.out.println("Month: "+m.group(2));
      System.out.println("Day: "+m.group(3));
      System.out.println("Time: "+m.group(4));
      System.out.println("Timezone: "+m.group(9));
    } else {
      System.out.println("no match");
    }
  }
}
andersoj
i make a call to a web service that sends me a calendar event date in 8601 format and then want me to show the time, date and timezone seperately. a pain i know as why couldnt they blooming do this from their own side if this is how they want to display the information
jonney
Also: i was working on a reg ex myself using these:private static final String REG_EX_DATE = "(.*)T"; private static final String REG_EX_TIME = "T(.*)+"; private static final String REG_EX_TIMEZONE = "+|-(.*)";
jonney
im going to try the above code anyway cheers for that
jonney
Nope, code above diddnt work. you can try it here if you like:http://www.regexplanet.com/simple/index.htmli will try the first reg ex you posted
jonney
I don't know anything about that regexplant thing. The code above compiled and ran fine on my JDK, giving me Year: 2010Month: 09Day: 18Time: 10:00:00.000Timezone: 01:00
andersoj
And in any case, I suggest using a `SimpleDateFormat` and a Calendar (or the JodaTime equivalent) to extract the relevant bits.
andersoj
@jonney: If the above code didn't work, can you please show me the output you saw?
andersoj
Hi, i got it to work somehow. ran it on my machine one more time and it worked. i am able to parse the date using the first reg ex you posted taken from that website
jonney