ansaurus

Question

Answer 1

+2 A:

If you only want the numbers, you can use a regular expression, for example the following:

(\d+).*?(\d+).*?(\d+%)

A quick test in PowerShell shows that it does work at least for your input data:

PS Home:\> function test ($re) {
>>   $a -match $re; $Matches
>>   $b -match $re; $Matches
>> }
>>
PS Home:\> $a = "Lo: 46°F. Hi: 67°F. Chance of precipitation: 20%"
PS Home:\> $b = "Niedrig: 46°F. Höchst: 67°F. Niederschlag %: 20%"
PS Home:\> test "(\d+).*?(\d+).*?(\d+%)"
True

Name                           Value
----                           -----
3                              20%
2                              67
1                              46
0                              46°F. Hi: 67°F. Chance of precipitation: 20%
True
3                              20%
2                              67
1                              46
0                              46°F. Höchst: 67°F. Niederschlag %: 20%

However, it won't work anymore if any locale might use numbers in the description strings.

You can add other constraints, like requiring a colon before every match:

: (\d+).*?: (\d+).*?: (\d+%)

This should deal with spurious numbers elsewhere in the string. But the best way overall would actually be to get your data from a source which gives you the data for machine reading, not for human consumption

Joey 2009-09-21 10:58:59

The RegeX worked. I combined your and Tor Haugen's answers to get my problem solved. Thanks!

Vijay 2009-09-21 12:03:52

Answer 2

A:

use regex (but i don't know the regex formula ;) )

You can also do a forloop over the sentence, and check each char if it's a integer. Each time you encounter once, place it in a string. when finding something else than an integer, parse the string to an int and voila. Do this 3 times

PoweRoy 2009-09-21 10:59:16

Certainly! I can do this. But, is that the best solution?

Vijay 2009-09-21 11:04:32

I would do a regex like johannes. definately cleaner but harder to read.

PoweRoy 2009-09-21 11:21:42

By the way, there's no best solution. Every solution has its pro and cons. regex pro: clean, smaller and probably faster. Regex con: harder to read, hard to master regex

PoweRoy 2009-09-21 11:22:46

Ok. Let me check it out. Thanks :)

Vijay 2009-09-21 11:51:54

Answer 3

A:

Its quite weird you are not getting XML with values in different nodes which would make more sense to me (they you could pick which values use for different locales).

But, if you want to extract data from given strings try this or something simmilar if you are not a fan of RegEx:

string dataUS = "Lo: 46°F. Hi: 67°F. Chance of precipitation: 20%";
string dataDE = "Niedrig: 46°F. Höchst: 67°F. Niederschlag %: 20%";
string[] stringValues = dataU.Split(new string[] {": "}, 4, StringSplitOptions.None);
List<int> values = new List<int>();
for (int i = 1; i < 4; i++)
{
 StringBuilder sb = new StringBuilder();
 foreach (char c in stringValues[i].Trim())
 {
  if (Char.IsDigit(c))
  {
   sb.Append(c);
  }
  else
  {
   values.Add(Convert.ToInt32(sb.ToString()));
   break;
  }
 }
}

(im spliting on ": " instead of digits)

Mike Nowak 2009-09-21 11:18:22

Answer 4

A:

I suggest using Regex to get the values that you want according to UI culture language one by one : I mean you can have a Regex to get the Lo temp. "(Lo|Niedrig):(\d+)" , a regex to get Hi temp "(Hi|Höchst):(\d+)" and a regex to get chance of perception and so on. In all of the above examples you can get the number from second element of the match.

Beatles1692 2009-09-21 11:21:21

You can also use non-grouping parentheses for the literal parts: `(?:Lo|Niedrig)` to avoid having groups you don't do anything with.

Joey 2009-09-21 11:49:42

Answer 5

+2 A:

You should consider always fetching the RSS using the same culture. That way, you'll have an easier task parsing the content. If you'll only be using the numbers, it shouldn't stop you from emitting culture-specific content to the end user.

So if you go for the en-US version, you could do it like this:

Regex re = new Regex(@"Lo: (\d+)°F. Hi: (\d+)°F. Chance of precipitation: (\d+)%");
var match = re.Match(forecast);
if (match.Success)
{
    var groups = match.Groups;
    lo = int.Parse(groups[1].Captures[0].Value);
    hi = int.Parse(groups[2].Captures[0].Value);
    prec = int.Parse(groups[3].Captures[0].Value);
}

Tor Haugen 2009-09-21 11:30:16

I combined your and Johannes Rössel's answers to get my problem solved. Thanks!

Vijay 2009-09-21 12:02:49

Answer 6

+1 A:

The following should extract the two numbers and chance of precipitation, as well as the units that are used (for culturally dependent units).

(?<lo>\d+°.).*?(?<hi>\d+°.).*?(?<precipitation>\d+)

If you don't want units extracted, then you can use

(?<lo>\d+)°.*?(?<hi>\d+)°.*?(?<precipitation>\d+)

ICR 2009-09-21 11:30:19

ansaurus

tags:

views:

answers:

Reading numbers from string in C#

related questions