tags:

views:

102

answers:

5

Hello,

I have a pretty easy question. (using C)

In a sentence such as

In this document, there are 345 words and 6 figures

How can I scan 345 and 6 while ignoring all that is in between ?

I tried fscanf(FILE *pointer,"%d %d",&words,&figs); But it only gets the first value ...

What am I doing wrong ?

EDIT

Im sorry I forgot to mention, the statement is always fixed ... In this document, there are # words and # figures

+1  A: 

I don't think scanf/fscanf will do what you need in this case if you don't know the exact format of the input string.

A better approach might be to parse the input line until you hit a whitespace, period, or comma (or some other separator), and then see if what you have so far consists solely of digits. If so, then you have a number, otherwise, you have a word (assuming here the sentence is well formed). You could then store that number in an array or whatever data structure you desire.

However, if the sentence structure is always in exactly the same format, you could use an approach like this:

    int main() {
      char* buff = "In this document, there are 345 words and 6 figures";
      char extra1[5000];
      char extra2[5000];
      int a,b;
      sscanf(buff,"%[In this document, there are ]%d%[ words and ]%d", extra1, &a, extra2, &b);
      cout<<a<<" "<<b<<endl;
      return 0;
    }
dcp
Thanks for the reply :) what if the statement was fixed ?
ZaZu
If the statement is fixed, then you could use something like what I have above. If not, you need a different approach like the "better" approach I mentioned above.
dcp
Georg Fritzsche
+2  A: 

I think that the way to do this is to combine strpbrk with strtol.

It would look kind of like:

long int n;
const char *p = str;
while( (p = strpbrk(p, "-0123456789")) ) {
    n = strtol(p, &p, 0);
    handle(n);
}

Update:
Depending on what you want, it may be better to use strtol(p, &p, 10) because in the test I just ran I discovered it really did convert Testing0x100what happens if I use base16 hex into 256, 16.

Zan Lynx
+1  A: 

You need to tokenize the string and check each word in sequence. The following code is modified from a C++ reference, the call is actually C.

/* strtok example */
#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] ="- This, a sample 9876 string.";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    if (pch[0] >= '0' && pch[0] <= '9')
    {
        // It's a number
    }
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}
jfawcett
+2  A: 

The problem with your format string is that the space in the format string only leads to white-space being ignored.

I don't think its possible to do it using only scanf() if there might be no second numerical value before the next linebreak and you also would be vulnerable to arbitrary input lengths. But an fgets()/sscanf() combination should do fine:

int a=0, b=0;
char buf[255];
fgets(buf, sizeof(buf), stdin);
sscanf(buf, "%*[^0-9]%d%*[^0-9]%d", &a, &b);

If however you know that there are always two seperate numerical values and the input length is fixed to a reasonable length, the following should do it:

int a=0, b=0;
scanf("%*[^0-9]%d%*[^0-9]%d", &a, &b);
Georg Fritzsche
This is the simplest way here, `scanf("%d%*[^0-9]%d", ` but it isnt working with me .. why ?
ZaZu
@Zazu: Does it give more info like an identifier (i.e. variable or function name)?
Georg Fritzsche
@Georg : THis is my exact code `fscanf(file,"%*[^0-9]%d%*[^0-9]%d",` ... Its not working :(
ZaZu
Are the variables `word` and `doc` defined somewhere?
Georg Fritzsche
They were, but int doc,word; now I made them `int doc=0,word=0;` .. It worked .. but when I print their value, they are =0 :S it didnt scan the document
ZaZu
Have you read/fseek()d to the right line in the file before trying to read them in?
Georg Fritzsche
Well they the statement is in the first line .. which makes the process even easier .. I have fopen the file, then if file==NULL, return 0, else `int doc=0,word=0; fscanf(file,"%*[^0-9]%d%*[^0-9]%d",`
ZaZu
@ZaZu: Hm, you could test or compare to [this version](http://pastebin.com/rkE2VuaU) or add a small sample that reproduces the problem to your question.
Georg Fritzsche
Thanks for the link, ill check it in a sec. Using the method you mentioned before, I did : ` fscanf(file,"%*[In this document, there are ]%d%*[words and ]%d",` Well im glad that words show numbers, but doc = 0 !! :( Why is that ?
ZaZu
`%*[abc]` is for character-sets, if you want to put common strings in simply don't surround them with anything, e.g. `"Have %d waffles."`.
Georg Fritzsche
Doesnt the star mean ignore whats inside ? %`*`[abc]
ZaZu
The asterisk means *"don't store whats matching the following"*. `[ab]` specifies a character set which matches strings consisting only of 'a's and 'b's. Also, everything not being a format specifier (`%foo`) is not stored anywhere.
Georg Fritzsche
Ooooh, thanks. Ill keep it for future needs, this code woill be really helpful !!
ZaZu
+2  A: 

This is because functions of the scanf() family are meant to read from strings written with a printf() like function with the same format. Since is the case here, no need to resort to string parsing and conversions to integers:

const char *format = "In this document, there are %d words and %d figures";

int n = fscanf(fp, format, &words, &figs);
if (n != 2) //--- not recognized ...

Of course, the format has to be exactly the same, at least before the values that are read, so it's safer to keep it in one place, following the Once and Only Once principle, and necessary to test the fscanf() return code.

philippe
That was simple !! Exactly what I needed, thanks :)
ZaZu