tags:

views:

444

answers:

4

Hi,

I have a text file that I am inputting data in from, but I can't seem to get it right.

Here are two lines from the text file as an example (these aren't real people don't worry):

Michael    Davidson     153 Summer Avenue        Evanston        CO 80303
Ingrid     Johnson      2075 Woodland Road       Aurora          IL 60507

Here is the code I have to load the text file and put the data into a struct. I am still new to C++(obviously) and I'm having a hard time using get and >> together. The code I have below, works fine until I get to the "state" and then something goes wrong. Thanks for the help!

//constants
const int FIRST_NAME_LEN = 11;
const int LAST_NAME_LEN = 13;
const int ADDRESS = 25;
const int CITY_NAME_LEN = 16;
const int STATE_LEN = 3;

//define struct data types
struct CustomerType {
    char firstName[FIRST_NAME_LEN];
    char lastName[LAST_NAME_LEN];
    char streetAddress[ADDRESS];
    char city[CITY_NAME_LEN];
    char state[STATE_LEN];
    int zipCode;
};

//prototype function
ifstream& getInfo(CustomerType& CT_Struct, ifstream& infile);

int main() {

    //declare struct objects
    CustomerType CT_Struct;

    ifstream infile("PGM951_customers.txt");
    if(!infile) {
     cerr << "Could not open the input file." << endl;
     exit(1); //terminates the program
    }

//call the function
getInfo(CT_Struct, infile);

return 0;
}

ifstream& getInfo(CustomerType& CT_Struct, ifstream& infile) {

    while(infile) {
     infile.get(CT_Struct.firstName, sizeof(CT_Struct.firstName));
     infile.get(CT_Struct.lastName, sizeof(CT_Struct.lastName));
     infile.get(CT_Struct.streetAddress, sizeof(CT_Struct.streetAddress));
     infile.get(CT_Struct.city, sizeof(CT_Struct.city));
     infile.get(CT_Struct.state, sizeof(CT_Struct.state));
     infile >> ws;
     infile >> CT_Struct.zipCode; 

     cout << CT_Struct.firstName << " | " << CT_Struct.lastName << " | " << CT_Struct.streetAddress  
      << " | " << CT_Struct.city << " | " << CT_Struct.state  << " | " << CT_Struct.zipCode << endl;
    }

return infile;

}

=== edit =========== Reading in the state at 8 char was just me messing around and then I forgot to change it back...sorry.

A: 

If I were you I would start again from scratch. I would:

  • use std::strings instead of character arrays for your data
  • reads line at a time from the file using std::getline
  • parse the line up using a stringstream
  • avoid mixing formatted and unformatted input
anon
The reason this is tricky is that names sometimes have 3, 4 or even 5 "words". Street names and city's are similarly variable.
dicroce
I assumed this was a learning exercise where the questioner controlled the input data. If not, he should obviously investigate formats such as XML or CSV for input.
anon
Using fixed size fields formats is the easiest way to do this. If you use a field separator character then you need to worry about escaping the field separator within a field or alternatively encoding sizes into the format or go with a heavy weight defined format like XML.
Martin York
A: 

My approach to this would be the following:

1) Read each line into a null terminated buffer. 2) Use a split() function that you're gonna have to write. This function should take a string as its input and return a list. It should also take a separator. The separator in this case is ' '. 3) iterate over the list carefully (are there never middle names?) What about 1 word, or 3 word street names? Since many of these columns are really variable in number of words, and you have no seperator other than whitspace, this may prove a fairly tough task. If you NEVER have middle names, you could assume the first two columns are first and last name. You know for sure what the last two are. Everything between them could be assigned to a single address field.

dicroce
+2  A: 

The problem is istream::get() breaks for streetAddress which has spaces in it.

One way is to tokenize the input line first into say, a vector of strings and then depending on the number of tokens convert these to appropriate fields of your CustomerType:

vector<string> tokenize(string& line, char delim=' ') {
      vector<string> tokens;
      size_t spos = 0, epos = string::npos;
      while ((epos = line.find_first_of(delim)) != string::npos) {
          tokens.push_back(line.substr(spos, epos - spos));
          spos = epos; 
      }
      return tokens;     
}

I'd rather a stream extraction operator for CustomerType :

struct CustomerType  {
   friend istream& operator>>(istream& i, CustomerType& c);
   string firstName, lastName, ...;
   // ...
};

istream& operator>>(istream& i, CustomerType& c) {       
   i >> c.firstName >> c.lastName;
   string s1, s2, s3;
   i >> s1 >> s2 >> s3;
   c.streetAddress = s1 + s2 + s3;  
   i >> c.city >> c.state >> c.zipCode;
   return i;
}
dirkgently
I agree that's generally niftier, but do we want to present the poor guy with operator overoading right off? :-)
Charlie Martin
This is what I'd suggest. Only issue I can think of is that there may be a variable number of items in the address field, but if so then a different character would be needed to split up fields (tab or | would be my ideas) and then you could just use get() with a different separator :)
workmad3
I was thinking that get got until count, end of line, or delimiter (which is not specified), so it would not break on whitespace, whereas the >> operator always breaks on ws.
crashmstr
And now the problem is that he will probably get buffer overruns.
anon
@Charlie Martin: There's another answer floating where someone introduced another (apparent) newbie to Boost::Tokenizer. What'd you know of my troubles to keep my fingers off of typing that ;-)
dirkgently
I bet. Gotta read up on Boost.
Charlie Martin
+1  A: 

You're getting 8 characters for State, which includes all your zipcode, and is larger than your field.

It'd also be tempting to use the skipws operator:

infile >> skipws >> CT_Struct.firstName
       >> CT_Struct.lastName 
       >> ... ;

(Update: that's what I get for doing that from memory. This is more closely approximating correct.)

Charlie Martin