views:

106

answers:

5

Okay, so I'm using perl to read in a file that contains some general configuration data. This data is organized into headers based on what they mean. An example follows:

[vars]

# This is how we define a variable!
$var = 10;
$str = "Hello thar!";


# This section contains flags which can be used to modify module behavior
# All modules read this file and if they understand any of the flags, use them
[flags] 
  Verbose =       true; # Notice the errant whitespace!

[path]
WinPath = default; # Keyword which loads the standard PATH as defined by the operating system. Append  with additonal values.
LinuxPath = default;

Goal: Using the first line as an example "$var = 10;", I'd like to use the split function in perl to create an array that contains the characters "$var" and "10" as elements. Using another line as an example:

    Verbose    =         true;
    # Should become [Verbose, true] aka no whitespace is present

This is needed because I will be outputting these values to a new file (which a different piece of C++ code will read) to instantiate dictionary objects. Just to give you a little taste of what it might look like (just making it up as I go along):

define new dictionary
name: [flags]
# Start defining keys => values
new key name: Verbose
new value val: 10 
# End dictionary

Oh, and here is the code I currently have along with what it is doing (incorrectly):

sub makeref($)
{
    my @line = (split (/=/)); # Produces ["Verbose", "    true"];
}

To answer one question, why I am not using Config::Simple, is that I originally did not know what my configuration file would look like, only what I wanted it to do. Making it up as I went along - at least what seemed sensible to me - and using perl to parse the file.

The problem is I have some C++ code that will load the information in the config file, but since parsing in C or C++ is :( I decided to use perl. It's also a good learning exercise for me since I am new to the language. So that's the thing, this perl code is not really apart of my application, it just makes it easier for the C++ code to read the information. And, it is more readable (both the config file, and the generated file). Thanks for the feedback, it really helped.

+2  A: 

Seems like you've got it. Strip the whitespaces before splitting.

sub makeref($)
{
    s/\s+//g;
    my @line = (split(/=/)); # gets ["verbose", "true"]
}
daniel
Ahh its so obvious now. Thanks, I'm new to perl and its a really cool language.
Tommy Fisk
You're welcome. Hope it helps.
daniel
Oddly, chomp does not chomp whitespaces!
Tommy Fisk
you're right, I thought about it and the regex solution is better, go with that.
daniel
Please note the messed up syntax highlighting.
Svante
@Tommy Read `perldoc -f chomp` to find out what `chomp` does `chomp`.
Sinan Ünür
Hope your parameters and values don't have spaces: `background color = dark red`
mobrule
+1  A: 

This code does the trick (and is more efficient without reversing).

for (@line) {
    s/^\s+//;
    s/\s+$//;
}
Tommy Fisk
you could add a 'g' to the end of the regex to get it to replace more than one extra whitespace appearance. i.e. `s/^\s+//g;`
daniel
Please note the messed up syntax highlighting.
Svante
@Tommy There are many modules that handle configuration sections, continuation lines, variables with multiple values etc etc on CPAN. Use one of them once you are done learning. I like `Config::Std`. @FM pointed out `Config::Simple`.
Sinan Ünür
+3  A: 

split splits on a regular expression, so you can simply put the whitespace around the = sign into its regex:

split (/\s*=\s*/, $line);

You obviously do not want to remove all whitespace, or such a line would be produced (whitespace missing in the string):

$str="Hellothere!";

I guess that only removing whitespace from the beginning and end of the line is sufficient:

$line =~ s/^\s*(.*?)\s*$/$1/;

A simpler alternative with two statements:

$line =~ s/^\s+//;
$line =~ s/\s+$//;
Svante
Please note the messed up syntax highlighting.
Svante
@Svante That's why I tend to use `s{...}{...}` when posting on SO.
Sinan Ünür
`s/^\s+//` is slightly more efficient.
Sinan Ünür
Yes, if there are zero whitespace characters, `s/^\s*//` replaces nothing with nothing--why bother?
Alan Moore
Yeah, yeah, I'm editing already.
Svante
+6  A: 

If you're doing this parsing as a learning exercise, that's fine. However, CPAN has several modules that will do a lot of the work for you.

use Config::Simple;
Config::Simple->import_from( 'some_config_file.txt', \my %conf );
FM
Yeah, I've really got to ask why the OP is using what is very similar to a standard config file format, but not using the standard config file reader modules that are readily available and very well-tested. (YAML is another good one to look at if Config::Simple doesn't quite match the desired format.)
Ether
95% time this is what would be wanted. I've got some reasons (learning, not all of my code is in perl) that make it easier to do it differently.
Tommy Fisk
A: 

You probably have it all figured out, but I thought I'd add a little. If you

sub makeref($)
{
   my @line = (split(/=/));
   foreach (@line)
   {
      s/^\s+//g;
      s/\s+$//g;
   }
}

then you will remove the whitespace before and after both the left and right side. That way something like:

 this is a parameter         =      all sorts of stuff here

will not have crazy spaces.

!!Warning: I probably don't know what I'm talking about!!

plor