views:

58

answers:

5

Basically I'm trying to get a bit of regex to do the following... I have some data I need to split, the sample data looks like this:

Brand Name - Product Name
Another Brand - Shoe Laces
Heinz - Bakes Beans

I want to be able to select the brand name or the product name but I can't seem to do it without catching the " - " part in the regex. Anyone tell me what I'm missing out? My regex is pretty basic.

EDIT: I'm exporting a database to a spreadsheet, formatting it and importing it into a new system through a CSV. The old system used a brand name - product name method as above where as the new one uses two separate fields. Ideally I wanted to try and sneak some regex in the spreadsheet formula but now I think its going to be easier to just handle this with a script. Likely PHP although Javascript isn't ruled out.

A: 

You don't need regex for this task. Just find the index of the substring "-". Stuff before it is the band name, and after is the product name.

KennyTM
+1  A: 

You won't need a regex for that - a simple split would be sufficient.

Example in python:

#!/usr/bin/env python
from string import strip

s = """
Brand Name - Product Name
Another Brand - Shoe Laces 
Heinz - Bakes Beans
"""

for line in s.split('\n'):
    try:
        brand, product = map(strip, line.split('-'))
        print 'Brand:', brand, '| Product:', product
    except:
        pass

Yields:

Brand: Brand Name | Product: Product Name
Brand: Another Brand | Product: Shoe Laces
Brand: Heinz | Product: Bakes Beans

PHP version:

<?php

$s = <<<EOM
Brand Name - Product Name
Another Brand - Shoe Laces 
Heinz - Bakes Beans
EOM;

foreach (split("\n", $s) as $line) {
    list($brand, $product) = split("-", $line, 2);
    echo "Brand: " . trim($brand) . " | Product: " . trim($product) . "\n";
}

?>

Ruby version:

#!/usr/bin/env ruby

s = "
Brand Name - Product Name
Another Brand - Shoe Laces 
Heinz - Bakes Beans
"

s.split("\n").each { |line| 
  brand, product = line.split("-").map{ |item| item.strip }
  puts "Brand: #{brand} | Product: #{product}" if brand and product
}
The MYYN
Surely you'll want to apply the regex line by line... `'Product Name Another Brand'` doesn't look right at all.
Michał Marczyk
Question was edited after I posted the answer; corrected now ...
The MYYN
A: 

If you know the data to be well-formatted, and in particular that the string - -- one space, one hyphen, one space -- will only occur as the separator in the middle, you can use (.*) - (.*) to retrieve the brand name in the first group and the product name in the second.

Michał Marczyk
+1  A: 

if your data is structured like that, the simplest way is to use whatever split method your language has, then do a split on "-". eg in Python

"Heinz - Bakes Beans".split("-")

No need complicated regex

So if your data is in a file

for line in open("file"):
    brand,product=line.rstrip().split("-")
    print brand, product

If you work with PHP, you can use explode

$f = fopen("file","r");
if($f){
     while( !feof($f) ){
        $line = fgets($f,4096);
        list($brand,$product) = explode("-",$line);
        echo "$brand - $product\n";
     }
}
fclose($f);
ghostdog74
+1  A: 

Assuming that there won't be any stray hyphens (-) in the string (and that the brand names etc would contain only alphanumerical characters and spaces - to allow other symbols, add them to the character classes [] ), you can use following regex:

^([\w\s]+?)\s*-\s*([\w\s]+)$

The result object will look like:

$1 Brand Name
$2 Product Name

Amarghosh