tags:

views:

140

answers:

3

Hi, having trouble coming up with search parameters for this issue, so I can't find an answer on my own.

      Column X             |      
Message (info 1)           |
Message (info 2) (info 1)  |

Above is the contents of one column I need to handle. The result of the query should be the part INSIDE the parentheses only. Problem is, there's one program that saves two sets of information in parentheses, in which case the LATTER (info 1) is the one we want in the first column, in addition to which we must add a second column for info 2.

So I'm imagining I need to combine an if clause with a variable I can depend on to count how many left parentheses there are, for example. If left_parentheses = 2, Then .... Else If left_parentheses = 1, Then ....

But I don't know how to do that in SQL, and I also don't know how to separate between info 1 / 2 in the example.

The result from the example would look like this:

Column 1 | Column 2
 Info 1  |
 Info 1  |  Info 2

As usual, I'll try to look for the answer while waiting for tips here. Thanks!

+4  A: 

Look at the builtin functions charindex, patindex, and substring.

charindex finds the positon of a specified character, patindex of a pattern, substring returns a portion of a string by position.

My advice would be to write a view over the table column X is in, that uses the above functions to provide two calculate columns. Then you could insert into result table select info1, info2 from columnX'stable;.

At least the calculated column info2 will involve a case statement to handle the case when there is only one parenthesized "info" in the source, something along these lines:

case when [some test using patindex too check for two parenthesized infos]
then [some combination of patidex and substring to extract the second info]
else null;

In particular, patindex returns zero when a pattern is not found so:

patindex('%(%)%(%)%', columnX)

would return zero for your first example but not your second example.

You'll also need to consider how you want to handle erroneous data, specifically 1) rows with no parentheses, 2) with unequal numbers of open and close parentheses, 3) with additional text between two parenthesized "infos", 4) with additional text after the closing parenthesis.

I'd encourage you to add examples of all these possibilities, as well as correctly formatted columnXes, to your test data, and then test that the view does what you want in all cases.

tpdi
Thanks for the tip, I played around with the patindex function for a while, didn't get it to work as intended but it's definitely something I'll remember for future reference, I can see its uses already.
Zan
+1  A: 

Here's my go at it in SQL 2005 syntax using a Common Table Expression. I make no claims as to it's correctness or it's efficiency and I've made some assumptions about how you wanted it to work.

WITH BracketIndeces AS
(
  SELECT 
    ColumnX AS ColVal,
    CHARINDEX('(', ColumnX) as first_open_bracket,
    CHARINDEX('(', ColumnX, CHARINDEX('(', ColumnX)+1) as second_open_bracket,
    CHARINDEX(')', ColumnX) as first_close_bracket,
    CHARINDEX(')', ColumnX, CHARINDEX(')', ColumnX)+1) as second_close_bracket
  FROM SomeTable
)
SELECT
  CASE
    WHEN second_close_bracket = 0 THEN
     SUBSTRING(ColVal, first_open_bracket+1, first_close_bracket - first_open_bracket-1)
    ELSE
     SUBSTRING(ColVal, second_open_bracket+1, second_close_bracket - second_open_bracket-1)  
  END AS Column1,
  CASE
    WHEN second_close_bracket = 0 THEN
     NULL
    ELSE
     SUBSTRING(ColVal, first_open_bracket+1, first_close_bracket - first_open_bracket-1)
  END AS Column2
FROM BracketIndeces
WHERE first_open_bracket <> 0
AND first_close_bracket <> 0
AND first_open_bracket < first_close_bracket
AND (
  (second_open_bracket = 0 AND second_close_bracket = 0) 
  OR
  (second_open_bracket < second_close_bracket 
    AND second_open_bracket > first_close_bracket
  )
)

The where clause at the bottom is just to filter out any columns that either contain no brackets or contain brackets in a weird order and it uses NULL in Column2 when only one set of brackets are there.

Wolfbyte
Can't thank you enough for this. I seriously had to bang my head against the wall trying to do this myself but the day's coming to an end and I couldn't cut it in time. Your solution worked like a charm! Thanks again. :)
Zan
+2  A: 

It will take ages if you have any decent amount of data but I doubt there are many better alternatives using SQL.

DECLARE @Table TABLE (TableID INT PRIMARY KEY, ColumnX VARCHAR(32))

INSERT INTO @Table VALUES (1, '(Info 1) (Info 2)');
INSERT INTO @Table VALUES (2, '(Info 1)');
INSERT INTO @Table VALUES (3, '(Info 10) (Info 20)');
INSERT INTO @Table VALUES (4, '(Info1')
INSERT INTO @Table VALUES (5, '(Info1) (Info2')
INSERT INTO @Table VALUES (6, '(Info1) Info2)')
INSERT INTO @Table VALUES (7, 'Info1')
INSERT INTO @Table VALUES (8, 'Info1)')
INSERT INTO @Table VALUES (9, NULL);

SELECT 
  TableID
  , [Column1] = CASE WHEN PATINDEX('%(%)%', ColumnX) = 1 
                THEN SUBSTRING(ColumnX
                               , CHARINDEX('(', ColumnX) + 1
                               , CHARINDEX(')', ColumnX) 
                                 - CHARINDEX('(', ColumnX) - 1
                             ) 
                ELSE NULL END
  , [Column2] = CASE WHEN PATINDEX('%(%)%(%)%', ColumnX) = 1 
                THEN SUBSTRING(ColumnX
                               , CHARINDEX('(', ColumnX, CHARINDEX('(', ColumnX) + 1) + 1
                               , CHARINDEX(')', ColumnX, CHARINDEX(')', ColumnX) + 1) 
                                 - CHARINDEX('(', ColumnX, CHARINDEX('(', ColumnX) + 1) - 1
                              )
                ELSE NULL END
FROM @Table
Lieven