views:

54

answers:

2

Excel returns a reference of the form

=Sheet1!R14C1R22C71junk

("junk" won't normally be there, but I want to be sure that there's no extraneous text.)

I would like to 'split' this into a VB array, where

a(0)="Sheet1"
a(1)="14"
a(2)="1"
a(3)="22"
a(4)="71"
a(5)="junk"

I'm sure it can be done easily with a regular expression, but I just can't get the hang of it. Is there a kind soul who could help me?

Thanks

+1  A: 
=([^!]+)!R(\d+)C(\d+)R(\d+)C(\d+)(.*)

should work.

[^!]+ matches a sequence of non-exclamation-point characters.

\d+ matches a sequence of digits.

.* matches anything.

So, in VB.NET:

Dim a As Match
a = Regex.Match(SubjectString, "=([^!]+)!R(\d+)C(\d+)R(\d+)C(\d+)(.*)")
If a.Success Then
    ' matched text: a.Value
    ' backreference n text: a.Groups(n).Value
Else
    ' Match attempt failed
End If
Tim Pietzcker
Thanks for the quick reply. Hm, tricky this regex stuff. But that would also accept "=Sheet1!xx14yyyy1zz22w71" would it not? I want to be sure it's Rs and Cs. Also "junk" could be rubbish of any length, starting with a non-digit, not just a word. Possible?
smirkingman
Yes, that should work. The dot will match anything, and the previous `\d+` has already used up all the digits. I've also changed the regex to accomodate the literal R and C.
Tim Pietzcker
A(4) finishes at the first non-digit
smirkingman
(off-topic) I'm a beginner here. How does get notified as quickly as you do?
smirkingman
Sorry, I edited my comment as you were answering to it - I had misread your previous comment. You get a notification about new comments when you refresh the page or visit a different page on Stack Overflow (little mailbox icon lights up, and the top left corner "Stack Exchange" symbol shows a number with your new comments.
Tim Pietzcker
Wow, I'm gob-struck, that's a gem. Thank you very much!
smirkingman
You're very welcome. Notifications work very well on SO; if you want to trigger a notification to a third person (say, a commenter on your answer to someone else's question), just add an `@` in front of her name (@smirkingman), and she will get notified that you directed a comment to her. But you only need to do that for someone who isn't the author of the question or the answer that you're commenting on.
Tim Pietzcker
+2  A: 

A straightforward String.Split would work, provided the "junk" text wasn't there:

Dim input As String = "=Sheet1!R14C1R22C71"
Dim result = input.Split(New Char() { "="c, "!"c, "R"c, "C"c }, StringSplitOptions.RemoveEmptyEntries)
For Each item As String In result
    Console.WriteLine(item)
Next

The regex gets a little tricky since you will need to go through the Groups and Captures of the nested portions to get the proper order.

EDIT: here's my regex solution. It accepts multiple occurrences of R's and C's.

Dim input As String = "=Sheet1!R14C1R22C71junk"
Dim pattern As String = "=(?<Sheet>Sheet\d+)!(?:R(?<R>\d+)C(?<C>\d+))+"

Dim m As Match = Regex.Match(input, pattern)
If m.Success Then
    Console.WriteLine(m.Groups("Sheet").Value)
    For i = 0 To m.Groups("R").Captures.Count - 1
        Console.WriteLine(m.Groups("R").Captures(i).Value)
        Console.WriteLine(m.Groups("C").Captures(i).Value)
    Next
End If

Pattern explanation:

  • "=(?Sheet\d+)" : matches an = sign followed by "Sheet" and digits. Uses named group of "Sheet"
  • "!(?:R(?\d+)C(?\d+))+" : matches the exclamation mark followed by at least one occurrence of the *R*xx*C*xx portion of the text. Named groups of "R" and "C" are used.
  • "(?:...)+" : this portion from the above portion matches but does not capture the inner pattern (i.e., the R/C part). This is to avoid unnecessarily capturing them while we are actually capturing them with the named groups.
Ahmad Mageed
+1 for a smart, simple solution, even if it doesn't catch the junk. Thanks
smirkingman
@smirkingman thanks. I've added a regex solution now too that will handle the "junk" case.
Ahmad Mageed