views:

918

answers:

10

I have a list of all email ids which I have copied from the 'To' field, from an email I received in MS Outlook. These values (email ids) are separated by a semicolon. I have copied this big list of email ids into Excel. Now I want to find the number of email ids in this list; basically by counting the number of semi colons.

One way I can do this is by writing C code. i.e. store the big list as string buffer, and keep comparing the chars to ";" in a while(char == ';') loop. But I want to do it quickly.

Is there any quick way to find that out using either:

1.) Regular expression (I use powergrep for processing the regexps)

2.) In excel itself (any excel macro/plugin for that?)

3.) DOS script method

4.) Any other quick way of getting it done?

+2  A: 

If counting the number of semicolons is good enough for you, you can do it in Perl using this solution: Perl FAQ 4.24: How can I count the number of occurrences of a substring within a string

Igor Oks
+7  A: 

I believe the following should work in Excel:

= Len(A1) - Len(Substitute(A1, ";", "")) + 1

/EDIT: if you've pasted the email addresses over several cells, you can count the cells with the following function:

= CountA(A1:BY1)

CountA counts non-empty cells in a given range. You can specify the range by typing =CountA( into a cell and then selecting your cell range with the mouse cursor.

Konrad Rudolph
This will be off by one. Say you have 0 semicolons, it'll return = x - x + 1, or if you have 1 simicolon it will return x - (x-1) + 1 and so on.
Malfist
Hello Konrad, When i paste the big list of email ids in Excel they are pasted in a row. They are not in one cell, they are spread across many cells(columns) in that row.When i tried using the formula u mentioned, didnt get correct result. Should i be pasting all the entries in one cell itself?
goldenmean
@Malfist: no. If you've got one address, you've got zero semicolons, right? @goldenmean: yes, I thought you had pasted them in one cell. For several cells, why not just count the cells? I've updated my posting.
Konrad Rudolph
Ah yes, that is correct. My bad.
Malfist
A: 

looping over it with a while loop and counting the ';' is probably going to be the fastest, and the most readable.

Consider Konrad's suggestions, that too will loop through the string and check every char and see if it is a simicolon, and then in modifies the string (may or may not be mutable, I don't know with excel), and then it counts the length between it and the original string.

Malfist
+1  A: 

If you are using Excel you can use this code and expose it.

Public Function CountSubString(ByVal RHS As String, ByVal Delimiter As String) As Integer
    Dim V As Variant
    V = Split(RHS, Delimiter)
    CountSubString = UBound(V) + 1
End Function

If you have .NET you can make a little command line utility

Module CountSubString

    Public Sub Main(ByVal Args() As String)

        If Args.Length <> 2 Then
            Console.WriteLine("wrong arguments passed->")
        Else
            Dim I As Integer = 0
            Dim Items() = Split(Args(0), Args(1))
            Console.WriteLine("There are " & CStr(UBound(Items) + 1) & " 
        End If

    End Sub
End Module
RS Conley
+1  A: 

Load the list in your favorite (not Notepad!) editor, replace ; by \n, see in the status bar how many lines you have, remove the last line if needed.

PhiLho
Hi Philhotried doing this in Textpad. The \n is taken as a literal character and its special control meaning is not interpreted and instead of each entry going to new line, all it did was put \n instead of semi colons. Or Am i missing something?
goldenmean
Alternative: replace “;” by the form feed character (= new page), print the whole document and count the number of pages that the printer produced. OR: count how much ink is left, subtract from full cartridge load, and divide by the amount of ink used to print an average e-mail address. :-p
Konrad Rudolph
@goldenmean Using notepad++ you can replace using extended charset. If you use literal replace, as you did, it won't create a newline.
To do it with Notepad, replace semicolons with tabs (write a tab in Notepad, copy it, and paste it in the Find/Replace dialog), and then copy them to Excel. Once pasted, selecting all cells will show you the count in the status bar.
Hosam Aly
@Konrad: Good solutions, but not ecologically correct! ;-)
PhiLho
@goldenmean: I don't know Textpad, but in SciTE, for example, there is an option to transform \x expressions to control chars.
PhiLho
Typically, in most text editors, you have to select a "regular expression" option in the search-and-replace dialog bog.
dangph
+2  A: 

Bash/Cygwin One-Liner

$ echo "[email protected];[email protected];[email protected]" | sed -e 's/;/\n/g' | wc -l
3

If you already have Cygwin installed it's effectively instant. If not, cygwin is worth installing IMHO. It basically provides a Linux bash prompt overlaid over your Windows system.

As an aside, stuff like this is why I prefer *nix over Windows for work stuff, I can't live on a windows box without Cygwin since bash scripts are so much more powerful than batch scripts.

Trampas Kirk
+1  A: 

C# 3.0 with LINQ would make this easy if it is an option for you over C

myString.ToCharArray().Count(char => char = ';')

MattH
+2  A: 

PowerShell:

> $a = 'blah;blah;blah'
> $a.Split(';').Count  
3
dangph
+2  A: 

3) if you don't have neither cygwin, nor powershell installed try this .cmd

@echo off
set /a i = 0
for %%i in ([email protected];[email protected];[email protected]) do set /a i = i + 1
@echo %i%
jonny
+1  A: 

If awk, echo is awailable (and it is, even on windows):

echo "addr1;addr2;addr3...." | awk -F ";" "{print NF}"
Zsolt Botykai