tags:

views:

78

answers:

3

I want to implement a version of Benford's law (http://en.wikipedia.org/wiki/Benford%27s_law) that basically asks for the first digit of a number to do analysis on the distribution.

1934---> 1
0.04 ---> 4
-56 ---> 5

How do you do this in MATLAB?

+2  A: 

A few ways you can do this...

  • Using REGEXP:

    wholeNumber = 1934;                      %# Your number
    numberString = num2str(wholeNumber,16);  %# Convert to a string
    matches = regexp(numberString,'[1-9]','match');  %# Find matches
    firstNumber = str2double(matches{1});  %# Convert the first match to a double
    
  • Using ISMEMBER:

    wholeNumber = 0.04;                      %# Your number
    numberString = num2str(wholeNumber,16);  %# Convert to a string
    isInSet = ismember(numberString,'123456789');  %# Find numbers that are
                                                   %#  between 1 and 9
    numberIndex = find(isInSet,1);           %# Get the first number index
    firstNumber = str2double(numberString(numberIndex));  %# Convert to a double
    

EDIT:

Some discussion of this topic has arisen on one of the MathWorks blogs. Some interesting additional solutions are provided there. One issue that was brought up was having vectorized solutions, so here's one vectorized version I came up with:

numberVector = [1934 0.04 -56];
numberStrings = cellstr(num2str(numberVector(:),16));
firstIndices = regexp(numberStrings,'[1-9]','once');
firstNumbers = cellfun(@(s,i) s(i),numberStrings,firstIndices);
gnovice
+4  A: 
function res = first_digit(number)
    number = abs(number);
    res = floor(number / (10 ^ floor(log10(number))));
end

It works for all real numbers (see gnovice's comment for an extreme case)

Yassin
Whoops, promise I didn't copy your answer, Yassin! You must have posted while I was figuring out my solution. +1 for thinking of using abs() to cover those pesky negatives.
Doresoom
+1: I knew there was probably some way to do it mathematically, but the first solutions I thought of were string-based.
gnovice
@gnovive: Thanks. Corrected it!
Yassin
Actually, I just found an extreme case where this solution fails but the string solution works. Try `number = eps(realmin)`. Admittedly, it's not a *likely* scenario you'll run into. I was just seeing how the solutions performed on extreme values.
gnovice
@gnovive: Nice one, +1!
Yassin
+1  A: 

Using log10 and floor built in functions,

floor(x./10.^floor(log10(x)))

returns the first digit of all elements in an array as well.

Doresoom

related questions