views:

495

answers:

8

I have an ordered sequence like {1, 3, 5, 6, 8, 9} I want to get first missing element(2 in the example) or max() if sequence contains no missing elements. Now I'm doing it like this:

public static int GetRegisterNumber<T>(this IQueryable<T> enumerable, Func<T, bool> whereFunc, Func<T, int?> selectFunc)
{
    var regNums = enumerable.OrderBy(selectFunc).Where(whereFunc).ToArray();

    if (regNums.Count() == 0)
    {
        return 1;
    }

    for (int i = 0; i < regNums.Count(); i++)
    {
        if (i + 1 != regNums[i])
        {
            return regNums[i].Value + 1;
        }
    }

    return regNums.Last().Value + 1;
}

But i think there are much faster methods. Any suggestions?

+4  A: 

Suggestion: run your code through a profiler. Then you'll know where it is slow. Intuitively, the OrderBy is the slowest thing in this program. But intuitions about the slowest thing are often very, very wrong. Use a profiler.

Of course, you should also eliminate the massive inefficiencies in this program. Remember, Count() counts the sequence by re-enumerating it. Count() does not know that you have not changed the sequence since the last time you counted it! You probably want to store the count rather than recalculating it every time, or use Length, since you have an array.

Eric Lippert
.OrderBy is anyway executed in the DB, so I cannot get rid of it
xumix
You can use Any() instead of comparing Count() against 0 too
Samuel Carrijo
@xumix you may not need to reorder in this specific case.
Samuel Carrijo
in most cases(99.999%), the sequence will contain elements, so .Any or .Count() will make additional DB query...
xumix
Why would it query the database? You've dumped the contents into an array and are querying against the array now.
Eric Lippert
@Eric Lippert yes, but if i use Any() to remove OrderBy() Call it will query the DB first
xumix
+5  A: 

I'd probably look at something like below; the Where can be done outside (as can the selector to be honest):

If you expect to start at 1:

public static int GetRegisterNumber<T>(this IEnumerable<T> enumerable,
    Func<T, int> selectFunc)
{
    int expected = 1;
    foreach (T item in enumerable) {
        if (selectFunc(item) != expected) return expected;
        expected++;
    }
    return expected;
}

To start with the first item in the list:

public static int GetRegisterNumber<T>(this IEnumerable<T> enumerable,
    Func<T, int> selectFunc)
{
    bool first = true;
    int prev = -1;
    foreach (T item in enumerable)
    {
        int val = selectFunc(item);
        if(first) {
            prev = val;
            first = false;
        } else if (val != prev + 1) {
            return prev + 1;
        }
        prev = val;
    }
    return first ? 1 : prev + 1;
}

It isn't clear how you wanted to handle nulls, so I didn't. Note that this only iterates once, and doesn't buffer everything.

Marc Gravell
thx, i'll try this one
xumix
A: 

As mentioned, first use a profiler to find where it is slow. If the sequence is really big, and the ordering is slow, you can use radix sort, that is O(kn), where k is the max number of digits, and n is the number of elements in the sequence. Sorting algorithms based on comparison are usually O(n logn).

This way the whole algorithm will be O(kn), which depending on n, is asymptotically faster, and consequently more scalable.

Samuel Carrijo
A: 

Assuming that the incoming sequence of values is already sorted, how about:

var upperBoundValue = values.Last() + 1;
var firstMissingItem = Enumerable.Range(1, upperBoundValue).Except(values).First();

If you are performing this operation iteratively, you can optimize the process by storing an index to the last place that you were in the sequence where you found a gap, and start the next iteration from there.

LBushkin
+3  A: 

Why not do something like a binary search?

Say you have a list 10 elements long. Read the first element. Then read the fifth element. If the fifth element isn't first element + 4 then you know there's a missing number, otherwise you know there isn't. Then just iterate like this until you find the first missing element, or reach the end of the list.

This of course assumes you know the size (which wasn't explicitly mentioned in the question), but you've converted to an array, so you should know.

O(log N) instead of O(n)

patros
but for ordering, you won't take O(logn). By what he says, the vector isn't ordered, so a binary search won't help (unless you sort it first)
Samuel Carrijo
Yes, the final step is O(logN) instead of O(n), but the ordering is still presumably O(n*logN).
patros
The title and first sentence of the post both say that the sequence is ordered.
mbeckish
+3  A: 

Assuming that your OrderBy and Where have already been applied:

int firstMissing = collection.TakeWhile((x, i) => x == ++i).LastOrDefault() + 1;
LukeH
A good answer for the question text. Just note, there is a LinqToSql tag - TakeWhile does not translate into Sql.
David B
A: 

You put a LinqToSql tag in your question. I presume you are looking for a "first available" id, so that you can create a new record using this id. Consider turning IDENTITY on in the database instead.

David B
+2  A: 

Edit: I just noticed that enumerable is IQueryable<T> but selectFunc and whereFunc are of type Func<T, _>. This will cause the Enumerable versions of OrderBy and Where to be called, rather than using database calls. You probably want to switch them to Expression<Func<T, _>> instead.

If you don't want to order regNums first, here's a O(n) golf-style solution:

var max = regNums.Max(i => (int?)i) ?? 0;
return Enumerable.Range(1, max + 1)
                 .Except(regNums)
                 .Min();

By line:

  1. By casting to int?, Max will return null if regNums is empty, coalesced to 0.

  2. Build a sequence of all possible registers, including our next value if full.

  3. Subtract the current set of registers.

  4. Pick the lowest.

dahlbyk
That's nice one
xumix
about Expression<Func<.... here is the question :)http://stackoverflow.com/questions/1098341/linq-does-not-lazy-load-if-invoked-via-generic-method
xumix