ansaurus

Question

Best way to find recurring values in a set

Answer 1

A:

SELECT  value
FROM    table
GROUP BY
        value
ORDER BY
        COUNT(*) desc
LIMIT 1

Quassnoi 2009-04-28 17:02:49

Well I know how to do it in form of SQL query..my question is if I have to write a psuedocode for the alogoritham that does this in one-pass algoritham(one tuple at a time), how to do it? assuming that you have memory no greater than log(m).

2009-04-28 17:05:12

Then why do you reference SQL at all in your question? It sounds like you're asking for a general-purpose algorithm for an arbitrary data structure representing a relation.

Dave Costa 2009-04-28 17:11:46

Ok..sorry for the cofusion...yes that's what I am asking...thank you..

2009-04-28 17:13:20

so i just wasted my time trying to help?

DForck42 2009-04-28 17:22:51

well it's not wasted...You also answered something I did not know...

2009-04-28 17:24:30

You should put all requirements into your question, Nimesh. Especially the memory restriction seems important. On the other hand, this somehow smells like disguised homework.

Svante 2009-04-28 17:31:36

Jhonny D. Cano -Leftware- 2009-04-28 17:34:59

well no it is not a homework...it's part of the project that I am working on...thank you for your comments

2009-04-28 17:41:35

Answer 2

+2 A:

SELECT value, COUNT(*) frequency
FROM table
GROUP BY value
ORDER BY COUNT(*) DESC

Jhonny D. Cano -Leftware- 2009-04-28 17:03:56

Thank you...for your answer

2009-04-28 17:25:06

Answer 3

+1 A:

Store them in a hash table, with a count of how many times each one was stored (O(n)).
Then loop through the buckets (O(n)).

Mike Dunlavey 2009-04-28 18:04:13

Thank you mike for your comment...

2009-04-28 18:11:45

You're welcome. I like the easy ones.

Mike Dunlavey 2009-04-28 18:14:56

What about O(logn)???? Is there a way to get O(log n) instead of O(n)?

2009-04-28 18:25:48

@Nimesh: How could you possibly get below O(n)? You have to look at all n numbers -- that alone takes O(n) time!

j_random_hacker 2009-04-28 18:59:50

Well, theoretically there's an O(1) method, if the list is finite. Just use the list as an index into a rather big array, and fetch the answer :-)

Mike Dunlavey 2009-04-28 19:10:41

@Mike: Sure, but that approach doesn't leave many algorithms that *aren't* O(1)... :)

j_random_hacker 2009-04-28 19:41:30

@j: Ain't CS wonderful? All we gotta do is wait for the hardware to catch up! :-)

Mike Dunlavey 2009-04-28 20:09:41

Answer 4

A:

By definition, a set contains only unique values. Thus, the answer should be the set itself, which can be "computed" in constant time. :-)

Seriously though, assuming that you're actually working with a heap, list, vector or some other data structure which allows duplicates, probably the fastest way to solve the problem is the answer from Mike Dunlavey, which is to use a hashtable. There are also some techniques using trees you could use which employ successively more refined estimates. I think such an approach would be O(n log n) (not as good as the hashtable solution), though perhaps it could be as low as O(log n) if you permit some statistical error.

Daniel Spiewak 2009-04-28 18:17:29

Thank you for your comment. Values are not unique. Table will have duplicates. It is just that values will be from domain (1,2,3,4...m).

2009-04-28 18:21:38

ansaurus

tags:

views:

answers:

Best way to find recurring values in a set

related questions