views:

1195

answers:

18

Hi,

I've been thinking recently on using the Object Oriented design in the sorting algorithm. However I was not able to find a proper way to even come closer in making this sorting algorithm that does the sorting in O(n) time.

Ok, here is what I've been thinking for a week. I have a set of input data. I will assign a mass to each of the input data (assume input data a type of Mass and a type of Sphere. If we assume all objects to be perfectly spherical objects with shapes proportional to their mass, the heaviest one touches the ground first.). I will be placing all my input data in the space all at same distance from earth. And I will make them free fall. According to gravitational law, the heaviest one hits the ground first. And the order in which they hit will give me the sorted data. This is funny in some way, but underneath I feel that this should be possible using the OO that I have learnt till date

Is it really possible to make a sorting technique that uses gravitational pull like scenario or am I stupid/crazy?

Edit: Turns out all objects hit the ground at the same time hence I introduced the spherical object concept.

+7  A: 

You've just restated the problem. Calculating the order of the gravitational effects will have, at best, the O of the sort algorithms you're trying to beat.

bmargulies
I agree with you and think you're correct, but stating this without any support isn't much help.
Beska
+2  A: 

Once you've calculated all the times they take to hit the ground, you'll still have to sort those values. You're not really gaining anything, you're just sorting different numbers after performing some extra unnecessary calculation.

EDIT: Whoops. Forgot physics 101. Of course they'll all hit at the same time. :)

Any sort of modeling like this just transforms one sorting problem into another one. You won't gain anything.

Seth
Why cant a hit ground operation really hits a receiver which collects the data? That way you dont have to sort the times.
Bragboy
You must to model (simulate) time to get the order of hits.
osgx
I don't understand. Are you thinking that you would simulate the falling objects? This doesn't really sound "object oriented".Regardless: in addition to the fact that (as others have pointed out) gravity doesn't really work that way. But even if you used some other method of physical modeling, you'd still wind up with the same sorting problem.
Seth
+5  A: 

Gravitation calculation is free of charge only in real world. In programm you need to model it. It will be another n in complexity minimum

osgx
+4  A: 

The idea might seem simple, but the difference between the real world and the modelled one in this case is that in the real world everything is happening in parrallel. To model the gravitational sort as you describe you would have to start by thinking each object on a seperate thread with a thread safe way to add them to a list in the order they finish. Realistically in terms of sorting performance you would probably just use a quick sort on the times, or since they are at the same distance the rate of falling... However if your formula is proportional to the mass, you'd just skip all that and sort the mass

JamesB
+3  A: 

In a fictious "gravitational computer" this kind of sorting would take O(1) to be solved. But with the real computers like we know it, your lateral thought wouldn't help.

Jader Dias
The computer would have to be pretty big to handle an unbounded amount of items all at once.
Steve314
Fictitious? Ha! http://www.youtube.com/watch?v=vZaRgMOo-Ig
Jeremy Friesner
@Jeremy Excelent!
Jader Dias
+5  A: 

General-purpose sorting is provably at best O(n log n). To improve on this, you have to know something about the data other than just how to compare values.

If the values are all numbers, radix sorting gives O(n) assuming you have upper and lower bounds for the numbers. This approach can be extended to handle any number - and ultimately, all data in a computer is represented using numbers. For example you can radix-sort strings by, in each pass, sorting by one character as if it were a digit.

Unfortunately, handling variable sizes of data means making a variable number of passes through the radix sort. You end up with O(n log m) where m is the largest value (since k bits gives you a value of up to (2^k)-1 for unsigned, half that for signed). For example, if you are sorting the integers from 0 to m-1, well - you've actually got O(n log n) again.

Translating one problem to another can be a very good approach, but sometimes it's just adding another layer of complexity and overhead.

Steve314
Back in the day, when I was a college prof, I used to assign my best students the problem of proving that sorting routines that compare and swap elements in place are at best O(n log n). Proving it for "general-purpose sorting" would require a definition of what that means.
Jive Dadson
+2  A: 

Ignoring all the flaws that everyone else has mentioned, essentially this boils down to an O(n^2) algorithm, not O(n). You'd have to iterate over all the "spheres", find the "heaviest" or "biggest" one, and then push it onto a list, then rinse and repeat. i.e., to find the one that hits the ground first, you have to iterate over the whole list, if it's the last one, it'd take O(n) time, the second would could take O(n-1), etc... but worse than that, you have to perform a bunch of mathematical operations each time just to calculate some useless "time" value when you could have sorted on the value you were interested in in the first place.

Mark
+15  A: 

The thing is, though one of the ideas of OOP may be to model the real world, that doesn't mean there's a direct correspondence between how long something takes in the real world and how long it would take to simulate it with a computer.

Imagine the actual steps required for your procedure:

  1. An object has to be constructed for every item in your set of data. On most modern hardware, this alone would require iteration and would therefore make your strategy O(n) at best.
  2. The effect of gravity would need to be simulated, for each object. Again, the most obvious, straightforward way to implement this would be to iterate.
  3. The time that each object lands on the surface of the "Earth" in your programming model would have to be captured and, via some implementation-specific mechanism, the corresponding object would need to be inserted into an ordered list as a result.

Considering the problem further introduces additional complications. Ask yourself this: how high up do you need to position these objects to start? Obviously high enough so that the largest one actually falls; i.e., farther from the Earth than the radius of the largest object. But how do you know how far that is? You'd need to first determine the largest object in the collection; this, again, would (probably) require iterating. Also, one might imagine that this simulation could be multithreaded to attempt to simulate the real-time behavior of the notion of objects actually falling; but then you will find yourself attempting to add items to a collection (an operation which obviously takes a non-zero amount of time) potentially at the same time that new collisions are being detected. So this will create threading issues as well.

In short, I have trouble imagining how this idea could be properly implemented simply using OOP without special hardware. That said, it really is a good idea. It reminds me, in fact, of Bead Sort--an algorithm which, though not the same as your idea, is also a sorting solution that takes advantage of the very physical concept of gravity and, not surprisingly, requires specialized hardware.

Dan Tao
THank you very much for your response. Its very insightfull. The reason I asked this question is the way people learn with examples. For example a stack, queue have real world counterparts. Even sorting techniques like quick sort/merge sort having a divide and rule concept, I can imagine to a certain extent how easy/faster things get when applied programatically. However in this specific case, I was wrong all along. Well, how else to learn? :)
Bragboy
@Bragaadeesh: Definitely not *wrong*! Like I said, it's actually a very clever concept. The problem is just much work would need to be done to set up the model to run the simulation that would produce your results.
Dan Tao
Heck, lots of my more innovative ideas turn out not to work. I just try to come up with more.
David Thornley
+1 I was going to mention bead sort, but you mentioned it first.
Brian
+3  A: 

To answer the ultimate question:

am I stupid/crazy?

The answer is no, you are inquisitive, that's all. Nothing wrong with that.

MPelletier
Being inquisitive is in no way an indicator of lack of craziness. In fact, inquisitiveness can be taken to the point of obsession and so be the first stage of insanity!
Jim Leonardo
+1  A: 

I love the idea! It is clever. While yes what others are saying is in general correct, that the O(n log n) bound is a provably lower bound on the sorting problem in general, we need to keep in mind that that lower bound is true only for comparison based models. What you are proposing here is a different model altogether, it does deserve further thought.

Also, as James and Matrix have pointed out, the heavier one doesn't travel faster than the lighter one, you obviously need to adapt the model to something where the heavier object (number) would indeed travel faster/further (or slower/less further) so that you can somehow distinguish the numbers (a centrifuge comes to mind as well).

Requires more thought, but your idea is sharp!

(Edit) Now looking at Enrique's Phys2D idea, I think it makes a whole lot more sense.

One thing I would suggest is to definitely ignore the efficiency aspect for now. (I know, I know that was the entire goal). It is a new model, and we first need to bridge the gap between the idea, and its implementation. Only then we can comprehend what the time complexity even means for this model.

Amrinder Arora
Thank you very much!
Bragboy
+1  A: 

It should be definitely only you should have proper hardware supported for it. Otherwise this sounds a very cool idea. Hope someone presents an IEEE paper to make this crazy dream visualized.

Sean Claypole
+1  A: 

Some weeks ago I was thinking about the same thing.

I thought to use Phys2D library to implement it. It may not be practical but just for fun. You could also assign negative weights to the objects to represent negative numbers. With phys2d library you can define the gravity so objects with negative weight will go to the roof and objects with positive weight will fall down .Then arrange all objects in the middle between the floor and roof with the same distance between floor and roof. Suppose you have a resulting array r[] of length=number of objects. Then every time an object touches the roof you add it at the beginning of the array r[0] and increment the counter, next time an object touches the roof you add it at r1, every time an object reaches the floor you add it at the end of the array r[r.length-1], next time you add it at r[r.length-2]. At the end objects that didn't move(stayed floating in the middle) can be added in the middle of the array(these objects are the 0's).

This is not efficient but can help you implement your idea.

Enrique
Thanks Enrique.. Its great to see that there is a library for this!!
Bragboy
+2  A: 

Hmmmm. Gravity sort.

Ignoring your specific model of gravity is wrong, let's see where this idea takes us.

Physical reality has 10^80 processors.

The actual lower bounds of sort is known to be log(N) if you have N/2 processors to sort N objects.

If you have several CPU cores available there's no reason you can't run mergesort on several threads.

Joshua
+1  A: 

I think the problem can be made simpler like this: you want to put the bottoms of the spheres at different heights so that when dropped at identical gravities the largest will hit the ground first, second largest second, etc. This is essentially equivalent to using lines rather than spheres...in which case you can just say that heightOffTheGround = MAX_VALUE - mass.

You also don't need to worry about acceleration over time...since you don't care how fast they are going or realistic physics, you can give them all an initial velocity x, and go from there.

The problem is then that we've essentially just restated the problem and solved it like this (pseudocode):

int[] sortedArray; // empty
int[] unsortedArray; // full of stuff
int iVal = MAX_INT_VALUE;
while (true)
{
    foreach (currentArrayValue in sortedArray)
    {
        if (iVal = current array value
        {
            put it in my sortedArray
            remove the value from my unsortedArray
        }
    }
    if (unsortedArray is empty)
    {
        break;  // from while loop
    } 
    iVal--
}

The problem is that to run the physics engine, you've got to iterate over each time unit, and that will be O(1)...with a very large constant...the constant number of delta values that the system will run on. The drawback is that for the very large majority of these delta values, you will essentially be getting no closer to the answer: on any given iteration, it is very likely that all the spheres/lines/whatever will have moved...but none will hit.

You could try to just say, well, let's skip a lot of intermediary steps and just jump ahead until one hits! But that means that you have to know which one is the largest...and you're back to the sorting issue.

Beska
+1  A: 

I’ll adapt your idea a little. We do have our objects but they don’t differ in weight, but in speed. So, at the beginning all objects are aligned at the starting line and on the starting shot, they’ll all move with their respective velocities to the finish.

Clear enough: First object in finish will emit a signal, saying it is there. You catch the signal and write to the results paper who it was.

Okay, so you’ll want to simulate that.

We take the length of your field to be L=1. With step size ∆t each of your N objects moves a length of vᵢ∙∆t at a time. That means each object needs sᵢ = L / (vᵢ∙∆t) steps before reaching the finish.

The point is now, in order to distinguish between two objects with very close speeds, you’ll need to have a different step size for all of your objects.

Now, in the best case, this means that object 1 finishes in one step, object 2 in two and so on. Therefore, the total number of steps is S = 1 + 2 + … + N = N∙(N + 1)/2. This is of order N∙N.

If it’s not the best case and the velocities are not equally close to each other, you’ll have to lower the step size and in effect iterate many more times.

Debilski
+1  A: 

If a computer were to be built that sorted objects based on some criteria (which is not utterly ridiculous to think about), then I believe the order of complexity would have nothing to do with the number of objects, but rather the rate of local acceleration due to gravity. Assuming it's modeled in Earth, the complexity would be O(g0) where g0 is:

alt text

The simple reasoning is that the number of spherical objects (n) is irrelevant if their centers are aligned at start. Moreover, the acceleration due to gravity will have a much bigger impact than the height itself as it increases. For instance, if we increase the height of all objects by 10x, it wouldn't take them 10x the time to hit the ground, but much lesser. This includes various negligible approximations as the acceleration actually depends on distance between two objects, but that can be ignored as we are more interested in a bigger picture rather than a specific value.

Brilliant idea nevertheless.

Also, I love the coin sorting video posted by @Jeremy. And finally object orientedness would be the least of my concerns in building such a machine. Thinking more about it, here's my stupid two cents on building such a machine:

O 0 o O o

. . . . .
. . . . .
. . . . .
= = = = =

All objects are of varying sizes proportional to the criteria we want to sort by. They are initially held together horizontally by a thin rod that runs through the center of each sphere. The = at the bottom are specially designed to record a value (optionally their position) somewhere as soon as they collide with a sphere. After all spheres have collided, we will have our sorted order based on the recorded values.

Anurag
+1  A: 

http://www.youtube.com/watch?v=5C5_dOEyAfk

"How 'bout that?"
+1  A: 

from Debilski's answer:

I’ll adapt your idea a little. We do have our objects but they don’t differ in weight, but in speed. So, at the beginning all objects are aligned at the starting line and on the starting shot, they’ll all move with their respective velocities to the finish.

Clear enough: First object in finish will emit a signal, saying it is there. You catch the signal and write to the results paper who it was

I'd simplify it a step further and say these objects are random positive integers. And we want to sort them in an ascending order as they approach zero, so they have a distance from zero d which initially is equal to the integer itself.

Assuming the simulation takes place in discrete time steps i.e. frames, in every frame, every object's new distance would be: d = d - v and an object would get added to the list when its d ≤ 0. That is one subtraction and one conditional. Two discrete steps for each object, so the calculations seem to be O(n): linear.

The catch is, it is linear for one frame only! It is multiplied by the number of frames f required to finish. The simulation itself is O(nf): quadratic.

However, if we take the duration of a frame as the argument t we get the ability to affect the number of frames f in an inversely proportional manner. We can increase t to reduce f but this comes at the cost of accuracy, the more we increase t, the more the probability that two objects will finish in the same time frame therefore be listed as equivalent, even if they are not. So we get an algorithm with adjustable accuracy (as it is in most finite element simulation contexts)

We can further refine this by turning it into an adaptive+recursive algorithm. In human code it would be:

function: FESort: arguments: OriginalList, Tolerance
  define an empty local list: ResultList

  while OriginalList has elements
    define an empty local list: FinishedList
    iterate through OriginalList
      decrement the distance of each object by Tolerance
      if distance is less than or equal to zero, move object from OriginalList to FinishedList

    check the number of elements in FinishedList
      when zero
        set Tolerance to double Tolerance
      when one
        append the only element in FinishedList to ResultList
      when more
        append the result of FESort with FinishedList and half Tolerance to ResultList

  return ResultList

I'm wondering if there's any real similar implementation that works for someone.

Interesting subject indeed :)

PS. The pseudocode above is my idea of pseudocode, please feel free to rewrite it in a more legible or conformant way if there is one.

edgerunner
+1 Wow!!! What an excellent view..
Bragboy
Thanks, I doubt it's useful though :) I'd say it depends much on finding a good tolerance value.
edgerunner