views:

134

answers:

4

I'm currently playing with implementing various sorting algorithms in Java, mostly for fun, but I'm struggling with how to do it 'right'. That is, I want the user to be able to call the sorting algorithm of choice on anything that is comparable - ints, longs, Strings, booleans (actually, are these comparable in Java?), their own classes; whatever. The question is how to do this.

I was thinking of using a class to represent the sorting algorithm, and therefore store the things to be sorted inside using a generic list or whatever (List<E>). This would also allow me to use multiple constructors and thus allow the user to pass in the data in various forms - Lists, Arrays, whatever. Is this the correct way to do it? My current problem is that I don't wish for the user to have to create a class when they want to sort something, I'd rather it was able to be called much like System.out.println or the like.

// Example:

int[] myInts = {5,4,3,2,1};

// This is what I do *not* want.
InsertionSort mySort = new InsertionSort();
int[] sortedInts = mySort.sort(myInts);

// This is more like what I want.
int[] sortedInts = Sorting.insertionSort(myInts);

I apologise with what may seem like a basic question, but I am just learning my way with programming languages. Which is a bit ridicolous for a 2nd year Computing student working at a software company for his summer job, but you'd be surprised at how little programming knowledge is required for most of my work... it's usually more design knowledge.

EDIT:

For clarity, my three main question are:

  • Is it better to have the user create a class to do the sorting, or to have a static method in a class the user imports?
  • Is it possible to deal with both primitive data types and generic objects easily? Since I want to be able to handle any generic object that implements comparable (or likewise), this then causes problems with primitives (as they don't implement anything ;) ).
  • What is the best way to handle generic input - what should I check for before I try to sort them (implementing Comparable, for example)?
+1  A: 

You could take as example the way Collections provides the binarySearch operation ... And indeed, the

int[] sortedInts = Sorting.insertionSort(myInts);

is more java-way, even if I would personnally prefer

public class Sorting {
     public static <DataType extends Comparable> Iterable<DataType> insertionSort(Iterable<DataType> data);
}
  • <DataType> ensure output data is of the same type than input
  • Iterable<DataType> data input data is an iterable, to ensure maximum compatibility. Obviously, using a List would be by far simple, as it allows internal item reordering. Howeve"r, using an iterable ensure the implementor of this method will have to re-create the list in order to modify it, guaranteeing that the input list is left unchanged, and that the output list is another one.

Since I just saw you edit your question, let me reply to it point by point (and consider chosing an answer after that, as it is easier to add new questions than to edit existing ones endlessly - unless you make your question a community wiki, like I do of this reply)

Is it better to have the user create a class to do the sorting, or to have a static method in a class the user imports?

To my mind, using a static method in this case is preferable, as you here have to manipulate objects you didn't create, in a quite "basic" fashion.

Is it possible to deal with both primitive data types and generic objects easily? Since I want to be able to handle any generic object that implements Comparable (or likewise), this then causes problems with primitives (as they don't implement anything ;) ).

Have you heard about autoboxing ? It's a feature of Java 5 which makes primary types "equivalents" of objects. That's to say int are automatically converted into Integer, which, as you know, implement Comparable.

What is the best way to handle generic input - what should I check for before I try to sort them (implementing Comparable, for example)?

Notice that, due to my method declaration (the ), checking that input data implements Comparable is not done by you, but by the Jav compiler, allowing your IDE to show you mistakes.

Riduidel
it's probably not a good idea to sort "Iterable" into "Iterable". List would be better
unbeli
Thank you for the link to Collections Riduidel - it has been very helpful. If you can, could you please explain how exactly your prefered way - `public static <DataType extends Comparable> Iterable<DataType> insertionSort(Iterable<DataType> data);` - works? That is, what does `<DataType extends Comparable> Iterable<DataType>` mean? XD. Unbeli, which one do you mean it would be better to use List for?
Stephen
I'm guessing it means "define DataType as a type extending Comparable, and then use it in two places". So instead of `public static Iterable<DataType extends Comparable> insertionSort(Iterable<DataType extends Comparable> data);`, which would be ambiguous.
Eric
+1  A: 

I think you answered your own question? If you want to expose a static method instead of making the user create and object and call an instance method, then just do that. Sorting.insertionSort() looks fine to me.

Internally that can dispatch to whatever you like. Inside, if you want to implement this with classes and polymorphism and whatnot, go ahead. It does seem like a bit overkill though. I am not sure inheritance and polymorphism help a lot here.

Sean Owen
A: 

I'm not sure if this is what you are struggling with ... but it is next to impossible to implement an algorithm that works both for reference types and (real) primitive types. The reason is the Java type system does not have a notional universal type that has the primitive types and Object as subtypes.

The normal workaround for this is to wrap the primitive types using their corresponding wrapper classes; e.g. Integer for int, Boolean for bool and so on. This allows you to implement (for example) a sorting algorithms that for for any Collection<T> or any <T>[].

This approach has performance / memory usage issues when applied to large arrays of (say) integers. Either you wear the performance hit, or you implement the algorithm and its supporting classes separately for each primitive type.

(I said next to impossible, because it is possible to abstract the comparison of a pair of array elements and swapping of a pair of array elements in a way that doesn't expose the actual element type in the interface; e.g.

public interface ArraySortAdapter {
  public abstract int compareElements(Object array, int pos1, int pos2);
  public abstract void swapElements(Object array, int pos1, int pos2);
}

and provide different implementations for different array types; e.g.

public class IntArraySortAdapter implements ArraySortAdapter {
  public int compareElements(Object array, int pos1, int pos2) {
      int[] intArray = (int[]) array;
      if (intArray[pos1] < intArray[pos2]) {
          return -1;
      } else if (intArray[pos1] > intArray[pos2]) {
          return +1;
      } else {
          return 0;
      }
  }

  ...
}

However, this is cumbersome and inefficient, to say the least ...)

Stephen C
A: 

The normal way of implementing a sorting algorithm would be to implement a static method, e.g. take a look at the source code for Arrays.sort(). You can overload this method with diferent implementations for different parameter types (e.g. objects that implement comparable vs. provide your own comparator vs. primitive arrays etc.)

Here's one I wrote earlier:

public static <T> void swap(T[] a, int x, int y) {
    T t=a[x];
    a[x]=a[y];
    a[y]=t;
}

public static <T extends Comparable<? super T>> void mergeInOrder(T[] src, T[] dst, int p1, int p2, int p3, int p4) {
    if (src[p2].compareTo(src[p3])<=0) return; // already sorted!

    // cut away ends
    while (src[p1].compareTo(src[p3])<=0) p1++;
    while (src[p2].compareTo(src[p4])<=0) p4--;

    int i1=p1;
    int i3=p3;
    int di=p1;
    while(di<p4) {
        if (src[i1].compareTo(src[i3])<=0) {
            dst[di++]=src[i1++];
        } else {
            dst[di++]=src[i3++];
            if (i3>p4) {
                System.arraycopy(src,i1,dst,di,p2-i1+1);
                break;
            }
        }
    }

    System.arraycopy(dst, p1, src, p1, (p4-p1)+1);
}

public static <T extends Comparable<? super T>> void mergeSort(T[] src, T[] dst, int start, int end) {
    if (start+1>=end) {
        if (start>=end) return;
        if (src[start].compareTo(src[end])>0) {
            swap(src,start,end);
        }
        return;
    }

    int middle=(start+end)/2;
    mergeSort(src,dst,start, middle);
    mergeSort(src,dst,middle+1, end);
    mergeInOrder(src,dst,start,middle,middle+1,end);
}

private static ThreadLocal<Comparable<?>[]> mergeSortTemp=new ThreadLocal<Comparable<?>[]>();

@SuppressWarnings("unchecked")
public static <T extends Comparable<? super T>> void mergeSort(T[] src) {
    int length=src.length;
    Comparable<?>[] temp=mergeSortTemp.get();
    if ((temp==null)||(temp.length<length)) {
        temp=new Comparable[length*3/2];
        mergeSortTemp.set(temp);
    }
    mergeSort(src,(T[])temp,0,length-1);
}

However, I can think of two good reasons to implement a sorting algorithm as a class where you generate your own instance:

  • It lets you polymorphically pass around instances of sorting algorithms - this could be useful if e.g. you were creating a collection of sorting algorithms and wanted to run lots of benchmarks on them for example.
  • You can have private state in the sorter instance - this is useful for some sorting algorithms, e.g. having some pre-allocated arrays for temporary storage, and it makes sense to put it in a class instance if you want to be able to simultaneously use different sort instances from multiple threads - a static method implementation would need some form of synchronisation (e.g. see the use of the ThreadLocal in the code above).
mikera