views:

190

answers:

3

I was writing a class that contained some functional-esque methods. First I wrote them using List as parameters and return types. Then I thought "Hey, you could also use a more generic type!" so I replaced the Lists with Seq, hoping that I could make my stuff faster one day by feeding them something else than lists.

So which general purpose stack-like data-structure shall I write my methods and algorithms for? Is there a general pattern I can stick to? All this is because the methods might need to get optimized in the future in case they will form a bottle-neck.

Update

I'll try to be a bit more precise: Given you know which operations you are using, like reversing, .tail, direct element access, or for comprehensions. Can I choose a type that will force efficiency on those operations?

Update 2

I'm quite aware of the performance of concrete data structures for various tasks. What I'm not aware of is which data structure might appear as a sub-class of some super type.

For example shall I use TraversableOnce or IndexedSeq instead of List or Array? Will it buy me anything?

Additional Question

What is your default List-like data-structure signature? Do you write

a(b: List[A]): List[A]

or

a(b: TraversableOnce[A]): TraversableOnce[A]?

Can you explain why?

+5  A: 

List is the default implementation of LinearSeq, which in turn is the default implementation of Seq, which in turn is the default implementation of Iterable, which in turn is the default implementation of Traversable.

See the diagrams here and choose the most general type as per your requirements.


alt text


This document might also help.

missingfaktor
+4  A: 

For background info on the collection library, have a look at the Scala 2.8 Collections API article.

If you have specific operations in mind then look in particular at the Performance Characteristics section.

On the design choice on whether to use a specific type of a more general trait, I would say that it depends on what you do in the implementation. For instance if a method accepts a List, it can count on fast prepend and could use this in its implementation. So accepting a more general trait could have unwanted performance results. Also, you'll have to worry about what type you get back.

scala> def a[A](t:TraversableOnce[A]): TraversableOnce[A] = t
a: [A](t: TraversableOnce[A])TraversableOnce[A]

scala> a(List(1,2))
res0: TraversableOnce[Int] = List(1, 2)

scala> res0.tail
<console>:8: error: value tail is not a member of TraversableOnce[Int]
       res0.tail

If you want to write something general, you will probably want to preserve the types. See http://stackoverflow.com/q/3225675/257449 for a taste of the issues you'll encounter and some of the solutions.

huynhjl
+2  A: 

I think, in general, you should use Seq for your parameters and design your methods to work efficiently with List. This way your methods will work ok with most Seq implementations and you will not have to convert your seqs prior to use your methods.

Edit

Your question has many questions inside.

  1. So which general purpose stack-like data-structure shall I write my methods and algorithms for?
    • I think the answer here is List. It's a stack and it's very fast
  2. Can I choose a type that will force efficiency on those operations?
  3. For example shall I use TraversableOnce or IndexedSeq instead of List or Array? Will it buy me anything?
    • Some abstractions have performance characteristics defined, some others don't. For example IndexedSeq scaladoc says "Indexed sequences support constant-time or near constant-time element access and length computation". If you have an IndexedSeq parameter and someone passes an IndexedSeq implementation that does not have "near-constant time element access", then that someone is breaking the contract and it's not your problem.
  4. What is your default List-like data-structure signature?
    • Seq
german1981
This is basically what I have done in my project. This is the kind of answer I was hoping for, but now I'm thinking that maybe there doesn't exist a good all-round answer.
ziggystar
One problem is that your question has many questions inside, I have updated my answer to try to address each question more specifically
german1981