views:

222

answers:

13

Dear all:

I was interested to know about parameters other than space and time during analysing the effectiveness of an algorithms. For example, we can focus on the effective trap function while developing encryption algorithms. What other things can you think of ?

+5  A: 

Stability - some algorithms may "blow up" with certain test conditions, e.g. take an inordinately long time to execute, or use an inordinately large amount of memory, or perhaps not even terminate.

Paul R
The original version of quicksort was a good example. It was `O(NlogN)` on average, but `O(N^2)` in pathological cases. And one such pathological case was when the input list was already sorted!
Stephen C
Of course, "stability" in a sorting algorithm means something else, which might be an answer on its own. I think the `O(N^2)` was understood though, so it's more like a known trade off.
Larry
+2  A: 

worst case and best case are also interesting, especially when linked to some conditions in the input. if your input data shows some properties, an algorithm, by taking advantage of this property, may perform better that another algorithm which performs the same task but does not use that property.

for example, many sorting algorithm perform very efficiently when input are partially ordered in a specific way which minimizes the number of operations the algorithm has to execute. (if your input is mostly sorted, an insertion sort will fit nicely, while you would never use that algorithm otherwise)

Adrien Plisson
+5  A: 

Complexity. 2 algorithms being the same in all other respects, the one that's much simpler is going to be a much better candidate for future customization and use.

Ease of parallelization. Depending on your use case, it might not make any difference or, on the other hand, make the algorithm useless because it can't use 10000 cores.

Tomislav Nakic-Alfirevic
The term complexity when applied to algorithms does not always equate to simplicity. It is usual to refer to complexity as time complexity, rather than how complicated an algorithm is.
Mitch Wheat
You mean *code* complexity (“complexity” is too general a term). Which can be measured differently, e.g. by SLOC (source lines of code) or cyclomatic complexity.
Konrad Rudolph
Fair enough. :) I would than say "understandability" or "maintainability". Would that fit the bill better?
Tomislav Nakic-Alfirevic
About your second point: these belong under the headlines of “work efficiency” and “step efficiency” of an algorithm. Unfortunately, Wikipedia’s output on these topics is zilch.
Konrad Rudolph
+1  A: 

What you are interested aren’t parameters, rather they are intrinsic properties of an algorithm.

Anyway, another property you might be interested in, and analyse an algorithm for, concerns heuristics (or rather, approximation algorithms), i.e. algorithms which don’t find an exact solution but rather one that is (hopefully) good enough.

You can analyze how far a solution is from the theoretical optimal solution in the worst case. For example, an existing algorithm (forgot which one) approximates the optimal travelling salesman tour by a factor of two, i.e. in the worst case it’s twice as long as the optimal tour.

Another metric concerns randomized algorithms where randomization is used to prevent unwanted worst-case behaviours. One example is randomized quicksort; quicksort has a worst-case running time of O(n2) which we want to avoid. By shuffling the array beforehand we can avoid the worst-case (i.e. an already sorted array) with a very high probability. Just how high this probability is can be important to know; this is another intrinsic property of the algorithm that can be analyzed using stochastic.

Konrad Rudolph
+3  A: 

One important parameter that is frequently measure in the analysis of algorithms is that of Cache hits and cache misses. While this is a very implementation and architecture dependent issue, it is possible to generalise somewhat. One particularly interesting property of the algorithm is being Cache-oblivious, which means that the algorithm will use the cache optimally on multiple machines with different cache sizes and structures without modification.

Il-Bhima
+3  A: 

Time and space are the big ones, and they seem so plain and definitive, whereby they should often be qualified (1). The fact that the OP uses the word "parameter" rather than say "criteria" or "properties" is somewhat indicative of this (as if a big O value on time and on space was sufficient to frame the underlying algorithm).
Other criteria include:

  • domain of applicability
  • complexity
  • mathematical tractability
  • definitiveness of outcome
  • ease of tuning (may be tied to "complexity" and "tactability" afore mentioned)
  • ability of running the algorithm in a parallel fashion

(1) "qualified": As hinted in other answers, a -technically- O(n^2) algorithm may be found to be faster than say an O(n) algorithm, in 90% of the cases (which, btw, may turn out to be 100% of the practical cases)

mjv
+8  A: 

First and foremost there's correctness. Make sure your algorithm always works, no matter what the input. Even for input that the algorithm is not designed to handle, you should print an error mesage, not crash the entire application. If you use greedy algorithms, make sure they truly work in every case, not just a few cases you tried by hand.

Then there's practical efficiency. An O(N2) algorithm can be a lot faster than an O(N) algorithm in practice. Do actual tests and don't rely on theoretical results too much.

Then there's ease of implementation. You usually don't need the best intro sort implementation to sort an array of 100 integers once, so don't bother.

Look for worst cases in your algorithms and if possible, try to avoid them. If you have a generally fast algorithm but with a very bad worst case, consider detecting that worst case and solving it using another algorithm that is generally slower but better for that single case.

Consider space and time tradeoffs. If you can afford the memory in order to get better speeds, there's probably no reason not to do it, especially if you really need the speed. If you can't afford the memory but can afford to be slower, do that.

If you can, use existing libraries. Don't roll your own multiprecision library if you can use GMP for example. For C++, stuff like boost and even the STL containers and algorithms have been worked on for years by an army of people and are most likely better than you can do alone.

IVlad
Thanks for your description.
sul4bh
+4  A: 

For algorithms that perform floating point operations, the accumulation of round-off error is often a consideration.

ChrisH
a.k.a. accuracy of the result.
phkahler
+2  A: 

If we're talking about algorithms in general, then (in the real world) you might have to think about CPU/filesystem(read/write operations)/bandwidth usage.

True they are way down there in the list of things you need worry about these days, but given a massive enough volume of data and cheap enough infrastructure you might have to tweak your code to ease up on one or the other.

Manos Dilaverakis
+4  A: 

Power consumption, for embedded algorithms (think smartcards).

Walter H
+6  A: 

Stability (sorting) - Does the algorithm maintain the relative order of equal elements?

Numeric Stability - Is the algorithm prone to error when very large or small real numbers are used?

Correctness - Does the algorithm always give the correct answer? If not, what is the margin of error?

Generality - Does the algorithm work in many situation (e.g. with many different data types)?

Compactness - Is the program for the algorithm concise?

Parallelizability - How well does performance scale when the number of concurrent threads of execution are increased?

Cache Awareness - Is the algorithm designed to maximize use of the computer's cache?

Cache Obliviousness - Is the algorithm tuned for particulary cache-sizes / cache-line-sizes or does it perform well regardless of the parameters of the cache?

Peter Alexander
Thanks, you list was rather exhaustive :)
sul4bh
+1  A: 

For numeric algorithms, there's also the property of continuity: that is, whether if you change input slightly, output also changes only slightly. See also Continuity analysis of programs on Lambda The Ultimate for a discussion and a link to an academical paper.

For lazy languages, there's also strictness: f is called strict if f _|_ = _|_ (where _|_ denotes the bottom (in the sense of domain theory), a computation that can't produce a result due to non-termination, errors etc.), otherwise it is non-strict. For example, the function \x -> 5 is non-strict, because (\x -> 5) _|_ = 5, whereas \x -> x + 1 is strict.

Another property is determinicity: whether the result of the algorithm (or its other properties, such as running time or space consumption) depends solely on its input.

jkff
A: 

All these things in the other answers about the quality of various algorithms are important and should be considered.

But time and space are two things that vary at some rate compared to the size of the input (n). So what else can vary according to n?

There are several that are related to I/O. For example, the number of writes to a disk is an important one, which may not be directly shown by space and time estimates alone. This becomes particularly important with flash memory, where the number of writes to the same memory location is the significant metric in some algorithms.

Another I/O metric would be "chattiness". A networking protocol might send shorter messages more often adding up to the same space and time as another networking protocol, but some aspect of the system (perhaps billing?) might make minimizing either the size or number of the messages desireable.

And that brings us to Cost, which is a very important algorithmic consideration sometimes. The cost of an algorithm may be affected by both space and time in different amounts (consider the separate costing of server storage space and gigabits of data transfer), but the cost is the thing that you wish to minimize overall, so it may have its own big-O estimations.

Jeffrey L Whitledge