views:

872

answers:

12

I wonder how many of you have implemented one of computer science's "classical algorithms" like Dijkstra's algorithm or data structures (e.g. binary search trees) in a real world, not academic project?

Is there a benefit to our dayjobs in knowing these algorithms and data structures when there are tons of libraries, frameworks and APIs which give you the same functionality?

+5  A: 

Well, someone has to write the libraries. While working at a mapping software company, I implemented Dijkstra's, as well as binary search trees, b-trees, n-ary trees, bk-trees and hidden markov models.

Besides, if all you want is a single 'well known' algorithm, and you also want the freedom to specialise it and optimise it if it becomes critical to performance, including a whole library seems like a poor choice.

Nick Johnson
+4  A: 

In my previous workplace, which was an EDA company, we implemented versions of Prim and Dijsktra's algorithms, disjoint set data structures, A* search and more. All of these had real world significance. I believe this is dependent on problem domain - some domains are more algorithm-intensive and some less so.

Having said that, there is a fine line to walk - I see no business reason for re-implementing STL or Java Generics. In many cases, a standard library is better than "inventing a wheel". The more you are near your core application, the more it may be necessary to implement a textbook algorithm or data structure.

Yuval F
+15  A: 

Is there a benefit to our dayjobs in knowing these algorithms and data structures when there are tons of libraries, frameworks and APIs which give you the same functionality?

The library doesn't know what your problem domain is and won't be able to chose the correct algorithm to do the job. That is why I think it is important to know about them: then YOU can make the correct choice of algorithms to solve YOUR problem.

kigurai
Learning those algorithms does to your brain what going to the gym does to your muscles.
Gastoni
+8  A: 

Is there a benefit to understanding your tools, rather than simply knowing that they exist?

Yes, of course there is. Taking a trivial example, don't you think there's a benefit to knowing what the difference is List (or your language's equivalent dynamic array implementation) and LinkedList (or your language's equivalent)? It's pretty important to know that one has constant random access time, while the other is linear. And one requires N copies if you insert a value in the middle of the sequence, while the other can do it in constant time.

Don't you think there's an advantage to understanding that the same sorting algorithm isn't always optimal? That for almost-sorted data, quicksort sucks, for example? Naively just calling Sort() and hoping for the best can become ridiculously expensive if you don't understand what's happening under the hood.

Of course there are a lot of algorithms you probably won't need, but even so, just understanding how they work may make it easier for yourself to come up with efficient algorithms to solve other, unrelated, problems.

jalf
While I agree in general, most libraries' Sort() facility caters for quicksort worst cases in one manner or another.
Nick Johnson
They mostly just avoid the O(n^2) cases. With nearly-sorted data, any library qsort implementation will be O(n log n), whereas insertion sort is O(n). "nearly sorted", adj: every element is within O(1) places of its proper sorted place.
Steve Jessop
And in practice, the super-linear component of algorithms like quicksort is very small. It's (almost) never worth worrying about.
Nick Johnson
+2  A: 

If you never work with performance-critical code, consider yourself lucky. However, I consider this scenario unrealistic. Performance problems could occur anywhere. And then it's necessary to know how to fix that problem. Obviously, merely knowing a few algorithm names isn't enough here – unless you want to implement them all and try them out one after the other.

No, knowing (at least some of) the inner workings of different algorithms is important for gauging their strengths and weaknesses and for analyzing how they would handle your situation.

Obviously, if there's a library already implementing exactly what you need, you're incredibly lucky. But let's face it, even if there is such a library, using it is often not completely straightforward (at the very least, interfaces and data representation often have to be adapted) so it's still good to know what to expect.

Konrad Rudolph
+2  A: 

A* for a pac man clone. It took me weeks to really get but to this day I consider it a thing of beauty.

Nick
+2  A: 

I've had to implement some of the classical algorithms from numerical analysis. It was easier to write my own than to connect to an existing library. Also, I've had to write variations on classical algorithms because the textbook case didn't fit my application.

For classical data structures, I nearly always use the standard libraries, such as STL for C++. The one time recently when I thought STL didn't have the structure I needed (a heap) I rolled my own, only to have someone point out almost immediately that I didn't need to do that.

John D. Cook
+4  A: 

We use a home grown implementation of a p-random number generator from Knuth SemiNumeric as an aid in some statistical processing

EvilTeach
+8  A: 

Knowing, or being able to understand these algorithms is important, these are the tools of your trade. It does not mean you have to be able to implement A* in an hour from memory. But you should be able to figure out what the advantages of using a red-black tree as opposed to a normal unbalanced tree are so you can decide if you need it or not. You need to be able to judge the fitness of an algorithm for solving your problem.

This might sound too school-masterish but these "classical algorithms" were not invented to give college students exam questions, they were invented to solve problems or improve on current solutions, just like the array, the linked list or the stack are building blocks to write a program so are some of these. Just like in math where you move from addition and subtraction to integration and differentiation, these are advanced techniques that will help you solve problems that are out there.

They might not be directly applicable to your problems or work situation but in the long run knowing of them will help you as a professional software engineer.

To answer your question, I did an implementation of A* recently for a game.

Harald Scheirich
+1  A: 

Classical algorithms I have used in actual work:

  • A topological sort

  • A red-black tree (although I will confess that I only had to implement insertions for that application and it only got used in a prototype). This got used to implement an 'ordered dict' type structure in Python.

  • A priority queue

  • State machines of various sorts

  • Probably one or two others I can't remember.

As to the second part of the question:

An understanding of how the algorithms work, their complexity and semantics gets used on a fairly regular basis. They also inform the design of systems. Occasionally one has to do things involving parsing or protocol handling, or some computation that's slightly clever. Having a working knowledge of what the algorithms do, how they work, how expensive they are and where one might find them lying around in library code goes a long way to knowing how to avoid reinventing the wheel poorly.

ConcernedOfTunbridgeWells
I don't think anyone doubts your coding skills, so I don't think you need to provide us with code...
kigurai
+1  A: 

Classical algorithms are usually associated with something glamorous, like games, or Web search, or scientific computation. However, I had to use some of the classical algorithms for a mere enterprise application.

I was building a metadata migration tool, and I had to use topological sort for dependency resolution, various forms of graph traversals for queries on metadata, and a modified variation of Tarjan's union-find datastructure to partition forest-like structured metadata to trees.

That was a really satisfying experience. Most of those algorithms were implemented before, but their implementations lacked something that I would need for my task. That's why It's important to understand their internals.

kohomologie
+1  A: 

I use the Levenshtein distance algorithm to help implement a 'Did you mean [suggested word]?' feature in our website search.

Works quite well when combined with our 'tagging' system, which allows us to associate extra words (other than those in title/description/etc) with items in the database. \

It's not perfect by any means, but it's way better than most corporate site searches, if I don't say so myself ; )

David HAust