What are your favorite programming-related / cs academic published papers?

It could be a functional pearl, a programming-language paper like those often cited on, etc. Really, anything vaguely related to programming. Please also explain why you like the paper.

+13  A: 

I have in mind a somewhat forgotten paper: Structured Programming with go to Statements, Donald Knuth, ACM Computing Surveys, Vol 6, No. 4, Dec. 1974.

It's actually a rebuttal to the "GOTO considered harmful" creed (saying that "GOTO is sometimes okay, really"), a view that is now out of fashion (to say the least!), but it also says some important things about the practice of programming. Most famously, it is the source of the quote "Premature optimization is the root of all evil". :-)
[It's also instructive to read the quote in its original context, to see what Knuth did not mean.]

Quite related is Knuth's Computer Programming as an Art, Knuth’s Turing Award lecture (1974), printed in Communications of the ACM, Volume 17, Issue 12, Dec. 1974, which is available e.g. here.

I just found out that the excellent Mark Jason Dominus (mjd, also considers this paper (Structured programming with go to Statements) "my single all-time favorite computer science paper":
+9  A: 

Oh there are so many fantastic papers. Purely Functional Data Structures is great. I'm still reading it, but so far it has been an excellent read. The implementations of the data structures are short and beautiful.

Chris turned his thesis into a book. It's a great book.
Norman Ramsey
+3  A: 

By far: Functional Reactive Animation by Conal Elliott and Paul Hudak. It was the paper that got me interested in functional reactive programming, but more importantly it is a fantastic exposition of the design methodology called Semantic Design, which has had a profound impact on the way I think about software.

Pickler Combinators is another one of my favorites, for its minimal elegance in solving a practical problem.

To be clear, functional reactive animation is since outdated with much research in the FRP community, but it is still my favorite paper on the subject.
The FRP paper was voted the most influential paper from ICFP 1997, I believe.
Norman Ramsey
+8  A: 


While not purely academic Edsger Dijkstra's paper "The Humble Progammer", his Turing Lecture from 1972, is an excellent paper to read. My favourite quote is

The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague.




Rob Wells
+1 Dijkstra's archive @ utexas is worth combing through slowly
just somebody
+1  A: 

It might not seem programming related, but it really is something that programmers should be aware of: ("Unskilled and Unaware of It: How Difficulties in Recognizing One's Own Incompetence Lead to Inflated Self-Assessments")

Oftentimes, we only see our own knowledge, and the corresponding ignorance of someone else in our own field. We mock the "PhD Professor" who can't remember how to get their email. What we fail to see is our own ignorance in their and other fields. Once we approach other people, our customers, knowing that they have knowledge that we don't, and we have knowledge that they don't, and that our goal is to help them do their jobs better, we can communicate without arrogance and condescension.

Interesting read.
Paul Nathan
A very good paper to put (anonymously) in someones inbox... Pure evil.
Sure, but they probably won't understand why it was in their box!
+8  A: 

Big Ball of Mud


While much attention has been focused on high-level software architectural patterns, what is, in effect, the de-facto standard software architecture is seldom discussed. This paper examines this most frequently deployed of software architectures: the BIG BALL OF MUD. A BIG BALL OF MUD is a casually, even haphazardly, structured system. Its organization, if one can call it that, is dictated more by expediency than design. Yet, its enduring popularity cannot merely be indicative of a general disregard for architecture.

These patterns explore the forces that encourage the emergence of a BIG BALL OF MUD, and the undeniable effectiveness of this approach to software architecture. What are the people who build them doing right? If more high-minded architectural approaches are to compete, we must understand what the forces that lead to a BIG BALL OF MUD are, and examine alternative ways to resolve them.

A number of additional patterns emerge out of the BIG BALL OF MUD. We discuss them in turn. Two principal questions underlie these patterns: Why are so many existing systems architecturally undistinguished, and what can we do to improve them?

Why do I like it? It's an irreverent but useful look at the software development world as it actually exists.

+12  A: 

Here are the papers that come to mind:

+1  A: 

The page rank citation ranking system

It describes how Google works.

Google uses it for web documents, but it's useful for pretty much any directed graph structure. For non directed graph's it's subject to manipulation, but if you "trust" the nodes in the graph, then it will work for non directed graphs too.

One interesting application of it to programing would be to index libraries in source code as a ranking system in a code / api documentation search.

It would be really cool, for example, to build an index of something like source forge (or code plex), using "page rank" between types and methods as a relevance metric.

The easiest language to do it for would be Java (because Java namespaces use reverse domain names by convention).

I wrote a prototype of something like this when I worked at Microsoft. It indexed VB source code, api documentation, and compiled .net assemblies. It also did stemming of identifier names, recogonzing camel casing, pascal casing, and underscores, and split things up into multiple groups.

It got pretty good results.

I was pretty new though (I had only been there a couple weeks), I worked on a different team (the VB team), so I wasn't very successful in convincing the MSDN help folks to adopt it. They ended up just using Windows Live Search.

In any case....

Another use for it would be for Stack Overflow.

You could, for example, treat stack overflow as a large graph, with edges from people to documents, and from documents to people. If you then weighted the edges based on reputation, you could compute page rank over the graph, and then use the page rank to sort search results.

This would produce different results than what Google does (because the edges in normal page rank are not weighted), but I think for the purposes of searching Stack Overflow questions it would yield better results than Google.

I'm willing to bet that most links into stack overflow from the "outside" are to the main page, not to individual questions.

This means from Google's point of view, most stack overflow questions and answers are pretty much equivalent.

But, if you calculated page rank, using "reputation" to weight edges, you would get relevance results that reflected the values of the Stack Overflow community.

In any case... they paper is really good. You should read it.

Scott Wisniewski
Most google searches don't go to the main page. I'm consistantly finding SO questions popping up in search results
Elizabeth Buckwalter
I didn't say most Google Searches go to the main page.I was saying that the Google Results may not be as good as they could be, because Google doesn't take reputation into account.They results may very well be "good enough".I was just expressing a way that I thought they could be "better".
Scott Wisniewski
+7  A: 

Gotta throw in Leslie Lamport's Time, Clocks, and the Ordering of Events in a Distributed System. Not so much for the clock sync part, but for the distributed event ordering. First paper I know of to talk about what causality really means in a distributed system. None of today's big webapps would really work without that :-).

+3  A: 

of course, the answer would be any computer generated scientific paper!

but joking aside, I'm (like many other probably) really intrigued by the whole P != NP problem.

The first time I heard about (this was in my first bachelor year at VUB) it had a huge influence on me. I still can remember how the problem got my attention for several weeks... I kept reading and looking up information about, even foolishly tried to come up with some clever algorithms on my own. Good times it were back then!

here is an example of such a paper, but there are many, many others just like it

+2  A: 

I really enjoyed A History of Erlang by Joe Armstrong, Erlang's creator. It's a really interesting inside look at the creation of a new (and exciting) programming language.

+18  A: 

For this audience I think I have to go with John Hughes's seminal paper Why Functional Programming Matters because it is densely packed with great ideas and because the topic of the paper is programming. I like the paper because any programmer can pick it up and read it and most will come away excited by the ideas. Also, there are lots of examples in the paper, so you can immediately pick up some kind of Haskell implementation and start trying the ideas for yourself right away.

For the academics in the crowd, I'll also give a shout-out to Tony Hoare's great paper Proof of Correctness of Data Representations. It's intensely mathematical, but it is the canonical reference for a critically important and poorly understood technique: abstract data types. If we all understood about abstraction functions and representation invariants, the world would be a far better place. Heck, I'd just settle for getting people to document their representation invariants and leaving me to figure out the abstraction functions for myself. (Some of Tony's best papers have been collected into a book called Essays in Computing Science. Tony is one of my heroes and this collection has some fabulous chapters including Tony's wonderful Turing lecture on 'The Emperor's New Clothes; a talk he gave to a lay audience on what programming is; a proof of a simple program, finding the kth largest element in an array; and many other nice things. Even if you're not mathy you can pick up this book and skip the mathy chapters and still enjoy it a lot.)

Norman Ramsey
The John Hughes paper link is dead. Here's a new one: [Why Functional Programming Matters (PDF)]( "Why Functional Programming Matters by John Hughes")
+7  A: 

Don't mean to get a little too classic on you, but I've gotta say Turing's On Computable Numbers, with an Application to the Entscheidungsproblem has got to be, at least I would consider, one of the most significant (even if only historically) papers in CS. And Charles Petzold's The Annotated Turing does an excellent job of illustrating the significance, as well as making the mathematics understandable, even to a mildly-mathematical programmer such as myself. In fact, I'm pretty sure Jeff mentioned it in one of the Stackoverflow podcasts.

+3  A: 

Trace Trees [PDF] by Andreas Gal, Michael Bebenita, Mason Chang, Michael Franz

Used by Mozilla Tracemonkey javascript interpreter, and contributed to the "javascript interpreter wars."

see also:

+7  A: 

I like Google's MapReduce paper. If you've ever wondered how they process all that data then this is the place to go. It touches on functional programming and distributed processing, which are two of the big topics these days.

+1, easily digestible paper
just somebody
+2  A: 

It would be hard to not include "No Silver Bullet" by Fred Brooks.

+9  A: 

I can't believe nobody has included the (possibly) most referenced paper on this site:
Go To Statement Considered Harmful - Edsger W. Dijkstra

Some of my favorites:
What Every Computer Scientist Should Know About Floating-Point Arithmetic - David Goldberg
A Personal Computer For Children of All Ages - Alan Kay (written in 1972, also a Wired interview with Kay)
Skip Lists: A Probabilistic Alternative to Balanced Trees - William Pugh

Graphics Noob
+1 for the skip lists paper. I was so pleased when I ran across that one a few years ago.
Jason S
+1  A: 

The IEEE Software's 25th-Anniversary is a great selection:

But one of my all time favorites is The Humble Programmer by Dijkstra

Francisco Garcia
+5  A: 

C.A.R. Hoare's 1980 Turing Award Lecture: The Emperor's Old Clothes

Adam Crossland
I had never seen this before, just read it... absolutely brilliant. It's good to see someone so smart/famous stating pretty much every principle of programming I've learned (the hard way) over the years.
@rmeador: I was awed when I first read this paper, and whenever I revisit it, I am impressed by Tony Hoare and his at-times almost painful honesty. Anyone who contemplates managing a software project needs to read it, and sleep on it, and read it again.
Adam Crossland
I found it very interesting that - after a major project failure - he adopted what we would today call an Agile approach - way back in 1965!
Tom Bushell
+3  A: 

A case against the GO TO statement, by E.W. Dijkstra

Any of Dijkstra's papers really, they influenced so much of modern computer science.

Whats funny is that at the machine level, GOTO is essential; however, I agree that in higher level languages it shouldn't be used.
Nate Bross
@Nate IMHO never using goto is a bad idea. Breaking out of nested loops without goto leads to unreadable code. Understanding why goto shouldn't be used in the majority of cases is the best idea.
@Yacoby: it would be interesting to take a survey to see how many programmers would claim to never use GOTO and who also think nothing of using BREAK to get out of loops.
Adam Crossland
What I meant is that in high level languages there are better ways than using a `GOTO` If you find your self using `GOTO` alot, you may want to restructure your code. I'm not against using `break;` or `continue;` but lots of VB6 code is chuck full of `GOTO`s when simply changing the structure of the code would have made it more clear.
Nate Bross
I think the argument is really structured-goto vs unstructured-goto. `break` is an example of the former, the `goto` keyword is in most languages the latter.
+1  A: 

Why Functional Programming Matters, by John Hughes. It's a dense read but really goes into the details of what advantages FP has.

Gabe Moothart
There's also a great article by Reginald Braithwaite about why that paper is *still* relevant after all those years. It's called: Why Why Functional Programming Matters Matters.
Jörg W Mittag
+4  A: 

The one that I find I refer people to most often when trying to encourage sound OO fundamentals is Robert C. Martin's "The Open-Closed Principle".

+6  A: 

Ken Thompsons's Reflections on Trusting Trust gets me every time I read it.

Matt Ball
+3  A: 

MapReduce: Simplified Data Processing on Large Clusters

Google's paper on a technique for massively distributed computation. This kind of thing will become a lot more important as we enter the multicore era.

Gabe Moothart
+1. Studied it in database course as personnal project. Great article :-).
+13  A: 

How to teach yourself programming in 10 years - Peter Norvig

That's not an academic paper.
Paul Nathan
The question asks for *published* papers.
Norman Ramsey
+1  A: 

It's a slide show, which I don't usually go for, but I found it to be a great reference for database design: Join-fu: The Art of SQL Tuning for MySQL

Elizabeth Buckwalter
Great resource, not a paper. but certainly a nice presentation : ). Thank you Elizabeth ("even bears practice Joinfu" hahhaah)+ 1
+3  A: 

My link was introduced to me from Coding Horror. Having been a TA of an introductory Computer Science course I found it fascinating.

The Camel Has Two Humps

+7  A: 

Fred Brook's No Silver Bullet — Essence and Accidents of Software Engineering is probably my personal favorite.

Edit: Changed the link to point to the actual paper

Tom Bushell
When ever I am being interviewed for a new job, one of the questions that I ask my prospective manager is which texts have most influenced his or her attitudes towards managing software projects. If Mythical Man-Month and No Silver Bullet aren't mentioned, I am very likely to pass on the position.
Adam Crossland
+4  A: 

Things You Should Never Do, Part I by Joel Spolsky

Josh Stodola
+3  A: 

How To Become A Hacker by Eric S.Raymond

+2  A: 

A rational design process: How and why to fake it by DL Parnas (IEEE Transactions on Software Engineering. Volume 12 , Issue 2, February 1986). Old but still relevant today.

We had an engineering manager print out copies of this and hand it out. Very good read.
Tony Arkles
+1  A: 

Big Ball of Mud
I changed my mind, Big Ball of Mud is good, but this is probably my favorite:
Portrait of a N00b

Jens Granlund
+2  A: 

Six ways to write more comprehensible code
aka How to keep your code from destroying you
by Jeff Vogel

Summary: As a developer, time is your most valuable resource. These six tips on how to write maintainable code are guaranteed to save you time and frustration: one minute spent writing comments can save you an hour of anguish.


  1. Comment like a smart person.
  2. Define constants a lot. No, a LOT.
  3. Don't use variable names that will mock you.
  4. Do error checking. You make errors. Yes, you.
  5. "Premature optimization is the root of all evil." - Donald Knuth
  6. Don't be too clever by half.

Mostly for intermediate or those moving into professional environments, but a good read regardless.

+2  A: 

Structured Programming with go to Statements by Donald Knuth.

God save the goto.

+2  A: 

The Real Reason Why Software Engineers Need Math [pdf]

Explains why mathematics is important in the education of a software engineer.

+1  A: 

Knuth: On the Translation of Languages from Left to Right

Dijkstra: Goto considered harmful

Towards a Mathematical Science of Computation

Recursive Functions of Symbolic Expressions and their Computation by Machine (Part I)

Tim Schaeffer

I particularly enjoy with whatever related to computer vision. Face detection, OCR, perspective projection... There are a lot of papers explaining strategies and algorithms and I practically find them all damn interesting.

+1  A: 

My two "forgotten gems" are:

Engineering a sort() function - Bentley & McIlroy -

Why programming is a good medium for expressing poorly understood and sloppily-formulated ideas - Minsky -

Especially the second paper.

+1  A: 

Claude Shannon - "A Mathematical Theory of Communication"

Remember some stuff about Shannon from school.. but I never read the paper; maybe I should.
+1  A: 

Selected Papers in Digital Typography by D. Knuth. Actually I remember a very early issue of DDJ with an article by DK that related why he was unhappy with the state of the art of textbook typography. I believe that TeX followed not long after.

Hugh S. Myers