views:

951

answers:

12

I think I've become quite good at the basics of programming (for a variety of languages). I can write a *good** line of code. I can write a good method. I can write a good class. I can write a good group of classes. I can write good small or medium application.

I do not however know how to build a good large application. Particularly in the case where multiple technologies are involved and more are likely to become involved with time. Say a project with a large web front-end, a large server back-end that connects to some other integration back-end and finally a large and complex database. Oh, I've been involved in a few of these applications and I could build one I'm sure. I'm not so sure however that it could qualify as "good".

My question is thus for a reference to a book or other good source of reading where I could learn how to distribute and organize code and data for general large projects. For example, would I want to layer things very strictly or would I want to encapsulate it independent units instead. Would I want to try to keep most of the logic in the same pool, or should it just be distributed as it seems most logical when adding whatever feature I'm adding.

I've seen lots of general principals on these issues (e.g. No spaghetti code, meatball code...) and read a few excellent articles that discuss the matter but I've never encountered a source which would lead me to concrete practical knowledge. I realize the difficultly of the question and so I'd be happy to just hear about the readings that others have found to help them in their quest for such knowledge.

As always, thank you for your replies.

****Given the debated nature of the definition of "good" code, the term "good" in this context won't be defined (it means whatever you think it ought to mean).

+11  A: 

Borrowed from tvanfosson:

Start with a small application and say yes everytime someone wants a new feature added.

carrier
Upvoted for being against the current.
Mr Grieves
HAH!!! Funy. That's what my boss wants me to do. HAH!!
Jerry
I've done it that way, its not fun. Client hands you a 2x4 one at a time tells you to build a house, but the catch being with every 2x4 comes a different idea of how the house should look...OK if you are billing by the hour...
EJB
E.J., I assume this is getting points for "Humorous", because you're exactly right--I just think it was assumed to be obvious.
Bill K
@EJB: I love your comment. I will use it and claim it as my own :)
j_random_hacker
+9  A: 
Scottie T
I don't think that this book answers his question: this book is fairly specific to C++, and is much more about the tactics (minuteae of information-hiding in header files, to reduce recompilation times) than it is about strategy (architecture/big picture).
ChrisW
He asked for the practical as opposed to general. There are a ton of resources on large software projects in general. This is a very practical book.
Scottie T
It's true that this book talks more about layering and about acyclic dependency than any other book I've seen.
ChrisW
The question does not deal directly with project management, but rather architecture and hence re-tagged.
Totophil
+4  A: 
  1. Decide which features are most important; forget the rest
  2. Decide which of those feature are most important and forget the rest
  3. Implement them (should take a couple weeks, otherwise repeat steps 1 and 2)
  4. Launch
  5. See which features work, which don't and which are missing
  6. Go back to step 1
davetron5000
I agree with 1 and 2 certainly, but I think the question wants to know about how to do (3) most effectively. Which surely comes down to how modular you make it...
JeeBee
Excellent point on the feature-weeding. That reminds me, I need to do another round of that right about now. Perfect timing!
lc
The point of (1) and (2) is to reduce (3) to what the questioner already knows how to do - build a medium application.
Arkadiy
Arkadiy - yes that is what I was getting at
davetron5000
+3  A: 

Make it extensible using design patterns that mean that you aren't going to have to change everything to wedge in new functionality.

Decide what you need to build and build that.

Break it up into modules that perform the tasks separately.

Plan plan plan plan, know what you are building before you start, and build that and nothing else.

Only write int he features you need to, don't add things you think might be useful, but... leave it flexible enough to be able to add anthing that you might need to add.

Omar Kooheji
These are the kind of general principles that I find more or less useful.You cant plan what part of your software will become successful.Software adapts to markets and you usually end up somewhere completely different for where you started.You can't make everything flexible.It takes too much time.
Mr Grieves
A: 

Well, you could take a look at rational unified process. Check the essential parts, select some of the artifacts you think you'll need. Make a list of all features you'll want and organize them in a requirements list. Also Plan your software architeture carefully, so you dont have to change it later. With some of those tips, it will be relative easier developing a large app.

Danmaxis
+4  A: 

Large Applications are not created in one night. Enterprise apps starts with small pieces and then they are putted together. If you design you apps is such a way that can be scaled up then it will be easier to integrate with all of surrounding factors like databases, third party tools etc. If you go into infoq.com you will find a lot of great case of studies and materials about scaling and architectures like Myspace, Amazon and many others. Nothing but the experience will lead you to developing great large apps.

Oscar Cabrero
+1 for fixing the spelling and punctuation.
Scottie T
+3  A: 

Incrementally, using Test Driven Design

Noel Walters
+10  A: 

As programmers, we like to believe we are smart people, so it's hard to admit that something is too big and complex to even think about all at once. But for a large-scale software project it's true, and the sooner you acknowledge your finite brain capacity and start coming up with ways to simplify the problem, the better off you'll be.

The other main thing to realise is that you will spend most of your time changing existing code. Building the initial codebase is just the honeymoon period -- you need to design your code with the idea in mind that, 6 months later you will be sitting in front of it trying to solve some problem without a clue how this particular module works, even though you wrote it yourself.

So, what can we do?

Minimise coupling between unrelated parts of your code. Code is going to change over time in ways you can't anticipate -- there will be showstopper problems integrating with unfamiliar products, requirements changes -- and those will cause ripple-on changes. If you have established stable interfaces and coded to them, you can make any changes you need in the implementation without those changes affecting code that uses the interface. You need to spend time and effort developing interfaces that will stand the test of time -- if an interface needs to change too, you're back to square one.

Establish automated tests that you can use for regression testing. Yes, it's a lot of work up front. But it will pay off in the future when you can make a change, run the tests, and establish that it still works without that anxious feeling of wondering if everything will fall over if you commit your latest change to source control.

Lay off the tricky stuff. Every now and then I see some clever C++ template trick and think, "Wow! That's just what my code needs!" But the truth is, the decrease in how readable and readily understandable the code becomes is often simply not worth the increased genericity. If you're someone like me whose natural inclination is to try to solve every problem in as general a manner as possible, you need to learn to restrain it until you actually come across the need for that general solution. If that need arises, you might have to rewrite some code -- it's no big deal.

j_random_hacker
Oh man. I get trapped in that whole generalization thing all the time. I'm always having to pull myself down to a more practical level. +1
Jeremy Powell
+3  A: 

As I have mentioned elsewhere, large applications are not just bigger, they are different. So much so that we speak of programming in-the-small and in-the-large. There is a major qualitative shift that occurs in the nature of the problems and their solutions when you are programming in-the-large. The line is very fuzzy, and there are numerous specific issues that can force you across that line.

Some of those issues include:

  • size (such as a database that simply won't fit on a single hard drive)
  • complexity (from all-in-one application to multiple subsystems)
  • concurrency (from zero to thousands/millions of simultaneous users)
  • availability (from 9% uptime to 99.999% uptime)
  • reliability (from daily failures to several years MTBF)
  • speed (from hours down to milliseconds in response time)
  • productization (from your pet project to a sellable commodity)
  • etc.

How to deal with all that? Learn and use every valuable technique you can, and learn to evaluate which ones are actually valuable--that will take a while, and there is no quick answer.

However, there is one technique that is easy, obvious, and one-size-fits-all: divide and conquer. Isolate each major piece of functionality, each subsystem, each external dependency, so that your main system only touches them at its outside edge. When you can change each of those by simply tweaking a thin interface in a very short timeframe, then you have accomplished something. That will take you a long way.

Best wishes.

Rob Williams
I interpret what you're saying here as, "Everything is easy for small n." But it's going from small n to large n that is hard part, since it's hard to see in advance what particular problems we will face when that happens.
j_random_hacker
@j_random_hacker: I agree, except that most of the particular problems have been seen before by someone, and many of those someones have published their experiences. So, going from small to large is predictable to a large degree, but not obvious to the uninitiated
Rob Williams
@Rob: Whoops, upon rereading my comment it maybe sounds like I was disagreeing with you, though in fact I was agreeing! Totally agree with your comment too, experience makes all the difference.
j_random_hacker
+3  A: 

It's really interesting to note how many of these comments say that blind iteration is the only way.

Iteration is critical (I'm a huge fan), but there are people who can plan out huge projects--it's just that few of us have ever met one.

Think of it as all of us playing basketball in our driveways. We're pretty good, we can get most baskets and actually have a great fun game in the park.

Just because we've never met professional players, however, doesn't mean they don't exist and can't kick each and every one of our butts up and down the court all day long.

The only thing is that there are no pro games of programming--maybe if there were we'd see them a bit more.

Bill K
A great observation.
Mr Grieves
Interesting insight. But you should check out the Algorithms competition on www.topcoder.com, where you'll find *crazy* smart programmers who can solve impossible problems in less time than it takes you or me to say "Holy Foo!" Search for tomek and Petr.
j_random_hacker
A: 

As someone in charge of a large app I would say

  • Use a non-invasive framework such as Spring
  • Reduce coupling
  • Create immutable objects wherever possible - they're thread-friendly
  • Accept that your application might need to be split into separate processes to scale better and plan for that.
  • Build a solid toolset and learn the tools.

DON'T PANIC

Fortyrunner
A: 

Some thoughts on product and vendor lock-in:

  • Try to be as vendor and platform independent as possible. This will prevent you from having to reimplement everything from scratch with a new product/platform/framework etc.
  • This practically means using Java SE + Java EE + and open source RDBMS like PostgreSQL encapsulated by JPA from Java EE. Do not use additional libraries, frameworks (spring, hibernate,...) etc. This way you can switch products and vendors any time you need.
  • I think that you can only get this level of product- and platform independence with Java. Even if you use OSS libraries and frameworks, you will regret using them if you find out that the implementation does not suit your needs and you have to redo everything.
  • You can check the product independence of your code with the Java Application Verification Kit.
  • Spend some time on the Architecture beforehand but also redesign the Architecture throughout the implementation. A good book (unfortunately only german) is "Java EE 5 Architekturen" by Adam Bien.

@j_random_hacker: Actually, no - I still think that my first point is an argument for using java in large applications, not against it. Every language is ONE language. So you always have to do a commitment to a language, of course.

  • But Java SE & EE include the language, compiler, virtual machine as well as all libraries/frameworks necessary. But there are different IMPLEMENTATIONS of the whole Java SE/EE platform: Java SE (JDK) from Sun, Apache, IBM, HP, Oracle, BEA. Java EE (Application Server) from Sun, Apache, Red Hat, IBM, Oracle and others. .Net with C# does only have one implementation (from Microsoft and an implementation of the somewhat similar language/platform called Mono).
  • PHP also has only one implementation, I think. There are plenty of different C++ compilers. But they all implement slightly different C++-languages and they are not bundled with libraries who all share the same API. Choosing Java, I know that I can choose between half a dozen Java SE implementations and half a dozen Java EE Application Servers to run the software, which in turn run on Linux, Solaris, FreeBSD, HP-UX, IBM z/OS, Windows, Mac OS X and on a very large variety of hardware platforms. So I just do not have to worry, if I find a really bad implementation problem late in development or even in production - I would just walk away from Sun and would never look back. (This is why I recommended the Java Application Verification Kit. By checking your source with it you can be sure, that Sun, IBM, Oracle or any other evil company did not sneak any of their proprietary stuff as dependencies into your source which could bind you to that company. You are free as a bird.)
  • You cannot do that with PHP or Ruby. With those languages, you would have to patch the implementation problem by yourself, if no one else does it, because spending months of bug patching time into PHP or Ruby is still less effort than rewriting your complete application.
  • Sun has open sourced both: Java SE (the complete JDK) and Java EE (Glassfish application server). The only thing, which is not "open source" is that there is a binding language specification, which is led by sun and gets massive contributions by others. This is why you can grab the Java implementation from sun, modify the Java language and redistribute that source and binaries, but cannot call that "Java" any more if this is not in line with the language specification (Sun protects the Java Trademark to only be applied to things actually java). This might sound "evil" at first, but it actually ensures, that there is such a thing as "Java": You can write a java application and run it on any java implementation. You cannot do that with C++ as there is no C++ specification which is agreed upon by every c++ implementation (a source code might compile with the Intel C++ compiler, but not with the GNU one) and - more importantly - there is no common library: if I write a C++ program with the QT library, it will not compile with the GTK library, as they have completely different APIs.
  • If you cannot stand anything Sun microsystems, but want an open source Java, than you can just use Apache Harmony (Java SE) with Apache Geronimo (Java EE) on top of it.
SAL9000
You're already on -1 so I won't downvote you further, but can't you see how your first point (be as vendor neutral as possible) kinda contradicts your advice to use Java? Sure, Java is pretty WORA these days, but Sun owns Java. Many (most?) languages are not tied to a particular vendor.
j_random_hacker
Thanks for the update, I wasn't aware that other organisations produced Java JDKs. Regarding C++ though, there is an ISO standard which nowadays has good support across compilers, as well as the C++ Standard Library (sadly much smaller than Java's libraries, but standard nonetheless).
j_random_hacker