Organizing Java projects

views:

441

answers:

+6 Q:

Organizing Java projects

Hello,

I'm a junior developer and recently started working for a very small office where they do a lot of in-house development. I have never worked in a project that involved more than one developer or were as big and complex as these ones.

The problem is that they don't use all the tools available (version control, automated building, continuous integration, etc) to their full extent: mainly a project is one big project in eclipse/netbeans using cvs for version control and everything checked in (including library jars), they started using branching for the first time when I started doing branches for small tasks and merging them back. As the projects get bigger and more complex, problems start to arise with dependencies, project structure tied to an IDE, building can be a PITA sometimes, etc. It's hectic at best.

What I want is to set up a development environment where most of these problems will go away and I will save time and effort. I would like to set up projects in a manner independent of IDE used using version control (I'm leaning towards SVN right now), avoid dependency messes and automate building as much as possible.

I know that there are multiple approaches and tools for this and do not want to see a holy war started, I would really appreciate practical recommendations based on experience and what you have found to be useful when facing similar problems. All projects are Java projects and range from web applications to "generic" ones, and I use Eclipse most of the time, but also use Netbeans if needed. Thanks in advance.

+1 A:

I recommend to use Maven for building your projects. Using Maven brigns value to the project, because:

Maven promotes convention over configuration what equals a good project structure
thanks Maven plugins eases generating projects for IDE's (Eclipse, Netbeans, Idea)
handles all dependecies and complete build lifecycle
faciliates projects modularization (via mulitimodule projects)
helps with releases/versions burden
improve code quality - easy integration with continous integration servers and lot of code quality plugins

cetnar 2009-12-25 19:00:07

Do you actually use it yourself?

Thorbjørn Ravn Andersen 2010-06-25 08:05:29

Yes. I use Maven in almost all my work and home projects.

cetnar 2010-06-25 09:39:45

+4 A:

A fine, admirable instinct. Kudos to you.

Part of your problem might not be solved using tools. I'd say that source code management needs some work, because it doesn't sound like branching, tagging, and merging is done properly. You'll need some training and communication to solve that.

I haven't used CVS myself, so I can't say how well it supports those practices. I will point out that Subversion and Git would be better choices. At worst, you should be reading the Subversion "red bean" book to get some generic advice on how to manage source code.

Personally, I'm not a Maven fan. I believe it's too heavyweight, especially when compared to Ant and Ivy. I'd say that using those with Cruise Control could be the solution to a lot of your problems.

You didn't mention unit testing. Start building TestNG and Fit tests into your build cycle.

Look into IntelliJ - I think its a better IDE than either Eclipse or NetBeans, but that's just me.

Best of luck.

duffymo 2009-12-25 19:02:50

+1 A:

Maven can be a bit daunting given its initial learning curve, but it would nicely address many of your concerns. I also recommend you take a look at Git for version control.

2009-12-25 19:02:52

+1 for Maven. The initial investment in learning it pays for itself many times over within months.

Kris 2009-12-25 19:28:54

+2 A:

Maven is great, however, it can have a fair bit of a learning curve, and it requires that the project fits a very specific file structure. If you have a big legacy project, it may be difficult to mavenize it. In that case, Ant+Ivy would do the same without the stringent requirements that maven has.

For build automation, Hudson is beyond awesome. I've used a couple different systems, but that is unquestionably the easiest to get set up and administer.

Milan Ramaiya 2009-12-25 19:26:21

+6 A:

Our development stack (team of 10+ developers)

Eclipse IDE with M2Eclipse and Subclipse/Subversive
Subversion for source control, some developers also use TortoiseSVN where Subclipse fails
Maven 2 for project configuration (dependencies, build plugins) and release mgmt (automatic tagging of releases)
Hudson for Continuous Integration (creates also snapshot releases with source attachments and reports)
Archiva for artifact repository (multiple repositories, e.g. releases and snapshots are separated)
Sonar for code quality tracking (e.g. hotspots, coverage, coding guidelines adherence)
JIRA for bug tracking
Confluence for developer wiki and communication of tech docs with other departments
Docbook for manuals (integrated into build)
JMeter for stress testing and long-term performance monitoring
Selenium/WebDriver for automated browser integration tests
Jetty, Tomcat, Weblogic and Websphere as test environments for web apps. Products are deployed every night and automated tests are run on distributed Hudsons.
Mailinglist with all developers for announcements, general info mails
Daily stand up meetings where everbody tells about what he's currently doing

This setup is considered standard for our company as many departments are using those tools and there is a lot of experience and community support for those.

You are absolutely right about trying to automate as much as possible. If your collegues start to see the benefits when aspects of the development phases are automated, they will be encouraged to improve on their own. Of course, every new technology gimmick ("tool") is a new burden and has to be managed and maintained. This is where the effort is moved. You save time e.g. when maven automatically performs your releases, but you will waste time on managing maven itself. My experience is that every time I introduced a new tool (one of the aboves), it takes time to be adopted and cared about, but in the end it will bring advantages to the whole team when real value is experienced - esp. in times of stress when the tools take over much of the work you would have to do manually.

mhaller 2009-12-25 20:06:38

+1 A:

For project and repository management, I use trac with subversion.

trashgod 2009-12-25 20:07:55

+9 A:

You seem to be almost exactly in the point where the place I worked at was when I started there 1,5 years ago, only difference being that you've started toying with branches which is actually something we still don't do at my work but more about that later on in this answer.

Anyway, you're listing a very good set of tools which can help a small company and those work really nicely as subtopics so without further ado,

Version control systems

Most commonly small companies currently use CVS or SVN and there's nothing bad in that, in fact I'd be really worried if no version control was really used at all. However you have to use version control right, just having one won't make your life easier. We currently use CVS and are looking into Mercurial, but we've found that the following works as a good set of conventions when working with CVS (and I'd suspect SVN too):

Have separate users for all commiters. It's beyond valuable to know who commited what.
Don't allow empty commit messages. In fact if possible, configure the repository to reject any commits without comments and/or default comment. Initial commit for FooBarizer is better than Empty log message
Use tags to mark milestones, prototypes, alphas, betas, release candidates and final versions. Don't use tags for experimental work or as footnotes/Post-It notes.
Don't use branches since they really don't work if you're continuing on developing the application. This is mainly because in CVS and SVN branching just doesn't work as expected and it becomes an exercise in futility to maintain any more than two living branches ( head and any secondary branch ) over time.

Always remember that for the software company the source code is your source of income and contains all your business value, so treat it that way. Also if you have extra 70 minutes, I really recommend that you watch through this talk Linus Thorvalds gave at Google about git and (d)VCS in general, it's really insightful.

Automated builds and Continuous Integration environments

These are about the same actually. Daily builds is a PR joke and has little no resemblance to the state of the actual software beyond some very rudimentary "Does it compile?" issues. You can compile a lot of awful code noise that doesn't do anything, keeping the software quality up has nothing to do with getting the code to compile.

On the other hand unit tests is a great way to maintain software quality and I can with a bit of personal pride say that rigorous unit testing helps even the worst of the programmers to improve a lot and catch stupid errors. In fact there has so far only been a total of three bugs that code I have written has reached production environments and I'd say that in 18 months that's a pretty damn good achievement. In our new production code we usually have a instruction code coverage of +80%, mostly +90% and in one special case reaching all the way to 98%. This part is very lively field and you're better of Googling for the following: TDD, BDD, unit tests, integration tests, acceptance tests, xUnit, mock objects.

That's a bit of a lengthy preface, I know. The actual meat for all the above is this: If you want to have automated builds, have them occur every time someone commits and make sure there's a constantly increasing and improving amount of unit tests for production code. Have the continuous integration system of your choice (we use Hudson CI) run all the unit tests related to project and only accept builds if all the tests pass. Do not make any compromises! If unit tests show that the software is broken, fix the software.

Additionally, Continuous Integration systems aren't just for compiling code but instead they should be used for tracking the state of the software project's metrics. For Hudson CI I can recommend all these plugins:

Checkstyle - Checks if the actual source code is written in a way you define. Big part of writing maintainable code is to use common conventions.
Cobertura - Code coverage metrics, very useful to see how the coverage develops over time. Also keeping in line with the "source is God" mentality, allows you to discard builds if coverage falls below a certain level.
Task Scanner - Simple but sweet: Scans for specific tags such as BUG, TODO, NOTE etc. in your code and creates a list from them for everyone to read. Simple way to track short notes or known bugs which needs fixing or whatever you can come up with.

Project structure and Dependency Management

This is a controversial one. Basically everyone agrees that having an unified structure is great but since there's several camps with different requirements, habits and views to issue they tend to disagree. For example Maven people really believe that there's only one way - the Maven way - to do things and that's it while Ivy supporters believe that the project structure shouldn't be hammered down your throat by external parties, only the dependencies need to be managed properly and in an unified manner. Just that it's not left unclear, our company simply loves Ivy.

So since we don't use project structure imposed by external parties, I'm going to tell you a bit about how we got into what we got into our current project structure.

In the beginning we used individual projects for actual software and related tests (usually named Product and Product_TEST). This is very close to what you have, one huge directory for everything with the dependencies as JARs directly included in the directory. What we did was that we checked out both projects from CVS and then linked the actual project to the test software project in Eclipse as runtime dependency. A bit clunky but it worked.

We soon came to realize that these extra steps are completely useless since by using Ant - by the way, you can invoke Ant tasks directly in Hudson - we could tell the JAR/WAR building step to ignore everything by either file name (say, everything that ends with Test or TestCase) or by source folder. Pretty soon we converted our software project to use a simple structure two root folders, src and test. We haven't looked back ever since. The only debate we currently have is if we should allow for a third folder called spikes to exist in our standard project structure and that's not a very heated debate at all.

This has worked tremendously well and doesn't require any additional support or plugins from any of IDEs out there which is a great plus - number two reason we didn't choose Maven was seeing how M2Eclipse basically took over Eclipse. And since you must be wondering, number one reason for rejecting Maven was the clunkiness of Maven itself, endless amount of lengthy XML declarations for configuration and the related learning curve was considered a too big cost as to what we would get from using it.

Rather interestingly later on commiting to Ivy instead of Maven has allowed us to a smooth shift to do some Grails development which uses folder and class names as conventions for just about everything when structuring the web application.

Also a final note about Maven, while it claims to promote convention over configuration, if you don't want to do things exactly the way the Maven's structure says you should do things, you're in a world of pain for the aforementioned reasons. Certainly that's an expected side effect of having conventions but no convention shouldn't be final, there always has to be at least some room for changes, bending the rules or choosing the appropriate from a certain set.

In short, my opinion is that Maven is a bazooka, you work in a house and you ultimate goal is to have it bug free. Each of these are good on it's own and work even if you pick any two of them, but the three together just doesn't work.

Final words

As long as you have less than 10 code-centric people, you have all the flexibility needed to do the important decisions. When you go beyond that, you have to live with whatever choices you've made, no matter how good or bad they are. Don't just believe things you hear on the Internet, sit down and test everything rigorously - heck, our senior tech guy even wrote his bachelor's thesis about Java web frameworks just to figure out which one we should use - and really figure out what you really need. Don't commit to anything just because you may need some of the functionality it provides in distant future, pick those things that has the lowest possible negative impact to the whole company. Being the 10th person hired to the company I work at I can undersign everything in this paragraph with my own blood, we currently have 16+ people working and changing certain conventions would actually be a bit scary at this point.

Esko 2009-12-25 20:33:22

Great answer, thank you for it and for your time.

teto 2009-12-27 16:27:39

Thanks. While typing this I did notice that I could write a lot about the subject but unfortunately I'm preoccupied with a lot of other things which prevent me from doing that. I'm going to put it to my list of things to do, however. Oh and maybe I should tone down my Maven dislike a bit.

Esko 2009-12-27 18:26:57

+1 A:

Here's what i'm using right now, but i will probably switch a few parts (see the end of this post).

Eclipse as IDE with a few plugins : JADClipse (to decompile .class on the fly, pretty useful), DBViewer for a quick access to database through Eclipse, WTP (Web Tools Platform) integrated into Eclipse for running Tomcat 6 as a developement web server (pretty fast), Mylyn (linked with JIRA bug-tracker).

I'm too wondering about "IDE independant projects", right now we are all sticked on Eclipse - Eclipse project files (.project, .classpath, .settings) are even commited in the CVS repository (in order to have a project fully ready once checked out) - but with Netbeans, supported by Sun and running faster and faster with each release (and each new JRE version), the question isn't closed.

CVS for storing projects, with nearly no branches (only for patches).

I'm working on environment production with Oracle SGBDR but I'm using HSQLDB on my developement computer to make test and build and development process way faster (with the help of the open-source DDLUtils tool to ease database creation and data injections). Otherwise i use SQLWorkbench for quick BD tasks (including schemas comparison) or the Oracle (free) SQLDeveloper for some Oracle specific tasks (like investating sessions locks and so on).

Tests are only JUnit tests (either simple unit test cases or more complex test cases (nearly "integrations" ones), nearly always runing on HSQLDB to run faster.

My build system is Ant (launched from Eclipse) for various small tasks (uploading a war on a remote server for example) and (mainly) Maven 2 for :

the build process
the publishing of the released artefacts
the publishing of the project's web site (including reports)
launching tests campaigns (launched every night)

The continuous integration front-end is Luntbuild, and the front-end for the Maven repository is Archiva.

All this works. But I'm pretty disappointed by a few elements of this ecosystem.

Mainly Maven, it's just too time-consuming and i have a lot of griefs versus this tool. Conflicts dependencies resolution is a joke. Lot of XML lines in every POM.xml, redundant in every project (even with the help of a few POM roots). Plugins are way too inconsistent, buggy, and it's really difficult to find clear documentation explaining what has to be configured, and so on.

So i'm wondering about switching from Maven to ANT+Ivy. For what i've seen so far, it's seems pretty cool (there are various Conflict manager for the conflicts dependencies resolutions and you can even write your own conflict manager), there is no need to have an additionnal tool installed and configured (as ANT is running natively under Eclipse, whereas Maven needs a separate plugin - i've tried the 3 Mavens plugins by the way, and have found all the three of them buggy).

However Maven 3 is on its way, i'll give it a try but i don't expect it to be fundamentaly different from Maven 2.

Hudson would seem a better choice than Luntbuild, too, but this part won't be changed for the now.

And Subversion will probably replace CVS in a near future (even if i nearly don't have any trouble with CVS).

Sergio 2009-12-28 13:16:24

Sergio, if you have no issues with CVS, why switch? We've recently switched on a large project, and besides losing all the previous history, svn doesn't seem to be as well-supported in eclipse (we use subversive). In particular, replacing one checked-out version with another is extremely awkward: instead of Team->Replace, it's "delete entire project, and check out new version".

CPerkins 2009-12-28 16:38:48

For a few (minor) points : mainly the LDAP authentification available with Subversion rather than the CVS internal accounts, and a few advantages for the team working on the infrastructure (i don't know the details, but they think that subversion is easier to administrate and backup / restore in case of a system crash).

Sergio 2009-12-30 14:39:23

Strongly consider a distributed versioning system.

Thorbjørn Ravn Andersen 2010-06-25 08:04:37

The single most best thing you can do without disrupting other people and their way of working is setting up hudson to watch the CVS repository for each of your project. Just doing that will give a central place to see cvs commit messages.

Next step is getting these projects to compile under Hudson. For Eclipse this typically means either switching to ant or - as we did - use ant4eclipse to model the existing eclipse build process. Not easy but very worthwhile. Remember to send out mails when the build breaks - this is extremely important. Ant4eclipse requires team project sets - introducing them in your organization Will make your colleagues happy the next time they need to set up a fresh workspace.

When you have a situation where your stuff builds properly whenever somebody commits changes then consider making that automatically built code the code to actually go to the customer. As it was built on the build server and not on a developers machine, you know that you can reproduce the build. That is invaluable in a "hey fix this ancient version" situation.

Thorbjørn Ravn Andersen 2010-06-25 06:59:29

ansaurus

tags:

views:

answers:

Organizing Java projects

Version control systems

Automated builds and Continuous Integration environments

Project structure and Dependency Management

Final words

related questions