Help with TDD approach to a real world problem: linker

views:

139

answers:

+1 Q:

Help with TDD approach to a real world problem: linker

I'm trying to learn TDD. I've seen examples and discussions about how it's easy to TDD a coffee vending machine firmware from smallest possible functionality up. These examples are either primitive or very well thought-out, it's hard to tell right away. But here's a real world problem.

Linker.

A linker, at its simplest, reads one object file, does magic, and writes one executable file. I don't think I can simplify it further. I do believe the linker design may be evolved, but I have absolutely no idea where to start. Any ideas on how to approach this?

Well, probably the whole linker is too big a problem for the first unit test. I can envision some rough structure beforehand. What a linker does is:

Represents an object file as a collection of segments. Segments contain code, data, symbol definitions and references, debug information etc.
Builds a reference graph and decides which segments to keep.
Packs remaining segments into a contiguous address space according to some rules.
Relocates references.

My main problem is with bullet 1. 2, 3, and 4 basically take a regular data structure and convert it into a platform-dependent mess based on some configuration. I can design that, and the design looks feasible. But 1, it should pick a platform-dependent mess, in one of the several supported formats, and convert it into a regular structure.

The task looks generic enough. It happens everywhere you need to support multiple input formats, be it image processing, document processing, you name it. Is it possible to TDD ? It seems like either test is too simple and I easily hack it to green, or it's a bit more complex and I need to implement the whole object/image/document format reader which is a lot of code. And there is no middle ground.

This is all very possible.
A sample from the top of my head is NHAML.

This is ASP.NET ViewEngine that converts plain text to the .NET native code.

You can have a look at source code and see how it is tested.

Dmytrii Nagirniak 2009-11-18 02:50:15

Early versions of NHAML didn't have any unit-tests. They covered an existing system after it has been designed to some extent. I, on the contrary, want to try TDD from scratch. My problem is that I cannot separate a small enought task which would do something useful, and looking at mature projects cannot help here IMHO.

SnakE 2009-11-18 04:48:14

I guess what I do is come up with layers and blocks and sub-divide to the point where I might be thinking about code and then start writing tests.

I think your tests should be quite simple: it's not the individual tests that are the power of TDD but the sum of the tests.

One of the principles I follow is that a method should fit on a screen - when that's the case, the tests are usually simple enough.

Your design should allow you to mock out lower layers so that you're only testing one layer.

serialhobbyist 2009-11-18 13:50:48

I kind of understand all that. But then again, your approach doesn't look like a test-driven design. It's more like you design with tests in mind, it's something rather different.

SnakE 2009-11-19 13:46:14

TDD is about specification, not test.

From your simplest spec of a linker, your TDD test has just to check whether an executable file has been created during the linker magic if you feed it with an object file.

Then you write a linker that makes your test succeed, e.g.:

check whether input file is an object file
if so, generate a "Hello World!" executable (note that your spec didn't specify that different object files would produce different executables)

Then you refine your spec and your TDD (these are your four bullets).

As long as you can write a specification you can write TDD test cases.

mouviciel 2009-11-18 14:07:04

There going to be a moment when I'll *have* to actually implement the object file parsing to pass my test. And parsing an object file format is a lot of work, so my project is going to be red for a while. I think this is going to happen with any high-level block. How do you avoid this?

SnakE 2009-11-19 14:02:18

I tend to associate TDD with agile methods, where big projects are divided in small incremental steps. At the end of each step a new feature has been specified (including TDD test cases), developed and tested with TDD tests cases.

mouviciel 2009-11-19 15:12:24

First, have a look at "Growing Object Oriented Software Guided By Tests" by Freeman & Pryce.

Now, my attempt to answer a difficult question in a few lines.

TDD does require you to think (i.e. design) what you're going to do. You have to:

Think in small steps. Very small steps.
Write a short test, to prove that the next small piece of behaviour works.
Run the test to show that it fails
Do the simplest thing possible to get the test to pass
Refactor ruthlessly to remove duplication and improve the structure of the code
Run the test(s) again to make sure it all still works
Go back to 1.

An initial idea (design) of how your linker might be structured will guide your initial tests. The tests will enforce a modular design (because each test is only testing a single behaviour, and there should be minimal dependencies on other code you've written).

As you proceed you may find your ideas change. The tests you've already written will allow you to refactor with confidence.

The tests should be simple. It is easy to 'hack' a single test to green. But after each 'hack' you refactor. If you see the need for a new class or algorithm during the refactoring, then write tests to drive out its interface. Make sure that the tests only ever test a single behaviour by keeping your modules loosely coupled (dependency injection, abstract base classes, interfaces, function pointers etc.) and use fakes, stubs and mocks to isolate the code under test from the rest of your system.

Finally use 'customer' tests to ensure that you have delivered functional features.

It's a difficult change in mind-set, but a lot of fun and very rewarding. Honest.

Seb Rose 2009-11-20 13:02:46

Thank you for the book reference. It's an interesting read. The authors seem to address exactly the concerns I'm having.

SnakE 2009-11-21 00:25:20

You're right, a linker seems a bit bigger than a 'unit' to me, and TDD does not excuse you from sitting down and thinking about how you're going to break down your problem into units. The Sudoku saga is a good illustration of what goes wrong if you don't think first!

Concentrating on your point 1, you have already described a good collection of units (of functionality) by listing the kinds of things that can appear in segments, and hinting that you need to support multiple formats. Why not start by dealing with a simple case like, say, a file containing just a data segment in the binary format of your development platform? You could simply hard-code the file as a binary array in your test, and then check that it interprets just that correctly. Then pick another simple case, and test for that. Keep going.

Now the magic bit is that pretty soon you'll see repeated structures in your code and in your tests, and because you've got tests you can be quite aggressive about refactoring it away. I suspect this is the bit that you haven't experienced yet, because you say "It seems like either test is too simple and I easily hack it to green, or it's a bit more complex and I need to implement the whole object/image/document format reader which is a lot of code. And there is no middle ground." The point is that you should hack them all to green, but as you're doing that you are also searching out the patterns in your hacks.

I wrote a (very simple) compiler in this fashion, and it mostly worked quite well. For each syntactic construction, I wrote the smallest program that I could think of which used it in some observable way, and had the test compile the program and check that it worked as expected. I used a proper parser generator as you can't plausibly TDD your way into one of them (you need to use a little forethought!) After about three cycles, it became obvious that I was repeating the code to walk the syntax tree, so that was refactored into something like a Visitor.

I also had larger-scale acceptance tests, but in the end I don't think these caught much that the unit tests didn't.

Dave Turner 2009-11-20 14:52:46

Thank you, I think I'm starting to get the idea. At least I've got some initial tests already running and passing, and they took a very feasible amount of time to write, and they test something useful, and the code they test does something useful. I actually embed tiny, several bytes long, object files into the tests, and make sure they are parsed as I expect.

SnakE 2009-11-22 13:56:48

ansaurus

tags:

views:

answers:

Help with TDD approach to a real world problem: linker

related questions