views:

127

answers:

3

There're several ways to generate data for tests (not only unit tests), for example, Object Mother, builders, etc. Another useful approach is to write test data as plain text:

product: Main; prices: 145, 255; Expire: 10-Apr-2011; qty: 2; includes: Sub
product: Sub; prices: 145, 255; Expire: 10-Apr-2011; qty: 2

and then parse it into C# objects. This is easy to use in unit tests (because deep inner collections can be written in single line), this is even more convenient to use in FitNesse-like system (because this DSL naturally fits into wiki), and so on.

So I use this and write parser, but it's tedious to write each time. I'm not a big expert in DSL/language parsers, but I think they can help here. What would be the right one to use? I only heard about:

  • DSL (I mean, any DSL)
  • Boo (that I think can do DSL)
  • ANTLR

but I don't even know which one to pick and where to start.

So the question: is it reasonable to use some kind of DSL to generate test data? What would you suggest to do so? Are there any existing cases?

Update: seems like I was not clear enough. It's not about raw string to object convertion. Look at first line and relate it to

var main = Product.New("Main")
   .AddPrice(Price.New(145).WithType(PriceType.Main).AndQty(2))
   .AddPrice(Price.New(255).WithType(PriceType.Maintenance).AndQty(2))
   .Expiration(new DateTime(10, 04, 2011));
var sub =  Product
   .New("Sub").Parent(main)
   .AddPrice(...));
main.AddSubProduct(sub);
products.Add(main);
products.Add(sub);

And note that I first create sub product and then add it to main, even though it is listed in reverse order. Prices are handled in a special way. I want to specify name of Sub product and get reference to it - created. I want to list all product properties - FLAT and NON-REPEATATIVE - on single line. I want to use defaults for properties. And so on.

Update: I'm not convinced to avoid DSL because all the alternative examples are too verbose and not user-friendly. And no-one said anything useful about DSL.

+1  A: 

I would first start by seeing if my language of choice was rich enough to build my DSL. C# ought to handle your case quite easily:

Product[] products = new Product[] {
    new TestProduct{product="Main", prices=new[]{145, 255}, Expire="10-Apr-2011", qty=2, includes="Sub"},
    new TestProduct{product="Sub", prices=new[]{145, 255}, Expire="10-Apr-2011", qty=2}
};

Not quite as pretty, but certainly tolerable enough that I would struggle to justify the extra effort of a custom DSL.

Also note that Expire is initialised with a string, but it is obviously a date. This is perfectly reasonable for a DSL idiom, since TestProduct.Expire's setter can do the translation.

Marcelo Cantos
It's a lot more verbose, especially if you add that prices are not just ints, and includes are not strings but references to other products (see update). I want to write "prices: 145, 255" and parser should know that first is Main product price and second is Maintenance product price. Without writing these details each time.
queen3
And if I write "prices: 145" parser should be smart enough to set both prices to same value. Or, maybe, maintenance as half-price. If you add such details, you'll see the need for DSL.
queen3
@queen3: Regarding your first comment, I agree; whether a DSL is worth it comes down to your own pain-threshold for line-noise. Your second point is incorrect, however; the setter for `TestProduct.prices` can easily see whether one or two prices have been supplied and apply your choice of rules accordingly.
Marcelo Cantos
No it can't; I wouldn't alter domain classes to fit test needs. E.g. domain class have AddPrice() but not SetPrices(); AddPrice doesn't know if there's 1 or 2 prices.
queen3
`TestProduct` is a derived class of `Product`. It should be able to populate its base class in exactly the same way that your DSL would.
Marcelo Cantos
OK, I can make a more complex example but why; instead, consider another part which is to pass DSL lines to FitNesse-like wiki system. Not only DSL is easier to use by users (and developers, too); I also wouldn't like to compile definitions on-the-fly instead of parsing. But I accept your answer as "No, don't use DSL".
queen3
I'm not trying to convince you not to use a DSL. I'm simply pointing out that it's a cost-benefit tradeoff that depends on the complexity of the concepts you are trying to express and the richness of the host language. And yes, if you have a non-technical audience, then of course that tilts the scales in favour of a custom DSL.
Marcelo Cantos
+2  A: 

For the data DSL YAML is an excellent candidate. Here is a sample from Wikipedia:

---
receipt:     Oz-Ware Purchase Invoice
date:        2007-08-06
customer:
    given:   Dorothy
    family:  Gale

items:
    - part_no:   A4786
      descrip:   Water Bucket (Filled)
      price:     1.47
      quantity:  4

    - part_no:   E1628
      descrip:   High Heeled "Ruby" Slippers
      price:     100.27
      quantity:  1

bill-to:  &id001
    street: |
            123 Tornado Alley
            Suite 16
    city:   East Westville
    state:  KS

ship-to:  *id001

specialDelivery:  >
    Follow the Yellow Brick
    Road to the Emerald City.
    Pay no attention to the
    man behind the curtain.

I used YAML in several projects and happy with it.

However, if we are talking about unit-tests it is usually simpler and more readable to construct necessary objects “by hand” with constructors and property assignments in-place. This is because unit-test are by their nature highly focused on some code (unit), and it shouldn't be hard to create data infrastructure that is just enough for the test. It is OK to operate on half-complete entities in unit-tests, don't bother with constructing data that is not related to this concrete test.

For functional tests YAML is great.

nailxx
It's multiline, which is worse to use in my opinion.
queen3
Single line remains readable only for simple cases when the data is flat. How to reflect nested structures in single line?
nailxx
Simple cases are most of cases when doing tests (just as you said); with DSL we can do single line for 80% simple cases and still can do multiline for the rest 20%; with multiline we're forced to be verbose even for simple cases. This is especially bad for FitNesse-Wiki-like systems where test data is entered into text tables.
queen3
A: 

For creating an external DSL I would recommend Eclipse TMF Xtext which is really good (based on ANTLR but simpler), but built on top of Eclipse and Java, however you can generate any code. When it comes to creating testing data, I was inspired by the way the Ruby on Rails guys do it, which was YAML fixtures as mentioned in another answer, but I also saw an approach using factories, which can help you to get rid of some duplicity and inflexibility. Look at this Railscasts 158: Factories not Fixtures, it might give you some ideas for designing the DSL.

Gabriel Ščerbák