So you want examples? Last semester I took a compilers course. In it we had to write a register allocator. To put it in simple terms, my program can be summarized like this:
Input: A file written in ILOC, a pseudo-assembly language that was made up for my textbook. The instructions in the file have register names like "r<number>
". The problem is the program uses as many registers as it needs, which is usually greater than the number of registers on the target machine.
Output: Another file written in ILOC. This time, the instructions are rewritten so that it uses the correct max number of registers that are allowed.
In order to write this program, I had to make a class that could parse an ILOC file. I wrote a bunch of tests for that class. Below are my tests (I actually had more, but got rid of them to help shorten this. I also added some comments to help you read it). I did the project in C++, so I used Google's C++ testing framework (googletest) located here.
Before showing you the code... let me say something about the basic structure. Essentially, there is a test class. You get to put a bunch of the general setup stuff in the test class. Then there are test macros called TEST_F's. The testing framework picks up on these and understands that they need to be run as tests. Each TEST_F has 2 arguments, the test class name, and the name of the test (which should be very descriptive... that way if the test fails, you know exactly what failed). You will see the structure of each test is similar: (1) set up some initial stuff, (2) run the method you are testing, (3) verify the output is correct. The way you check (3) is by using macros like EXPECT_*. EXPECT_EQ(expected, result)
checks that result
is equal to the expected
. If it is not, you get a useful error message like "result was blah, but expected Blah".
Here is the code (I hope this isn't terribly confusing... it is certainly not a short or easy example, but if you take the time you should be able to follow and get the general flavor of how it works).
// Unit tests for the iloc_parser.{h, cc}
#include <fstream>
#include <iostream>
#include <gtest/gtest.h>
#include <sstream>
#include <string>
#include <vector>
#include "iloc_parser.h"
using namespace std;
namespace compilers {
// Here is my test class
class IlocParserTest : public testing::Test {
protected:
IlocParserTest() {}
virtual ~IlocParserTest() {}
virtual void SetUp() {
const testing::TestInfo* const test_info =
testing::UnitTest::GetInstance()->current_test_info();
test_name_ = test_info->name();
}
string test_name_;
};
// Here is a utility function to help me test
static void ReadFileAsString(const string& filename, string* output) {
ifstream in_file(filename.c_str());
stringstream result("");
string temp;
while (getline(in_file, temp)) {
result << temp << endl;
}
*output = result.str();
}
// All of these TEST_F things are macros that are part of the test framework I used.
// Just think of them as test functions. The argument is the name of the test class.
// The second one is the name of the test (A descriptive name so you know what it is
// testing).
TEST_F(IlocParserTest, ReplaceSingleInstanceOfSingleCharWithEmptyString) {
string to_replace = "blah,blah";
string to_find = ",";
string replace_with = "";
IlocParser::FindAndReplace(to_find, replace_with, &to_replace);
EXPECT_EQ("blahblah", to_replace);
}
TEST_F(IlocParserTest, ReplaceMultipleInstancesOfSingleCharWithEmptyString) {
string to_replace = "blah,blah,blah";
string to_find = ",";
string replace_with = "";
IlocParser::FindAndReplace(to_find, replace_with, &to_replace);
EXPECT_EQ("blahblahblah", to_replace);
}
TEST_F(IlocParserTest,
ReplaceMultipleInstancesOfMultipleCharsWithEmptyString) {
string to_replace = "blah=>blah=>blah";
string to_find = "=>";
string replace_with = "";
IlocParser::FindAndReplace(to_find, replace_with, &to_replace);
EXPECT_EQ("blahblahblah", to_replace);
}
// This test was suppsoed to strip out the "r" from register
// register names in the ILOC code.
TEST_F(IlocParserTest, StripIlocLineLoadI) {
string iloc_line = "loadI\t1028\t=> r11";
IlocParser::StripIlocLine(&iloc_line);
EXPECT_EQ("loadI\t1028\t 11", iloc_line);
}
// Here I make sure stripping the line works when it has a comment
TEST_F(IlocParserTest, StripIlocLineSubWithComment) {
string iloc_line = "sub\tr12, r10\t=> r13 // Subtract r10 from r12\n";
IlocParser::StripIlocLine(&iloc_line);
EXPECT_EQ("sub\t12 10\t 13 ", iloc_line);
}
// Here I make sure I can break a line up into the tokens I wanted.
TEST_F(IlocParserTest, TokenizeIlocLineNormalInstruction) {
string iloc_line = "sub\t12 10\t 13\n"; // already stripped
vector<string> tokens;
IlocParser::TokenizeIlocLine(iloc_line, &tokens);
EXPECT_EQ(4, tokens.size());
EXPECT_EQ("sub", tokens[0]);
EXPECT_EQ("12", tokens[1]);
EXPECT_EQ("10", tokens[2]);
EXPECT_EQ("13", tokens[3]);
}
// Here I make sure I can create an instruction from the tokens
TEST_F(IlocParserTest, CreateIlocInstructionLoadI) {
vector<string> tokens;
tokens.push_back("loadI");
tokens.push_back("1");
tokens.push_back("5");
IlocInstruction instruction(IlocInstruction::NONE);
EXPECT_TRUE(IlocParser::CreateIlocInstruction(tokens,
&instruction));
EXPECT_EQ(IlocInstruction::LOADI, instruction.op_code());
EXPECT_EQ(2, instruction.num_operands());
IlocInstruction::OperandList::const_iterator it = instruction.begin();
EXPECT_EQ(1, *it);
++it;
EXPECT_EQ(5, *it);
}
// Making sure the CreateIlocInstruction() method fails when it should.
TEST_F(IlocParserTest, CreateIlocInstructionFromMisspelledOp) {
vector<string> tokens;
tokens.push_back("ADD");
tokens.push_back("1");
tokens.push_back("5");
tokens.push_back("2");
IlocInstruction instruction(IlocInstruction::NONE);
EXPECT_FALSE(IlocParser::CreateIlocInstruction(tokens,
&instruction));
EXPECT_EQ(0, instruction.num_operands());
}
// Make sure creating an empty instruction works because there
// were times when I would actually have an empty tokens vector.
TEST_F(IlocParserTest, CreateIlocInstructionFromNoTokens) {
// Empty, which happens from a line that is a comment.
vector<string> tokens;
IlocInstruction instruction(IlocInstruction::NONE);
EXPECT_TRUE(IlocParser::CreateIlocInstruction(tokens,
&instruction));
EXPECT_EQ(IlocInstruction::NONE, instruction.op_code());
EXPECT_EQ(0, instruction.num_operands());
}
// This was a function that helped me generate actual code
// that I could output as a line in my output file.
TEST_F(IlocParserTest, MakeIlocLineFromInstructionAddI) {
IlocInstruction instruction(IlocInstruction::ADDI);
vector<int> operands;
operands.push_back(1);
operands.push_back(2);
operands.push_back(3);
instruction.CopyOperandsFrom(operands);
string output;
EXPECT_TRUE(IlocParser::MakeIlocLineFromInstruction(instruction, &output));
EXPECT_EQ("addI r1, 2 => r3", output);
}
// This test actually glued a bunch of stuff together. It actually
// read an input file (that was the name of the test) and parsed it
// I then checked that it parsed it correctly.
TEST_F(IlocParserTest, ParseIlocFileSimple) {
IlocParser parser;
vector<IlocInstruction*> lines;
EXPECT_TRUE(parser.ParseIlocFile(test_name_, &lines));
EXPECT_EQ(2, lines.size());
// Check first line
EXPECT_EQ(IlocInstruction::ADD, lines[0]->op_code());
EXPECT_EQ(3, lines[0]->num_operands());
IlocInstruction::OperandList::const_iterator operand = lines[0]->begin();
EXPECT_EQ(1, *operand);
++operand;
EXPECT_EQ(2, *operand);
++operand;
EXPECT_EQ(3, *operand);
// Check second line
EXPECT_EQ(IlocInstruction::LOADI, lines[1]->op_code());
EXPECT_EQ(2, lines[1]->num_operands());
operand = lines[1]->begin();
EXPECT_EQ(5, *operand);
++operand;
EXPECT_EQ(10, *operand);
// Deallocate memory
for (vector<IlocInstruction*>::iterator it = lines.begin();
it != lines.end();
++it) {
delete *it;
}
}
// This test made sure I generated an output file correctly.
// I built the file as an in memory representation, and then
// output it. I had a "golden file" that was supposed to represent
// the correct output. I compare my output to the golden file to
// make sure it was correct.
TEST_F(IlocParserTest, WriteIlocFileSimple) {
// Setup instructions
IlocInstruction instruction1(IlocInstruction::ADD);
vector<int> operands;
operands.push_back(1);
operands.push_back(2);
operands.push_back(3);
instruction1.CopyOperandsFrom(operands);
operands.clear();
IlocInstruction instruction2(IlocInstruction::LOADI);
operands.push_back(17);
operands.push_back(10);
instruction2.CopyOperandsFrom(operands);
operands.clear();
IlocInstruction instruction3(IlocInstruction::OUTPUT);
operands.push_back(1024);
instruction3.CopyOperandsFrom(operands);
// Propogate lines with the instructions
vector<IlocInstruction*> lines;
lines.push_back(&instruction1);
lines.push_back(&instruction2);
lines.push_back(&instruction3);
// Write out the file
string out_filename = test_name_ + "_output";
string golden_filename = test_name_ + "_golden";
IlocParser parser;
EXPECT_TRUE(parser.WriteIlocFile(out_filename, lines));
// Read back output file and verify contents are as expected.
string golden_file;
string out_file;
ReadFileAsString(golden_filename, &golden_file);
ReadFileAsString(out_filename, &out_file);
EXPECT_EQ(golden_file, out_file);
}
} // namespace compilers
int main(int argc, char** argv) {
// Boiler plate, test initialization
testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}
After all is said and done... WHY DID I DO THIS!? Well first of all. I wrote the tests incrementally as I prepared to write each piece of code. It helped give me peace of mind that the code I already wrote was working properly. It would have been insane to write all my code and then just try it out on a file and see what happened. There were so many layers, how could I know where a bug would come from unless I had each little piece tested in isolation?
BUT... MOST IMPORTANTLY!!! Testing is not really about catching initial bugs in your code... it's about protecting yourself from accidentally breaking your code. Every time I refactored or altered my IlocParser class, I was confident I didn't alter it in a bad way because I could run my tests (in a matter of seconds) and see that all the code is still working as expected. THAT is the great use of unit tests.
They seem like they take too much time... but ultimately, they save you time tracking down bugs because you changed some code and don't know what happened. They are a useful way of verifying that small pieces of code are doing what they are supposed to do, and correctly.