Category Archives: mocks

A Test That Isn’t Automated Is a (fill in the blank)

Recently I ran into some problems with tests that could not easily be run in an automated manner.  In the spirit of a glass half full, I am trying to decide if manual tests are partially useful, partially useless, or totally useless.

The kind of work that I do, laboratory automation, scientific instrumentation, medical devices, deals a lot with hardware: sensors, motors, lasers, all kinds of fun stuff.  Over the last several years I have relied heavily on TDD, DI, and mocking frameworks to be able to abstract away many of the dependencies on the actual hardware for my testing and instead use mocks to verify higher level control functions.

At some point the rubber needs to meet the road and you need to make sure that what you are doing with the hardware is really what is supposed to happen, i.e. did a motor move, was an image captured, did you really measure the temperature correctly.  I have found that the ease-of-use of modern testing frameworks like MbUnit make writing a test a convenient way to perform this kind of hardware-software integration test.  As an example, consider a test that verifies that a motor moves correctly.  The test will read the position of the motor, move the motor in the forward direction, stop the motor, then read the position again and see if the motor has moved.

        [TestCategory("Motor Harware Tests")]
        [TestMethod]
        public void TestMovingMotorForward()
        {
            double moveMillimeters = 10.0;
            double allowedTolerance = 0.1;
            LinearMotor motor = new LinearMotor();
            double startPosition = motor.GetPosition();
            motor.MoveRelative(moveMillimeters);
            double endPosition = motor.GetPosition();

            Assert.AreEqual(moveMillimeters, endPosition - startPosition, allowedTolerance);
        }

Now this test is doing a lot of things and it certainly is not a unit test in the strictest definition.   I want to stress that before I ever get to this kind of test I have already written a lot of lower level unit tests with the hardware abstracted away.  In many cases the code has been written and unit-tested long before the hardware even gets in the door and gets wired up, so these types of tests are really more like system integration tests.

But here’s the problem.  I can’t run this test on my automated build server because the test requires a special fixture, i.e. the hardware.  I can exclude the test by including it in a special test category and telling my automated test runner to not execute that category, but what do I do about running the test?  Do I make sure to run it manually at some frequency (1/week, 1/day, etc).  Do I just leave it in the code base and only run it manually if there is a reported problem?  What happens when I run it and there is a problem, how to I know when the error got introduced?  Do I rip the test out since it is not automated and is just cruft?

After some recent experiences of having problems occur and not knowing when they were introduced, I am really leaning heavily towards the notion of setting up a special build platform that is able to run several types of hardware tests.  Tests like these can take a long time to run due to the delays associated with the hardware, so it is not realistic to run them on every check-in, otherwise any hope at the 10 minute build is out the window.  But you could certainly set up a build to run once a day, or on the weekend, or at some frequency that works for you.

Now you may not have test problems just like this, i.e. hardware related, but maybe you have some integration tests that are long running tests.  Maybe you want to test how a database runs under extended load or you have some other stress test.  Find a way to make sure that the tests are automatically executed at some frequency, otherwise you are inviting the time when you go to run a “manual” test, only to find that it has long since stopped working.

My take-away is that if a test is not automated at some meaningful frequency it should be removed from the code, because it is not doing what you need it to do and if it is not being run it is just more technical debt.

TDD with C++

I know, it is 2010, and doing TDD with C++ probably isn’t trending too high on anyone’s hot topics list, but after a 8 year hiatus I find myself doing some Agile consulting with an embedded systems team that is using C++.   At first I was concerned about the transition.  What was I going to do without my testing frameworks, my mocking frameworks, my continuous integration support, my…wait I have to manage my own memory too…

Well it is not all bad news folks, a lot has happened in the world of C++.  And while the tools are not yet at the level you may be accustomed to coming from C#, Java, or Ruby, it’s not all bad news.  This is not your father’s C++.

Testing Frameworks

There are a few testing frameworks out there for C++: CppUnit, CxxTest, and GoogleTest are some of the better known frameworks.  These three are all heavily influenced by JUnit, so if you are familiar with JUnit, NUnit or even MSTest, you will be familiar with the terminology, i.e. they support test fixtures, have setup and teardown support etc.

On the project I am on, we are using GoogleTest and it is working OK for us.  It can be targetted to a variety of compilers and OS’s, so you may have some work to do to get it up and running.  The syntax relies heavily on macros, which isn’t all that surprising, but it makes it a little ugly to look at, but it supports the notion of assertions, test fixtures, setup/teardown that you are probably familiar with.  A few simple tests might look something like this

// Tests that 1 really equals 1.
// Just looking to see if my framework does what I think it does
TEST(DumbTest, Checks1Is1) {
  EXPECT_EQ(1, 1);
}

// Tests that a person can be created with a first and last name
TEST(PersonTests, PersonFullName){
	Person aPerson = Person("Fred", "Flintstone");
	ASSERT_STREQ( "Fred Flintstone", aPerson.FullName() );
}

Test Output

To get your tests results reported as part of your continuous integration process, you will want to see what output formats the testing framework provides.  GoogleTest supports several output formats, including a text based output that is helpful from running within the IDE as well as an XML report that conforms to the Ant/JUnit format.  We are using TeamCity, so reporting automated test results is straightforward in that regard.

Controlling What Tests Run

This is an area that hurts a bit, at least if you are coming from an environment where you are used to being able to easily control which tests run while you are in the IDE via a plug-in, for example right-click and select “Run this test” or “Run this test suite”.  GoogleTest does have options to control what tests run and which tests do not run, the ability to repeat the test, etc, but it is all via switches to the test runner, which makes the development cycle a little slower.   Since I am using this on an embedded systems project, some tests may require hardware that is not at the development station or a build server. To have the test runner skip tests that have the word Hardware in the test name, you would pass –gtest_filter=-*.*Hardware*.  Although it is a bit cumbersome, at least the capability exists to control which tests run.

Mocking Frameworks

I am a big fan of mocking frameworks in C#.  I know they are just one tool in the unit testing toolbag, but I find them invaluable for testing interactions between components and can free me from having to write more sophisticated simulators.  GoogleMock provides mocking support for C++ and like GoogleTest can be targeted towards a variety of development tools and OS’s.  Note that as of this writing, there are some problems building GoogleMock for VS2010.

Coming up to speed on any mocking framework takes effort, especially once you get away from the plain vanilla mock scenarios and get into events, custom matchers, custom actions, etc.  Most of my C# mocking is with NMock2, there are other great mocking tools out there, like TypeMock, RhinoMocks, and Moq.  If you are used to those tools, you are going to find GoogleMock a little clumbsy and cumbersome, in my opinion.  For example, this snippet sets an expectation that the mutator’s Mutate funtion will be called with true as the first argument and a don’t care on the second argument (indicated by the ‘_’).  It will also perform an action when the expectation is satisfied by setting the value of the first argument (0-based) to 5

  MockMutator mutator;
  EXPECT_CALL(mutator, Mutate(true, _))
      .WillOnce(SetArgumentPointee<1>(5));

Again, like GoogleTest, heavily macro-based and in my opinion a little….clumsy.  Maybe it is just getting used to being in C++ again, but it doesn’t flow as naturally as some of the C# mocking tools.  But on the plus side, at least there is a mocking framework andit will probably do 90% of what you need out of the box.  If you need to write a custom action, there may be some work there.

Other Niceties

Boost C++ Libraries

Ah memory management, how I have NOT missed you all these years.  Fortunately the Boost C++ Libraries have good support for smart pointers which relieve the developer of a lot of the pain of memory management.  Not all of the pain, but enought to merit using them.  In addition to smart pointers, the Boost library has a variety of classes for string and text, containers, iterators, math, threading, and on and on.  If your OS or compiler does not directly support something, check the Boost documentation before you implement something from scratch.  Chances are it may have been solved already.

IOC/DI Tools

Although I have not had a chance to use it yet, there is at least on IOC container tool in C++, Pococapsule.  Since I haven’t had a chance to use it yet, I will leave it as an exercise to the reader to try it out, and if you have used it, please feel free to comment on it.

Summary

Although many of us may not have used it in many years, C++ is still chugging along and has a place in the software universe.   In certain areas it is the best choice based on criteria like performance, foot-print, etc.  It is encouraging to see TDD practices and tools that started in other frameworks/languages come into C++.  They may not have the same ease of use as the other frameworks, but these frameworks also had some warts in their early development.  The open-source aspect of these tools will also hopefully continue to move the feature set and usability of these tools forward.

If I had my choice, I would prefer to be develop in C#, Python, Ruby, or Java.  The ease of use and richness of tools make these more “productive” developer environments, especially when looked at from the perspective of TDD.  However sometimes C++ is the right tool and I am glad to see that the tools are there to develop code in a TDD manner.  They may not be at the level of the tools we are used to, but they are available to us.

And many thanks to the folks on these open-source projects who have put those tools out there.

Why TDD?

“When do you write your tests?”

This is a question that I have been putting to developers lately and the answers I get back sometimes surprise me. I still hear a lot of people say they are writing their tests after the majority of the code is written. These are people who, by and large, agree in the value of unit testing.  Unfortunately by deferring testing to after the code is written I think they are missing out on an opportunity to make significant improvements in how they write software.

Remember, it is test driven development.  The tests play an important role in driving the interface definition, underlying design, and structure of the code.

TDD Helps Define the Interface

What is your code going to look like to clients?  What methods are going to be provided, what are the arguments, what are the failure modes and behaviors? These are all questions that will shake out of a test driven development process.

When I am doing TDD, I typically go through a series of test, code, and refactor cycles that take the component under test through a progression of increasing functionality/behavior.

  1. Simply create the component under test
  2. Implement a simple, sunny-day operation
  3. Add some error conditions
  4. Layer in some more functionality
  5. And so on…

As I go along this progression, my understanding of the behavior of the component and its interface is evolving.  And since I am driving the interface definition from my tests, I am thinking about how the interface looks from the outside.  That is an important distinction.  Without that view from the outside it is easy to put a lot of effort into the internals of the design, without having a good understanding of how it is going to get used.  When I am writing the tests I have to put a lot of thought into how a component is configured and called, as well as how it will respond to error cases.

After each of these steps, I am also checking in my code. There are several reasons for this. The obvious one is I am building up a revision history and keeping the scope of my changes small. That way if I paint myself into a corner or start to detect a code smell I can revert back to a known good state. I am also getting my tests to run in the automated build. That answers the question of whether there is an unknown dependency on a library or configuration that exists on my development system but not on the build server. It may also point to an expensive test setup or teardown condition that causes the automated tests to take a long time to execute.  Finding that out early makes diagnosing and fixing it a lot easier.

TDD Encourages Good Design

You can have good design without TDD, and you can write lousy code using TDD, but I find one of the strengths in TDD is that it encourages good design and good design practices.

The iterative nature of TDD,  sometimes referred to as red-green-refactor (or test-code-refactor), encourages continuous design.  One of the knocks I hear about Agile and TDD by people who really don’t understand it is that there is no design cycle.  In reality you are constantly thinking about and improving the design, just in incremental steps and in response to adding more functionality (via new tests).  As you continue to add in more functionality, opportunities to refactor for modularity, re-use, decomposition, and performance will present themselves naturally.  And since you have a test suite already in place you can refactor and get an indication that the component is still behaving as expected.

Another benefit to TDD is that it encourages loose coupling of components.  When you are unit testing, you want to keep the amount of code that you are testing to a minimum.  If the code under test has dependencies on other components,how do you restrict your testing efforts to the code under test and not all the pieces of code that it talks to?  How do you decide what dependencies should be covered in the test suite and which ones should be treated more abstractly.  There aren’t any hard and fast rules here.  I have seen code that has so many injected dependencies that it is hard to figure out what exactly it does, and I have seen code that included so many other classes that it is nearly impossible to have good tests that don’t break when an underlying component changes.  But by following an iterative, test driven strategy, you are forced to confront these issues early, before you have invested too much effort into what might be an unwieldy design.

You may argue that you can get these types of benefits with tests developed after the code is written, and that may be true, but at that point making changes to the design is more difficult.

TDD Enables Testability

That may seem obvious, but by writing tests starting on day 1, you are forced to deal with how to put in hooks for testing right away and how to determine that your component is behaving as expected.

I recently had to refactor some code that had practically useless unit tests.  The reason that the tests were of little value was because there was no easy way to see the interaction with the dependencies or evaluate the “success” of an operation.  Essentially these tests boiled down to “call this method and verify that no exceptions are thrown”.  That is a an unacceptable success criteria for a “finished” piece of software.

Two big issues for me with unit tests are dependencies and infrastructure.  How can I run my tests so my code is not dependent on too many other components, such that the tests become unwieldy or brittle.  I am a big fan of mocking and dependency injection.  For C# I use nmock2 for mocking.  At some point I need to finally try out an IoC framework (Scott Hanselman has a good list of them for .NET  here) for dependency injection, but to this point providing overloaded constructors and manually injecting dependencies has not been too painful.

The infrastructure issue is another thing you want to get a handle on.  How dependent are your tests on things like the file system, do they need a database, are there any special configuration files?  Do they run quickly or require a costly setup/teardown?  These may be indicative of problems in your test design and may cause problems.  Remember that your tests can be refactored too if you detect a code smell.

PyCon on the Charles – Part I

I have been doing some work with Python the last few months and recently joined a Python Meetup group in Cambridge that had a meeting last night at Microsoft NERD. The purpose of the meeting was a chance for some people who are presenting at PyCon to do a dry run on their presentations. The meeting was called PyCon on the Charles – Part I

This meeting definitely met my one good thing rule, in fact I thought all the presentations were interesting. There were three presentations

  • Python for Large Astronomical Data Reduction and Analysis Systems by Francesco Pierfederici
  • Python’s Dusty Corners by Jack Diederich
  • Tests and Testability by Ned Batchelder

Francesco Pierfederici talked about how he uses Python in “Big Astronomy”, particularly for the LSST project. The code base they are using has a pipeline framework that I am interested in looking at. There is a lot of simulation that needs to be done before building a large telescope and the supporting computing infrastructure. Most of the high level code is developed in Python, with the computation intensive code written in C.

I was most interested in the Ned Batchelder’s Tests and Testability presentation because I am interested in the tool support for testing in Python. He presented techniques to using mocking to make your code more testable, as well as ways to structure your code to support dependency injection. The mocking framework was Michael Foord’s mock.

I would have liked to have seen more emphasis on addressing testability early in the development cycle. As a proponent of TDD, I think you need to begin testing as soon as you start developing. In my experience it leads to a better design through loose coupling and you avoid having to make a major refactoring to put in test hooks after the fact. I think Ned did a nice job presenting the techniques and some of the tools available in Python to make testing easier and with better coverage.

All in all, an interesting evening and if you are a Python developer in the Boston/Cambridge developer you may want to join up. There is a follow up on scheduled on 2/3/10 called PyCon on the Charles – Part II where there will be three more practive presentations for PyCon.

And once again, kudos to Microsoft for making their space available for different groups. This is the 4th event that I have been at in their space and exactly none of them were Microsoft specific. I think having a space where developers can get together is something that has been sorely lacking and they deserve credit for trying to make that happen.

Rethinking the C# using statement

I’ve typically used the C# using construct to wrap an instance on an object that has a short lifetime and requires a call to Dispose. The using keyword hides the need to call Dispose explicitly and avoids having to use a try-finally to ensure that Dispose is always called. One example of a class that I often use the using keyword is a stream class

        using (FileStream fs = File.Create(path))
        {
            // Do something with the file stream
            ...

        }  //  Dispose gets called automatically here

Recently, while using NMock2, I came across a usage of the using statement that forced me to stop and think about what was going on under the covers. NMock2 is a mocking framework and the Order/Unordered properties are used to tell the framework whether you care that the expectations are satisfied in a particular order. The Ordered attribute tells NMock2 that the expectations must match the actual order of execution, while Unordered indicates that all the expectations must be satisfied, but the order doesn’t matter. When you are defining the expectations you put the Ordered/Unordered attributes in a using statement. They can even be nested, so a test might look something like this (and no I don’t use class/method names like this):

        [Test]
        public void CanDoIt()
        {
            using (_mockery.Ordered)
            {
                Expect.Once.On(_thing1).Method("Method1").WithNoArguments();

                using (_mockery.Unordered)
                {
                    Expect.Once.On(_thing1).Method("Method2").WithNoArguments();
                    Expect.Once.On(_thing1).Method("Method3").With(foo, bar);
                }

                Expect.Once.On(_thing1).Method("Method4").WithNoArguments();
            }

            classUnderTest.DoIt();
        }

In the above test we are telling NMock that we expect the following methods to be done in a certain order. Method1 should be called first, followed by Method2 and Method3 in any order, followed by Method4.

Using NMock2 for unit testing can be the subject of another post, but it was the use of the Ordered/Unordered attributes inside of a using statement that really made we think about what was going on. What I found intriguing about the use of this construct is that the Ordered/Unordered properties are returning an object that implements IDisposable but you never do anything with the returned property. NMock2 is using the using contruct with Ordered and Unordered as syntactic sugar to make the test cases easier to read, and I think it works well in NMocks case.

Under the covers, the constructor of the object returned from the Ordered/Unordered properties is actually being used to set state in the underlying owning Mockery object. I don’t think I can show only a snippet of the NMock2 source code under the license agreement of the NMock2, but if you want to see for yourself what it is doing you can download it from here.

Basically the object that gets created by the Ordered/Unordered properties takes a reference to the “parent” object in the constructor. In this case the parent object is the Mockery object. The Dispose method cleans up the state in the parent object. I came up with a somewhat contrived package shipping class that does something similar to what NMock is doing as an example, although in my example you can not nest using statements like in NMock.

using System;
using System.Collections.Generic;
using System.Text;

namespace TestUsing
{
    class Program
    {
        static void Main(string[] args)
        {
            PrioritizedShipping shipping = new PrioritizedShipping();

            // Setup a bunch of packages for priority
            using (shipping.Overnight)
            {
                shipping.ScheduleForDelivery(new Package(10.0, 51.2));
                shipping.ScheduleForDelivery(new Package(11.0, 52.3));

                // Note you can not nest usings because there is no Push/Pop mechanism implemented
            }

            // Default is standard shipping, could also use a using( shipping.Standard)
            shipping.ScheduleForDelivery(new Package(300.0, 1248.0));

            // For this example I prefer a more straightforward method that is clear to a manintainer
            shipping.ABetterScheduleForDelivery(new Package(500.0, 2448.0), ShipPriority.Standard);
            shipping.ABetterScheduleForDelivery(new Package(20.0, 12.0), ShipPriority.Overnight);
            shipping.ABetterScheduleForDelivery(new Package(21.0, 13.0), ShipPriority.Overnight);
        }
    }

    public class Package
    {
        private double _weight;
        private double _girth;

        public Package(double weight, double girth)
        {
            _weight = weight;
            _girth = girth;
        }
    }

    public enum ShipPriority { Overnight, Standard };

    public class PrioritizedShipping
    {
        private List<Package> _priorityPackages = new List<Package>();
        private List<Package> _standardPackages = new List<Package>();
        private List<Package> _activeList;

        public IDisposable Overnight
        {
            get { return new SetupShippingHelper(this, ShipPriority.Overnight); }
        }

        public IDisposable Standard
        {
            get { return new SetupShippingHelper(this, ShipPriority.Standard); }
        }

        public void SetupPriority(ShipPriority priority)
        {
            if (priority == ShipPriority.Overnight)
                _activeList = _priorityPackages;
            else
                _activeList = _standardPackages;
        }

        public void ScheduleForDelivery(Package package)
        {
            _activeList.Add(package);
        }

        public void ABetterScheduleForDelivery(Package package, ShipPriority priority)
        {
            if (priority == ShipPriority.Overnight)
                _priorityPackages.Add(package);
            else
                _standardPackages.Add(package);
        }

    }

    public class SetupShippingHelper : IDisposable
    {
        PrioritizedShipping _parent;

        public SetupShippingHelper( PrioritizedShipping parent, ShipPriority priority )
        {
            _parent = parent;
            _parent.SetupPriority(priority);
        }

        #region IDisposable Members

        public void Dispose()
        {
            _parent.SetupPriority(ShipPriority.Standard);
        }

        #endregion
    }
}

Note in the contrived example I’ve come up with I think it is actually less clear and I prefer a more explicit method, ABetterScheduleForDelivery, to indicate what priority to use. I would also have some serious concerns about how something like this would work in a multi-threaded environment. I can see how you could make it work, but at the end of the day I think this type of approach would be harder to maintain and could have some unforeseen consequences. I think it works really well for the NMock2 case, but I would have to think long and hard before using it in my own code.

I haven’t found a use for this particular usage of the using statement, but it does give me something to think about and a tool that perhaps I can use in the future. I think there is a lot to be said for code that reads well, but that also needs to be balanced against a maintainer being able to understand what is happening under the covers to avoid inadvertent consequences.