Unit Testing in C++: an Overview
Eirian Owen Perkins
CSCI 5828 - Presentation 2
Table of Contents
- Introduction and Motivation
- What is Unit Testing Anyway
- Unit Testing Terminology
- Selecting a Harness to Fit Your Needs
- Features under Consideration
- Summary of Features
- Discussion and Conclusions
- Final Remarks
Introduction and Motivation
- Unit testing is essential to software development[13]
- Enables developers to detect errors early on
- a form of white-box testing
- effective way to support test driven development
- unit tests can be seen as a "living document" that provide usage examples in context
- Inspect units, the "lowest" level of testing
Introduction and Motivation
- Detecting Errors Early: A business case[14, 15]
- Avoid damage to the company's reputation
- Boost product sales
- Customer and consumer confidence
- Catching errors late in the development cycle (or after release) has a "cascading impact" on time and money spent
- A fix can be orders of magnitude more expensive if caught late
- Don't drive business to your competitors!!!
Introduction and Motivation
- White-Box Testing[13]
- "Deriving tests from the source code internals of the software, specifically including branches, individual conditions, and statements."
- Are the "guts" doing what we think they're doing?
Introduction and Motivation
- Test-Driven Development[16]
- Write tests based on acceptance criteria.
- Run the tests. Tests will not pass until proper code is written
- Update code
- Run the tests again
- Update code until all tests pass
- "Test-Driven Development = Refactoring + Test First Development"
What is Unit Testing Anyway?
- Unit Testing is...[13]
- The "lowest" level of testing
- No knowledge of the "encapsulating software application"
- The developer's responsibility
- Testing at the function, method, or module level
- Coverage metrics such as branch or statement coverage may be captured at this level
- Unit tests should be independent
- The result of a previous unit test should not affect a subsequent test
What is Unit Testing Anyway?
- Unit Testing is NOT...
- Integration testing
- testing a combination of modules
- Black-box testing
- testing without knowledge of the internals
- System testing
- black-box testing at the application level
- A bench test
- testing with software or hardware assistance, for instance with a simulator
https://en.wikipedia.org/wiki/Black-box_testing
https://en.wikipedia.org/wiki/Test_bench
http://www.tutorialspoint.com/software_testing_dictionary/system_testing.htm
https://en.wikipedia.org/wiki/Integration_testing
Unit Testing Terminology
- Test Case
- Test Suite
- Test Harness
- Mocking and Stubbing
Unit Testing Terminology
- Test Case
- An individual test. For example:
- Consider testing isPrime()
- isPrime(42) == FALSE
- isPrime(17) == TRUE
- These are both a test case evaluating isPrime().
https://en.wikipedia.org/wiki/Unit_testing
Unit Testing Terminology
- Test Suite
- A collection of test cases
Unit Testing Terminology
- Test Harness
- The software framework that runs test suites
Unit Testing Terminology
- Mocking and Stubbing
- Simulating some feature or piece of code
- For instance, returning a pre-defined string instead of querying a database
- Another example: returning some status from hardware that may otherwise be difficult to replicate
- Mocking is more in-depth than stubbing
Selecting a Framework to Fit Your Needs
- There is a large selection of unit testing frameworks available to C++ developers
- Narrowing down a list of frameworks is a task in and of itself
- The selections on the remaining slides were selected after reading reviews, recommendation, and discussions
- Noteworthy discussion: Exploring the C++ Unit Testing Framework Jungle
Selecting a Framework to Fit Your Needs
- Recommendation Preview:
- Boost Test
- Large community, rich feature set
- Google Test
- Large community, used in well-known projects such as the LLVM compiler.
- CppUnit
- Port of JUnit. Java developers may be reasonably familiar with this.
- CppUnitLite
- Stripped down version of CppUnit; recommended for embedded systems.
- xUnit++
- Newer project designed as an alternative to Boost Test and Google Test
- CxxTest
- Seems to receive consistently good reviews and does not rely on "advanced" features of C++ or C++11 because it is implemented in Python
Features under Consideration (I)
- Run subset of tests in the test suite
- Portable -- minimal dependencies
- Time-Related Support
- Supports different output formats? Consider XML
- "Good" assert functionality
- Supports non-fatal failures
- Supports fatal failures
- Handles exceptions and crashes well
Features under Consideration (II)
- Minimum work required to add new test suites
- Actively Maintained
- Clear, up to date documentation
- Mocking Capability
- Repeat test N times
- License
- Per-check message
- Fixture support
Features -- Motivation
- Run subset of tests in the test suite
- Some test harnesses support running all tests or no tests
- Sometimes a developer would prefer to run a subset of test cases in a test suite
- Some tests may be long-running
- A developer may be interested in one test case at a time during test-driven development
Features -- Motivation
- Portable -- minimal dependencies
- What third party libraries are required for this test harness to run?
- The software application under development is probably deployed to multiple platforms or environments
- Developers should test their code in environments where the product will be deployed
- The development environment might change
- Dependencies may be come deprecated
- For this reason, portability in terms of minimal dependencies should be evaluated
Features -- Motivation
- Time-Related Support[8]
- Does the framework support timing-out tests?
- Report the time each test case took to run
- Report if a test exceeds some time limit
Features -- Motivation
- Supports different output formats
- Teams may want to run unit tests after automated builds
- Teams may want to run unit tests on a regular schedule (nightly, for instance)
- When either of the above are desired, it makes sense to send out email summaries
- How to summarize output? How to parse?
- Specification of a standard output format like XML is "nice to have" for these purposes
Features -- Motivation
- "Good" assert functionality
- Supports non-fatal failures
- Supports fatal failures
- Some test harnesses abort after a single test fails (fatal failure)
- Developers may want to continue running and see the total number of tests that pass/fail (non-fatal failure)
- Alternatively, if some critical function fails, there may be no reason to continue the test
- In that case, developers may want the test suite to halt
- Are both behaviors available? Quitting immediately (fatal) VS finish running tests (non-fatal)
Features -- Motivation
- Handles exceptions and crashes well
- It's reasonable to expect application code to throw exceptions
- Developers may even want to check that an exception was thrown
- The test suite should be able to handle this instead of dumping core
Features -- Motivation
- Minimum work required to add new test suites
- Software is continually growing. Developers will need to add new test suites for every module under test
- It should be trivial to add a new test suite
- Don't waste development time setting up boiler-plate code
Features -- Motivation
- Actively Maintained
- Avoid using deprecated software
- ...or software that has a high probability of becoming deprecated
- As we know, code always has defects
- Use a test harness under active maintenance
- Newer releases fix bugs
- Newer releases add features
Features -- Motivation
- Clear, up to date documentation
- Trying to figure out "how-to" with poorly documented software is a time-sink
- Unit test harnesses should help developers
- Developers must have adequate instruction on how to use the test harness
Features -- Motivation
- Mocking Capability
- Moot point, it turns out most frameworks will work with an outside mock library
Features -- Motivation
- Repeat test N times
- There may be good reason to run a test repeatedly
- Example from Hewlett-Packard: I had to run a single test 800 times in order to consistently reproduce a failure
- We saw plenty examples in class -- a developer may want to reproduce failure in a concurrent application that reproduces infrequently
Features -- Motivation
- License
- If the application under development is not open-source, do not use a third party library that requires your code to be open-source
- Your organization may not want to pay license fees, if any are required
- There are legal considerations
Features -- Motivation
- Per-check message
- Is a message generated for each test case in a test suite?
Features -- Motivation
- Fixture support
- Support for setUp() and tearDown() methods
Summary of Features
|
CppUnit |
CppUnitLite |
CxxTest |
xUnit++ |
Google Test |
Boost Test |
Run subset of tests |
not without modifying |
not without modifying |
yes |
yes |
yes |
yes |
Minimal Dependencies |
Issues integrating with visual studio |
yes |
requires python |
Requires GCC 4.7 or above |
yes |
Requires Boost Test library |
Time-Related support |
there exists an example |
unsure |
not that I found |
Halt long-running tests, "inherently fragile" |
yes |
Time-out not available on Win32" |
Different Output Format |
yes |
trivial to add more |
yes |
yes |
yes |
yes |
Good Assert Functionality |
all tests run or none do |
all tests run or none do |
all tests run or none do |
yes |
yes |
yes |
Handles Exceptions well |
yes |
no, must be added |
yes |
yes |
yes |
yes |
Minimal work to add new suite |
no |
yes |
yes |
yes |
yes |
yes |
Actively Maintained |
not CppUnit2 either |
other versions may be |
yes |
yes |
yes |
yes |
Clear documentation |
clear but not up to date |
hardly any |
yes |
not really |
yes |
up to date but overly verbose |
Repeat test N times |
yes |
not that I found |
not that I found |
not that I found |
yes |
test case must be specified N times |
License |
GNU LGPL |
none in original version; various in modified versions |
GNU LGPLv3 |
MIT |
BSD |
Boost Software License |
Per-Check message |
no |
no |
yes |
yes |
yes |
yes |
Fixture Support |
yes |
must be added |
yes |
yes |
yes |
yes |
Discussion and Conslusions
- After the initial review, CPPUnit and CPPUnitLite seemed too limited, poorly supported, and too ill-maintained for further consideration.
- That being said, CPPUnitLite is great for development in embedded systems. In this context, CppUnitLite's best feature is arguably the fact it is so stripped down.
Discussion and Conslusions
- CxxTest, xUnit++, Boost, and Google Test were all easy to use, feature rich, and actively maintained
- CxxTest was by far the easiest framework to get up and running
Discussion and Conslusions -- CxxTest
- CxxTest was easy to set up and get running.
- No main() function must be defined
- Tests are written in header files
// MyTestSuite1.h
#include <cxxtest/TestSuite.h>
class MyTestSuite1 : public CxxTest::TestSuite
{
public:
void testAddition(void)
{
TS_ASSERT(1 + 1 > 1);
TS_ASSERT_EQUALS(1 + 1, 2);
}
};
Discussion and Conslusions -- CxxTest
- After writing the test cases, the developer must run the cxxgentest parser to generate a .cpp file
- The .cpp file will have to be compiled before it can be run
- CxxTest was ported to Python from Perl. It has fewer limitations now than it did in the past
- For instance, older versions could not handle
#if
and #endif
preprocessor flags
- Another example: multiline comments could not previously be parsed by CxxTest. Now it is possible if the user specifies
--fog-parser
or -f
on the command line.
Discussion and Conslusions -- CxxTest
- + CxxTest had many of the desirable features considered
- + CxxTest had the most thorough, easy to read, and understandable documentation out of any of the considered frameworks
- - Python 2.4-3.3 are supported, but 2.4 will no longer be supported in upcoming releases
- - CxxTest does not display the time spent running each test
Discussion and Conslusions -- xUnit++
- At first glance, xUnit++ looked as promising as CxxTest
- Let's take a look at their self evaluation (from the xUnit++ webpage)
Discussion and Conslusions -- xUnit++
- xUnit++ gives itself a deceptively good evaluation.
- Including a test runner isn't much of an advantage
- A test runner is essentially a boiler-plate file with main() defined
- Boost test DOES NOT REQUIRE a user-supplied test runner.
- xUnit++ boasts that it has support for attributes, but fails to give motivation or use use cases for attributes. There is little distinction between attributes and test suites.
- xUnit++ reports that it has support for halting long-running tests, but the actual documentation says that this features is "fragile."
Discussion and Conslusions -- Boost Test
- Boost Test has an enormous and useful feature set
- For example: Boost gives you the option to follow fork'd children and abort if they do not return a 0 status
- A user-supplied test runner is not required, but is easy to set up if desired
- For instance, it is straight-forward to set up a test runner to set options from a configuration file
- This may be useful in different environments -- don't run a test if it's not relevant
- Simply use a macro to define test cases and test suites. Test cases may exist inside of test suites.
Discussion and Conslusions -- Boost Test
- A test case example:
- Notice BOOST_CHECK and BOOST_REQUIRE. CHECK is non-fatal, while REQUIRE is fatal and aborts on a failure case.
- Adding a test suite requires an additional 2 lines only
#define BOOST_TEST_MODULE MyTest
#include <boost/test/unit_test.hpp>
int add( int i, int j ) { return i+j; }
BOOST_AUTO_TEST_CASE( my_test )
{
// seven ways to detect and report the same error:
BOOST_CHECK( add( 2,2 ) == 4 ); // #1 continues on error
BOOST_REQUIRE( add( 2,2 ) == 4 ); // #2 throws on error
if( add( 2,2 ) != 4 )
BOOST_ERROR( "Ouch..." ); // #3 continues on error
if( add( 2,2 ) != 4 )
BOOST_FAIL( "Ouch..." ); // #4 throws on error
if( add( 2,2 ) != 4 ) throw "Ouch..."; // #5 throws on error
BOOST_CHECK_MESSAGE( add( 2,2 ) == 4, // #6 continues on error
"add(..) result: " << add( 2,2 ) );
BOOST_CHECK_EQUAL( add( 2,2 ), 4 ); // #7 continues on error
}
Discussion and Conslusions -- Boost Test
- Their documentation sucks.
- There's a LOT of documentation
- a LOT of documentation isn't necessarily a good thing
- it's overly verbose, unclear, and in some cases misleading
- IMHO StackOverflow is a better resource than the Boost webpage
Discussion and Conslusions -- Google Test
- Google Test is by far the most feature-rich, easy-to-use framework under consideration
- The documentation is up to date and easy to read
- ...BUT sometimes leaves out important details
- like how to compile test suites
- There's an option to "drop" the developer into the GDB debugger on failure
- the details are somewhat unclear
- the test must be run INSIDE the debugger to begin with
- setting the environment variable GTEST_BREAK_ON_FAILURE=1 enables this behavior
- let's take a look
Discussion and Conslusions -- Google Test
// Tests factorial of positive numbers.
TEST(FactorialTest, Positive) {
EXPECT_EQ(1, Factorial(1));
EXPECT_EQ(2, Factorial(2));
EXPECT_EQ(6, Factorial(3));
EXPECT_EQ(40320, Factorial(8));
}
(gdb) run
Starting program: /home/yourfavoriteprotein/bin/cpp_unit_test_frameworks/gtest-1.7.0/samples/mytest
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Running main() from test_main.cc
[==========] Running 6 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 3 tests from FactorialTest
[ RUN ] FactorialTest.Negative
[ OK ] FactorialTest.Negative (0 ms)
[ RUN ] FactorialTest.Zero
[ OK ] FactorialTest.Zero (0 ms)
[ RUN ] FactorialTest.Positive
sample1_unittest.cc:112: Failure
Value of: 2*3
Actual: 6
Expected: 8
Program received signal SIGSEGV, Segmentation fault.
0x0000000000413427 in testing::UnitTest::AddTestPartResult(testing::TestPartResult::Type,
char const*, int, std::string const&, std::string const&) ()
Since the developer is in the debugger, she can use gdb to figure out what went wrong.
Example from https://code.google.com/p/googletest/source/browse/trunk/samples/sample1_unittest.cc and http://stackoverflow.com/questions/23746585/why-is-google-test-segfaulting
Discussion and Conslusions -- Google Test
- Here's what it looks like when GTEST_BREAK_ON_FAILURE=0
Discussion and Conslusions -- Google Test
- Here's what it looks like when GTEST_BREAK_ON_FAILURE=0
- As noted, Google Test is incredibly feature-rich.
- One example is sharding, which is not shared by other frameworks.
- Sharding can be used to run tests concurrently on different machines
- Google Test also has fatal and non-fatal assertions
- Test can be disabled easily
- Even if a test is disabled, it may be selected to be run individually
- This flexibility is noteworthy
- The colorful output, as seen on the previous slide, is also "nice to have."
Final remarks
Due to the richness of features, ease of use, and clarity of the documentation, Google Test earns my recommendation as a unit test harness. That being said, it is roughly equivalent to Boost Test. Therefore, if Boost Test is in use anyway, I see no strong reason to switch to Google Test other than personal taste. It is worth noting that Google Test is compatible with other test harnesses. It appears that changing the test runner slightly would allow some other framework to drive, even while executing Google Test test cases.
References
[ 1] Games from Within. Exploring the C++ Unit Testing Framework Jungle. 2004.
http://gamesfromwithin.com/exploring-the-c-unit-testing-framework-jungle
[ 2] xUnit++. How does xUnit++ compare to …? 2013.
https://bitbucket.org/moswald/xunit/wiki/Compare.wiki
[ 3] ACCU Professionalism in Programming. C++ Unit Tests Frameworks. 2007.
http://accu.org/index.php/journals/1326
[ 4] Boost C++ Libraries. Boost Library Documentation. Accessed October 2015.
http://www.boost.org/doc/libs
[ 5] BaRiS AyDiNoZ. Unittest++ vs cppunit. Accessed October 2015.
https://sites.google.com/site/barisaydinoz/unittest-vscppunit
[ 6] stackoverflow. Repeat testcase in Boost test multiple times. May 2012.
http://stackoverflow.com/questions/10554613/repeat-testcase-in-boost-test-multiple-times
[ 7] SourceForge.net. CppUnit. December 2008.
http://sourceforge.net/apps/mediawiki/cppunit/index.php?title=Main_Page
[ 8] stackoverflow. Boost Unit Test timing mechanism. May 2012.
http://stackoverflow.com/questions/10518099/boost-unit-test-timing-mechanism
[ 9] stackoverflow. How to parametrize [sic] a test using cppunit. November 2014.
http://stackoverflow.com/questions/290099/how-to-parametrize-a-test-using-cppunit
[10] CxxTest. CxxTest User Guide. Accessed October 2015.
http://cxxtest.com/guide.html
[11] SourceForge.net. [soci-users] Using a unit testing framework? March 2013.
http://sourceforge.net/p/soci/mailman/soci-users/thread/1364228493.69102.YahooMailNeo@web172205.mail.ir2.yahoo.com/
[12] Wikipedia. Unit testing. Accessed October 2015.
http://en.wikipedia.org/wiki/Unit_testing
[13] Paul Ammann and Jeff Offutt. Introduction to Software Testing. 2008.
Cambridge University Press.
[14] JM Stecklein. Error Cost Escalation Through the Project Life Cycle. June 2004.
http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/20100036670.pdf
NASA Johnson Space Center.
[15] Team Professor. Fundamentals of Secure Development. Accessed October 2015.
https://teamprofessor.absorbtraining.com/
[16] Agile Modeling. Introduction to Test Driven Development (TDD). Accessed October 2015.
http://agiledata.org/essays/tdd.html