Introduction
Unit testing is a technique for testing individual components (also referred to as “units”) of a program, such as methods or classes, independently from each other. Unit testing can identify bugs at an early stage, which means they can be fixed faster and cheaper. A proper unit test suite also acts as a safety net, allowing developers to evolve existing functionality and add new ones without fear of introducing regressions.
In software testing, code coverage is a metric that measures the extent to which the source code is executed during testing. In the context of unit testing, it indicates how many lines, branches, or paths in the codebase are covered by unit tests. Higher code coverage is generally better as it reduces the likelihood of untested, error-prone code. However, aiming for 100 percent coverage can lead to diminishing returns, encouraging superficial tests that do little to improve quality.
This article provides an overview of unit testing tools and practices and explains how to introduce and track code coverage effectively.
Unit Testing Tools
Unit testing always relies on tools, primarily “unit testing frameworks.” They enable developers to focus on writing effective tests in a structured manner rather than handling the intricacies of test execution and result analysis.
Most unit testing frameworks share several core features:
- A structure for defining and organizing test cases in files or classes
- Assertion functions that compare expected and actual test results, which determine whether a test passes or fails
- Setup and teardown methods for initializing and cleaning up test environments before and after running tests
- An environment to run unit tests and obtain their results
- Test-report generation
Here are some unit test frameworks that are commonly used with the world’s most popular programming languages:
- JavaScript/TypeScript: Jest, Mocha, Jasmine
- Python: unittest, pytest
- Java: JUnit, TestNG
- C#: xUnit.net, NUnit
- C++: GoogleTest, Catch2
- PHP: PHPUnit
- Go: Testify
- Rust: cargo test
- Ruby: RSpec, minitest
Code Coverage Tools
In addition to writing and running test cases, you may want to measure how much of your production code is covered by tests. This is where code coverage tools come in.
Some unit testing frameworks measure code coverage out of the box, while others depend on plugins or third-party tools. For example, Jest has built-in support for code coverage, enabled by running tests with the --coverage
flag. In pytest, you can add code coverage using the pytest-cov plugin. JUnit does not include built-in code coverage tools; instead, Java developers often use it alongside the JaCoCo code coverage library.
Apart from tools that collect code coverage metrics and generate basic reports, there are tools that ingest these reports in order to visualize them and help your team monitor code coverage over time. For example, SonarQube, a code-quality analysis platform, can show code coverage as part of its multiple metrics.
SonarQube collects coverage reports from various tools for languages like Java, JavaScript/TypeScript, .NET, Python, PHP, or C/C++. Once these reports are imported, they are combined with other code-quality metrics, such as code smells and maintainability issues, to provide a comprehensive view of your codebase health. From there, you can track coverage trends and configure quality gates to prevent poorly tested code from being introduced.
Writing Effective Unit Tests
Before getting into the details of code coverage, let’s first look at some unit testing best practices. Ensuring your team follows these guidelines will make coverage metrics more meaningful as you scale. If you’re already confident in your unit testing approach, feel free to skip ahead to “What Is Good Code Coverage?”
Well-written unit tests can help catch bugs early, make refactoring safer, and serve as living documentation for your code. Here are some best practices and guidelines for writing effective unit tests.
Follow the “Arrange, Act, Assert” (AAA) Pattern
The “Arrange, Act, Assert” (AAA) pattern organizes each test into three distinct sections:
- Arrange prepares the unit (usually a class) being tested and sets up the conditions and inputs for the test.
- Act executes the specific function being tested and captures the result.
- Assert verifies that the actual result matches the expected result.
This pattern improves test readability by clearly separating setup, execution, and verification steps.
Here’s an example of a unit test written in Java using the JUnit unit testing framework, which follows the AAA pattern. The three steps of AAA are outlined with comments:
import static org.junit.jupiter.api.Assertions.assertEquals;
import org.junit.jupiter.api.Test;
public class CalculatorTest {
@Test
void calculateSum_withPositiveNumbers_returnsCorrectSum() {
// Arrange
Calculator calculator = new Calculator();
int a = 5;
int b = 10;
// Act
int result = calculator.calculateSum(a, b);
// Assert
assertEquals(15, result, "Sum of 5 and 10 should be 15");
}
}
Use a Clear and Consistent Naming Convention for Test Cases
When a unit test name clearly describes what is being tested, developers can understand the purpose of the test at a glance without needing to delve deeply into the test code. Here are some commonly applicable naming recommendations:
- Use expressive test names: Name each test so its function is clear from reading its name. Names like
Test1
,Test2
, andYetAnotherTest
don’t convey the intent of the test, making it harder to understand what has and what has not been tested. - Choose from existing naming conventions: Instead of inventing your own naming convention, consider picking one of the established conventions. Some options include the following:
- Behavioral naming explicitly describes the behavior of the unit being tested under specific conditions and the expected outcome. The typical structure of a name is
methodName_condition_expectedResult
(egcalculateSum_sameNumbersOppositeSigns_returnsZero
). - Given-When-Then emphasizes the context of a test (“given”), the behavior being tested (“when”), and the expected outcome (“then”). The structure would be
given[Condition]_when[Action]_then[ExpectedOutcome]
(eggivenNumbersWithOppositeSigns_whenCalculateSum_thenReturnsZero
). - Feature-based naming advocates grouping unit tests by feature rather than by implementation methods. It would be structured as
[FeatureName]_[Scenario]
(egShoppingCart_AddItem_ItemAppearsInCart
).
- Behavioral naming explicitly describes the behavior of the unit being tested under specific conditions and the expected outcome. The typical structure of a name is
- Use underscores for readability: Even if your programming language’s naming conventions don’t usually use underscores (eg Java, JavaScript, or C#), it’s okay to use them in test names. Test names tend to be lengthy, and adding a space between the components of a test name improves readability.
- Make test names approachable for nontechnical people: If you rely on your unit tests as a communication method between the development team and less technical employees, make test names reflect the expected behavior instead of implementation details.
- Mind your unit testing framework: When choosing a naming convention, consider any requirements imposed by your unit testing framework. For example, the two prominent Python unit testing frameworks, unittest and pytest, look for test methods that start with the
test_
prefix. If you use the default configuration of one of these frameworks and do not follow their naming conventions, your tests won’t be discovered and run. - Choose one and stick to it: Whatever naming convention you choose, stick to it and use it across your team. Consistent naming helps identify patterns, locate specific tests, and maintain organization in your test suite.
Keep Tests Independent and Isolated
An independent unit test isn’t affected by the results of running any other tests. When a test fails, independence makes it easier to fix because you know it tests one specific behavior. Finally, independent tests can run in parallel, which can dramatically speed up test execution.
An isolated unit test doesn’t depend on external systems. If your tests need to share external resources, such as databases or network sockets, make sure that the state of these resources is reset between tests.
Here’s what you can do to maintain test independence and isolation:
- Replace external dependencies like databases or APIs with mocks to isolate the unit being tested.
- Perform test cleanup. Clean up any resources—such as files, database entries, or configurations—created during a test. Many frameworks provide mechanisms like JUnit’s
@AfterEach
and@AfterAll
annotations for cleanup.
Test Edge Cases
Many bugs occur because of incorrect handling of minimum or maximum values or improper handling of empty and null values. Testing these cases can catch issues early, which may require refining parts of the code under test.
A solid unit test suite will contain:
- tests that validate behavior with input within a comfortable, expected range and
- tests that verify edge cases and invalid input.
Here are some common examples of edge cases to test for:
- An empty array for a function that expects a nonempty array
- A file with zero bytes for a function that processes file content
- A negative value for a function that expects a positive value
- A string containing non-ASCII characters for a function that processes strings
Prefer Expressive Assertion APIs
Unit test frameworks often provide various assertion functions. Some of them are generic, such as expect.toEqual()
in the Jest framework, while others are more specialized and expressive, such as expect.toBeInstanceOf()
or expect.toBeTruthy()
. While generic assertions can cover many cases, try using more expressive assertions to improve the readability and intent of your tests.
For example, here are two possible ways to assert a false Boolean result in a JUnit test. The generic assertion function assertEquals()
works, but the more specialized assertFalse()
is more concise and helps understand expectations at a glance:
public class BooleanTest {
@Test
void isEven_withOddNumber_returnsFalse() {
int number = 5;
boolean result = isEven(number);
// Generic assertion
assertEquals(false, result, "5 is not an even number");
// Expressive assertion
assertFalse(result, "5 is not an even number");
}
private boolean isEven(int number) {
return number % 2 == 0;
}
}
Here’s another JUnit example; see how the specialized assertArrayEquals()
function allows asserting that two arrays contain the same set of values, something that requires several lines when using the generic assertEquals()
function:
public class ArrayTest {
@Test
void arrays_shouldBeEqual() {
int[] expected = {1, 2, 3};
int[] actual = {1, 2, 3};
// Generic assertion
assertEquals(expected.length, actual.length, "Arrays should have the same length");
for (int i = 0; i < expected.length; i++) {
assertEquals(expected[i], actual[i], "Array elements should match at index " + i);
}
// Expressive assertion
assertArrayEquals(expected, actual, "Arrays should be equal");
}
}
Advanced Unit Testing Techniques
Basic assertions are not always enough as you’re developing your unit test suite. Earlier, we discussed that effective unit tests needed to be isolated from external systems such as databases, but how do you achieve this in practice? If you want to run a lot of similar tests with a variety of inputs, do you need to duplicate a test for each input, or is there a better way? Knowing a few advanced techniques will help you address these challenges.
Mocking Dependencies
Mocking involves replacing real dependencies with substitutes (“mocks”) to ensure control over the unit of code being tested. This is important when the unit under test depends on external systems, such as databases, APIs, or file systems, which can be slow, unreliable, or difficult to set up for tests. Mocking is how you achieve isolation of your unit tests.
Using mocks, you can simulate specific dependency behaviors, control their responses, and test how your code interacts with them. For example, when testing a service that retrieves data from an external API, a mock can simulate different API responses, including success, failure, and timeouts, without making actual network calls.
Mocking frameworks, such as Mockito for Java, make it easier to create, configure, and verify mocks.
Data-Driven Testing
Data-driven testing is a technique where test cases are executed with multiple sets of input data, allowing you to validate the behavior of a unit across a variety of scenarios without writing separate tests for each data set.
This approach is particularly useful when a function needs to handle a wide range of inputs, such as numerical calculations, string manipulations, or decision-making logic.
Many unit testing frameworks support data-driven testing through special test annotations. For example, in JUnit, @ParameterizedTest
sets up a test to run with multiple sets of inputs, while @MethodSource
defines where to look for these inputs.
How to Get Started with Code Coverage
When your team is comfortable enough with unit testing as a practice, the logical next steps are to see how much of your production code is tested, check whether you’re testing the right parts, and plan your future testing efforts. Code coverage is the right metric to help you with that.
A good starting point is to measure the current coverage using your unit testing framework or a separate code coverage tool. Establishing a coverage baseline helps your team understand which parts of the codebase are already well tested and which require additional attention.
Next, you should agree on your targets for code coverage and your strategy to achieve them. Remember: you want to encourage meaningful testing without being overly demanding or disruptive. There are at least two paths to get there:
- Target-critical, high-risk areas of your codebase: Writing tests for business-critical components, complex logic, or areas with a history of bugs helps pull the most value out of your unit testing efforts early on. While doing this, consider adopting incremental coverage goals to make the process more manageable: If your baseline coverage stands at 15 percent, aim for 25 percent or 30 percent over a few months instead of forcing a less realistic target. When you achieve the smaller target, aim for more.
- Focus on high coverage for new code: Introducing thresholds for newly written code, such as requiring 80 percent coverage for new components, helps avoid dedicating sprints solely to improving code coverage. Your team can continue delivering new value while adopting the habit of properly testing new code.
Whichever path you choose, consider integrating code coverage reporting into your CI/CD pipeline. By tracking progress in CI dashboards, teams can see the impact of their efforts and maintain accountability.
What Is Good Code Coverage?
Good code coverage is about strategically testing significant parts of your code and refers to the extent to which your source code is tested by your test suite. It’s a measure of how much of your code is executed while running automated tests, using tools to track and improve coverage trends, and ensuring critical business logic is adequately covered.
Let’s see what code coverage can tell you about the quality of your software project:
- Starting with any unit tests means progress: A 0 percent code coverage means unit tests are nonexistent in your codebase. However, starting with even 1 percent coverage marks a significant improvement and sets the foundation for better code quality.
- Identifying critical areas for improvement: Detecting low code coverage in critical business logic highlights areas that may need immediate attention for risk mitigation. Addressing these gaps reduces the likelihood of bugs and regressions significantly. For example, if a key function handling customer data shows insufficient coverage, adding tests there ensures it performs reliably under different scenarios.
- Tracking code coverage trends: Leveraging tools to monitor code coverage trends, such as SonarQube, enables the team to track progress over time. This visibility helps the team to ensure they are consistently improving their test coverage, addressing any areas where coverage may be degrading, and extending the share of the codebase covered with tests. An example is observing a consistent increase in test coverage percentage after each sprint, reflecting a growing confidence in the stability of the codebase.
Code coverage can’t tell the following:
- If your unit tests are of sufficient quality: Code coverage does not measure the quality of tests, and high coverage can include superficial or ineffective tests.
- Whether your application is fully tested: When a test executes a line of code, it contributes to code coverage, even if it’s not executed for the right reason. Critical scenarios or edge cases could still be missed.
- If it’s the right code that is covered: Coverage metrics don’t differentiate between essential business logic and trivial code (such as getters and setters), which may inflate coverage numbers without meaningful testing.
Since code coverage as a metric doesn’t convey the quality of unit tests, avoid mandating high coverage numbers without giving the team sufficient resources and investing in a culture of meaningful unit testing. When faced with administrative pressure to meet unrealistic coverage targets, developers may sabotage the initiative by introducing tests that execute as much code as possible without validating the program’s behavior in a meaningful way.
As long as you introduce code coverage cautiously and iteratively and make sure that the substance of tests can catch up with your coverage target, you’ll likely want to know how to set a meaningful target for the maximum code coverage you can ultimately achieve.
There are no universally mandated benchmarks for code coverage. However, the industry generally agrees that achieving 100 percent code coverage is not feasible because of the following reasons:
- Parts of your code may not be designed for unit testing. For instance, UI code is better tested through integration or end-to-end tests.
- Code automatically generated by tools or frameworks is typically excluded from testing.
- Thin wrappers around third-party APIs are often excluded, as their behavior is determined by the external dependency.
- Platform-specific code, such as operating system calls, may not be testable with your chosen unit test framework.
A commonly cited long-term target is 70–80 percent code coverage. For example, at Google, 75 percent coverage is considered commendable. While you may aim for this range, it’s how you get there that really matters:
- Focus on covering the most critical, complex, and dynamic parts of your codebase. This may require you to look at a broader picture, considering other factors such as cyclomatic complexity, frequency of change, and bug-reporting trends.
- Look for tools that guide you toward critical untested areas. For example, the Hot Spots view in JetBrains’ dotCover analyzes code coverage alongside complexity and recent modifications to identify risky code.
Prioritize quality over quantity when writing tests. A narrow focus on achieving high coverage may lead to bloated test suites full of superficial tests with little value. Educating your team on effective unit testing and prioritizing critical areas to test is what makes code coverage a meaningful metric.
Summary
In this article, you explored the importance of unit testing, focusing on best practices for writing effective unit tests and the role of code coverage in assessing test effectiveness. You also learned about various unit testing frameworks, code coverage tools, and techniques for writing high-quality tests, including mocking dependencies and data-driven testing.
Code coverage is a valuable metric that can help you identify areas of your codebase that are not well tested. However, it’s important to remember that high code coverage does not guarantee that your unit tests are of high quality. It can be misleading if you focus on achieving a high coverage percentage without also considering the quality of your tests.
SonarQube can help you keep track of code coverage by integrating with popular code coverage tools and acting as a central hub. It shows the percentage of code covered by test cases alongside other code quality metrics such as security, reliability, and maintainability that are measured through static code analysis. By integrating these results with a quality gate, Sonar can provide clear pass/fail metrics to ensure code meets quality standards. The end result is that developers are able to use these actionable insights to continuously improve the quality and security of their code, while lowering the risk of regressions caused by testing gaps.
With the right approach, your code coverage will organically reflect the quality of your codebase, and you can use tools like SonarQube to track your coverage targets and enable developers to build better, faster.
Author: Jura Gorohovsky