Brownie: Evaluating Solidity Code Coverage via Opcode Tracing

0
12

Brownie is a Python framework for Ethereum smart contract testing, debugging, interaction and deployment.

Introduction

An important tool in every developer’s arsenal is the ability to evaluate code coverage. A coverage report provides a high level overview that can be used to find gaps in your test suite, and while a high coverage percentage by no means guarantees quality tests, it does provide a better sense of where undiscovered bugs may be lurking. Given the immutable nature of smart contracts and the vast sums of value they secure, we should welcome and utilize every tool available during the development process.

The following article discusses how Brownie handles code coverage via trace analysis. It explores the motivations, gives a summary of the implementation, discusses the benefits and challenges, and talks about where we’re going next.

Why is this Useful?

Before we explore the technical details… Behold, a sample Brownie branch coverage report!

Branch coverage report as displayed by the Brownie GUI

The color of each highlight indicates how that branch evaluated during testing:

  • Green branches evaluated both truthfully and falsely
  • Yellow branches were only evaluated truthfully
  • Orange branches only evaluated falsely
  • Red branches were not evaluated

With this report you can quickly see how your tests are interacting with your contract. It can help you determine which areas of your project need more testing, as well as locate sections of unreachable code.

Techniques for Coverage Evaluation

Broadly speaking there are two approaches to evaluating coverage:

  • Instrumentation involves injecting data collectors throughout the code that are used to monitor exactly which lines and branches are executed. This is how the popular tool solidity-coverage works.
  • Tracing involves monitoring the program counter of the compiled code as it executes, and then mapping the program counters back to specific lines code.

Instrumentation is the more common approach. It is far simpler to implement and generally viewed as giving the most accurate data. The main downside when compared to tracing is that it is invasive — the functions added to monitor execution mean that the code being evaluated is not the same code that will be used in production.

Additionally, the EVM brings a whole new set of challenges in the form of gas costs. Every operation uses gas, and each block has a finite supply of gas available. Instrumenting a contract means adding more operations means increased deployment and execution costs. In order to stay within the block gas limit this sometimes means running on a modified EVM ruleset. So now we are testing a modified set of code within a modified virtual machine!

For this reason, when implementing coverage analysis in Brownie I chose to use tracing rather than instrumentation. I felt the benefits of testing the real code justify the challenges in building such a system. So, where to begin?

Implementation Basics

The key to implementing coverage via tracing lies in two data structures returned by the Solidity compiler:

  • The abstract syntax tree, which is a standardized representation of the source code syntax. Brownie uses py-solc-ast to traverse the AST.
  • The deployed source mapping, where compiled opcodes are mapped to the original source code. Brownie expands this into its own program counter map which it uses extensively in coverage analysis.

By analyzing the AST we can locate statements and branches, and then use the source map to associate them with opcodes. We then query the debug_traceTransaction RPC endpoint on each transaction that runs during unit tests, and analyze the returned data to find out which code was hit. For contract calls, we instead broadcast them as transactions to get the trace, then immediately rewind the chain to ensure the state was not changed. Easy, right?

Lets get into the nuts and bolts!

Statement Coverage

Statements are syntax units that express an action to be performed. They are self-contained and linear, having a single point of entry and exit within the code.

Mapping statement coverage within Solidity is relatively straightforward. First, we search the AST for the deepest statement nodes (those which are not the parent of another statement). Then we iterate through the program counter map looking for opcodes which have a source offset contained within the source offset of the statement node. Whenever one is found, that opcode is associated with the statement. We know that the presence of that opcode within a trace means that this statement was executed.

Branch coverage

Branch coverage is where things get interesting.

A branch is an instruction that can cause a program to execute different code. In the EVM, branches are denoted by the JUMPI opcode. Within Solidity, these occur during if statements, require statements, and ternary operations.

To map branch coverage, we must first search the AST for the following nodes:

  • IfStatement (if (x) {} else {})
  • Conditional (a = x > y ? 1 : 2;)
  • FunctionCall containing a require expression (require(x, "oopsie"); )

Next, we next search the children of these nodes for BinaryOperation expressions (operations that evaluate in the branch, such as x > y or returnsBoolFn()). Because we must also account for nested operations such as ((x > y) || (x — 4 < y)) we ignore any BinaryOperation nodes that contain children who are also BinaryOperation.

Once this list of nodes is generated, we next associate opcodes in a similar manner to how statement mapping was handled — looking for opcodes that have a source offset contained within the offset of the node. We must also map these opcodes to JUMPI instructions, which we use to determine how the branch evaluated. To do so, we find the last opcode with a source offset contained inside the AST source offset, and associate it to the next JUMPI instruction. The reason we must use the last opcode has to do with the way Solidity handles jump instructions within nested binary operations .

Determining the relationship between the outcome of the JUMPI and whether the branch evaluated true or false depends on the type of node and it’s location relative to other nodes within the AST. There are many rules, and exceptions to them. If you’re still with me and interested in how this is handled, I invite you to view the relevant source code.

The end result of all this is a map of opcodes associated with both source offsets and jump instructions, that can be used to determine if a branch has executed and whether it evaluated truthfully or falsely! feelsgoodman.jpg

Execution Times

This technique is not without it’s limitations, the biggest of which is execution time. Queries to debug_traceTransaction are slow! Brownie attempts to mitigate against this by tracking coverage data on a per-transaction basis, and where an identical transaction is broadcasted it does not repeat the process. With a well-designed test suite this can lead to significantly faster execution.

Additionally, the Brownie Pytest plugin includes an --update flag that allows you to skip re-running tests where nothing has changed. This also requires tests to be designed with proper isolation, and could make for another article in itself . If you want to learn more you can view the documentation on this subject.

Try It Yourself!

If you’d like to see Brownie’s coverage evaluation in action you can use the following commands to install Brownie, download the Brownie Mix token template, run the tests, and open the GUI:

pip install eth-brownie
brownie bake token
cd token
pytest tests -C
brownie gui

Going Forward…

Brownie has only recently come out of beta and is still under active development. While this functionality has been tested extensively, more scrutiny from fresh eyes and more use from other projects is always a good thing.

In the near future Brownie will be adding similar support for Vyper. The latest Vyper release exposes an opcode source map comparable to that of Solidity, which should allow for the same level of coverage evaluation. Brownie and Vyper are a perfect fit in many ways— expect much more on this soon.

If you found this article interesting, if you have any ideas, comments or criticisms, please reach out! Leave a comment below, send me an email or find me on Gitter or Telegram. I would love to hear from you.


Brownie: Evaluating Solidity Code Coverage via Opcode Tracing was originally published in Coinmonks on Medium, where people are continuing the conversation by highlighting and responding to this story.