most read
Software engineering
Going agile: do less to deliver more Aug 14
Culture & Values
The Spark Of Our Foundation: a letter from our founders Dec 9
Software engineering
The value of canonicity Oct 30
Careers
We bring together great minds from diverse backgrounds who enable discussion and debate and enhance problem-solving.
Learn more about our careers
In the past at Nubank we used an End-to-End test suite to find issues across the boundaries of our microservices architecture in a staging environment. That implies that the interactions between services very often are backed by actual databases, messaging systems, HTTP requests, etc.
In general, End-to-End are black-box tests in the sense that we stimulate one of the inputs of the system (e.g. by making an HTTP request to an endpoint or producing an asynchronous message to a topic).
The system then produces several interactions between its parts; for example: possibly many other HTTP requests and/or asynchronous messages are produced. And we check the validity of these interactions by checking specific outputs; for example by calling another HTTP endpoint to check if the desired effect has been produced.
End-to-end in a fintech
As a fintech, quality is of the utmost importance for us. We need our customers to trust us with their money. Our End-to-End test suite complemented our testing strategy to assure our systems were of very high quality and integrity.
It turns out that this practice has a lot of downsides. In my early days working here I conducted an assessment of engineering pain points. Speaking to many different people and teams, the struggle with this type of test became a common theme.
Check our job opportunies
A diagnosis of our end-to-end test suite
Most companies still believe that End-to-End Integration Tests are the best way to catch bugs. But they also experience a progressively slow down of value delivery due to these pain points that we uncovered during the assessment:
Presentation to engineering leadership
When showing the results of my assessment to the CTO and engineering leadership, I presented the situation:
A new hope: Contract Testing
This diagnosis was not something novel or specific to Nubank. In my previous experiences I had seen the same situation in many Fortune 500 companies that struggled with their End-to-End test automation. But at Nubank we were determined to change this. One of our Sr Staff Engineers, Rafael Ferreira, ran some numbers and applied queueing theory. He determined that by 2021 Nubank’s End-to-End test suite would take… an infinite time to run!
So we decided to explore Consumer Driven Contract (CDC) testing as an option.
The billing service in turn, as a consumer of this endpoint, expects to receive two attributes: first_name and last_name (both non-empty strings). Any unexpected changes on these assumptions (e.g. the customer service changes the last_name attribute to be optional meaning that it may be empty) consist in a contract breakage that may affect the behavior of business flows in runtime and must be caught by contract testing tools.
So in summary:
Why we chose to build our own framework
How these contracts are declared and validated depends on the contract testing framework in question. However, a common characteristic is that inputs and outputs are collected and validated without executing black-box tests against actual instances running on production-like environments.
Implementations of such concept may vary:
The second approach is the one implemented by Pact. We started it, since it’s a quite consolidated framework for writing Contract Tests. However, two aspects caught our attention in our first experiments:
Our decision
Rafael and our team, including Lead Engineers Rui Hayashi and Alan Ghelardi, considering the above aspects, decided we should develop our own implementation of a Contract Tests tool.
In fact, we picked many aspects from the Pact framework as well as from the consumer-driven contracts pattern as a source of inspiration. However, throughout this journey, we realized that even the traditional model of Contract Tests had some downsides that could make their adoption difficult for our circumstances.
In general, Contract Tests are strongly dependent on the correct (and often complex) initial state in the microservices being tested to exercise relevant interactions among them. For very simple interactions (like those that frequently appear in examples of CDC tests) this might not be a problem, but in the context of a financial company with complex business rules spread across a wide variety of services, certainly, engineers would struggle very often to put their microservices into valid states to write useful tests.
Building Sachem
And so we decided to create Sachem, our very own contract testing framework, to deprecate End-to-End testing in staging as a practice. One interesting thing about this project is that we realized we already had the contracts in our microservices. Clojure has this library Schema that allows you to richly describe data structures.
Therefore, nothing really changed in terms of communication between teams, but it shows the value of having those contracts in place.
Factors considered
We have learned in a few experimental sessions with our engineers that, to be successful, our solution should be less intrusive and organically adhere to our codebase by leveraging existing aspects of our architecture, preferably, without forcing engineers to write complex tests. To proceed, we have considered the following factors:
With those aspects in mind, we gave up on the idea of providing a replacement for testing distributed business logic among services through contract tests and concentrated on validating uniquely their schemas with Sachem.
One strategy to rule them all?
Getting rid of E2E tests really helped to eliminate the coordination problems. Before Sachem we experienced issues frequently, since each team had a turn in the E2E pipeline, and they would barter to skip ahead of the queue to get their change in production faster, causing friction with other teams.
The engineering cost of keeping the tests working, waiting to get code into production, and so on, were the main drivers of this change. It used to take at least two hours, in the best scenario, to get some code into production. It could take more than a day. Just think of the cost of context switching between tasks while you wait for something else to go to production.
At first, the goal was to completely remove the End-to-End tests, but in the long-term, we found that we missed some of its benefits. Mainly, Contract Tests can catch structural incompatibilities, but they are not good in testing behavior.
The guarantee
But to keep our high quality standard we also needed to guarantee critical behaviour of our applications.
So we found another way to complement our suites, by building what we call acceptance tests (not to be confused with Gherkin style acceptance tests).
We also have an experimentation platform whereby people can run experiments against a subset of our customer base. It’s a common practice to use techniques like feature flags, percentage rollouts, a/b testing, and so on. This has been working well both as a pure technical testing mechanism, but most importantly to gather business insights on new features.
We have tried to implement distributed tracing instrumentation but turned it off because of a mix of not having much uptake from our internal users, and for the difficulty of maintaining the infrastructure. We intend to revisit that in the future.
The results
The results were quite remarkable, especially for two important metrics:
It is important to mention that these are two of the four “Accelerate” metrics. The Accelerate book shows that companies that excel in those metrics are among the high performers in the Industry.
I would like to thank César Vortmann for the inspiration for this post and questions that drove it. We have previously mentioned on CDC in this podcast and on this video on how we do end-to-end tests for our Microservices architecture. It became clear people wanted to hear more and we hope this post will help folks out there considering these different testing strategies. Also thanks to Rui Hayashi and Alan Ghelardi for being co-writers on getting answers for this article and Rafael Ferreira, Paulo Victor and Ezequiel Siddig for your review.
Check our job opportunies