Presenting Precog, Nubank’s Real Time Event AI

Understand the promising future of Precog and customer representation AI at Nubank

person working at nubank

Written by Cristiano Breuel, Gabriel Bakiewicz, George Salvino, Mariana Sant’Anna, Luana Campos, Cinthia Tanaka, Pedro Chaves, Allan Garcia, Kevvin Sahdo, Tan Mukhopadhyay


Imagine a world where your call to your bank is instantly picked up by an expert agent who quickly solves your issue. No more navigating through tedious menus, listening to endless recordings, or being transferred from one agent to another. For many Nubank customers, this is not a distant dream but a reality, thanks to our real-time event AI, Precog. 

In this article, we take a look at Nubank’s journey over the past decade, from a single-product company to a multi-product financial powerhouse operating in three countries, that motivated the development of Precog, and detail its system architecture, some of the technical challenges we faced, and the results we achieved.

Background

Ever since Nubank got started ten years ago, we used machine learning to make decisions, beginning with credit underwriting. Highly skilled data scientists built some of the best performing models to set us up for unprecedented success.

However, back then, we were a single product company, in a single country, serving fairly homogeneous users. A lot has changed since then, both inside Nubank and in the wider world. Nubank now operates in three countries, offering more than ten different financial  products (e.g. checking/savings accounts, investments, insurance) to a very large portion of the population. This pace and scale of change and complexity has made it almost impossible for some Data Science teams to keep up — especially horizontal teams, like Customer Excellence, which handles support.

Meanwhile, in the outside world, AI took huge leaps thanks to new model architectures and the emergence of the foundation model paradigm. In late 2021, a vision for doing AI at Nubank in this new reality started to take form. We decided to start with our customer support platforms, whose responsibilities include predicting customer’s support needs in real time.

Motivation

Nubank is organized into several business units, which in turn offer many products and functionalities, implemented as independent microservices. The result is a decentralized environment that lets teams move quickly and independently. It also means that it can be challenging to have a full view of the customer across all of these products. Traditionally, we relied on business knowledge to think of features that might be useful for specific models, then went looking across the organization for data to build those features. As our product portfolio and complexity grew, it became harder to keep up, and our models started to lag behind product evolution.

In order to tackle this issue in a fundamental way, we needed to automate and scale up real time data processing and feature engineering to support many product lines and use cases. That’s where Precog came in.

Representation

The key insight behind Precog is that customer events (e.g. app click stream, transactions) can be encoded as a sequence of symbols. This means that we can use techniques that were originally developed for natural language processing, like embeddings and sequence models, to understand and predict customer needs.

The richest source of data that we were missing from our models was our app click stream, so that’s where we decided to start. This data is collected by an internal system that receives events from Nubank’s mobile app and other sources, through a flexible API.

The main challenge with this data is its lack of enforced structure across different product teams. Each record is identified by a metric name and a JSON with attribute/value pairs, all of which are determined by the engineers who implement each flow. In order to avoid development bottlenecks, there is no centralized governance, and these values can change quickly with each update of the app.

To deal with this semi-structured data, Precog turns records into sequences of text-based identifiers, while dropping infrequent symbols, like unique ids.

There are many possible ways of learning from these string representations, with different levels of complexity. We wanted something that would not be tied to any particular domain, so it could be easily reused in different models. Following our usual approach of starting simple to quickly assess the value of a solution, we decided to represent customers with a bag-of-words of their events’ symbols.

Precog learns embeddings of the events in a self-supervised manner, using contrastive learning. A set of events from a customer is placed as the anchor, with a single event randomly removed from it to serve as a positive sample while other events that don’t occur for that customer are randomly sampled as negative. Contrastive learning tries to minimize the distance between the anchor and positive samples and at the same time maximize the distance to negative samples. At the end, a vector for each symbol in the vocabulary is produced and we aggregate them to get the customer representation. 

System architecture

Precog’s main component is a pipeline to train events/customers embeddings and then incorporate the learned embeddings into downstream models at training and serving time. To train the embeddings, we use Starspace library, which provides a flexible and efficient implementation to learn any entity embeddings.

To train a downstream model, we take records that are keyed by anonymous customer identifiers and timestamps, together with the labels we want to predict and any other features, and join them with the relevant sets of events. For example, in order to predict the contact reason for a customer support call, we take the events of that customer from the event window preceding the call and the contact reason as classified by agents.

At run time, an event consumer microservice (implemented in our canonical business service language, Clojure) transforms the raw event data into a string format and stores it in a low-latency, temporary storage (Redis). At serving time, the downstream model microservice (built in Python) retrieves relevant events from the cache, computes embeddings, and uses them as features for a classification model.

Technical challenges

Identifying the optimal window for the events: Our current modeling approach doesn’t take into account the order or age of the events, hence we noticed that increasing the event window reduced model performance, because the most relevant events tend to be recent. However, reducing the window too much would decrease our coverage, since many customers may not have interacted with the app recently. After running several tests, we settled on a 3 hour window for now, but we intend to explore better ways to select events that balance coverage and precision, including ways of incorporating event age information.

Data volume and cost: The volume of this data is fairly large, so we anticipated that storage costs would be an issue. We planned to only keep the prepared data in low-latency storage for the time it would be required for inference (historic raw data is already kept in high-latency storage). For ease of implementation, our first version used a low latency key-value database with built-in support for time-to-live cleanup (AWS DynamoDB). However, the huge volume of events to be stored and the amount of writes and reads made the cost prohibitive. We then switched to an in-memory database (Redis), which made the implementation slightly more complex, but reduced the cost to a small fraction of the original version, making the solution cost-effective.

Need for frequent retraining: Since event definitions change frequently for reasons mentioned above, we also need to retrain the embeddings often. However, retraining the embeddings means we also need to retrain downstream models that depend on them. Our solution for this problem is to provide a standardized retraining pipeline that can easily be adopted by downstream models to periodically retrain.

Results

Our first application of Precog is in routing of phone calls, through a downstream model that uses the customer embeddings and other features to rank the most likely product that the customer needs help with. If the model is confident enough, the system routes customers directly to specialists. Otherwise, they will talk to generalist agents, who may transfer them to specialists if they can’t handle the problem themselves.

By adding Precog embeddings, we managed to increase the volume of calls that are correctly routed to specialized agents without need for customer input by more than 50%, compared to the previous model. This resulted in significant decreases in time to solve issues and increases in customer satisfaction. There were improvements for all the products we tested, but we noticed that the largest gains happened in newer products for which we hadn’t implemented traditional features yet. This confirms our hypothesis that we can replace laborious feature engineering with a more scalable approach.

Another benefit of Precog is the reduction in latency. For traditional features, we had to call multiple REST services and process the results, which resulted in high worst-case latency. With Precog, the time to fetch and process event data is much lower and more predictable, resulting in lower overall serving latency. The freshness of the events (time between event happening and being available to serve) was a concern initially, but in practice we’re seeing them available almost instantly, making this approach very competitive with synchronous REST retrieval.

In addition to these runtime improvements, we also realized that there is a huge benefit for the development cycle: adding Precog features to a model typically takes a couple of weeks, versus months for traditional approaches, allowing us to iterate and deliver value to customers faster.

We are currently working on other applications for Precog, such as FAQ article recommendations and topic suggestions in the Chat support interface. Additionally, we foresee many more uses for it in the future.

Future work

This is just the beginning for Precog and customer representation AI at Nubank. There are several promising avenues for further developing this technology and applying it to our products, like more data sources and improved modeling techniques. We also intend to invest in additional Foundation Models to build the next generation of the AI stack for Nubank.