Transaction Classification Aids Credit Risk Assessment


Carlos Blancarte

Date Published:
March 15, 2019

A significant transformation is currently underway in the lending market. Banks are competing to provide lending decisions in a single day, with a vastly simplified customer experience as their primary way of growing market share. The key technology allowing this transformation is being able to accurately automate the credit decision; that is, to use advanced analytics to estimate a customer’s probability of default, affordability, and financial position to make the credit decision quickly.


Traditional methods for deciding credit have relied on models using a mix of self-reported data and credit bureau scores (risk drivers). This has ignored the treasure trove of transactional data available to a bank. We have found that using customer transactions to their full potential makes the next generation of credit decisioning possible. Let’s briefly look at how this works.

Customer transactions can reveal preferences and spending patterns, if you can create structured features from their unstructured text. Several techniques are available to unlock this value, from inductive methods like Data Science and Machine Learning to more deductive methods like Natural Language Processing (NLP), AI, and rules-based systems.

Short Transaction Descriptions

Transactions descriptions are usually short in length, but can range in size from one to two words to nine or ten words. Not all transaction descriptions are equally detailed; nor are all of them strictly merchant descriptions. Some may even contain short memos of free-form text (e.g., an individual’s name) or an intended reason for the transaction (e.g., wages).

Methods for Classifying Transactions

The best method to use depends on the data available and the time and energy that can be dedicated to solving the problem. For example, a straightforward rule-based approach can provide very accurate results, but at the cost of hundreds of hours spent writing detailed rules each covering a narrow type of transaction. In contrast, an inductive method is highly extensible, but may be difficult to interpret when validating. Note that a state-of-the art approach is not always best; it is computationally expensive and may be overkill if the attributes of the transactions are well-defined.

Table 1 summarizes the benefits, challenges, and complexity of competing ways to classify transactions. Typically, we’ve found it best to combine a couple of the listed methods.

This brief blog provides an overview, which we plan to expand later with examples and more detail about the competing methods making possible this step change in banking decision efficiency.

Want to Learn More?

Download our White Paper to learn more about Extracting Value From Unstructured Text such as bank customer transactions.
Download Now