Open Banking Unleashed – More Connections, Better Data, Higher Revenue

EDGE Insights

Nearly every checking and savings account in the U.S. can be accessed through a bank data aggregator. An entire ecosystem of open banking fintech has emerged around this data – from financial health dashboards to payments authentication to credit risk assessment.

EDGE and our customers see tremendous value in cash flow underwriting as a more holistic picture of financial health than traditional credit scores. It turns out that the predictive value of risk insights depends as much on solid analysis as on the source of data that’s analyzed.  

How much data is sourced and how accurately that data is aggregated can be the difference between a reliable, actionable risk signal, versus noise that complicates your underwriting without improvement. Equally important is the impact on your funnel of sources with better (or worse) connection rates.

To maximize the value of our customers’ data spend, NinjaEdge takes a source-agnostic approach to bank data aggregation. For every request and for every bank, we optimize across a portfolio of aggregators for all of:

  • Reliability to initiate your connection request where we know there’s a healthy connection
  • Speed to achieve the fastest connection for better consumer experience
  • Data density to acquire the longest possible transaction history for more meaningful analysis
  • Data accuracy to minimize errors and omissions in an aggregator’s raw data that dilute insights

Looking Under the Hood

The same data requested from a particular bank but sourced by different aggregators can yield surprisingly different results along any (or all) of the above dimensions. Understanding the reasons for such differences requires context on bank data aggregation methods:

  • Screen scraping: The aggregator sees what the consumer sees, via code that “crawls” across a bank’s consumer-facing website and “scrapes” data displayed for packaging and transmission to downstream data recipients as a consistent, structured dataset.

  • OAuth: Direct connection via API to larger banks through a tokenized access standard that returns balances, transactions, and account details to the aggregator as enabled by each bank’s API specifications. In turn, the aggregator normalizes data from each bank to the same structure as their data from other sources.

  • Banking core direct: Banking core systems process and store all bank transaction data potentially accessed through OAuth and screen scraping – and more. Some aggregators work with one or more of the leading core providers covering hundreds (if not thousands) of banks. Select aggregators directly connect to individual banks’ core system.

A peek under the hood at each aggregator’s capabilities reveals pronounced differences in source mix for various legacy and strategic considerations. Given these structural differences, your choice of aggregator can profoundly impact the value realized on your data acquisition spend.

Connections matter

Historically, scraping has been the dominant method for accessing consumer-permissioned data from bank accounts in the U.S. going back to Yodlee as early as 1999. Market acceptance of open banking and technology have increased potential direct connectivity coverage to over 70% of U.S. banks, credit unions, and other financial institutions.

As you evaluate partners for acquiring consumer-permissioned bank transaction data, the number and type of direct connections available will drive the value you’re able to extract from the relationship. Having seen billions of transactions from our network of data providers and “bring your own data” arrangements with our customers, EDGE sees the merits and considerations of each aggregation method along the following lines:

Screen scraping
Banking core direct
Data density
Data accuracy

🟢  Good

🟡  Fair

🔴  Poor

To scrape or not to scrape

Beyond the obvious drawback that banks are increasingly moving to eliminate scraping, the method is only as good as an aggregator’s individual experience with each bank’s website. As web interfaces evolve, scraping technology must catch up to every change in order to continue ingesting the same data.

As a primary method, scraping is also less reliable than directly connecting to a bank or its core system. Any change in a bank’s authentication technology can result in downtime for an aggregator’s ability to make a successful connection, and banks can impede scraping with such tools as CAPTCHAs and honeytraps.

EDGE has analyzed over four billion bank transactions from a range of aggregators, and we’ve seen firsthand that there is meaningful variability across the amount and quality of data at the individual bank level. We’ve also observed periodic downtime for individual aggregators’ ability to access let alone scrape bank websites.

With a fairly short list of U.S. banks currently offering OAuth APIs and the technology investment required to create OAuth capabilities, scraping probably isn’t going away any time soon. It’s reasonable to expect that for some time, scraping will be the necessary default for the more than 10,000 banks and credit unions in the U.S.

It’s also worth noting that certain core providers today also conduct “friendly scraping” of data from their bank and credit union customers. Think of these as bank or credit union data that’s already processed by the core provider but the institution’s systems aren’t configured for direct access by aggregators at the core provider level.

OAuth – the emerging standard

Access to a bank’s customer data fully sanctioned by the bank itself offers the clear benefit of a reliable “pipe” directly established between banks and consumers with requested data then transmitted to aggregators. As long as the aggregator’s platform and the bank’s API are online, you should expect consistent responses with low latency rates.

Consumer experience is an important driver of successfully completed requests for bank data. 30-35% of consumers will look for another service or cancel their account when faced with a difficult login process. Instead of replicating a bank’s login process, OAuth is the same portal and process a consumer would use to login directly to check their balance or make a payment. Additional steps required to permission and provision an aggregator for screen scraping introduce friction that increases potential for drop-offs.

Underlying the move to OAuth by banks that can afford to invest in the required technology and support are stronger authentication and information security that protect consumers who have entrusted a bank with their data (and money). Tokenization allows limited, fit-for-purpose information sharing and access duration with predetermined expiration.

In limiting access to personally identifiable information and sensitive information via OAuth, some banks return a substitute account number that can’t be verified against the account number provided by the consumer. Downstream users that that require the actual account number must instead call a different (typically more expensive) API endpoint or utilize a separate data provider for this use case.

While not necessarily a favorable or limiting factor for downstream users, an important consideration with data accessed via OAuth is that the structure, syntax, and duration of data is determined by each bank since they have voluntarily developed and offered their OAuth API. With most large banks and aggregators participating in a private consortium to promote standards for data sharing (as well as pending regulation), we expect greater consistency over time.

Banking cores – straight from the source

Similar to OAuth banks, transaction data access through a banking core provides a direct connection with generally more reliable (and therefore higher) connection rates, higher quality data, and lower latency. The core is your single source of truth for data that could also potentially be accessed with an OAuth API or scraped. And as long as the core is up and running (which is almost always the case), the data should be available through this direct connection.

Importantly, the direct access available through a core processor is not one-to-one like OAuth (or even one-to-many where aggregator OAuth partnerships average around 10-12 of the largest U.S. banks). It turns out that the three largest core processors dominate the market with over 80% share across U.S. banks of all sizes. So relationships with core providers – either directly or through an aggregator– open up a much “longer tail” of the U.S. banking footprint that would otherwise only be available through screen scraping.

Core providers provide a measure of consistency from bank to bank in how they structure data and memorialize transactions which introduces stability and predictability for recipients of that data. Downstream users like EDGE and our customers can better ingest and analyze data with greater historical and expected consistency.

If there are any drawbacks to aggregation of bank data via core providers relative to OAuth, concerns probably fall at the policy level as opposed to the downstream perspective of EDGE and our customers. Risk of any data breach could be magnified when credentials for accessing hundreds of banks are funneled through a core provider’s platform instead of a direct, however we view this risk as theoretical given the scale and security of the leading core providers who participate on the aggregation side of the open banking ecosystem.

More for Your Money

The value you realize from bank data is a question of optimizing at the aggregator level across all of reliability, speed, data density, and data accuracy. EDGE achieves this optimization with Smart Routing across multiple aggregator partners that we monitor for up-to-the-minute changes that could impact where we direct your data requests.

Sometimes getting the best data quickly simply means defaulting to OAuth or banking core access, with the understanding that scraping introduces unknowns with each new connection attempt. At the same time, all scraping is not equal. Across billions of scraped transactions, EDGE has learned which aggregators yield higher quality payloads and route our customers’ bank data requests accordingly.

EDGE is also omnivorous in that we can and do work with data from any aggregator, whether we have a direct relationship or serve as a conduit utilizing our customers’ aggregator credentials (the “bring your own data” arrangements mentioned earlier).

Leveraging our portfolio of aggregator relationships, EDGE provides Smart Routing for most of our customers because they’ve realized meaningful value in several key areas:

  • Single partner for contracting and integrating to access bank data (along with our risk analytics and risk score)
  • Better data as the basis for more robust analytics, more predictive insights, and improved decisions
  • Higher connection rates to successfully acquire data through expanded, optimized coverage

This last point is readily quantifiable, and the results may surprise you: Customers typically see a 5-15 percentage point improvement in connection rates after switching from an incumbent aggregator (or aggregator of aggregators) to EDGE. Some individual aggregators deliver a connection rate of 30-40% with the industry average somewhere around 50%. Most of our customers who previously utilized one or more aggregator had reached 50-60% connection rates before switching to our Smart Routing.

In contrast, EDGE reliably delivers connection rates of 65-70% with some of our customers seeing connection rates as high as 75-80% (stay tuned for an upcoming blog post on best practices to improve connection rates). We’re able to unleash more potential with open banking with the broadest possible coverage where individual aggregators overlap and differentiate, experience with billions of transactions from nearly every U.S. bank technology that dynamically and technology to dynamically prioritize data sources.

Connection rates have a direct impact on your portfolio and profitability. If your baseline connection rate is 60% and that would increase to 70% with NinjaEdge, you’re losing 1 of every 6 applicants because of your aggregator. Poor aggregator connectivity isn’t correlated with underwriting risk, so it’s reasonable to assume that you’ll approve a similar percentage of applicants you wouldn’t otherwise see. So in this example, you could expect 17% increase in originations simply on the basis of optimized bank data acquisition and connection rate uplift.

If you’ve already introduced bank data into your underwriting, put EDGE to the test – we’re confident that poor connection rates are leaving money on the table for your business. And if you’re new to bank data, we’re your one-stop shop to access the broadest, richest dataset possible then deliver actionable insights with performance-optimized analytics and the leading default risk score all based on open banking data.