Blog

New Tool Helps States Reduce Unemployment Insurance Overpayment

Victor Diloreto

September 18, 2020

BLOG_New Tool Helps States Reduce Unemployment Insurance Overpayment

Elder Research has extensive experience helping enterprises find and eliminate fraud, waste and abuse (FWA). For several state-level labor departments, we have assisted with identifying fraudulent unemployment insurance (UI) claims. This capability is critical today, during the unfortunate record-setting unemployment over the last few months. This blog is dedicated to help agencies – who are wary of domestic and foreign abusers of the UI system – by describing our successful work, where and how it is being used, and how to leverage the lessons learned and tools built to identify FWA.

The Tool - DAPM

The Data Analytics and Predictive Modeling (DAPM) tool combines business rules, mathematical algorithms, and predictive machine learning models to evaluate claims data and assess the risk of overpayment. Because the DAPM tool is designed to be adoptable by any state workforce agency, it uses generally-available UI data sets – claims data, certifications data, employment and income data, etc. – with the vision that states adopting the tool could first implement the core product and then build out functionality specific to their UI landscape. The analytics generated by DAPM are delivered using Elder Research’s RADR, a powerful, server-based, data analytics product fusing data from multiple sources, supporting sophisticated predictive and machine learning risk models, and providing a flexible and intuitive visual interface.

The DAPM analytical modules can be classified into four broad groups:

  • IP address-based modules (IP Location, Changing IP, Sharing IP, Travel Risk, Employer IP Sharing Risk)
  • Associations to previously flagged claimants module
  • Report-based modules (Quarterly Wage Report, National Directory of New Hires)
  • Supervised models (k-Nearest Neighbors (kNN), Support Vector Machine, and Random Forest)

Table-1

Each of these modules generates a risk score for each claimant that is aggregated and normalized to a scale of 0 to 100, with low scores meaning it is less likely to be an overpayment and higher scores meaning it is more likely to be an overpayment.

Performance

Gain charts are used to assess the performance of this scoring process. Gain charts measure how quickly a model identifies overpayments by charting the percentage of certifications in the validation data set that would need to be examined by subject matter experts (SME) in order to detect a given percentage of the overpayments in the data set. For example, a model that identifies 25% of overpayments after examining only 5% of certifications would be extremely useful – such a model would identify overpayments at five times the baseline rate.

The gain charts contain three lines: the performance of the risk-scoring model (labeled “Model”), the baseline performance (“Random”), and the best performance possible if the model knew exactly which certifications were overpaid (“Wizard”). The charts below reflect a sample of results including one that focuses on the highest-scoring 5% of claimants. The top 5% to 10% of claimants is typically of greatest interest because they represent claims with the highest risk – and are those most worth examining given limited resources. Depending on the deployment, results will vary, and a combination of model tuning and adding other modules can hone results for a specific agency’s requirements.

Figure 1-Aug-04-2020-06-01-51-00-PM

Delivery

Elder Research built Extract, Load, and Transform (ETL) pipelines to pull the required data from an agency’s data warehouse and prepare it for analytic analysis, and configured RADR to allow analysts to view the results and work cases. The pipelines were automated to pull data on a periodic basis, update the model scores, and push the updates to RADR to provide analysts with fresh case scores. The system was deployed on internal agency servers but could be deployed in a cloud environment as well.

The RADR configuration developed allows investigators to easily view the claim risk scores and can easily be altered to change or add views for a specific need. The current configuration supports several views:

  • Claimant Score Listings
  • Aggregation of Claimants Across Key Attributes
    • Employer
    • Email
    • IP Address
    • Phone Number
  • Claimant Details
    • Includes a graph to associate attributes to other claimants

Claimant Score Listing

Claimant Score Listing

Aggregation – IP Address Example

Aggregation – IP Address Example

Claimant Detail – Summary

Claimant Detail – Summary

Claimant Detail – Graph

Claimant Detail – Graph-1

Leveraging DAPM

Given the recent influx of UI claims, we are working with several state agencies to enhance the current deployment and demonstrate how DAPM can improve their analyst’s efficiency. The source code required for implementing the DAPM ETL pipeline and models have been released as open source. While a full implementation will rely on agency-specific data, this solution provides the framework for the pipeline and training. Elder Research can be consulted to support, install and/or augment these models for specific requirements.


thumbnail-mining-your-own-business-ebookDownload our Ebook to learn about key considerations and best practices for leading a data analytics initiative. This eBook includes Chapter 3 of Mining Your Own Business titled “Leading a Data Analytics Initiative” which covers the key challenges and considerations for business leaders employing analytics to provide data-drive insight.


Related

Improving Workers' Compensation Fraud Detection

Improving Unemployment Insurance Claim Fraud Detection

Detecting Fraudulent Workers’ Compensation Claims

Finding Fraud When No Cases are Known

Detecting Hidden Fraud Risk from Public Data


About the Author

Victor Diloreto Vic Diloreto leads the software engineering group at Elder Research. In this role, Vic is chartered with the continuing support of our data science service to clients where software is needed in data preparations and/or visualizations. Vic is also leading the efforts to convert select portions of Elder Research’s intellectual property library into standalone products. Prior to Elder Research, Vic was a Senior Director of Technology at MegaPath, where he led a team in charge of architecture, design and operations of advanced communication solutions. Prior to MegaPath, Vic was VP of Engineering and CTO of telecommunication start-up sentitO Networks, where he led hardware and software developments. In this capacity, he assisted in raising 60M in venture financing. Vic also had roles within BNR/Northern Telecom and Pulsecom in engineering and management positions – most notably leading the technical aspects of the sale of Pulsecom’s Broadband division to ECI Telecom for 61M.