Efron, Simon, and the Bootstrap

Gambler’s Fallacy

Model Validation and Reproducibility of Results

Measuring Invisible Treatment Effects with Uplift Analysis: A Get-Out-The-Vote Example

New Tool Helps States Reduce Unemployment Insurance Overpayment

This blog is dedicated to help agencies – who are wary of domestic and foreign abusers of the Unemployment Insurance system – by describing our successful fraud detection work, where and how it is being used, and how to leverage the lessons learned and tools built to identify FWA.

AUC: A Fatally Flawed Model Metric

In this blog Dr. John Elder helps readers understand the flaw of using Area Under the Curve (AUC) as a metric of model performance and better ways to measure that value.

Natural Language Processing for RegTech: Uncovering Hidden Patterns in Regulatory Documents

Blog highlights how Natural Language Processing (NLP) helps regulatory agencies, regulated enterprises, and markets understand unstructured regulatory documents without countless hours spent researching, reading, and analyzing. It helps analysts increase efficiency, derive actionable insights, and uncover hidden topics from large collections of rules, filings, or reports.

How and Why to Interpret Black Box Models

How to Pick a Winning March Madness Bracket

Updating a Data Pipeline with AWS’s Latest Offerings

Improve Predictive Model Performance With Ensembles

Dr. Jordan Barr takes explores the attributes and applications of model ensembles and potential downsides to provide context for when to use them.

Modeling Outcomes: Explain or Predict

This blog by Peter Bruce explores the differences between modeling for description versus for explanation and how your goals determine which method to use.

Ways Machine Learning Models Fail: Missing Causes

Transaction Classification Aids Credit Risk Assessment

Data Engineering with Discipline

When trying to get decision-making insights from data, we often must start with helping to clean and organize the data architecture so we can build data science and machine learning models, a process called data engineering. This blog explores process of preparing data for analytical analysis.

The Problem with Random Stratified Partitioning

Group Optimization – An Application of the Nash Equilibrium

Supervised vs. Unsupervised Machine Learning

How to Automate Machine Learning Model Tuning

Hyperparameters are the high-level “knobs” or “levers” of a model. In this blog Data Scientist Dr. Trent Bradberry explores hyperparameters in more detail and some ways to find good sets of them to reliably automate the model tuning process.

Fluency in The Language of Data Models