Blog Header 1.jpg


Sort by Topic

Measuring Invisible Treatment Effects with Uplift Analysis: A Get-Out-The-Vote Example

Mike Thurber

October 23, 2020


Models make predictions by identifying consistent correlations in what has been observed, but we usually require more than predictions to know what action we should take. For example, knowing that older people are more likely to have heart disease is a good first step, but knowing behaviors or treatments that will reduce the risk of heart disease as we age is actionable. Knowing millennials are more likely to buy your product than gen Z is nice, but knowing which marketing approach will persuade gen Z to buy is valuable. In this election season, knowing who will vote is interesting, but identifying unlikely voters who can be persuaded to show up at the polls is everything to campaign managers. When data science goes further to estimate the impact of alternative actions we may perform to achieve a better outcome, we call it uplift modeling or, more technically, treatment effect modeling. For this instructional blog we will use a very limited example of how uplift modeling can apply to get-out-the-vote campaigns, without divulging which sample or geography was used.

Read More

Building a Successful Analytics Project from the Ground Up

Lisa Targonski

October 16, 2020

BLOG_Building a Successful Analytics Project from the Ground Up

Like a house built to withstand the seasons, a successful and sustainable analytics project must start with a firm foundation, a purposeful plan, a seasoned team, and the right tools and materials. Analytics project owners and budget-holders, much like a homeowner, want to see their large investment result in something that delivers value and supports evolving needs.

Read More

Creating a Legacy as a Chief Data Officer by Optimizing the Value of Data for Public Good

Christina Ho

October 1, 2020

Pioneers Logo

As the Chief Data Officer (CDO) role gets established at each federal agency as required by the Evidence-Based Policymaking Act of 2018 (Public Law 115-435) (Evidence Act), I can’t help but feel conflicted. On one hand, I am excited that the federal government has taken such an intentional step towards a more data-driven government: There are enormous potential benefits when CDOs partner with program owners to innovate and deliver real values to the federal government and our citizens. On the other hand, I am concerned this role, if not properly supported and empowered, could become yet another silo yielding little real value.

Read More

New Tool Helps States Reduce Unemployment Insurance Overpayment

Victor Diloreto

September 18, 2020

BLOG_New Tool Helps States Reduce Unemployment Insurance Overpayment

Elder Research has extensive experience helping enterprises find and eliminate fraud, waste and abuse (FWA). For several state-level labor departments, we have assisted with identifying fraudulent unemployment insurance (UI) claims. This capability is critical today, during the unfortunate record-setting unemployment over the last few months. This blog is dedicated to help agencies – who are wary of domestic and foreign abusers of the UI system – by describing our successful work, where and how it is being used, and how to leverage the lessons learned and tools built to identify FWA.

Read More

Policy Impact on COVID-19 Spread

Tom Shafer

September 4, 2020

BLOG_Policy Impact on COVID-19 Spread

As COVID-19 began to take off in the U.S. early this year, federal, state, and county governments hurried to contain it. Policies restricting movement, closing businesses, etc., have become a staple of government policy, including many stay-at-home directives across the U.S. These measures have been met with varying levels of enthusiasm, but they are rooted in an understanding about how epidemic diseases spread. Now, with several months of data, we can begin to test whether these policies have been effective in their primary function and to what extent a controlled “reopening” raises a community’s COVID-19 risk profile.

Read More

AUC: A Fatally Flawed Model Metric

John Elder, Ph.D.

August 21, 2020

BLOG_AUC - A Fatally Flawed Model Metric-1

The blog Recidivism, and the Failure of AUC published on showed how the use of “Area Under the Curve” (AUC) concealed bias against African-Americans defendants in a model predicting recidivism, that is, which defendants would re-offend. There, a model varied greatly in its performance characteristics depending on whether the defendant was white or black. Though both situations resulted in virtually identical AUC measures, they led to very different false alarm vs. false dismissal rates. So, AUC failed the analysts relying on it, as it quantified the wrong property of the models and thus missed their vital real-world implications.

Read More

Sira-Kvina Hydro Power – Tracking Down Why the Generator Failed

Peter Bruce, Ramon Perez, Mark Smith

August 7, 2020

BLOG_Sira-Kvina Hydro Power – Tracking Down Why the Generator Failed-1

In early 2020, Sira-Kvina Kraftselskap, a large producer of hydroelectric power in Norway, suffered a breakdown of one of its major generators. Company technicians went through established diagnostics to identify the cause, but they were unable to pinpoint the trouble. Efforts to restart the generator continued to fail and the shutdown dragged on for months.

Read More

Four Common Data Engineering Pitfalls (and How to Avoid Them)

Will Goodrum, Ph.D.

July 24, 2020

BLOG_Four Common Data Engineering Pitfalls-1

Your company has made it a strategic priority to become more data-driven. Good! A major anticipated component of this transition is to implement new data technology (e.g., a data lake). Resources are thrown at identifying source systems and pulling information into a new, analytically-focused data repository or an even bigger data lake. Time is spent creating an ETL pipeline to move data from one place to another. Web endpoints are created to facilitate access for data customers. Dashboards are created that show information available in this centralized and optimized data source. At a brief with the company executive team 12 months later, the excited response from the C-level is a resounding: So how has any of this effort made us more data-driven?

Read More

Natural Language Processing for RegTech: Uncovering Hidden Patterns in Regulatory Documents

Evan Mitchell

July 10, 2020

BLOG_Natural Language Processing for RegTech

Natural language processing (NLP) is a branch of artificial intelligence aimed at giving computers the ability to use and understand human language and speech. Technology features we take for granted every day are a product of NLP. When you dictate a text message to Siri or ask Alexa the weather, that’s natural language processing. When our email services filter out spam, check our spelling and grammar, and even autocomplete entire messages, that’s NLP too. NLP is also a key part of Elder Research's approach to RegTech.

Read More

Success With Analytics Starts with Data Literacy

Jennifer Schaff, PhD

June 26, 2020

BLOG_Success With Analytics Starts with Data Literacy-1

The field of data analytics is dynamic with rapidly evolving innovations. To realize the potential of an enterprise-wide analytics program, leaders and managers at all levels need strong data literacy. Providing opportunities for continuous learning and sharpening of skills is necessary for a robust analytics enterprise able to reliably deliver measurable value to the organization.

Read More

COVID-19: Second Wave

Peter Bruce

June 10, 2020

BLOG_COVID-19-Second Wave

Will there be a second wave of COVID-19 in the fall?  Individuals and organizations are making plans that depend on an answer to that question, and there is much comment in the media to the effect that there definitely will be a second wave.   However, specific predictions have limited credibility, due to the poor performance of epidemiological and statistical models in forecasting the progress of the first wave while we are still in it.

Read More

SearchPPE Fills Supply Chain Gaps for PPE

Paul Derstine

June 5, 2020


In the midst of the national emergency response to the Coronavirus Disease 2019 (COVID-19) pandemic, United States manufacturers and State governments have been unable to effectively identify and leverage manufacturing resources to fuel rapid supply chain order fulfillment and delivery of Personal Protective Equipment (PPE). This lack of transparency, through all layers of the manufacturing supply chain, has resulted in increased viral exposure risk for frontline healthcare providers, as well as multi-sector business revenue loss. State government and commercial industry need better supply chain visibility to effectively manage resources in crisis response and resupply.

Read More

Visualizing the Performance of COVID Models

Chris McLean & Peter Bruce

May 26, 2020

BLOG_Visualizing the Performance of Covid Models

Never before have statistical models received the attention they are getting now in the midst of the Coronavirus pandemic.  It is hard to read a news feed today without encountering either:

  • New predictions from models such as the IHME model and others, or
  • Critiques of older predictions.

So – how have older predictions turned out? 

Read More

Lockdowns Knock Down the Spread of COVID-19, but Only to a Point (and only early on)

Mike Thurber and John Elder, Ph.D.

May 19, 2020

BLOG_Lockdowns Knock Down the Spread of COVID-19

By tracking anonymized mobile phone location data and COVID-19 case reports for many countries with different policies, we studied the effect of restricting mobility on the spread of COVID-19.  We found that lockdown policies did rapidly reduce the Covid reproduction ratio, R, but only up until ~3 days before a country’s peak daily case rate, and they had little or negative impact after that.  Also, people should be allowed to go to Parks.

Read More

A Holistic Framework for Managing Data Analytics Projects

Mike Thurber

May 15, 2020

Blog_A Holistic Framework for Managing Data Analytics Projects

Data Science project management must be customized to work best with each organization, but we find that our projects are most successful when managed using an Agile + CRISP-DM process rather than a traditional Waterfall approach. Sprint planning in an Agile + CRISP-DM framework constantly encourages the team to consider emerging requirements and to pivot based on findings from the previous sprint.

Read More

COVID-19: Tuesday Blips in a “Downward Trajectory”

Carl D. Hoover

May 7, 2020

BLOG_COVID-19_Tuesday Blips in a Downward Trajectory

We observe a peculiar counter-fluctuation in a COVID-19 statistic -- daily percent changes in deaths; it has a downward trend, but Tuesdays tend to see small increases.

COVID-19 death counts continue to increase. Figure 1a shows total U.S. deaths over the last 7 weeks along with the daily increase in deaths (dashed line). According to The COVID Tracking Project, between March 22nd and May 3rd reported deaths attributed to COVID-19 rose from 436 to a staggering 61,868.

Read More

Coronavirus: Age and Health

Peter Bruce

April 30, 2020

BLOG_Coronavirus - Age and Health

Although there are frequent reports in the news media of young people contracting serious cases of COVID-19 and even dying, the disease in its serious form is overwhelmingly a disease of older people.  Data on US deaths from the U. S. Centers for Disease Control in Figure 1 portray this vividly. (We focus on deaths rather than cases of Covid because deaths are less affected by differences in testing.)

Read More

COVID-19 Social Distancing Has Mitigated 2020 Flu Season

Mike Thurber

April 27, 2020

BLOG_COVID-19 Social Distancing Has Mitigated 2020 Flu Season

Three weeks ago, our Brief Is the Spread of the COVID-19 Coronavirus Being Slowed looked at the impact of social distancing on the flu.  Evidence showed that the unprecedented measures taken by the government are having the expected effect, as measured by seasonal flu cases.  In this Brief, we update and amplify that information.

Read More

Roadmap to Becoming a Data-Driven Organization

Robert Pitney

April 24, 2020

BLOG_Roadmap to Becoming a Data-Driven Organization

Data analytics is the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information to support the decision-making process. Every organization can benefit from an effective data analytics program that uses data insights to more efficiently and effectively accomplish their mission. Developing an enterprise-wide core analytics group -- that consolidates analytics initiatives within the organization and facilitate communication between business units -- requires significant organizational and cultural change. Cultivating data-driven decision-making starts with the executive leadership.

Read More

COVID-19 Asymptomatic Rates and Implications

Mike Thurber

April 20, 2020

BLOG_COVID-19 Asymptomatic Rates and Implications

A key issue for COVID-19 response policies moving forward is the asymptomatic rate - people who have the virus but do not show symptoms.  If all the people who get the virus show symptoms, then despite the wide scope of the current outbreak, we have a long way to go.  In New York state, the virus epicenter in the U.S., there are nearly 220,000 tested & confirmed infections (as of 4/16/20), and the vast majority are symptomatic (it’s very hard to get tested if you're not symptomatic).  Still this is less than 1% of the total New York state population.  On the other hand, if there are 20 asymptomatic individuals for every confirmed symptomatic case, that would mean 4.5 million cases in New York, probably concentrated in the New York City area.  This would bring the city much closer to “herd immunity” and reduce fear of resurgence or second waves.  So what is the asymptomatic rate? 

Read More

Covid-19: Epidemiological Models vs. Statistical Models

Peter Bruce

April 15, 2020

BLOG_Covid-19-Epidemiological Models vs. Statistical Models

Nearly everyone is now familiar with the IHME Covid-19 forecasts (also called the “Murray” model after the lead project investigator at the Institute for Health Metrics and Evaluation), and perhaps its associated interactive visualizations. 

Read More

Is the Spread of the COVID-19 Coronavirus Being Slowed?

Mike Thurber

April 7, 2020

BLOG_Is the Spread of COVID-19 Being Slowed

The COVID-19 pandemic has created a need for clear and actionable analytics like never before. The world can’t wait for controlled scientific studies to be completed, or dodge with, “We can’t say anything until we get more data.” To discover true relationships, we’d love to have detailed structured public data. But we usually must take the limited data at hand and quantify actionable insights in the face of uncertainty.  Here, we’ll look at one piece of the menacing puzzle:  is Social Distancing helping?

Read More

How and Why to Interpret Black Box Models

Grant Fleming

March 27, 2020

BLOG_Holding Black Box Models Accountable Through Interpretability

Demand for data science services continues to accelerate, which has fueled the rapid development of ever more complex models. That complexity has contributed to the poor application of models and thus to controversy surrounding the true value of data science. It is vital for us as data scientists to ensure that, while our models continue to improve in performance, we can also interpret how they function, and thereby diagnose any harms that they might cause through biased or unfair predictions.

Read More

What is the Value of Data Engineering?

William Proffitt

March 13, 2020

Data Engineering

With more organizations discovering the value of using data science to make better decisions, new opportunities are emerging for Data Engineers to provide support and integration for analytics teams. What’s valuable about Data Engineering skills?

Read More

How to Pick a Winning March Madness Bracket

Robert Robison

February 28, 2020

BLOG_How to Pick a Winning March Madness Bracket

In 2019, over 40 million Americans wagered money on March Madness brackets, according to the American Gaming Association. Most of this money was bet in “bracket pools,” which consist of a group of people each entering their predictions of the NCAA tournament games along with a buy-in. The bracket that comes closest to being right wins. If you also consider the bracket pools where only pride is at stake, the number of participants is much greater. Despite all this attention, most do not give themselves the best chance to win because they are focused on the wrong question.

Read More