Elder Research is now part of MANTECH! Together we are shaping the future of data and AI. Read the Announcement →

Home / White Papers / Are Orange Cars Really not Lemons

Are Orange Cars Really not Lemons

White Paper

Overview

An article in The Seattle Times reported that “an orange used car is least likely to be a lemon.” This discovery surfaced in a competition hosted by Kaggle to predict bad buys among used cars using a labeled dataset.

Of the 72,983 used cars, 8,976 were bad buys (12.3%). Yet, of the 415 orange cars in the dataset, only 34 were bad (8.2%). The visualization used was entirely appropriate and accurate, but susceptible to the small-sample effect so it led to incorrect conclusions.

This white paper dives into the details and explores techniques, particularly Target Shuffling, to avoid making the same mistake.

Are Orange Cars Really not Lemons

Overview

Download the White Paper