How Good Am I at Ping Pong?

I had a fantastic family vacation on a cruise earlier this year, and while most of my activity revolved around kids and waterslides, I carved out time to enter the cruise’s ping pong tournament. I play a little ping pong in the office (and of course, being a data science shop, we keep an updated RShiny app to track and visualize our custom ELO scores), so I thought I could be competitive.

I was randomly seeded along with 15 other competitors in a single elimination tournament. With my children watching on, I proceeded to lose in the first round by a score of 12-10.

Back to the waterslide we went, and I was left shaking my head and wondering how good I truly was at ping pong. My son was quite convinced I was the worst player of the 16 that entered the tournament, but the data didn’t support that claim with very much confidence. My intuition was that I was probably in the bottom half, but I couldn’t even make that claim with much confidence. There was very little I could say conclusively.

Small Data Problems

Over the past few decades there has been so much hype about big data and the challenges that it brings with storage capacity and computational efficiency. Small data, however, also poses an interesting challenge, and one where throwing more money (storage and compute) doesn’t make the analytics easier.

How can we make inferences when we have captured very little data?

How can we forecast the future when we’ve collected little about the past (or the past is not similar to the future, i.e., COVID)?

How can I tell where I rank in ping pong when all we know is that I lost 12-10 in the first round?

I decided to try and answer that question in an effort to show my kids that it was unlikely I was the worst player in the tournament. First, given my background as an office ping pong player and ELO ranking calculator, I made some assumptions.

These aren’t perfect, but they provide a reasonable starting place to make inferences on overall ranking.

Each player has

a ping pong ‘skill-level’ which is collectively normally distributed (this is based on the ELO scores in our office approximately following a normal distribution). 3 Myths About the Normal Distribution >>

For each point played

in a game, either player could win, but the more ‘skilled’ player is more likely to win.

The likelihood of winning

a point is approximated by a sigmoid function where the input is the difference in players’ skills. Thus, a very large difference in skills will result in near certainty for the more skilled player to win a point, and equally skilled players each have a 50% chance to win any given point.

Each point is independent

of previous points (there are no ‘hot streaks’ other than chance), and each game is independent of previous games.

I encoded these assumptions in R so that I could simulate many games. There was no hard science to getting the exact parameters on players’ skill distributions, but I tried to match the distributions to the actual results that we’ve seen in our office, which means there is really one more assumption:

The players that sign up for a cruise ship ping pong tournament are reasonably similar to the employees in the Raleigh office of Elder Research.

Simulating Ping-Pong Tournaments

Now we can reasonably simulate games between randomly seeded players in this tournament. Running one simulation doesn’t help much, but it is quite cheap and fast to run many thousands of simulations. Then, we can look at all the players that have lost 12-10 in the first round and see where they fall in the score rankings. For example, we randomly assigned skills to 16 ‘players’, simulated a tournament, and only one player lost 12-10 in the first round. When we compare that player to all other players in the simulation, we can see exactly where they rank. If we do this many thousands of times, then we can see a distribution of where a 12-10 first round loser ranks:

How did Evan rank?

Rank of the First Round 12-10 Loser

This is interesting, but it tells us mostly what our intuition already likely had: we can’t be very confident about where I rank. But as luck would have it, my wife and I ran into the gentleman that beat me later in the cruise, and it turns out he won the entire tournament! Not only that, but his score in the championship match was 12-6! This is some very useful new information, and using the same assumptions and simulation technique, we can know where I likely ranked out of all 16 competitors. My son still thought I was the worst, but my (maybe a little biased) intuition was that I was confidently the 2nd best player in the tournament. I must be, right!? I lost to the champ by a closer margin than the person who lost in the finals. Let’s run the simulation and see the likely ranking of someone who lost 12-10 to an eventual champion, given the champion won 12-6 in the finals. Since we’re collecting the scores of each game in the simulation, we can simulate many tournaments and only use the ones that have this exact case for our inference.

How did Evan rank?

Rank of the first round loser

These results were quite surprising to me! While it was certainly possible that I ranked 2nd, it was more likely that I ranked 3rd (since the highest bar is the third, that was the most common result of all the relevant simulations). It is also very plausible that I ranked anywhere in the top 10, and to my son’s great excitement, not completely impossible that I ranked dead last.

I was happy to point out that it was quite possible that I ranked 1st, even though I lost in the first round. Being more skilled than an opponent simply means that they you are more likely to win each individual point; a less skilled player can certainly get lucky and win enough points to take the game, especially if the skill level is close.

Why is this interesting?

Often when businesses want to employ statistics or predictive modeling, leaders are looking for a single output: Will the prospect buy? What will revenue be? How many users will sign up next month? As practitioners of data science are well aware, a single number isn’t as informative as a distribution, and can sometimes be quite misleading.

Consider a forecast for the number of widgets that will be ordered next month. Warehouse A and Warehouse B both have a forecast of 100 widgets. Is that all the information that is needed? What if Warehouse A had sold exactly 100 widgets for each of the past 24 months and Warehouse B just opened 6 months ago and ranged between 10 and 200 widgets each month? The forecast distributions for the following month could look something like this:

Warehouse Forecasts

The uncertainty is critical, and given that uncertainty, we can treat the two warehouses differently despite the exact same forecast. Perhaps they should have different safety stock, overtime laborers, secondary sourcing, etc. The distribution gives actionable information that is too often left out of any decision-making process, and this is the case for nearly all predictive modeling or statistical inference.

Conclusion

Even if all the assumptions are true, it would be misleading to say I was the 3rd best ping pong player in that tournament, even if that was the most likely case. It isn’t as concise to share a distribution, and it isn’t as intuitive to compare distributions as it is to compare individual point estimates, but that context is critical when using data and analytics to inform real world decisions.

It also lets me hold out hope that I secretly was the best player in the tournament.

How Good Am I at Ping Pong?

Author:

Date Published:

Small Data Problems

Each player has

For each point played

The likelihood of winning

Each point is independent

Simulating Ping-Pong Tournaments

How did Evan rank?

How did Evan rank?

Why is this interesting?

Warehouse Forecasts

Conclusion

How Good Am I at Ping Pong?

Author:

Date Published:

Small Data Problems

Each player has

For each point played

The likelihood of winning

Each point is independent

Simulating Ping-Pong Tournaments

How did Evan rank?

How did Evan rank?

Why is this interesting?

Warehouse Forecasts

Conclusion

Additional Articles by Evan:

How a Datathon Saved Christmas

Credit Scoring in the Cryptocurrency Ecosystem

Get with the Times

Author

Evan Wimpey

Former Director of Analytics Strategy