top of page

Whats Going on Here?

Racing is an evolving sport surrounded by concepts of mathematics, engineering, and technology in every aspect... except post race stats. Starting Position, Finishing Position, Laps Led, Top-5s, DNFs - that's about all you can get when trying to learn which teams are good and bad. Why? Is there nothing else interesting? Is this information impossible to get? WRONG.


I have always been interested in sports statistics, but finding and analyzing stats in NASCAR is much less interesting than other sports. In football you can look at some basic stats like rushing yards and TDs thrown, but some stats can get in-depth, like passes over 20 yards on the right vs the left side. In baseball, people have made models that can predict which pitch will be thrown next or how many wins each pitcher provides their team. These types of analytics just aren't available in NASCAR. One ''stat" that opened my eyes was a post I saw that was something like "It's been 56 races since Kurt Busch led a lap at a 1.5 mile track last fall." When I looked into it, there was only 22 races at 1.5 mile tracks in that 56 race streak. It's such a lazy, misinforming stat that really made me wish for something better.


ree

The idea for this project started when I learned about the movement of advanced stats in the NHL. Fans of hockey started to capture their own data (and use the large amounts of data the NHL provided) and used it to evaluate teams and players much more in-depth than previously recorded stats. This has became a community with many mathematicians and hobbyists creating highly advanced models for evaluating players. Some of these newly recorded statistics and models have even found their way into the team's talent evaluation process and become full time jobs for those behind it. Now I'm trying to bring that movement to NASCAR.


The data collection process has been the biggest deterrent of these advanced stats since the information isn't stored anywhere. NASCAR didn't really provide much information about the race after it concluded. I created a scraper of NASCAR's timing and scoring webpage to get the information in real time as the race was taking place. The data was stored and, through the power of Excel spreadsheets, something useful comes out the other side. With that data, there can be analysis of how drivers do on every lap throughout the race, not just how they started and finished. In 2020, NASCAR has started to publish significantly more data from the entire race which can be scrapped and store for analysis. Since it is stored it doesn't need to be scrapped in real time and also is [almost] guaranteed to be accurate.


Using this data there is so much more that can be found. We can see which drivers are the fastest on restarts or older tires. We can analyze getting on and off pit road and see which track really is the most difficult or who is the best. We could see which drivers are the best at passing, or leading, or getting the best finish with a bad car. I think this is much more valuable than looking at the finishing results on Monday and thinking that's how they entire race went. There is really some cool information if you dig into, and I plan to.

Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.

© 2023 by The Real Speed Blog

bottom of page