First, some background information.
- A regression problem is when your neural network outputs a continuous value. Think housing prices, stock market value, etc.
- A spread is sports is what the sports book think will be the difference in the final score of a game. For example, Team 1 ends with 24 points and Team 2 ends with 21 points the spread is -3 for Team 1
I am assuming that some readers could guess where this is going. After doing the DevPost project on college basketball I tried to mess around with NFL scores. Going in I knew that I wouldn’t actually get anything of value but it would be “fun” to try. Anyway, I started training my model and I was getting amazing results. I was within a quarter of a point. If I actually had a model that would do that I would be a billionaire and pretty much shut down the sports betting market.
Well, since I am still here and not on a personal island you can assume I messed up. It turns out that I had skipped the step where I removed the ACTUAL SCORE OF THE GAME from my training data. The network picked up on this and was ignoring all my other inputs.
Back to work, I guess.
As I am going through the Google Developer Expert process I was asked with coding up KNN in TensorFlow. I couldn’t find an example online so I decided to create it.
The gist was that I could use *tf.math.squared_difference* to measure the difference. Then, I used reduce_sum to combine the 4 feature differences to a single number. Then, I would reverse the values using tf.negative and call tf.math.top_k to grab the nearest. Pretty straight forward.
Overall, it was a great experience. TWCC has done a great job in their 20+ years and this year was no different. I wasn’t able to stay the entire time but from everything I saw it was great. The facilities were perfect for an event this sized and it appeared that everyone was getting a long and there were multiple groups of people have conversations about the current topics that were presented.
The only downside was that I had to leave home at 5am to get to the start and hit some ice on the way up. Can’t fight mother nature!
GitHub Repo: https://github.com/ehennis/ReinforcementLearning
This post is about what I learned while submitting my basketball spread application to DevPost that I covered in a previous post.
For data collection I used a C# application that would download the NCAA results for the last 4 seasons. It was pretty straight forward string manipulation. It wasn’t until later that I discovered that some teams don’t have their names kept the same. State to St. for example. At first, I went back and cleaned up the CSV but decided that it would be smarter to just handle it in code. This would allow me to not worry about changes going forward.
I fought with the structure (layers, nodes, optimizer, activations) for a while. I knew that I didn’t want more than a few layers and with 20k games I didn’t want a lot of nodes. I settled into a sweet spot with 32/32. I had dropouts but decided that I wanted to remove them.
My network is set up for Home/Away structure and a tournament games doesn’t have a home team. At first I assumed that it would matter as long as I used the teams “away” stats. This turned out to be not true. To work around this issue I ran the prediction twice and average them out.
Overall, I liked the experience and will probably try and do an NFL version. As long as you keep in mind that MILLIONS of dollars are spent each week in Vegas you will never beat their lines. But, you can easily find some weaknesses and beat them in a few games.