One new feature that I’ll be doing this year is a weekly set of computer picks. Unlike some other models which focus on beating spreads or simply picking the winners, this model will focus on something else: the money line. Each week it will estimate the probable outcomes of each game, compare it to the money line, and come up with a set of low, medium and high confidence picks. It will then make “bets” (not real money, obviously) of increasing scale depending on which (if any) pool each game falls into, and keep track of all of the results.
In order to ensure equal weights for all games, every “bet” inside a confidence pool will have the same amount of “money” at stake. In other words, the amount risked and the amount that can be won on any game will always add to a constant ($5 for Low, $10 for Medium, $15 for High).
For instance, let’s take a look at last week’s game between Purdue and Northwestern, where the money line was -300 (bet 3 to win 1) for Purdue and +260 for Northwestern (bet 1 to win 2.60). If the computer had thought that Purdue was the pick with low confidence. It would then bet $3.75 to potentially win $1.25. On the other hand, if it had picked Northwestern, it would have bet $1.39 to potentially win $3.61 (both numbers were rounded). In either case, the total amount at stake would be $5.
One other feature that will be included is the weekly set of upset picks. These are games in which the computer believes that the upset is the most likely outcome. And while the computer is wrong more than it’s right (and it makes some really wild picks from time to time, like 2005’s picks of Indiana over Michigan State or Cincinnati over Louisvile), the upset picks have a tendency to cover the spread for some reason, which is why they’ll get packed in as well. Just remember not to take them too seriously, since especially early (i.e. this week) you’re going to see some weird picks.
And now a bit about how the actual model is built. Unlike a lot of other models, which rely on separately calculated parts and then combines them in one manner or another, this model has everything working in concert to build each team’s set of rating numbers. Each and every game is graded on a score of 0 through 1 (for the winner; the loser gets the negative of that number), taking into account straight margin of victory, relative margin of victory (i.e. a 10-0 win is more impressive than a 42-28 win, despite a lesser absolute margin), and whether or not the game goes into overtime, since any overtime result is much less decisive than the same score result in regulation.
Every score is then balanced against how strong the other team was, and the whole thing is then put together and solved together, with adjustments made for each team’s individual home-field advantage and improvement (or lack thereof) over the course of the year (treated as a linear function). After that, the system looks at each game result for each team to see how much variation each team has.
And then, to make the new picks, the system compares each team’s rating numbers and variance and churns out a probability estimate for the upcoming matchup.
And that’s about it. And for anyone who’s curious about the game data I used to make the model, here are the results for all of the games so far this year (1-A results only). Feel free to let me know if there’s an error in there (they're manually inputted, so I woudn't be surprised if I missed a typo somewhere). Weeks are arbitrarily defined as starting on Wednesdays and ending on Tuesdays (i.e. Saturday – centric), which is why the week numbers for some games are different from what you’d see elsewhere.
Questions, comments or suggestions? Email me at firstname.lastname@example.org