Now that the 2010 Compu-Picks model has been finalized, it's time to talk about what the model is, what it's for, and a little bit about how it works.
First and foremost, as the name suggests, Compu-Picks is a predictive system that exists for the sake of making picks. While the various features I may
run from time to time that use the model outputs (such as the top 1-A teams of 2004 - 2009, link here) are interesting,
they are not the point of the NFL and CFB models, and they are not the standard that I use to judge the model.
Instead, I judge these models by their predictive accuracy. Right now, I use these models to generate ATS (against the spread) picks in both NFL and college football (1-A only).
In terms of ATS picks, the ultimate goal is to have a long-term average of 60% accuracy. Shorter-term, my intermediate goals are to hit a long-term average of 55% and then of 57.5%.
While there is no way to be sure, I believe that my college model has hit the first target. In 2009, it went 169-138 ( link here ), which was
55%, and after making a number of improvements from the 2009 model to the 2010 model, I believe it should at least hit that target again this year. Using the data from the last six seasons,
it would have been above 55% in five of the six years, with only 2007 failing to do so. I certainly wouldn't make any guarantees, but history seems to indicate that 2010 should go well for this model.
Unfortunately, the NFL model was less successful last year, going substantially under 50% with its picks
( link here ). After making a number of improvements to the NFL model, I hope that it can hit
a long-term average of 55% (the new model would have only gone about 50% in 2009, but would have done substantially better in 2006 - 2008),
but until proven otherwise, it's still experimental.
As far as how the model actually works, most of the details are confidential, for obvious reasons (would you publicly talk about a model that you think can do 55% or better ATS?).
However, I don't mind talking a little about some of the things it does and does not consider. The first point is that both models only start making picks in week 8.
This is because it takes that long before either model has enough data for me to feel comfortable with its picks and ratings. The second point is about what sort of
data it does and does not factor in. Below are the first two rows of 2009 data used by the college model:
| Date || Vegas || O/U || Week || Team1Name || Team1Num || Team1Loc || Team1Score || Team2Name || Team2Num || Team2Loc || Team2Score || OT |
| 3-Sep-09 || 17 || 58.5 || 1 || North Texas || 67 || AWAY || 20 || Ball St || 10 || HOME || 10 || NO |
| 3-Sep-09 || 4 || 63.5 || 1 || Oregon || 75 || AWAY || 8 || Boise St || 12 || HOME || 19 || NO |
The NFL model uses the same data set structure. I also keep a separate table for both college and NFL that link each team to the division / conference they're in.
And those two tables (the data table as shown above, plus the division/conference map) are the ONLY data sources that the model currently uses.
It doesn't factor in yardage totals, AP polls, etc. Anything else that gets calculated (ex: bye weeks) comes from the data table above.
It also doesn't currently factor in numbers or ratings from any prior seasons, though this may change in future seasons, depending on future research