Chad Milliman posted an
interesting podcast earlier this week with Dr. Bob. One of the things they talked about
was technical betting (using ATS trends to predict covers, as opposed to fundamental betting,
which looks at the relative strengths of two teams and compares that to the spread to try and find value)
and why Dr. Bob tends to be skeptical of systems that promise
80%+ returns (and, as he discovered, tend to hover around 50% once they get published
and sold to the general public). I thought it was pretty interesting and thought I'd
weigh in with my take on why this happens.
1) It is REALLY easy to generate seemingly meaningful relationships from random data.
Instead of going through a theoretical explanation of why this is, I'm going to illustrate this with the simple example of a coin flip.
I've got a coin, and I'm going to flip it 16 times.
That's plenty of data for you to reasonably establish whether something really wacky is going on... or is it?
For a simple "I've got one coin, let's see if it's heavily biased towards heads or tails" test, it pretty much is.
If I flip, say, 12+ heads or 12+ tails, maybe it's just luck, but much more likely there's a bias involved and it's not a fair coin.
But it gets messier when we start applying a bunch of other tests to the data. Let's use the following set of coinflips (randomly generated in excel)
and apply some aggressive data mining logic (which is essentially what technical analysis is, digging through large data sets and applying large numbers
of different rules and combinations of data until something interesting pops out):
0  1  1  0 
0  1  1  1 
1  1  0  1 
0  1  0  0 
That's a set of 16 coinflips (1 = heads, 0 = tails), with the first row being the first four, the second row being the second four, etc. So what can data mining tell us?
Let's start with a somewhat straightforward adjustment: instead of there being just one coin, now there are two coins, coin A and coin B. Each gets flipped the same number of time,
and the order is determined in some sort of logic way. I'm not going to tell you what that way is, just that it's logical. And I'm not going to tell you for sure if
either of the coins is fair or not (remember, we do know that it's all generated through 5050 coinflips, but for the sake of the thought experiment we'll pretend we don't know).
With just two coins, there aren't a whole lot of especially obvious ways of organizing the flips. The only two which really stand out are alternating coin flips and flipping the first eight times
and then the second eight times. So let's test those two rules and see if we can find (false) evidence of unfair coins. The first test, alternating coin flips, gets us 3 heads for coin A (add first, third columns)
and six heads for coin B (add second, fourth columns). The second test, flipping coin A eight times then coin B eight times, gets us five heads for coin A and four for coin B.
So in this case, the strongest potential conclusion we can reach is in the first test, where "Coin A" gave us six out of eight heads. A 62 record is nice, but it's only 75% on a fairly small sample size.
If we want a strong conclusion, we need to dig deeper.
So now let's say that instead of two coins, there were four coins, A through D. Again, we don't know how they were ordered, just that it was "something logical." So what can we do with four coins?
A few things: we can flip one after the other (1); we can flip A four times, then B four times, etc. (2); or we do 2 flips of A, then 2 B, then 2 A, then 2 B, then do the same with C and D (this splits
the table into quadrants: upper left, upper right, lower left, lower right (3).
Scenario 3 might seem counterintuitive, but it's just slicing the data set in half one way, then cutting it in a different direction to split A and B or C and D. A betting analogy might be that the table represents betting on teams coming off a win,
sorted in order of how much of a favorite that team is. In that case, coins A and B represent betting on a favorite while C and D represent an underdog.
But at the same time, A and C are the home team (slicing it in a different direction), while B and D are the road team. You've got four distinct groups
In fact, I could just as easily create scenario 4, where I flip A, then B, then A, then B, doing that four times, the doing the same for C and D.
We'll pass on this one, but it's worth noting that we don't have to.
So we've got three scenarios, 1, 2, and 3. In scenario 1 (alternating), we just add up each column to represent each coin. That gives us A as 1 out of 4, B as 4 out of 4, and C and D as 2 out of 4 each. Hey, we just found a 100% winner in "Coin B"!
Let's take a look at scenario 2 (4 A then 4 B etc.). Here we just add up each row, and get 2/4 for A, 3/4 each for B and C, and 1/4 for D.
And in scenario 3 (quadrants), we get 2/4 A, 3/4 B, 3/4 C, and 1/4 D.
At this point, scenario 1, coin B looks pretty promising, because we hit 4/4 heads, and a few other ideas gave us 3/4 or 1/4, and seem like they might get us somewhere if only
we can get more supporting data. So let's find that supporting data by doing 64 coin flips instead of 16:
Dome 
0  1  1  0 
0  1  1  1 
1  1  0  1 
0  1  0  0 

Mild 
0  0  1  0 
0  0  1  0 
1  0  1  1 
0  0  0  1 

Hot 
1  0  1  1 
0  0  1  0 
1  0  1  1 
0  1  0  1 

Snow 
0  0  1  1 
1  0  0  0 
0  1  0  0 
0  1  1  1 

But we're not just replicating the original set of rules that gave us our first 16, we're going to do 16 coin flips in four different conditions. Let's say
for this example that the first set is in a dome, the second set is outside in mild conditions, the third set is outside in hot conditions, and the fourth set is outside in the snow.
It'll become clear why we're doing this soon, but let's just go with it for now.
Let's start with scenario 1, alternating each coin in turn (i.e. adding columns). What kind of rules can we get here? Well, it turns out that "B is always heads" doesn't quite work so great,
since the other three tables are 0, 1, and 2 each. But as compensation, we have some other ones that look good instead. Remember how A was 1/4? Well, A is 1/4 in all but the third table.
That gives us a 93 ATS record for "coin A whenever it isn't brutally hot outside." That's a pretty good rule, right? 93 is 75% on a pretty decent sample size.
So we've got 93 for "flip coin A if it's not brutally hot outside", and 40 for "flip coin B in a dome." Those are two nicelooking trends, but not as much as we were hoping for. What about
the other scenarios?
Let's look at scenario 2. Unfortunately, we don't have any 40's in any of the tables, but we do have one other promising lead. In each of the nondome tables, coin B was 1/4. So that gets
us a 39 ATS record for flipping coin B outside a dome. We also have 62 for coin C and 26 for coin D for "when conditions aren't bad" (either dome or mild weather). Again, not a motherlode, but some stuff
that looks like it's got potential.
How about scenario 3? Right off the bat we have 04 for coin A outside in mild weather. In fact, we have 210 for coin A in all nondome conditions combined, and 412 across the board!
That's a pretty nice set of numbers. Remember, back at the start of the article we established that 12 out of 16 (which is what you get betting against A) was enough to establish that there's likely something real under normal conditions (i.e. if this had
been an independent test and we hadn't just sliced and diced the data until we found something that hit the desired target).
And while we're at it, we have 8/12 for Coin D outside a dome, another data point that looks reasonably convincing to the naked eye.
And as a quick aside, scenario 4 gives us 04 for coin B in mild conditions, 210 in all nondome conditions and 412 in all conditions combined. Plus it gives us a 115 record record
for coin D across all conditions.
In other words, even though I was just putting up completely random numbers, I was still able to come up with a number of rules that, when looked at in isolation, seem like they represent something meaningful.
A 210 or 412 ATS record (top left of table) looks pretty good as a fade. And if you just saw "home favorites coming off a win are 210 ATS outside of domes", that would look pretty good if you didn't know better.
By reading this article, you have the advantage of knowing that I sliced and diced the data every which way possible, but if all you saw was "top left quadrant of the table virtually never wins ATS outside a dome",
would you know that I pretty much just stumbled across that conclusion after almost nothing else worked?
And that's just a simple set of rules. You can increase the set of data splits again if you have four tables for "the last game was a division opponent", another four for "the last game was a nondivision opponent in the same conference", and another four for
"the last game was against the other conference." And you can do it again for "this game is against a division opponent" etc. The more you slice and dice the data, the more likely you are to find SOMETHING
that looks meaningful even though it's more likely than not a function of random chance.
2) There's no guarantee conditions won't change as a trend becomes widely known.
Remember that coin flip analogy? Well, what if Vegas suddenly changes the payout so that you're laying 200 instead of the standard 110 because everyone's betting it?
It could turn out that the real odds were 55% (i.e. there really was something legitimate but it wasn't a huge effect), but you've sliced and diced the data so much that you think it's 80%. You then bet into the teeth of the adjusted line
and get killed, while some random dude who doesn't know about the trend and says "hey, I'm getting +150 (or whatever) for betting a coin flip" becomes the winner
because he's getting value on his bet.
Now, of course that doesn't mean that all technical betting is bad. But if you see "Trend X is 80% ATS over the last 10 years!" in some magazine that everyone else
can see, there's probably no longer any value in that trend, if in fact it ever was anything other than a data mining mirage (and it's REALLY easy to get fooled by those).
So be skeptical about technical betting. Unless you're an expert on the subject, or have found your own trends that you're pretty sure no one else knows about,
it's more than likely a losing proposition.
There are a few important notes and caveats I need to make about this article:
1) CompuPicks does not endorse implicitly or explicitly any form of illegal gambling.
CompuPicks is intended to be used for entertainment purposes only.
2) No guarantee or warranty is offered or implied by CompuPicks for any information provided and/or predictions made.
2011 CompuPicks Blog
Questions, comments or suggestions? Email me at cfn_ms@hotmail.com