Resident Amateur Mathematician Jared Kobe (@SCviaDC) drops in again to write up another math centric post. As last time, I’ll be doing some interpreting for you as I see fit in a slightly different color. This week we’re looking at just how likely really good teams are to lose, even get swept!, during a season. Nats fans might want to read this. Without further ado…
How likely is it that a team that 60% of the time will be swept in a three game series?
How likely is it that a team that wins 75% of their games will go 4-6 in a 10 game stretch?
How likely is it that a .400 hitter has a 6 for 30 or worse stretch in a season?
These are all the same question. You want to know how likely it is that, given a probability of success, like a team’s winning percentage, how likely is it that you will have a certain number of successes (wins) in a certain number of trials (games).
How Does The Math Work?
Note: If you hate math, and or are really going to get bogged down in this (which you shouldn’t, its pretty interesting) you can just take Jared’s work on faith and scroll down to the part where he applies this all to the Nats.
The most straightforward way to figure out the probability is to enumerate all of the possible outcomes, count the number of times what you are looking for happened, and divide it the number of outcomes. This is fairly simple for a small number of trials and successes, like flipping a fair coin 3 times. There are two possibilities, so that means there are 8 outcomes: HHH HHT HTH HTT THH THT TTH TTT. From those, you can see there is a 1/8 chance of 0 heads (TTT), a 3/8 chance of 1 heads (HTT THT TTH), a 3/8 chance of 2 heads (HHT HTH THH), and 1/8 chance of 3 heads (HHH).
But what about if the probability isn’t 50/50? Or if there are more trials? If you are doing, say 10 coin flips, that is 1024 different possible outcomes. Trust him. Please. And if you have a success rate of say 70%, even if you only do two trials, you have to list 100 possibilities (7 successes, 3 failures in the first trial and then 7 successes and 3 failures for each of those for the second trial). And what if you are talking about a 162 game baseball season? Trying to list all of the possibilities for something that large will make your spreadsheet so large you can even open it, much less do any calculations (trust me on this one). I do! I do! Point being, there is a lot more going on than flipping a coin in baseball.
In comes the Binomial Distribution to the rescue. Any test of an event that has a binary (2) number of outcomes (success or failure, heads or tails, win or loss, etc.) with a certain probability of success is something called a Bernoulli trial. The Binomial Distribution assumes that you have a certain number of these Bernoulli trials, n, that are independent of each other and have a probability of success, p. If you look for a certain number of successes, k, you can use the following formula to get the probability:
Probability = C(n,k)p^k(1-p)^(n-k)
Not as scary as it looks! Stick with it!
Let’s break down this equation and use getting 2 heads in three coin tosses, which was 3/8, to illustrate what’s going on.
- p^k is the probability of success multiplied by itself for the number of successes that you are looking for. For heads, p is ½, and we are looking for 2 successes, so k=2, leading us to ½*2=1/4.
- (1-p)^(n-k) is the probability of failure, sometimes written as q, multiplied by itself for the number of failures that you are looking for; in our example this is (1-1/2)^3-2 or one tails, which works out to ½.
- Lastly, C(n,k) represents the mathematical combination and is read “n choose k”, and is essentially the number of possible ways that you can get k in n. The mathematical equation for C(n,k) is n!/(k!(n-k)!, and as always with factorials (!), can get pretty large really fast. In our example, we are doing 3 choose 2, which is 3!/(2!(3-2)! = 3!/2! = (3*2)/2 = 3. So, using the Binomial probability function, we have 3 * ¼ * ½ = 3/8, which is what we came up with counting on our own.
Hmmm, okay. So basically you’ve got the number of times you are going to do something (n) the probability of success (p), and the number of times you are looking for that success to occur (k). throw all those guys in there, do a little math-magic and boom! You’re in probability city!
The beauty of this formula is it allows you to easily calculate any number of successes over however large number of trials for any probability, making it a very powerful tool if you are looking at something that has a binary outcome like a baseball game. Where, the two options are a Curly W or a loss. Nice!
Now Applied To The Nats!
So, let’s look at a 3 game baseball series. Let’s say the Atlanta Braves are visiting the Washington Nationals for a weekend series or something. That never happens. What is the probability that the Nationals get swept by the Braves? Using the Binomial Distribution, this means that the number of trials, n, is 3, and the number of successes, k, is 0, because the team did not win any games. Plugging in different winning percentages for the Nationals gives us the following the probabilities:
Wins |
||||
Win % |
0 |
1 |
2 |
3 |
0.100 |
72.9% |
24.3% |
2.7% |
0.1% |
0.200 |
51.2% |
38.4% |
9.6% |
0.8% |
0.300 |
34.3% |
44.1% |
18.9% |
2.7% |
0.350 |
27.5% |
44.4% |
23.9% |
4.3% |
0.400 |
21.6% |
43.2% |
28.8% |
6.4% |
0.450 |
16.6% |
40.8% |
33.4% |
9.1% |
0.500 |
12.5% |
37.5% |
37.5% |
12.5% |
0.550 |
9.1% |
33.4% |
40.8% |
16.6% |
0.556 |
8.8% |
32.9% |
41.2% |
17.1% |
0.600 |
6.4% |
28.8% |
43.2% |
21.6% |
0.605 |
6.2% |
28.3% |
43.4% |
22.1% |
0.650 |
4.3% |
23.9% |
44.4% |
27.5% |
0.700 |
2.7% |
18.9% |
44.1% |
34.3% |
0.800 |
0.8% |
9.6% |
38.4% |
51.2% |
0.900 |
0.1% |
2.7% |
24.3% |
72.9% |
Note: Jared is using a bunch of different winning percentages here because, well, we don’t actually know what the winning percentage for the team will be yet. This lets you look at the likelihood of getting swept (or sweeping) the Braves if the Nats are a great, or awful, team.
Here, we can see that, if the Braves and Nationals are evenly matched and will win games ½ of the time (the .500 line), a sweep will happen 1/8 times (12.5%) or the same probability you have of getting 3 heads in a row. Obviously, if one team has an advantage, the probability of the worse team being swept creeps up. But even a team that only wins 1 out of every ten games (.100 Win %) will avoid being swept ¼ of the time. The opposite is also true; the probability of the team with the edge being swept diminishes, but it doesn’t go away until you get to REALLY good winning percentages. The chance of a team being swept in a 3-game series does not dip below 5% until you get to a winning percentage of about .631, which equates to a 102 or 103 win team.
I have highlighted two percentages in the table above. The first, .556, is the record the Nationals had against the Braves last season (10-8). So, even assuming NOTHING changed in the off-season, the Nats would still get swept in nearly 1 out of every 10 series against the Braves.
The second percentage was the Nationals overall winning percentage for the season, .605 (98-64). This means that for ANY 3 game subset of last season, the Nationals had a 6% chance of losing all three games. Given that there are 160 distinct sets of 3 consecutive games, that means that Nats should have lost 3 consecutive games about 10 times last season (they actually had 13).
And what about season records against really bad teams? Let’s use a divisional series from this year’s schedule, which is 19 games. Let’s also say that the REALLY bad team will only win 50 games, which has happened maybe twice in a 162 game season in the history of baseball. That gives us a .309 winning percentage:
Wins |
|||||||||||||||
Win % |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14+ |
0.309 |
0.1% |
0.8% |
3.1% |
7.8% |
13.9% |
18.6% |
19.3% |
16.0% |
10.7% |
5.9% |
2.6% |
1.0% |
0.3% |
0.1% |
<.1% |
If you haven’t sussed it out, Jared is referencing the Marlins…only making them worse then they likely will be.
Looking at this, even a team that is historically bad will win at least 1 game against their division rivals 99.9% of the time. Or, put another way, the Nats would complete a 19-0, season sweep of these ultra bad Marlins once every thousand seasons. 54% of the time, this team is going to go somewhere between 5-14 and 7-12 in the 19 game series and 90% of the time, they will go somewhere between 3-16 and 9-10.
One last note, based on this distribution, this historically bad team will have a winning record against a division rival (10+ wins) 4% of the time. Given that each team has 4 division rivals, this means that, even if this team was historically bad for several consecutive seasons, it would only take about 6 seasons for them to have a winning record against one of the teams in their division.
And what about that team that wins 75% of their games, which is about a 120 game winner, over a ten game stretch? What are the chances they win 4 or fewer games in those 10? Just like the Nats did from the last game of the Reds series, through the White Sox series and the Braves series.
Wins |
|||||||||||
Win % |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
0.750 |
0.0% |
0.0% |
0.0% |
0.3% |
1.6% |
5.8% |
14.6% |
25.0% |
28.2% |
18.8% |
5.6% |
In case you didn’t know 120 wins gives a team about the best record ever in MLB history.
There is a 2% chance that a 120 win team will lose 4 or fewer games over a 10 game stretch, which is a 1 in 50 chance. In a 162 game season, there are 153 different sets of 10 consecutive games, so even a historically good team will go 4-6 or worse over 10 games 3 times a season on average.
So, before you panic, remember that there is a reason that everyone decries small sample sizes. 3 games do not a season make. For that matter, 19 games do not a season make. The 2012 San Francisco Giants, AKA the reigning World Series Champions, began their season being swept by the Arizona Diamondbacks, a team that didn’t even make the playoffs at the end of the season. The 116 win 2001 Seattle Mariners went 4-6 over 10 games twice and the 114 win 1998 New York Yankees, who also won the World Series, went 4-6 over 10 games 11 times that season. Remember, there are still 148!/(58!90!) (~7.32×1041) ways that the Nationals can get back to 98 wins this season.
Questions, comments welcome below!
I wish this was seen by my friends at Federal Baseball, esp. the ones who make doomsday comments even before a game is over, and often after just a 2-3 game stretch. It must be therapeutic to vent, but to those hearing these negative comments it often sounds irrational.
Thanks! By all means share, we love Federal Baseball!
Waht happened to the comment I made before being required to login??
Sorry! We do an approval process here since we dont get a lot of comments. All square now!