Players at Grand Prix Niagara Falls completed a total of 3,466 matches, including Top 8 playoffs but excluding byes and draws. Last week, I looked at the results of 2,809 of those matches, all where either the winning or the losing deck could be determined. If you want to know which archetypes performed the best overall, look there.
This article deals with the 2,304 completed matches for which we know both the winning as well as the losing deck. I write “we know” because all 2,304 match results are available on a Google sheet for everyone to see.
So if you want to look up, say, how the battle between MUD and Miracles played out, you can. MUD met Miracles twice and won twice, which may be reassuring to the MUD people, but doesn’t really prove anything. There’s a chance of p=0.25 to go 2-0 even in an even matchup.
A very simple test of statistical significance does exactly this: calculate the probability that a result at least as extreme as the one observed occurs under the assumption that neither of the two decks is at an advantage. If said probability comes in at p<0.05, then it is customary to consider the observation significant. Then we can reject the null hypothesis: the result indeed indicates a significant advantage for the winning deck.
By these standards, a 3-0 or 4-0 record in a matchup isn’t significant, but a 5-0 record is. Likewise, 7-1 or better is significant, but not 6-1 or 7-2. Of course, if a record of 13-5 bears significance, then a flipped record of 5-13 does as well.
The following focuses on significant observations, although I’ll point out results that came close (p<0.07) along the way.
On the Edge of the Stoneblade
The most popular deck in the field naturally completed the most matches. But the large samples mostly just showed what a true all-rounder Stoneblade is. It seems to benefit or suffer from precious few lopsided matchups. In Niagara Falls, Stoneblade went even against a diverse rogue’s gallery including Miracles (an exact 50% record of 21-21), Death and Taxes (18-15), Storm (14-13), and various others. Even its 31-21 record versus Blue-Red Delver isn’t enough to call the matchup confidently for Stoneblade.
The one big exception and Stoneblade’s kryptonite proved Sneak and Show. Sneak and Show won 77% of its 26 matches against Stoneblade. This equals a highly significant p-value of 0.0047.
Teetering on the edge of significance, at p=0.068 and p=0.0625, were Stoneblade’s results versus Turbo Depths and Enchantress. Stoneblade beat Depths 19 matches to 10, but lost 1-6 against Enchantress.
Your Money and Your Life
Like Stoneblade, Death and Taxes exhibited hardly any strong tendencies anywhere. Its single significant stat also didn’t come against one of the major competitors. Instead, Death and Taxes lost five of five matches against Infect, the bare minimum to push a p-value below 0.05.
On the flip side, lots of decks almost achieved a significant winning record versus Death and Taxes: Miracles at 15-7, Sneak and Show at 11-4, Goblins, Elves, and RUG Delver at 4-0 each.
A Work of Wonder
Miracles concludes the trifecta of archetypes with barely any favorable or unfavorable matchups. In fact, it brings matters to a head. None of the deck’s results at GP Niagara Falls indicate a significant advantage or disadvantage for anyone.
Somewhat notably, Miracles went 15-7 against Death and Taxes and 10-17 versus Blue-Red Delver.
Sneaking into the Show Room
Sneak and Show exhibited considerably more extreme matchups, although again most didn’t reach the level of significance. Two positive and two negative pairings did. The deck beat Stoneblade and Storm convincingly, at 20-6 and 14-3 respectively, whereas it lost each of its six encounters with Infect as well as 71% of its 21 matches against Grixis Delver.
Most of this appears to be in line with expectations. Grixis Delver in particular had always been the favorite in the matchup. It is interesting, if not significant, that Grixis Control—a similar strategy, albeit without a clock—only went 7-7 against Sneak and Show. It is interesting not because it is surprising, but on the contrary, because it adds to the mountain of evidence that teaches us to look for disruption plus pressure.
Other than that, Sneak and Show could boast an 11-4 record versus Death and Taxes and a 6-1 versus Red Prison. Neither is quite significant. People expected Sneak and Show to be an unfavorable pairing for Lands, which lends a bit of extra credibility to Sneak and Show’s 11-5 performance in the matchup. Though 11-5 itself isn’t anywhere close to significant, which in turn puts the expectation into perspective.
Plumbing the Depths
Significantly at p<0.05, Turbo Depths went 5-0 versus Burn and 5-0 versus Eldrazi.
Almost significantly at p<0.07, Turbo Depths went 4-0 against Eldrazi Post, 4-0 against Arclight Phoenix decks, 6-1 against Reanimator, but 10-19 versus Stoneblade and 8-17 versus Blue-Red Delver.
I heard rumors that Red Prison “should beat Turbo Depths all day long.” Interestingly, the red deck just won six of ten such encounters at Grand Prix Niagara Falls. This result cannot call the original assessment into question. Maybe the Depths players got uncharacteristically lucky. But months of collecting data has taught me that most matchups in Magic aren’t as clear as people like to believe. See Stoneblade, Death and Taxes, Miracles, et al.
Delving into the Differences between Grixis and Blue-Red
The differences in performance between Blue-Red Delver and Grixis Delver weren’t all that great. What’s more, the specifics aren’t all that surprising.
Grixis won a significant portion of matches, 15 of 21, against Sneak and Show, whereas Blue-Red went about even, winning 12 of 25. The relevant black cards found themselves almost exclusively in the sideboard, but they appeared relevant indeed. The addition of black also improved the Stoneblade matchup, although neither Blue-Red’s 21-31 losing record nor Grixis’s 8-5 winning record was significant. The benefit of black didn’t extend to the Death and Taxes matchup, which was insignificantly unfavorable for both versions of Delver.
The third color’s disadvantage made itself felt against decks featuring more than the usual Wasteland action. To wit, Blue-Red went 5-5 versus Lands while Grixis went 0-4, and Blue-Red went 2-2 versus Aggro Loam while Grixis went 0-5. Curiously, the same didn’t apply to Delver’s performance against Red Prison’s Blood Moon, where Grixis won 8 of 11 matches while Blue-Red only won 8 of 18, neither significant.
Blood Moon didn’t prove an effective weapon against Grixis in general. In fact, Red Prison’s single significant matchup was its 3-10 result versus Grixis Control.
As far as I know, there were no further significant matchups. The field just wasn’t large enough and too fractured too. As a consequence, most decks simply didn’t meet any other deck sufficiently often to string together any sort of meaningful matchup record.
Sadly, this includes Steel Stompy. The Steel Stompy pilots in attendance delivered by far the best performance at GP Niagara Falls, but there were only four of them. Looking at the performance in detail might give additional indication as to its replicability. If the 34-15 overall record is comprised of various unlikely results, then the composite looks much more questionable as well.
The Men of Steel Went:
- 3-0 versus Death’s Shadow
- 3-0 versus Grixis Control
- 3-0 versus Reanimator
- 2-0 versus Dredge
- 2-0 versus Miracles
- 2-0 versus Sneak and Show
- 1-0 versus Aggro Loam
- 1-0 versus Bomberman
- 1-0 versus Enchantress
- 1-0 versus Infect
- 1-0 versus Omni-Tell
- 1-0 versus Sneak and Breach
- 6-1 versus unknown decks
- 3-1 versus Blue-Red Delver
- 1-1 versus Death and Taxes
- 1-2 versus Grixis Delver
- 1-2 versus Turbo Depths
- 1-2 versus Red Prison
- 0-1 versus 4-Color Control
- 0-1 versus Burn
- 0-1 versus Eldrazi
- 0-1 versus MUD
- 0-1 versus RUG Delver
- 0-1 versus Stoneblade
I can’t say for sure whether or not any of the above constitute outliers. But I wouldn’t have expected Steel Stompy to be a favorite against various Show and Tell or Sneak Attack decks or against Reanimator. Does this match history look like business as usual for Steel Stompy? Or does it rather look like a string of lucky breaks? I hope someone can enlighten me in the comments.