Picking the right deck for a Magic tournament involves three basic inputs.

1. What does the metagame look like?

We want to focus on popular matchups.

2. What does the win-percentage matrix look like?

I recommend reading Brad Nelson’s article if you want to get a more intuitive feel for why we should care about all the math I am about to do. He gives his specific example of a win-percentage matrix for the time.

In a more detailed matrix, you would have percentages instead of just who’s favored. Paul Jordan used to do something like this with PT data. Analyzing which decks won the largest % of their matches and against whom.

Specific builds of various archetypes will be better or worse than the representative example, but since these are generally just trade-offs you usually want to start with the best positioned macro archetype.

3. How many rounds is this tournament?

This last factor is actually subtly important. In shorter tournaments the decks that are best positioned in round 1 are the best. In longer tournaments, however, those well positioned round 1 decks become a larger part of the top tables around round 6. A different deck may be better positioned in round 6. The metagame later in the tournament is determined by the win matrix and the metagame at the beginning. If the tournament is long enough we care about beating the “end boss” metagame as much as we do the round 1 metagame.

A good example is Infect at Pro Tour Return to Ravnica where Jund decks with Liliana—Infect’s worst matchup—were 30% of the metagame. For a more detailed discussion, see here.

So how should we answer these questions?

I think 1 and 3 are easy to understand for players. You take a guess at the metagame (by forecasting using your intuition or using historical numbers a la Chapin/Karsten). You know the number of rounds.

But question 2 is hard to answer.

Ideally you would just play matches between every deck and hope that you got typical results to plug into your matrix. With a large enough sample size you would become more and more accurate. If you had access to the deck lists (and reverse engineered the rounds from Wizards’ site), you could get a fairly large sample from the PT.

But I am focused on estimating the win matrix from the relatively opaque data that we do get.

Below are the results of a new methodology I am calling Maximum Likelihood Estimation.

I’ll break down the rest of the article as follows:

1) Results
2) The Intuition Behind What I Have Done
3) Some Caveats
4) Actual Advice

If you aren’t interested in the math, I suggest skipping 2 and 3.

1) Results: Who Was Beating Whom

PT DTK v212


(Click for larger version.)

The above win matrix is the one which best fits the observed metagames from rounds 1 and round 16.
PT DTK v22.1

I’ve tiered decks to be close in overall win percentage versus the expected metagame at that point.

At the beginning of the tournament you wouldn’t sacrifice much win percentage by playing any of the tier 1 decks. As the tournament progresses, however, the tier 1 decks tend to become overrepresented and thus you would want to be the tier 1 deck with good matchups against the other T1s.

UB Control was the Best Deck

Interestingly enough, this suggests that UB control was a completely broken deck. And I think that this hasn’t been properly appreciated by the mainstream coverage.

This was Caw-blade-level dominance. Essentially no bad matchups. But because it was split over various versions and played by multiple teams we don’t see statements like “____ broke it.” This was an interesting example where data doesn’t quite fit the narrative that we like as Magic players—looking for the new broken deck piloted by one team. Having the winning deck be something other than UB only further misleads us as to the dominance of the archetype.

We have yet to see whether UB Control (and its Dragons alternatives) can survive a targeted attack like Caw-blade did, but I think it is possible.

Of note also is that when it comes to the “winner’s metagame” there are a lot of equally viable options in the tier 2 brackets. But this is somewhat misleading, because RDW and Green Devotion were better at the beginning of the tournament. So from an expected EV point of view you were better off picking one of those, not because it had a better chance of winning the tournament but because your floor was probably lower.

2) Maximum Likelihood Estimate of Win Percentages

A fairly consistent problem in science is estimating parameters based on some empirical data. Classic examples are things like calculating a mean of a sample or performing a linear regression. Econometrics has a similar tool called maximum likelihood estimation. The idea is simple: what values of parameters (in this case we care about the win matrix) generate the greatest probability of observing the data that we actually saw? In this case the data is the change in metagame from round 1 to round 11.

Here is the Wikipedia explanation.

Consider a simple example. Imagine a tournament where there were only two decks (RDW and UB Control). At the beginning of the tournament the metagame was 50/50 and at the end the metagame was 50/50. The MLE of RDW vs. UB would probably put them at 50/50.

If UB was 100% of the metagame by the end of the tournament then the MLE would put the matchup as favorable for UB. Exactly how favorable depends on how much RDW fell off and how many rounds there were.

Note we assume that the mirror is always 50/50.

Formally we would express it as:

r1 is the metagame in round r1.

f(x,r1,10) is the expected metagame after 10 rounds of play given the win matrix (x) and initial metagame.

There are also some other constraints that have to be placed on the win matrix.

3) Some Caveats

As always with these kind of exercises we are relying on a whole bunch of assumptions which may make our results less accurate.

Particular to this estimation method is one that wouldn’t apply to the normal metagame/data analysis:

The win matrix that solves a given metagame change is not unique.

For example, consider the two-deck metagame where RDW becomes 0% of the field after 100 rounds. UB having a 70%-100% matchup vs. Mono-Red will be indistinguishable in these cases.

We account for this using two methods:

1) 300 simulations each with a randomly determined starting point for the win matrix. Furthermore, the results were fed back into the estimator to see if they were stable.

2) Favoring a matchup being closer to 50/50 than 100/0 whenever a judgment call has to be made. E.g. in our two-deck metagame we would assume the matchup is 70/30, not 100/0.

4) Give Me Some Practical Advice

Play UB. Learn the mirror, prepare for the rest of the metagame.

Or, if that isn’t an option, and you feel the need to try and beat them:

The decks with the best win percentage against UB were estimated to be Abzan Control, other, and Abzan Aggro. Presumably all three were mostly built without UB at the forefront of their concerns. So this is where I would start. It is also possible that as people become more inbred to beat the mirror, other decks (like Mono-Red or GR Rabblemaster variants) may become better positioned.

“Other” includes a large variety of decks, so I would look through the successful ones and try to tune some of those. Note that I think this is what distinguishes UB Control from historically dominant decks (assuming my estimates are accurate). Historically dominant decks crushed all the random stuff that shows up at any large Magic tournament. They weren’t just tuned for a specific meta, they were overwhelmingly powerful.

Part of picking the best deck for the next few weeks is figuring out whether blue/black was a particularly well positioned tool or an overwhelmingly powerful deck.

What I wouldn’t do, is play any of the following:

These decks are poor vs. the top dog and weak to some of the other contenders for tier 1 (such as Abzan Aggro).

• Whip
• Heroic
• Jeskai