For those unfamiliar with MTG Arena, drafting on the platform currently only allows you to draft with bots. In principle, this creates many benefits. You are not required to wait for others to make picks, not penalized for pausing to think for as long as you want, and you do not forfeit your entry fee if you must stop drafting midway through. But the experience of drafting on MTG Arena is not all positive, and the problems act as constant reminders of the vastness of the uncanny valley. Fortunately for Arena fans, Wizards of the Coast has so far been receptive to suggestions to improve the artificial intelligence behind the bots and so hopefully the downsides can be addressed while the upsides are preserved. The goal of this series is to theorize ways to improve the bot drafting experience.

The Problem With MTG Arena Drafting

Imagine you drafted for the entire lifespan of a Limited format, gained experience with cards, and then created your own internal evaluation system that you used while drafting. Then you use your powers to go back in time to when the set was first released and you join a Draft with players at the prerelease who do not have your knowledge or experience. This is what drafting in Arena feels like. The bots seem to lock into colors after the first pick and then do not deviate. The bots are so off on card evaluation and rigid in their ways during the lifespan of a set that they do not evolve fast enough to capture the current interpretations of card quality. These flaws allow the bots to be meta-gamed (/exploited) and detract from the overall enjoyable experience that drafting has provided in the past.

These major problems are not entirely unique to Draft bots, so even if you are a human who is new to drafting but have no interest in Arena, this will be helpful. Both bots and novice drafters often rigidly rely on set reviews or pick orders created before a set is extensively tested and the metagame established. Here is an example of meta-gaming the bots that seem to adhere to these principles:

And here are the bots seemingly drafting at random:

While I know this is the internet and it’s usually expected to just be negative and complain, my training as a scientist makes me want to theorize about a solution. I have no way of knowing if this is possible or plausible, or if it isn’t the way it already works to some extent, so just consider it a fun thought experiment about strategy through the lens of teaching a computer how to draft competently.

Identifying Parameters

First we must generate some principles that our computer should follow, knowing that they will do no more and no less than what we tell them. Following popular convention, my ideal Draft bot will follow three principles:

  1. Begin Day 0 able to execute a sensible pick order.
  2. Modify pick order depending on what cards have already been chosen in the draft.
  3. Evolve the Day 0 pick order over the life of the format based on feedback (machine learning).

This article is already pretty long and ChannelFireball doesn’t pay by the word, so part one will only focus on the first two principles. Part two will be even more theoretical than this, so the third principle will be the focus next time.

Establish a Day 0 Pick Order

Principle number 1 is straightforward. We need to give our bots a sensible place to start at the beginning of the Draft. When the set is released (Day 0), the bots should have a preset pick order where all the cards in the set are sorted into different tiers. For humans, this is often aggregated from major content creators, so maybe we can train the bots to mine this trove of information too. This is plausible to do completely internally at Wizards of the Coast due to the qualified Play Design Team, who could generate these rankings in the event that some content creators (who shall remain nameless) wait until after the set release to finish their Limited rankings.

For visualization purposes, in this article I’ll be using LSV’s rating scale and set review for War of the Spark. Magic is a complex game and we do not want our bots attempting to read and evaluate cards, so let’s imagine that none of the cards have text. Instead, each card just has their color identity and a number that is equal to the rating assigned by the set review we’re using. In our case, LSV’s. Here’s an example pack that we would see and what we want our bot to see instead:

A sample pack 1 and bot ratings.

Beginning to Draft

For our first pick we want the best card in the pack, so we want to train our bots to take the highest-rated card. But in this example, there are two cards that are tied, so we will need ways to break a tie based on good drafting fundamentals–i.e., giving preference to single-color cards and lower converted mana cost cards in the event of a tie. In other words:

P1p1: Take the highest-rated card in this pack regardless of color.

Tiebreaking procedure: Pick the card that requires fewer additional colors of mana. If there is still a tie, then pick the card with a lower converted mana cost.

If we trained a Draft bot to use LSV’s rating in our example above, it would identify a tie between the two cards that are 3.5, and the tie would be broken by Jaya’s Greeting since it requires fewer colors of mana than Casualties of War.

For our second pick (P1P2) we want our bot to prioritize taking a better card than we took in P1P1, but if that’s not possible we want it to take a good card in the same color as P1P1 (or is colorless). Our third option if our bot cannot do either of the first two is to take the best card overall in the pack. In other words:

P1p2 first choice: If the highest-rated card in this pack has a rating higher than or equal to our previous pick, then select the highest-rated card.

P1p2 second choice: If there is a card that is colorless or shares the same colors with our first pick with a rating of 3.0 or higher, then pick that card.

P1p2 last choice: Pick the card with the highest rating regardless of color.

Here’s an example pack 2, again showing what we see and what our bot is seeing:

A sample pack 2 and bot ratings.

Our rules require our bot to draft Ugin’s Conjurant as P1P2 since it is the highest-rated card in the pack, rated equal to P1p1, and is colorless.

For our third pick, we sometimes will have an indication of what colors we would like to play, but traditional Draft fundamentals would still suggest that we have not fully committed and should remain flexible (in our case this is even more so because we have a good colorless card). We would still want the bots to jump to another color if they get passed a card better than the cards we have already drafted, but if that is not an option then we want to take a good card that is colorless or in a color that we have already selected a good card in. If we cannot do either of those then we want to take the best card. In other words:

P1p3 first option: If the highest-rated card in this pack has a rating higher than all of our previous picks or equal to the highest, then pick the highest-rated card.

P1p3 second option: If there is a card that is rated 3.0 or higher that is colorless or shares the same colors with the highest-rated card drafted so far, then pick that card. If more than one card meets this description, then pick the highest-rated of those cards.

P1p3 third option: If there is a card that is rated 3.0 or higher that shares the same colors with the second highest-rated card drafted so far, then pick that card. If more than one card meets this description, then pick the highest-rated of those cards.

P1p3 last option: Pick the card with the highest rating regardless of color.

Here’s an example of our possible pack 3:

A sample pack 3 with card ratings.

Our rules require our bots to take Spellgorger Weird for P1p3 since its 3.5 rating is higher than or equal to the highest-rated card that we have already drafted, which is also 3.5. In our example it also happens to be the same color as our first pick, which is an unintended bonus.

As we accumulate picks, we normally have even more of a lean towards playing a color, but if our neighbors are asleep at the table and pass us a bomb in different colors (like an Roalesk, Apex Hybrid or Enter the God-Eternals, for example) then we could be persuaded to switch colors.

P1p4 first option: If the highest-rated card in this pack has a rating higher than our previous picks or equal to our highest-rated card, then pick the highest-rated card.

P1p4 second option: If there is a card that is rated 3.0 or higher that is colorless or shares the same colors with the highest-rated card drafted so far, then pick that card. If more than one card meets this description, then pick the highest-rated of those cards.

P1p4 third option: If there is a card that is rated 3.0 or higher that shares the same colors with the second highest-rated card drafted so far, then pick that card. If more than one card meets this description, then pick the highest-rated of those cards.

P1p4 fourth option: If there is a card that is rated 3.0 or higher that shares the same colors with the third highest-rated card drafted so far, then pick that card. If more than one card meets this description, then pick the highest-rated of those cards.

P1p4 last option: Pick the card with the highest rating regardless of color.

Middle of Pack One

You might notice that this is pretty formulaic so far. That is great for us because we can potentially codify that behavior so our bot will emulate every pick of a human drafter with solid fundamentals. As we move deeper, we will start to get some clear signals of what colors we should be in and develop color preferences based on the cards we have already drafted. We will want our bots also to do that, relying less on the raw ranking of cards and now taking in the context of the draft. For our bot, we can create an additional rudimentary equation to augment its base rating depending on what we have already selected in previous picks.

As an example how we might do this, let’s say we have our bot count the number of cards we have drafted in each color that are 3.5 or greater and use these tallies to give bonuses to future cards we encounter in packs of those colors. This will push us to break ties between cards based on which color we have more good cards drafted already (e.g., rating = rating base x 1.03 number of cards rated 3.5 or higher drafted in same color). The example equation would result in a 3% bonus to colors in which we have already selected one highly rated cards, a 13% bonus if we have already selected four good cards (at which point it might allow that “on color” 4.0 to outrank a 4.5 if we have no good cards in the second color), etc. Now if our bot is somehow faced with a decision between two cards with base ratings of 3.0, it will break the tie based on which color is already our strongest. Colorless cards can be made to scale with all colors so we do not accidentally outrank a powerful colorless card we open in a later pack.

We could also go even further with these bonus ratings and instruct our bot to provide another slight bonus to cards drafted later in the pack—better known among humans as “reading signals.” These cards provide us more information about how open our colors are since their presence in the packs is information that our neighbors opted for other cards instead (e.g., rating = rating base x 1.01 (14-cards left in pack)). We could even magnify this effect in later packs where our signals should be stronger (e.g., rating = rating base x 1.01 (14-cards left in pack) x (pack number)). This equation, for example, would now result in a 0% bonus to our P1p1 rating (since we are only relying on base rating), a 1% bonus to our P1p2, a 6% bonus to P2p3, and a 13% bonus to P3p4. These bonuses combined with the same formulaic selection system described above should eventually push us into the colors that our neighbors are passing.

To combine these equations:

Rating = rating base x (1.03 number of cards rated 3.5 or higher drafted in same color + 1.01 (14-cards left in pack) x (pack number))

By tinkering with these strictly theoretical numbers, we can transition our bot from just reading our base rating scale to reading the draft table. Human drafters with good fundamentals probably already have an intuition for these behaviors, and these equations are just an attempt to quantify it for our bot friends. While this strategy ignores draft combos, we could even add a “combo bonus” that will value one card higher if you draft a different card that combos well with it. You could even create a random number generator that gives different bonus values to each bot in a pod so each of them value colors they’ve already drafted, signals, combos, or other variables slightly differently to lead to a unique experience even if all of the same cards were opened in two separate Drafts.

Subsequent Packs

By the end of pack 1, we are ideally heavily committed to one color and have some other picks in other colors that will influence our P2p1, but we have the freedom to choose just about anything if it’s good enough. The fine-tuning of the bonus values described previously can allow a reality where our bot that has drafted mostly black picks in pack 1 selects a P2p1 Sarkhan the Masterless but at the same time does not force a selection on a P3p1 Roalesk, Apex Hybrid in our red-white deck. If we can potentially play a bomb we get in later packs, we want to allow our bot the freedom to draft it but not at the cost of a card of value that is “on color.”

Conclusions

As good as the pro Magic players are at Limited, even they do not have the entire metagame completely figured out by the time the set reviews are written and sometimes breakout strategies catch many by surprise. Our bots should not attempt to anticipate these surprises, but ideally they should be able to adapt once they become popular among human players. I think if the principles in this article were implemented on Arena that it would greatly improve the drafting experience there, but the technology and data available could go even further. It’s entirely possible that Arena bots already follow these principles I have outlined, and the calibration is just a little off. An inaccurate base rating or bonuses that are not well-configured could theoretically result in the bot behavior we observe. Next time I’ll explain how to create bots that can actually learn from players and evolve after Day 0 in a way that does not require manually setting the base ratings at all after Day 0.

Drafting is an exciting and unique experience every time and this is why Limited is my most played format throughout my career. This is partially due to the randomness of pack contents and ordering, but also varies based on color preference and differing card evaluation between players. Two players looking at the same cards in the same order can end up in very different places in the end. Remember the example images that I showed above that had us drafting a red deck? Well, those screenshots came from a recent Draft by Reid Duke and he went in a completely different direction than us. That’s the best part of Draft!

Do you think you intuitively follow a formula when you draft? Do you believe you can teach a computer to draft or is it just too complex? How would you design a Draft bot? Let’s discuss in the comments!