A few weeks ago, I talked about playtesting Limited, and this week I want to talk about the corollary in how we playtest Constructed. This article will mostly focus on the structure of how and why we playtest Limited, and not on the actual mechanics of it.
Playtesting in the Pit differs quite a bit from playtesting in the real world for a few reasons:
- Our goal is not to simply find the best deck, it is to create a deep and engaging format where no one deck or strategy is too dominant
- We are concerned with how both players experience the game, more even than balance, and want games to be enjoyable for both players
- We have the power to change cards in order to accomplish these goals
Creating a game that is fun and reasonably balanced is a lot harder than just creating one that is perfectly balanced. Chess is about as perfectly balanced a game as possible, but still, people release new games every year. If balance was the ultimate goal, then they could give up. The mixture is what is important. We want to create something that is new and interesting, and provides people opportunities to learn—which I believe Magic excels at. What we aim to do is provide cards to construct an environment that is robust enough to support millions of players, while at the same time letting people discover new things we were not expecting.
Early Playtesting ("Devign"–Structural Development)
When working on a set, we start off with Limited both because it is the most sensitive to art, and because it is the part that can be most heavily influenced without the sets around it being finished. Getting cards right for Limited is also less about the individual power level of cards, and more about the general shape of the set.
During these early times, we will try out cards that feel risky, but could also add something cool and interesting to the game. Early playtesting is for taking the big swings, and it's where the vast majority of our changes happen. Sometimes we add a mana to a card, sometimes we add two. Sometimes we realize that we would be most happy if the card was at a cost that kept anyone from ever casting it—and cut it from the set.
Sometimes, it's because the card is much stronger than we predicted, and we end up adding mana to it, or it might not need stats that are quite as large to get us to play it.
Many of the cards design hands over in the initial handoff will not make the final version of the set, and many more will get tweaks, such as to casting cost, power and toughness, or activation cost. Still others will get cut from the set entirely to make room for new cards that either fill a specific role that is missing or are just independently cool cards.
That isn't just to say that the only thing that happens at this point is weakening cards—we also buff quite a few cards that we believe would be fun if they were stronger. Master of Waves is an example of a card that was buffed because it looked like it would be fun if it was priced in such a way to make it a Standard card. The original 4U 3/1 that made 2/0s just wasn't going to get there.
In Magic's ancient past, removal, counterspells, and card draw were constantly made very strong because of the belief that that is what competitive players liked. Well, to an extent, that was true, but I think in general most competitive players liked them because they were the strong cards. When looking at competitive play, it is dangerous to overlap "cards people played" with "cards people liked playing." Players make (mostly) rational decisions in deck building and playing the strong cards, so simply seeing what decks look like won't tell you what people really want to do. I think people are generally more happy with the cards that are more flat in overall power level, and the fun and interesting decks have a little more room to compete, as opposed to when the range of decks that can be played is incredibly narrow and focused on a few pillars of the format.
I am sure that some people very much enjoyed casting Arcbound Ravager, but if it hadn't been the strongest deck by far in the environment, those players could still have done so, while the players who wanted to cast Tooth and Nail or Crystal Shard could've done their own things. In fact, I suspect that most of the people who genuinely enjoyed playing Arcbound Ravagers would've been happier if every other match wasn't a mirror match. Over time, we have become better at finding the cards that would be fun if they were strong, and making them so. Ideally, by the end of structural development, we have a set that plays well in Limited, and has enough rolls for Constructed cards that it will make an impact on Constructed environments.
Middle Playtesting (Format Development)
After structural development, there is a break. When a set returns, it does so in format development, and the nature of things changes a bit. To some extent, sets in structural development exist in a bubble. By the time format development begins, most of the previous set is locked down, and the new set has to integrate with it for both Limited and Constructed.
During this time, we focus adding the cards from the newest set in development to our existing decks, as well as seeing what cards in the new set allow for new decks to be Constructed. Looking at Born of the Gods, Xenagos is an example of a card that we added to our Red-Green Monsters decks right from the beginning. He went through many iterations, but we knew he was going to fit into that kind of strategy. Pain Seer and Herald of Torment, on the other hand, didn't fit as neatly into existing decks, so we were able to construct new decks around them and get them to about the power level where we thought they would catch up.
We aren't going to get everything right, and due to the very nature of how Magic and our playtesting works, we will end up missing a few things. Some of these will end up meaning that cards we thought were Constructed staples didn't quite make it, and some of it means that cards we thought were fringe were instead staples. As long as we aim cards at the right power level, and we don't miss on too many cards in the same direction, then a great deal of this will end up canceling itself out. In general, we don't get the exact levels or builds of the decks correct, but we do get the overall structure of the metagame's buckets about right.
Just to reiterate our basic bucket strategy, we try to make sure that there are enough decks of all major strategies in standard, to help balance the metagame. The basic outline of this strategy can be found below:
You can read more about it here, but for the tl;dr, the idea is that the basic structure of the decks above is such that they are strongest against the two decks following it, and strongest against the two decks preceding it. So, midrange will beat up on the pure aggro and disruptive aggro decks because it has larger creatures and removal, but weak against decks that are trying to either pull off a combo/ramp into something or traditional control decks. On the other end of the spectrum, disruptive aggro (aggro with counterspells or discard) will be strong against control and combo/ramp, but have a hard time beating a more focused pure aggro strategy or midrange.
The goal isn't to perfectly balance things here, but to make sure that there are fun and attractive decks in each bucket to keep the metagame moving. The idea is that if the midrange decks become too popular, ramp/combo or control will rise up to defeat them, which will leave an opening for the aggro decks to come back in the metagame. It doesn't always work, but it gives us a lot of direction for balancing, and making sure that we are making the metagame broad enough. If we feel that one of the buckets is lacking, we will find some cards that would be fun, and push on them a bit to give those decks enough juice to be competitive.
Finishing Touches (Polish)
After full-scale development of the set has officially stopped, there is another gap of time before the set actually needs to get typeset and sent to the printers. The main work that happens during this time are adjustments based on FFL feedback, again strengthening and weakening cards to make the environment as a whole more fun and balanced. Because the next set in the chain is now in format development, changes can be done on both sets to improve things.
During this time, the art has already been commissioned, so unless we need a major change that justifies rushing an art piece (which is rare), we are stuck either working within the confines of the art, or doing an art swap. This means a creature can usually gain a little power or toughness, but not something like flying. It also leads us to finding creative ways to fix our problems, which leads to a lot of unique cards—sometimes with some very tenuous flavor justifications.
This is the part of the process where a lot of the "developer-y" text tends to get added to cards. Sometimes, we realize we need a card to fill a very specific role—like to be good against strong decks in the metagame (i.e., Stormbreath Dragon's pro-white and anti-Sphinx's Revelation text), and sometimes we need to slightly ding a card to make room for future cards (like cards that I can't talk about). In any case, the fine tuning here is mostly just that—very fine movements once we have a much better handle on the metagame than several weeks before. Cards generally change a little bit here or there as the set approaches the pencils-down moment—often focusing on getting better templates—but for all intents and purposes, our playtesting focus moves to the next set in the line.
That's it for this week. Until next time,