It is development's job to balance the set, but to also focus on making the set as fun as possible. But we also have to be concerned with more meta-concerns in terms of common word count and complexity, color balance, and making sure the set has enough depth to be played fifty or more times. We have a relatively limited amount of time to accomplish all of these goals, so we have come up with a number of processes to improve how we do what we do and make our playtesting sessions as fruitful as possible.
It All Starts with a Point
Before development does any Limited playtests, it's important that the team spends the time to quickpoint the set and try to get some of the most basic balancing done and make sure the colors (at least on a first pass) are all around the same power level. Quickpointing is a system we basically use to divide the set up into different power categories for Limited—A+s, As, Bs, and Cs.
As the name implies, it's intended to be a quick way of getting the job done, not to be 100% accurate in relating cards to each other. It's just enough that we can get a general grasp on where the strength of the set lies. Some of these numbers reflect how strong the card is in Constructed but are very far from exact. For instance, a card like Duress is not very powerful in Limited, but is definitely a Constructed card, while Enlarge is one of green's strongest cards in Magic 2014 Limited, but will probably not see any real Constructed play. The goal of quickpointing is not to perfectly balance the set (that will come with time and learning which cards are better in the current environment than expected), but to try and reduce the number of Limited playtests that only serve to tell us that one common is too powerful, or that one color is much stronger than the rest.
Another related exercise we do before too much Limited playtesting occurs is to chart out the top commons of each color and make sure there is a reasonable distribution among them between the colors, as well as what the cards are doing. While we generally expect that the best red and black commons will be removal, if every good common in the set is removal, we probably have a problem. Similarly, if the best blue and white commons all hit the same points on the converted mana cost curve, it is likely that a white-blue deck is going to have real problems curving out in the format, and it would be good to shift a few cards around a bit to make sure that doesn't happen.
The majority of our Limited playtesting comes in the form of Sealed. This might be surprising, as the majority of Limited play after the Prerelease weekend is Draft, but Sealed has some inherent advantages over Draft, namely that it isn't a self-correcting format. The problem with relying too heavily on Draft data, especially early on in playtesting, is that even if the color balance is pretty out of whack, it can easily be hidden by the fact that everyone needs to construct a forty-card deck, and people tend to drift toward the colors that are just not being taken enough. It would be easy, at the end of the draft, to look at what people are playing and decide the format is balanced, when really everyone tried to go black, most failed, and two players very reluctantly slipped into blue because the first-pick blue commons were going ninth pick in the first pack.
Sealed Deck doesn't have this problem. If green and red are the strongest colors by a country mile, then almost everyone will end up playing green and red in the Sealed playtest—very easy indication that it's time to rework some stuff. Similarly, if one color is underpowered (or at least unappealing), then people will simply ignore it for Sealed. Because the sealed decks we get are pretty well color balanced, it's usually pretty easy to tell if one color is being avoided and try and figure out how to solve that for future playtests.
That doesn't mean we don't do drafts, though. Plenty of those occur; they just tend to come later in the process, when most of the major balance issues have been solved. What draft playtesting is very good at doing is highlighting how Limited archetypes are playing out, and of course simulating how deep the Limited format is. One of the things we have found from Magic Online is that a surprisingly large number of people draft good Limited environments more than twenty times, and we want to make sure the Limited environments we are putting out can hold up to that many drafts. Making sure that the actual draft experience has fun and meaningful choices is just as important as making sure that the games themselves play out in fun and meaningful ways. Missing either of these aspects can end up with Limited environments that are fun for a while, but not ones that will stand the test of time.
Even though your average Pro Tour player will do more drafts of a set in the weeks leading up to the Pro Tour than we have done in the entire development cycle for the draft (and all of their drafts will be with the final card pool, which few of ours will be), we have become pretty good as a team of figuring out how to layer things together to create an environment that is deep enough for players to discover new things about over time.
Once everyone has been given a sealed deck, or has drafted, before we play we like to get a better sense of what is going on as a whole, so we record our numbers, usually on a whiteboard. The purpose of this is to provide the lead of the set a good idea of how the Sealed or Draft playtest sorted itself out.
Below, I have created an example of what these numbers might look like:
What we are looking for are somewhat even color distribution (with the slashes representing gold cards), people playing different color pairs, and a reasonable number of the rares seeing play.
For this last point, one of the things that makes Limited fun to play over and over again is that there are powerful rares that show up from time to time. In a Sealed playtest, we assume that your rares are going to be one of the main (but not only) driving forces for assembling your deck. If almost every rare was seeing play, that would probably be a sign that they were too strong in Limited, while if almost none of the rares were seeing play, that would be a sign that they were just too weak. For draft, we want to make sure that the rares are doing one of their intended goals of helping to anchor players into a color. These numbers aren't perfect at giving us all that information, but it does give us a reasonable place to start.
For the above draft, it looks like blue and white are a little underpowered. Red might be, as a lot of its total came from one person playing a mono-red deck, so we would make sure that the rewards for playing that weren't too high. In addition, two people played red-white and two played black-green. While a single point of data might not be enough to make any decisions, if we completed additional drafts and saw those color pairs popping up much more than others, we would look at what the synergies are between them and see if either they are too strong or the motivations to go into other color pairs aren't strong enough.
The final part of the equation is our internal wiki pages. After each session, someone sends out a link to this page, and people are encourage to leave feedback—both positive and negative—about their experiences with the set. This can vary, from comments about if the person felt the set was too hard to build a sealed deck in to whether the person enjoyed some of the cards he or she played to whether he or she felt tricked by a first pick whose theme never really panned out, or anything of that ilk.
As sets move through the process, we try to not only draft with members of the development team, but also with different people throughout the company. While it would be easy for us to develop a draft format that is only enjoyable to the developers, we know that we are not necessarily our typical consumer, so getting a variety of viewpoints is important. Knowing how people who are simply less experienced or familiar with a set react to it is important, and it helps to make sure that by the time the set has actually been printed and is being opened up in booster packs, that it has the widest appeal possible, and that the Limited environment is enjoyable to as many people as possible.
The thing the development team feeds on the most is actionable feedback. Concrete reasons why a person had a bad time, or card interactions they just didn't like, go much further than general "it was fun" or "I didn't enjoy it." We have a limited amount of time to get sets from design to development, and feedback that gives us possible fixes just goes much further in terms of trying to decide on how to make alterations to the set.
Once the cards have been put down and the feedback has been absorbed, it is time for the set's development team to meet and make changes. In the beginning, these can been be at a pretty breakneck pace. Ten to twenty cards can change per playtest, dramatically changing how the set plays out.
Development's charge isn't just to balance a set, but to do so while fulfilling the goals set by design. While some of that can be done by simply moving around a few casting costs or numbers, other parts require large reworking of the card set. That's just how our process works. The sets are about a lot more than just the individual cards, but also about the more lofty ideas being pushed by design, such as making Ravnica's guild system play out or creating a Limited environment that can support the Eldrazi titans. Between the time a set enters development and development ends, a vast majority of the cards will change at least a bit, and many will be taken out and replaced by designs created by the developers to fill gaps that are missing in the set, both for Constructed and Limited. While there is a perception that developer cards all have protection or weird rules text aimed totally at Standard, you might not think of a card like Mercurial Chemister as a developer-card, but that is where it came from in the process.
We talk a lot about how the Future Future League works, but little about how we playtest Limited. Of course, a lot more goes into making a set than all this, but I hope you enjoyed the basic overview of what we do. If there is more interest, I can do a follow-up to this at some point with further details. If you want to know more, just shoot me an email.
Until next time,