We certainly see a number of matches in the real world play out in ways we would've expected from our testing just from the cards on the table, but the details of the decks and their construction is often radically different. We didn't play much Boros Reckoner in a control shell, as an example. While we did have the Blasphemous Act combo with him, we didn't have the unlimited life combo. Things are recognizable, but not the same.
One thing that most players tend to do in the real world is find the best deck (or, at least the deck they feel is the best) and then just keep tweaking it week after week to keep it up to date and on top of the metagame. Our goal is different—to make sure there is enough for all of the decks in the format to do, and that the cards we are releasing are reasonably balanced. We spend time theorycrafting what the best deck might be, but the cards change too quickly to do the kind of real-world tweaking that most players are used to. It's a moving-goalpost problem. It's not uncommon, even late in the development of a set, to have two or three cards get tweaked every week, each one potentially shifting the metagame in radical ways.
The Future Future League
The Future Future League is a group of dedicated playtesters with a recurring meeting time each week to meet and talk about the cards, and also to sit down and battle with decks that we have worked on over the previous week. We have a place where we can post decks that we are working on, for others to look over, and for the set leads to post either notes about cards they want tested or cards that have received changes.
The people most involved with the Future Future League currently are:
There are other designers and people in other departments who will occasionally put some work in, bring a deck, and play a few games, but these are the main people and the ones who can be expected to always have a new deck brewing in their heads.
Our major task in the Future Future League isn't to predict the metagame perfectly, but rather try and get the cards in the right spots. In these weekly meetings, discussions include topics such as which cards are fun, which card aren't, which cards could use a little extra juice, which cards have a bit too much, what strategies are proving to be too strong, and what are some cards we can throw into the format to act as safety valves if one strategy is too strong. As a group, we try to come to a conclusion on which cards to adjust, which cards should get redesigned, or which new numbers to try a card at for a week.
Pay No Attention to the Deck Behind the Curtain
We tend not to show a lot of decks that the people played in the Future Future League, not because we are too worried about wrecking the real-world metagame, but because they tend to represent decks that will simply never exist in the real world. It wouldn't do much good to post a deck based around Vorel/Jace or the Boros decks when Truefire Paladin naturally had both first strike and vigilance. The timing on changes also don't always line up with information we can show you just yet. Even when we moved beyond just testing Dragon's Maze as the newest set, cards from Magic 2014 impacted how we balanced the cards in Dragon's Maze. That makes showing this interconnectedness at the best time (when Dragon's Maze is the newest set) basically impossible without spoiling something in M14. Still, I've combed through the archives and found a few examples of decks that included cards that haven't changed much since when they were in the FFL, and I'm willing to share those with you, to give you a better idea of why we do what we do.
Some decks we make are pegged by us to be potential roleplayers in Standard, and we keep an eye on them, making sure to keep them in the regular rotation of decks we play new brews against. One example is an RWU Delver of Secrets deck that filled itself with Charms and Snapcaster , going up to Thundermaw Hellkite as the heavy finisher.
The real-world decks are much better tuned than this one (and have more refined mana bases), ended up dropping Delver of Secrets, and went with Boros Reckoner . Playing with this still gives us a pretty good sense of what this style of deck in the future will end up playing like. It's not always going to perform how we expect, but it is close enough to work for a baseline.
Another tactic we sometimes take is to start with a deck that is succeeding in the real world and slot in updates from the sets we are working on, to get a better picture of how strong it will be. We played a little with Blood Artist when Avacyn Restored was in development, but the card took off much more in the real world than we had expected. When Killing Wave Zombies made itself a force to be reckoned with on the independent tournament scene, we put a deck together in our FFL to get an idea of what Zombies might look using cards through Dragon's Maze.
While many of the cards in this list haven't seen much play in the real world, it's generally better for our purposes to go wide and try out a wider variety of cards from the get-go and get an idea of how they play out. Better to have a deck with a few suboptimal cards the first time we play it than totally miss an important interaction.
Playtesting in the FFL isn't just about trying out the strongest decks or trying to balance obviously powerful cards. We're going to be wrong about those some amount of the time, and it's important that while trying to figure out how to best test Delver of Secrets something doesn't slip through the cracks. There are a lot of cards in our sets, and trying to get some games with all of them is important, as you will often find cards that you hadn't thought of as being contenders that end up as role players. Sometimes, the decks we try like this aren't refined and aren't intended to reflect what it might look like in a tournament deck; instead, these decks are built to test some interaction and get a sense of how powerful it is when it gets to work. Here is an example of a deck put together to test out Lavinia of the Tenth , in an earlier state when her enters-the-battlefield ability detained all nonlands your opponents control and not just ones with converted mana cost 4 or less.
We'd already received feedback from several people that her ability was incredibly annoying, there were concerns about how it would play in Commander, and a member of the FFL threw this deck together to see what the worst-case scenario was. What if I can blink Lavinia every turn? What does that look like? It's only an outline, but it's good enough to get some games in. The net result, unsurprisingly, was that when the deck managed to get both Lavinia and Conjurer's Closet , it beat most other decks. As a result, we changed Lavinia to only hitting permanents with converted mana cost 4 or less, giving most decks an out—by going over the top of it.
We have regular meeting times twice a week where we have a few hours set aside to play Future Future League matches, but we also play pickup games throughout the week, often when someone has an idea for a deck or after a development meeting on a set has changed some stats, and a card needs to get reevaluated.
We don't always conform to regular Magic tournament rules. The first is that we use a modified mulligan rule, where you can freely mulligan any zero-, one-, six-, or seven-land hands. We play so few games of each matchup, at least in the grand scheme of things, that we want to make sure our testing isn't horribly biased by one person just having two or three bad draws in a row. We also alternate who goes first back and forth regardless of who won the previous game.
Another important trick for us when testing out an important card, especially if it is a combo or a metagame card, is to draw six-card hands and automatically add the card to our opening hand. How does the new card designed to combat aggressive decks fair? Well, let's see what it does when it comes out on turn three every game. If it doesn't do anything, then, well... time to go back to the drawing board. If it wins every game... well, it might be time to go back to the drawing board on that one, too. Ideally, it is impactful not but unbeatable, and it makes the virtual post-board matchup more interesting.
Why We Do What We Do
Our goal with the Future Future League is to make sure that we get the cards as correct as possible. Our goal is not to correctly predict what each deck will look like a year later when the set hits Standard. There are just too many moving parts for us to be entirely accurate about that, and if we spend too much time only trying to balance around our predictions, we risk everything falling apart if we misevaluate just how good one deck is. Beyond that, our goal is to create a metagame that is diverse and complex enough to survive the pressures of hundreds of thousands of Magic players playing everywhere, from their kitchen tables to Friday Night Magic to the Pro Tour, all while allowing plenty of space for all of the existing decks to grow. If our group of FFL testers was able to figure how the format will play out in the amount of time we devote to it, the real world player base would crack it in no time.
In terms of cards that were different than we'd expected, Domri Rade was a huge part of our internal metagame in RG decks, but he has only made a minor splash in the real world. Strangleroot Geist was one of our go-to two-drops and Jace, Architect of Thought was one of the best Planeswalkers in the format. So, what caused this difference? Well, a lot, actually. Thragtusk was stronger than we'd expected and went into decks we just weren't testing it with—like Bant Control. This, of course, changed the makeup of the successful aggro decks, turned the Reanimator deck into more of a value deck than the combo-y version we were testing, and even forced the control decks to use different win conditions than we were using. Fortunately, we'd left enough room in the format for it to easily adjust for all of these differences and still be fun and balanced.
After being here a little over a year, I can say that the common wisdom Billy Moreno gave me when I started was right—we are generally correct on the power level of the individual cards, and generally wrong on the composition of decks. Plenty of cards that we played with while testing Dragon's Maze may be waiting for the metagame to shift in a way to make them viable. I'm looking forward to Magic 2014 and Theros and seeing which of those come true, and further than that, how all of the other cards we created do.