Today, I want to talk about the difference between good and popular, and why we as game designers need to keep our eye on what's popular even though most of our talents lie in finding out what's good.
The Truth, the Whole Truth, and the Approach to Truth
When you get a bunch of talented and smart people together, they can do extraordinary things. Of course, sometimes those people disagree with each other. And because they are both talented and smart, they argue a lot. How can we resolve these arguments without getting progressively more aggravated with other R&D members who are clearly wrong and geez why don't they just accept I'm right already?
The answer is data.
Sometimes, asking someone for a specific response can give you bad data. That's because what people think and can state about what they want doesn't always match what they actually like. What a person actually does when presented with such a choice is called that person's revealed preferences. We get a lot of good data on what people like in Magic by observing how they play and what they purchase, because those are revealed preferences.
A lot of what I'm talking about in the article is about stated preferences, though—surveys, interviews, question/response—so what gives? Well, collecting this sort of information is much easier in many cases. And there's still useful data to be gleaned from these tools—especially about the appeal of cards—from which we can learn a lot about perception and the appeal of various options through stated preferences.
I am a physicist by training. It is wonderful and beautiful when there are elegant equations and principles that allow you to derive truth. But most of science (and life) doesn't work that way. Most of the time, we have to experiment to explore the edges of that truth.
It works the same way in game design—when we don't have the beautiful undeniable truth in hand, we run experiments to determine what the underlying truths about fun and accessibility might be. Some of those experiments are internal playtests to check the design. Some of them are releasing the product into the wild to see how people react with their pocketbooks. And some experiments involve talking to players to see how they perceive what we are trying to do.
There are three major ways surveying and polling can be used as tools in our development of Magic:
- Identify which cards have high appeal
- Look at why low-appeal cards have low appeal, and investigate if and why they have high appeal to specific individuals
- Choose among many options that play similarly but read differently
I'll walk through each of these and show you some specific examples that illustrate our experimental process in testing appeal.
Rare but Well-Polled
So there you are, hotshot lead on an upcoming set, running one of your weekly team meetings. You've been playtesting both Limited and Constructed a ton, learning what's fun and what isn't, what's been working in terms of themes/power levels and what hasn't. As you stand at the whiteboard, slowly a horrible realization dawns on you...
"We forgot the green cards!" you exclaim to your team. They look at you puzzled. No, wait, that isn't it... then it hits you:
You haven't really collected much data at all on a huge fraction of the rares!
As their designation implies, rares are... well, rare. And that means that when we run playtests, we do a lot of playing and reviewing of the commons and uncommons, but only a handful of rares. Similarly, when FFL starts working on the Constructed cards in the set, there are still only a handful of rares that impact Standard and older formats. This can be for a lot of reasons:
- Some rares are just not very powerful, or haven't found the right context to be powerful, in Limited and non-casual Constructed.
- Some rares have fringe use, but the conditions for that use are complicated to set up internally. Think of a specific kind of metagame (maybe one where targeted graveyard hate is quite useful) that would evolve in the real world but which takes an entire week or longer to engineer properly in FFL.
- Some rares are build-around cards that aren't tournament quality.
To get a better sense of which of these are appealing, we run a system we internally call Rare Polls, and which sometimes just go by Rate the Cards.
Even though we do survey people inside R&D with these Rare Polls, the primary purpose is to collect feedback from more casual and less plugged-into-the-future Magic players. Luckily, there are plenty such players inside the walls at Wizards, and even better, they LOVE to see upcoming cards and tell us what they think!
For fun, I went back to find the data for "Hook," set one in the Hook-Line-Sinker block. You probably know it better as Return to Ravnica, now in stores. What do you think were the top five Hook rares, as rated by the players inside the walls of Wizards?
Well, I suppose that's not too surprising. Those lands are pretty appealing to anyone who is interested in playing lots of colors, and they are more powerful than most dual lands we supply to Standard. So let's consider the top ten non-shockland Hook rares—what do you think they were?
Let's talk about what's going on with this handy spreadsheet of data:
- On the left, you can see the overall ranking for this card among the 68 cards polled (53 rares and 15 mythic rares).
- Next to that is the playtest name for that card and a color indicator, so when we discuss the card in the review meeting, we have an at-a-glance understanding of color appeal.
- Then you can see three categories—Overall, Casual, and Pit. We order by Casual average rating (descending) and then keep an eye on the number of "8+" and "4-" ratings. Even if a card is overall not appealing to the segment, if there are a significant number of 8+ ratings, we try to preserve the fun aspects of the card.
Looking at the list of most appealing cards gives us a good sense of whether development of individual cards is meeting our goals for the set. Remember, each set has needs—a Limited environment that's fresh and interesting, contributions to Standard and older formats, theme and flavor—that are the primary purpose for including cards in that set specifically.
- Are the colors (and specifically the guilds) well-represented in the top cards? It looks like there's no very-appealing Rakdos card here—the first two to show up on the list are Demon's Persecution (now Rakdos's Return) and Rakdos himself. Development took this note and improved Rakdos's appeal a good deal—it was the second-highest-rated card in Return to Ravnica on Gatherer during the week before the Prerelease.
- Are the top cards doing extra work for the set? Yes! A high-profile rare cycle is at the top (the shocklands), and the next few are used as promos/special previews ( Duel Decks: Izzet vs. Golgari ) or high-profile build-around guild cards.
- Are cards present for all three primary player psychographics—Timmy, Johnny, and Spike? Yes, it appears so, although this particular list skews Timmy-heavy. Development made some tweaks to individual cards based on the comments to increase Johnny appeal on some of the mythic rares in the set.
Next we'll dive into the second part of the Rare Poll—the lower-rated cards and how we mine the comments on those to figure out which to keep and how to fix them.
(As an aside, you'll see two cards on the list—Haunted Armor and Refraction—that aren't in Return to Ravnica booster packs at all! What gives? As you might have guessed, those cards didn't make the final cut for reasons other than their perceived appeal. For example, the needs of the set for Limited or Constructed might require changes to a card that is "perfect" as is. Generally, when this happens, rather than change it to fit, we keep that gem of a card unchanged and ready to slot into a future set where it will be able to shine.)For reference, here are the next two top rated rares that did actually make the set.
Let's Give Them Something to Talk About
The numbers function as guideposts so we know which cards to talk about—the low-rated cards are unappealing, the high 8+-cards have devoted fans, etc.—but it's the comments about why people rated the cards as they did that give us the most information about how to change the cards to be more appealing and still serve the needs of the set.
Let's take a look at a low-rated card from the Return to Ravnica Rare Poll—one of my favorites!—and figure out what's going on:
At the beginning of each player's upkeep, that player loses half of his or her life, rounded up.
This card was rated 62 out of 68 in the set's Rare Poll, but it clearly has fans—seventeen 8+ ratings and only five 4- ratings. So if we want to preserve it and figure out how to make it more appealing, what can we learn from the comments?
A lot of responders enjoy the Rakdos feel, and part of that is that it hits all players.
There's a bit of unease about that this card doesn't have enough bang for its buck.
Additionally, it feels more like a card than a card.
Most developers believed Half Life was more and more likely to be cut from the set as time went on. (Heh.) It wasn't pulling its weight in the rare slot for its guild. As we've seen already, Rakdos cards were generally lower appeal at this point in the process. Unless something changed, its days were numbered.
But Return to Ravnica lead developer Erik Lauer saw the potential of this card to be something cool and kept it in, searching for ways to tweak it. Often, when a development team is searching for answers to a puzzle like the one this Rare Poll data creates, the team looks at older cards to help inform their tweaks. Have a look at this famous Cube card from Scourge:
Even though no specific respondent mentioned a "can't gain life" clause, it seems obvious that such a clause would soup up the appeal of Half Life while also making the card a more cohesive and still very Rakdos-y whole. After that change, reception for Half Life became much more positive.
A Card in Hand is Worth Two in the Multiverse File
Take a look at these two cards and figure out which one players will like more:
I am fond of this particular example, first shown to me by R&D Editor and former DailyMTG.com editor Kelly Digges. It indicates the value of appeal even to smart, experienced Magic players. I would not have guessed that the gut-wrenching anti-appeal of the "T: Lose 5 life" would have overcome players' natural inclination to have a 5/6 over a 5/5, but through surveying we find that it does matter and the 5/6 is much less well-loved. Most Magic players can identify the 5/6 is a better card, but they make the stink face that tells us (as developers) that something is wrong here.
One of the most insidious traps a game designer can fall into is setting up incentives such that players will do something but hate themselves for doing it. Massively multiplayer games that make you "grind" for quests, experience, and loot fall into this camp—players do not enjoy feeling forced into the situation the game has constructed for them.
Here are some interesting examples of one-on-one card surveys that impacted sets in development:
Early in Alara Reborn development, I took two decks (sort of proto-Duel Decks) that used cascade cards and played them with more casual players around the company. In doing so, I noted that people had the expectation of a dream of chaining cascade spells together and, at the time, the mechanic was "next spell of equal cost, then stop." Using this data, I convinced the design and development teams we should change it to "next spell of lesser cost, continue if a cascade spell." I believe the bigger dream ended up having bigger appeal overall.
During Avacyn Restored, I took a number of designs of the now-extinct forbidden mechanic and shopped them around to various folks in the company. Finding out how "ungrokkable" the mechanic was—especially trying to explain WHY a player would want to use cards with the mechanic—eventually convinced lead developer Dave Humpherys to try out the miracle mechanic instead. It was much more promising when surveyed: people just got what it was trying to do.
Thanks for Your Participation
Surveying appeal is one of the texturing operations we do on products; it's not on the "main sequence" for a set's development of design to playtest to tweak to iterate, but it is still a key component of refining the set and making sure it will be well received. Where video games and software can iterate post-release with patches or upgrades, we cannot. By involving a more casual audience and incorporating its feedback, we can represent some amount of "time in the real world" in our process and create a more polished product.
I talked about Rare Polls and one-on-one surveys, but we do a number of other things in R&D to continue to polish each set as it makes its way to the production line:
- We post the mythic rares on a board near R&D, where any and all feedback (in the form of angry post-it notes) is welcome. (The most common—and most dangerous—response is "this doesn't feel mythic rare!" See also: Jace, the Mind Sculptor and Batterskull, both late pushes based in part on this feedback.)
- We run a slideshow showing ALL of the cards in the set once it is near completion, to get a big-picture review and give everyone a chance to "ooooh," "ahhhh," and sometimes "uhhh..." with enough time left to tweak if needed.
Thanks for reading! If you have specific feedback on the guild Prerelease packs, or if you are a player who reads DailyMTG.com and rarely (if ever) goes to events, I'd love to hear from you—you can use the author form below to email me and we'll start a conversation.