Data-Mining Extended Tournaments for Statistics

Posted in Feature on January 24, 2007

By Frank Karsten

TheIPA qualifiers have run the last week, so we have many well-attended events to draw metagame data from. Let's begin with the Standard update.

Deck namePopularityChange in popularity from last week
1. Mono Green Aggro■■■■■ ■■■■■ ■■ (12%)+2%
2. Dralnu du Louvre■■■■■ ■■■■■ ■■ (12%)+5% (!!!)
3. U/b Pickles■■■■■ ■■■■■ ■ (11%)-3%
4. U/W Urzatron■■■■■ ■■■■■ (10%)+1%
5. U/G Scryb&Force■■■■■ ■■ (7%)-3%
6. Dragonstorm ■■■■■ ■ (6%)+1%
7. Angelfire■■■■■ (5%)+4% (!!!)
8. Boros Deck Wins■■■■■ (5%)-3%
9. Zoo■■■■ (4%)+2%
10. Solar Flare■■■■ (4%)+3%
11. R/G Aggro■■■ (3%)+1%
12. Izzetron■■ (2%)0%
13. W/R/B Control■■ (2%)+2%
14. GhaziGlare ■■ (2%)-4% (!!!)
15. BlinkRiders■■ (2%)+1%
16. W/B Control■■ (2%)0%
17. Panda Connection■■ (2%)0%
18. SnakeBlink ■■ (2%)+2%
19. Ignite the Warrens■ (1%)-3%
20. U/R Snow■ (1%)0%

That's an interesting new number one! Classified by many - including me - as "a weak budget deck," Mono Green Aggro has proven it has the tools to dominate the format. And it even makes sense in the current metagame full of blue/black control decks. Imagine you get out a Silhana Ledgewalker enchanted with Blanchwood Armor. That is (a) a fast clock and (b) very hard to answer for such a deck. Your opponent better find a Skeletal Vampire or the Brine Elemental plus Vesuvan Shapeshifter combo soon, or the game is over. Perhaps black/blue decks will have to put awkward cards like Cruel Edict or Evacuation in their sideboards in order to deal with Silhana Ledgewalker.

Mystical Teachings

By the way, I know that technically U/b Pickles doesn't play black cards, but I still label it as a blue/black deck because it plays some black-producing lands and Signets to flashback Mystical Teachings. The U/b notation (with lowercase b) indicates that the main color is blue and black is a splash. While we're at it, Dralnu du Louvre appears to be superior to U/b Pickles. The main differences between the two are that U/b Pickles plays Vesuvan Shapeshifter, Dimir Signet, Fathom Seer, Willbender, and Brine Elemental, where Dralnu du Louvre has Think Twice, extra counters, extra Mystical Teachings and one-of targets, Skeletal Vampire, and Dralnu, Lich Lord. U/b Pickles may be more fun to play, but after observing both decks for a while, I think that Dralnu du Louvre just has stronger cards and tools. The kill condition is largely irrelevant; if you take control of the game, you don't care whether you win with a Vesuvan Shapeshifter / Brine Elemental lock or with a billion Skeletal Vampire-fueled Bat tokens. And in my opinion the extra card draw and control cards in Dralnu du Louvre are superior to a bunch of morphs.

Of course, things will get crazier once Planar Chaos becomes legal and black decks get access to Damnation. That card will certainly push blue/black control decks to the max. I have already heard stories about people trading away their Hallowed Fountains for Watery Graves at the prerelease. I'll advise up front, though, that it won't fit easily in the current Dralnu du Louvre build. That deck works well because every card can be played on instant speed (even the creatures, thanks to Teferi, Mage of Zhalfir) and the deck can therefore execute the "draw-go" plan perfectly; always keeping mana open during the opponent's turn and waiting to see what he does before making your own play. Adding a sorcery to that plan is worse than it may seem. Nevertheless, a deck that is at least close to Dralnu du Louvre will likely rise to the top ranks once Planar Chaos is released online.


But I'm getting ahead of myself. I'll talk about the impact of Planar Chaos in more detail in a few weeks. We're still in the current format, where we only have access to Wrath of God. Wrath should be really good against Mono Green Aggro, as a deck with just creatures and pump spells and nothing else usually rolls over when its army is taken out. If Mono Green Aggro stays as popular as it is now, then I would choose a Wrath-based control deck for an upcoming Standard tournament. I still like my Angelfire deck.

Battling Extended Archetypes

The last two weeks I just touched briefly on the Online Extended metagame, but today it is time to visit it in detail. I have compiled the Magic Online Extended Premier Events Top 8 results of the last three weeks in a handy table. The first column is the deck name. The second column shows the popularity percentage of the week of January 1-7, the third column shows the popularity percentage of the week of January 8-14, and the fourth column shows the popularity percentage of the week of January 15-21. This popularity percentage is my way of ranking the deck archetypes on a combination of Top 8 appearances and performance in the elimination rounds. Simply put, I add up the expected total match points of each Premier Event Top 8 deck, rather than allocating points on a one-point-per-Top 8 basis. Therefore, a high popularity percentage indicates a deck that many people play and win with. The last column shows the average weighted popularity over the three weeks. The decks are ranked in order of that average popularity.

DeckWeek 1Week 2Week 3Average Popularity
1. TEPS Desire8%14%15%■■■■■ ■■■■■ ■■ (12%)
2. Boros Deck Wins20%5%9%■■■■■ ■■■■■ ■ (11%)
3. U/W Urzatron6%16%11%■■■■■ ■■■■■ ■ (11%)
4. Aggro Loam 4%11%7%■■■■■ ■■ (7%)
5. Trinket Angel6%11%4%■■■■■ ■■ (7%)
6. Affinity2%8%5%■■■■■ (5%)
7. Gifts/Rock10%2%3%■■■■■ (5%)
8. Scepter/Chant5%3%5%■■■■ (4%)
9. CAL3%7%1%■■■■ (4%)
10. Ichorid6%1%2%■■■ (3%)
11. Tooth and Nail1%4%3%■■■ (3%)
12. UW Post3%0%6%■■■ (3%)
13. Flow Rock 4%0%4%■■■ (3%)
14. Aggro Flow Rock5%1%1%■■■ (3%)
15. Goblin Storm5%2%1%■■■ (3%)
16. Flow Deck Wins4%3%0%■■ (2%)
17. MUC Tron2%1%2%■■ (2%)
18. R/G Aggro1%1%2%■ (1%)
19. G/W Hate 0%0%4%■ (1%)
20. U/G Opposition0%0%3%■ (1%)

The major decks are TEPS Desire, Boros Deck Wins, and U/W Urzatron, a pattern that is not new. Today I'm not going to feature special decks, because there haven't been that many spectacular innovations and the people that have interesting decks usually don't want to share their lists, as they want to keep a deck advantage for upcoming PTQs or GPs. Furthermore, if you are just looking for a good decklist or the latest tech, you can always check the deck-o-pedia entry (click on a deck name in the above table to view a decklist and short explanation in my deck-o-pedia forum thread), the Pro Tour - Yokohama Qualifying Season Top 8 Decklists, or keep an eye on Mike Flores's Swimming with Sharks column on Thursdays.

Mind's Desire

Today I am going to present a little experiment. I am always basing my metagame discussions on the visible replays of the Premier Event Top 8s, but there is more data to be gained from Magic Online. While Premier Events are running, you can also watch replays of the current round and see every round's match results. By noting down who plays what (for instance, PlayerA=TEPS, PlayerB=Boros), copying the match results (which appear in the format "PlayerA defeated PlayerB."), and then matching these by some handy usage of Excel ("TEPS defeated Boros"), we have a bunch of matchup data to draw from that can indicate what a deck's good and bad matchups are. This week, I tracked this data for five Extended Premier Events and added some of my own playtesting results. Adding everything up we can draw conclusions on how matchups play out. For example, Boros beat TEPS 5 out of 26 matches; therefore Boros beats TEPS 19% of the time and TEPS beats Boros 81% of the time.

Trying to communicate all this matchup information in a well-organized, conveniently arranged manner raised some issues. I can hardly put up a 40x40 matrix with an entry for every random deck, so I had to make some abstractions, only showing matchup information for the most popular global deck archetypes. Furthermore, showing stuff like 21-5 also got a bit chaotic, so I am just showing calculated match results (I indicate very reliable results and very unreliable results with a small colored note, in order to still give some information on how many matches the matchup calculation is based). I'll show the table now, and afterwards discuss some of the deck classifications I used.

BorosTEPSUW ManaScepterTrinketGiftsRockAffinityLoamFlowOther
UW Mana59%50%50%75%50%29%28%40%21%63%

: reliable result, based on 20 or more matches.
: unreliable result, based on less than 6 matches.

Boros, TEPS, Scepter (-Chant), Trinket (-Angel), GiftsRock, and Affinity all are what you expect. But the categories "UW Mana," "Loam," and "Flow" warrant extra explanation. "UW Mana" includes UW Tron decks as well as UW Post decks. My reasoning for combining these two archetypes is that UW Post play the same cards as UW Tron decks, they just have a slightly different mana base: Urzatron vs. Cloudpost/Vesuva. It's an easy distinction, but their matchups against the field are about the same, and that's what this table is about. Combining the UW Post matchup data with the UW Tron matchup data should give more accuracy to the matchup percentages because it improves the sample size.

Life from the Loam

A similar reasoning has been used for the "Loam" and "Flow" categories. "Loam" includes Aggro Loam and CAL. Whereas these two archetypes always looked and played games differently - Aggro Loam traditionally played aggressive cards like Terravore, Devastating Dreams, Werebear, and Firebolt, whereas CAL tended to play control cards like Eternal Witness, Solitary Confinement, Dark Confidant, and extra lands - lately these decks have more or less blended together. Seeing Solitary Confinement, Eternal Witness, Terravore, and Devastating Dreams together in one deck is not uncommon. That makes classifying pretty hard (although I try to label the decks according to the strategy that they resemble most for my metagame table), but it also blends together CAL's and Aggro Loam's matchups. Since the difference between these archetypes is usually only a few cards now, they should have a more or less similar matchups against the field. The same holds for the "Flow" category, which includes Flow Rock (with Eternal Witness and Sensei's Divining Top), Aggro Flow Rock (with Troll Ascetic and equipment), and Flow Deck Wins (with Kird Ape and Firebolt). These versions are also somewhat blending together lately (Semi-Aggro Flow Semi-Rock Deck Wins, anyone?). They all have a similar game plan in Destructive Flow, and all Flow decks together make up a decent part of the metagame (8%), so I felt it was warranted to create a special category for these decks as well.

So What's the Use of All This?

The above table can be used to see which deck type tends to beat what. If you know what your bad matchups are, you can try to find out why that matchup is bad (a vital thing that the above table unfortunately doesn't tell) and come up with fixes. Also, you can take this to your friend who still claims that Boros beats UW Tron to prove him wrong. Nowadays Sphere of Law frequently appears in sideboards, so that matchup is not what it used to be anymore. But we can go further that just matchup percentages. The next step is to combine this matchup data with the metagame table, in order to calculate an expected match win percentage against this field. I'll explain how that works with an example. The expected match win percentage of Boros against the field is equal to:

{ Percentage of Boros decks in the metagame (11%) * Match win percentage against Boros (50%) } + { Percentage of TEPS decks in the metagame (12%) * Match win percentage against TEPS (19%) } + … + … + { Percentage of Other decks in the metagame (23%) * Match win percentage against Other (73%) } = 46.5%

Doing this for all deck categories, we get the following results:

UW Mana49.4%

This would lead us to believe that some Life from the Loam deck is the best choice to take to an upcoming tournament, and that Trinket Angel, Scepter/Chant, and TEPS are fine options as well. These are interesting observations. But you have to take them with a grain of salt. There are, after all, a lot of methodological problems with my method. I'll highlight a few. The most important one is obvious: these conclusions are only valid for the stated metagame - in this case it is represented by the average online popularity percentage over the last three weeks, which is already skewed towards top tier decks - and metagames tend to vary largely over time and places. But there are other problems. Some matchup percentages are rather unreliable (based on less than 6 matches), and that can skew results.

Furthermore, this method abstracts away from playing skill. I am actually not surprised that control decks have higher win percentages than beatdown decks (Affinity and Boros are relatively low), because the good players usually play control decks, as they usually feel control decks offer them more opportunity to outplay their opponents and they appreciate the challenge of a difficult deck better. Pro Tours are also usually more control-heavy than State Championships, for instance. If all high-rated players had chosen Boros/Affinity instead and if all low-rated players had chosen Life from the Loam decks instead, then the numbers would probably all be closer to 50%.


Moreover, my method also doesn't give proper weight to well-tuned versions of decks, and most importantly, because I abstracted away from specific versions, it doesn't tell you yet what the best Life from Loam variant is. Finding a well-tuned deck that suits your own personal playing style is smarter than just picking the deck with the highest win percentage presented here. If you have found a good version of Affinity and can pilot it very well, do not make the mistake of switching to a Life from the Loam deck at the last minute because of something you read here. Experience and comfort with a certain deck are more important factors! That said, I previously discussed how Aggro Loam and CAL decks are looking more and more similar. Aggro Loam is borrowing Solitary Confinement from CAL, and CAL is borrowing Terravore from Aggro Loam, for instance. I like this and think the best possible Loam deck is somewhere in between. I personally would try to make a Life from the Loam deck with all the best powerful tools available. I don't like random Firebolts - I'd rather make a deck that runs Solitary Confinement (a.k.a. good game against Boros/Affinity), the Terravore plus Devastating Dreams combo (good game against pretty much everything), Cabal Therapy (you need it to battle TEPS), and at least 25 lands. I don't have a fixed decklist yet - I honestly have no clue whether cards like Werebear, Wall of Roots, Eternal Witness, Loxodon Hierarch, and Putrefy should be in or not, and I also need to figure out a good sideboard plan against Trinket Angel's Tormod's Crypts - but I would certainly include every non-blue card that has good synergy with Life from the Loam.

Behind the Scenes: Interview with Beta's Own Jerry Vanhulle

Implementing the Planar Chaos cards on Magic Online is similar to making a new computer program; the code for the new cards has to be tested. Software testing culminates in a Beta, where end users test the computer program and the bugs they discover are fixed. The Planar Chaos Beta is supposed to start this week, so I asked Jerry Vanhulle, the manager of most of the Beta process, some questions on what happens behind the scenes.

Online Tech: Jerry, tell us a little about yourself: age, background, job title and description, how long have you worked for Wizards and the beta program, etc.


Jerry Vanhulle: I am 33 years old. My background is actually in dairy farming and herd management. I grew bored of that and got into gaming as a hobby (I have been playing D&D and other pen and paper games since I was 14.). I eventually got work as a game tester and worked for a number of the major game companies in the greater Seattle area. I really loved testing and have been involved in it for nearly 10 years now. My official title is Software Test Engineer, or "Black Box" tester, and, as my manager calls me, "All around Beta Dude," but that isn't official yet. I have been working on the Betas since the Mirrodin block. I have worked at Wizards on and off since March 2003, when I was involved in a very small group of testers who worked on Scourge. I began working on the Beta program as a minor tester during the Mirrodin block. I took up a more significant role in the Beta each time I worked on one, until about a year and a half ago when I began running the Betas under the direction of Mister X.

OT: Can you offer an estimated release date for online Planar Chaos?

JV: The target is February 26th 2007. Of course, quality is king, and if we find that we need more time as we gauge the stability of the new set during Beta, we'll take it.

OT: How is the beta tester pool going to be selected this time around?

JV: The Beta tester pool will be selected from the top scoring applicants who applied to our online application. We have a small group of about 75 people we call our 'core' testers who receive an invite each Beta. People who submit a large quantity of valid bugs, provide our Beta staff with assistance in testing, and players that assist in helping other players test or give direction to new Beta testers on a regular basis are included on the core testers list. Basically people who make the Beta process better or smoother are included as a core tester. The list has people removed and added each Beta but it generally remains around 75 people. We do not always see eye to eye with our core testers, and sometimes they do not agree with what we are doing. That kind of input is what makes Magic Online the game it is today.

OT: For the Time Spiral beta, certain changes had been implemented to ensure the beta players are finding bugs rather than improving their draft skills or testing their latest competitive decks. I remember draft queues were replaced with Sealed Deck events, and in Constructed only Singleton matches were allowed, for instance. Are you happy with that direction, and how is it going to affect the Planar Chaos beta?


JV: We are constantly tweaking our Beta process. We have far more Betas than other companies that produce online content, and because of this we are constantly tweaking the process to ensure two things. First, and foremost, we need to have a Beta that provides results. We need to maximize bug detection and elimination. Second, we want our Beta participants to have a pleasant experience. We can't make everyone happy, but we seek to make the experience as enjoyable as possible while never losing site of what everyone is on the Beta for… to find bugs. We are not at the point, and probably never will be, where we think our process cannot be improved. There is always room for improvement and player input provides constant improvement. As far as the direction we are going in this Beta, we are sticking with the Sealed format, but we are taking a more relaxed stance on the Constructed games. However, people who advertise games titled "Competitive decks only," or anything else that suggests they are using Beta to practice and not test, will be permanently kicked and banned from Beta. It is okay to play a fun game with another Beta tester now and again, but keep it to a minimum. If you're tired of testing, log onto live and come back when you're ready for some more test games.

OT: What do you like the best about your job? What do you like the least?

JV: I like playing Magic and interacting with the players. I also love finding bugs. It is fun for me to puzzle out where exactly a card is breaking during the process of a game. The most fun I have is working with the players on figuring out really complex bug interactions. It is almost a game in itself. What do I like least about my job? I would have to say it would be Beta conduct enforcement. I hate having to ban, kick, or reprimand Beta participants. It just bums me out.

OT: Is there one Beta that stands out to you or that has left fond memories and why?

JV: The Beta that was the most memorable for me was the Dissension Beta. For two reasons: We introduced our Beta boards, and we introduced Testmonkey5, or as her fans know her, TM5. The Beta boards opened up a whole new dynamic for communicating with our Beta testers. The boards allow us to archive and index Beta testers' comments and ideas for future reference. We are very happy we put in the effort to set up the Beta boards. They are the most useful tool we have for directing and interacting with our Beta testers. Testmonkey5 was the other reason why that Beta stands out so vividly for me. TM5 put in a lot of time and effort into the Beta and the Beta testers fell in love with her wit and effervescent personality. She was the only Test Monkey to ever get a fan club! I am happy to announce that Testmonkey5 is now a permanent part of the team. Welcome aboard Tonja, a.k.a Testmonkey5!

OT: What is a "Testmonkey," and what is their role in the beta? Have you ever been one?


JV: A Test Monkey is a contractor. They work with us for about a year and often come back later and work for us again. We are picky. We look for people who are experienced black box testers, have knowledge of Magic: The Gathering, and are gamers themselves. I started off over 4 years ago as a testbot, testbot8 in fact. Later we began card set testing online through our Beta program. Testbots became Testmonkeys and I was Testmonkey7. Eventually I became Testmonkey1, and then when Wizards of the Coast hired me on directly I became the JerryVan you all know and love today. The role of a Testmonkey in Beta varies depending on the Testmonkey. Some manage bug submissions, which is a full time job by itself. Others handle Beta conduct enforcement. Some are mostly on our internal test environment conducting countless repetitive test case passes and playing one-man eight-player games. Beta is our busiest time here at Wizards HQ. If the players are the heart of the Beta, the Testmonkeys are the brain, tracking each and every bug and working on getting them fixed with our development staff.

OT: How will v3.0 affect the way Betas are run, if at all?

JV: 3.0 Betas will work exactly as 2.5 Betas work. We may need new tools for working with 3.0, but the basics will remain the same.

OT: Is there a MTG card that best describes you?

Meddling Mage

JV: I wasn't sure what card I would be so I asked my co-workers, and they all think the card that best represents me is Meddling Mage, but mostly because of the flavor text! I use Meddling Kids as my avatar on the boards, because not only is it funny, but Meddling Mage is one of my favorite cards. So I think their observation is apt.

Lastly, I'd like to give special thanks to…

  • Jerry Vanhulle for answering my questions and joekewwl for his help on the interview.
  • Mike Mandelsberg, a top-notch programmer who created a program that makes it easier to record replays. It's a great help for me and my assistant Josh Clark to get all the required data to write this column, hopefully making it even stronger. Josh wanted to use this opportunity to publicly thank his friend for creating this tool.
  • Latest Feature Articles


    Amonkhet Prerelease Primer by, Gavin Verhey

    Welcome to Amonkhet! You've never quite seen a world like Amonkhet before. Inspired by ancient Egypt at its height, and showcasing some of the strongest warriors around, Amonkhet is a ...

    Learn More



    Feature Archive

    Consult the archives for more articles!

    See All

    We use cookies on this site to enhance your user experience. By clicking any link on this page or by clicking Yes, you are giving your consent for us to set cookies. (Learn more about cookies)

    No, I want to find out more