Magic Data
General Discussion of the Intricacies
Moderator: CCGHQ Admins
Magic Data
by Arch » 07 Jul 2009, 09:30
Data: http://dl.dropbox.com/u/2771470/mtg-data-2015-11-08.zip
Text: http://dl.dropbox.com/u/2771470/mtg-dat ... -11-08.zip
Source is available at:
Gatherer download/parsing: https://github.com/karmag/loa
Text version: https://github.com/karmag/loa-format
Support library: https://github.com/karmag/ants
This project has been discontinued. The last version of the software is compiled and ready to go here:
http://dl.dropbox.com/u/2771470/loa-standalone.zip
Original post:
I was wondering how/where the developers around here get their data. The data I'm taking about is card-information (names, p/t, oracle text..), what sets contains what cards and other such straight up data.
The way I currently do it (and suspect others are doing as well) is to parse whatever sources I may find that is decently structured. (Crystal Keep, Gatherer, MWS, the comprehensive rules to name some.) The problem here is that none of the data is really formatted for easy processing. The problems include duplication of data, missing data and plainly malformed data.
So does anyone have a better source for this kind of data?
Text: http://dl.dropbox.com/u/2771470/mtg-dat ... -11-08.zip
Source is available at:
Gatherer download/parsing: https://github.com/karmag/loa
Text version: https://github.com/karmag/loa-format
Support library: https://github.com/karmag/ants
This project has been discontinued. The last version of the software is compiled and ready to go here:
http://dl.dropbox.com/u/2771470/loa-standalone.zip
Original post:
I was wondering how/where the developers around here get their data. The data I'm taking about is card-information (names, p/t, oracle text..), what sets contains what cards and other such straight up data.
The way I currently do it (and suspect others are doing as well) is to parse whatever sources I may find that is decently structured. (Crystal Keep, Gatherer, MWS, the comprehensive rules to name some.) The problem here is that none of the data is really formatted for easy processing. The problems include duplication of data, missing data and plainly malformed data.
So does anyone have a better source for this kind of data?
Last edited by Arch on 08 Nov 2015, 17:38, edited 41 times in total.
Re: Magic Data
by Snacko » 07 Jul 2009, 10:22
Spider Gatherer for cards and organise the data as you feel is right. This is the place all the card information comes from. However look out for the new Gatherer bugs especially when parsing non English cards.
For rules the Crystal Keep rulings summaries are nearly always properly formatted and pretty much everyone uses them.
I only use GH MWS masterbase, because it contains promo cards which are missing from Gatherer. The other good side of this is that GH puts effort to ensure the cards data doesn't have Gatherer bugs plus he fixes all reported issues.
For rules the Crystal Keep rulings summaries are nearly always properly formatted and pretty much everyone uses them.
I only use GH MWS masterbase, because it contains promo cards which are missing from Gatherer. The other good side of this is that GH puts effort to ensure the cards data doesn't have Gatherer bugs plus he fixes all reported issues.
Re: Magic Data
by MageKing17 » 07 Jul 2009, 17:54
The old gatherer used to spit out a perfectly suitable oracle spoiler, but for some reason that option is no longer there (or it wasn't last time I looked). As for determining what cards are in what sets, on both the old and new gatherers you can get it to spit out a "text spoiler", which is similar to an oracle spoiler, but contains extra information (but lacks line breaks in the rules text, and is therefore unsuitable as a replacement for the oracle spoiler (it makes a world of difference if a "whenever" is part of an ability or the start of a seperate one)), such as rarity and whatnot.
-
MageKing17 - Programmer
- Posts: 473
- Joined: 12 Jun 2008, 20:40
- Has thanked: 5 times
- Been thanked: 9 times
Re: Magic Data
by Arch » 09 Jul 2009, 09:12
Decided to implement some tools for extracting the data from Gatherer. I did encounter some wierd errors as mentioned by Snacko (mainly raritiy being wrong) but it seems to mostly affect older cards/sets.
I made the data available if anyone wants it here:
http://www.easy-share.com/1906676470/mtg-data.zip
It's XML-formatted and is organized as follows:
cardlist.xml All the rule information for all card, basicly all information that is not set-dependent. Un* sets are not included here.
glossary.xml, rules.xml The comprehensive rules in XML format.
setlist.xml All the sets with set-codes, release-data and such. Information taken from wikipedia/wizards web-site.
set/*.xml Set-specific information on a set-by-set basis. Contains rarity, collectors numbers, artist.
What is not included here, that I wanted to do, is flavor and printed rules/types. Since it doesn't really interest me I didn't spend any time trying to parse that (really annoying) part of gatherer.
I have not spent any time trying to correct information that might be wrong in gatherer. This has been straight up parsing. I've just made sure data ended up in the right place.
Will make an update when gatherer is updated with M10.
I made the data available if anyone wants it here:
http://www.easy-share.com/1906676470/mtg-data.zip
It's XML-formatted and is organized as follows:
cardlist.xml All the rule information for all card, basicly all information that is not set-dependent. Un* sets are not included here.
glossary.xml, rules.xml The comprehensive rules in XML format.
setlist.xml All the sets with set-codes, release-data and such. Information taken from wikipedia/wizards web-site.
set/*.xml Set-specific information on a set-by-set basis. Contains rarity, collectors numbers, artist.
What is not included here, that I wanted to do, is flavor and printed rules/types. Since it doesn't really interest me I didn't spend any time trying to parse that (really annoying) part of gatherer.
I have not spent any time trying to correct information that might be wrong in gatherer. This has been straight up parsing. I've just made sure data ended up in the right place.
Will make an update when gatherer is updated with M10.
Re: Magic Data
by staggerwingjtstw » 09 Jul 2009, 14:10
Question Arch, is that the old Comp Rules, or the one they said they'd be releasing on like, Tuesday?
I like being bad... it makes me happy...
-
staggerwingjtstw - Posts: 181
- Joined: 31 May 2008, 18:03
- Has thanked: 3 times
- Been thanked: 1 time
Re: Magic Data
by Arch » 09 Jul 2009, 16:50
Those are the old (090501) comp-rules. Seems like both gatherer and comp-rules will be updated tomorrow though so I'll do my update then as well.staggerwingjtstw wrote:Question Arch, is that the old Comp Rules, or the one they said they'd be releasing on like, Tuesday?
Thanks for the heads-up. I'm not following the wizards homepage on a daily basis as I'm currently on vacation. In my mind the update was further down the line.
Re: Magic Data
by Arch » 10 Jul 2009, 19:28
Updated with M10 changes.
http://www.easy-share.com/1906697239/mt ... 090710.zip
If anyone would happen to be interested in the script behind this it can be made available.
http://www.easy-share.com/1906697239/mt ... 090710.zip
If anyone would happen to be interested in the script behind this it can be made available.
Re: Magic Data
by Marek14 » 11 Jul 2009, 05:41
For some reason, I can't unpack con.xml 
EDIT: Also, what do you use to view the xml files properly? I tried the browser, but it still displays tags...

EDIT: Also, what do you use to view the xml files properly? I tried the browser, but it still displays tags...
Re: Magic Data
by Arch » 11 Jul 2009, 08:37
That sounds wierd. Is it only con.xml that's failing? Are you able to unpack the rest of the files without trouble?Marek14 wrote:For some reason, I can't unpack con.xml
Not sure what you mean by "properly". You usually want to see the tags when looking at XML. I myself use whatever text-editor I have available to view XML-files.Marek14 wrote:EDIT: Also, what do you use to view the xml files properly? I tried the browser, but it still displays tags...
Side-note: This package is aimed at developers. If you're not a developer you probably have little to no use for this data.
Re: Magic Data
by Marek14 » 12 Jul 2009, 05:25
Only con.xml. If it was everything, I'd think the file was corrupted, but it's only con.xml.
I am not a developer, but I am always on a hunt for good text file with all the info. I tried new Gatherer, but the best I was able to obtain is a gigantic html file, which can be browsed offline, but still isn't quite what I seek.
I am not a developer, but I am always on a hunt for good text file with all the info. I tried new Gatherer, but the best I was able to obtain is a gigantic html file, which can be browsed offline, but still isn't quite what I seek.
Re: Magic Data
by Arch » 12 Jul 2009, 16:41
http://blogs.msdn.com/oldnewthing/archi ... 55388.aspx
It appears that no file can be named con<whatever>. It's supposedly to support backwards compability with DOS 1.0. (Only affects microsoft-OSs of course.) I'm on Ubuntu so I didn't have any problems with it. It's something that would have to be adressed though...
It appears that no file can be named con<whatever>. It's supposedly to support backwards compability with DOS 1.0. (Only affects microsoft-OSs of course.) I'm on Ubuntu so I didn't have any problems with it. It's something that would have to be adressed though...
One of the reasons I made this is because with a proper data format like this it's quite trivial to manipulate the data - including turning it into plain text. I could help you with it if you don't know how to do it yourself; just give me an idea of how you would like the text to be presented.I am not a developer, but I am always on a hunt for good text file with all the info. I tried new Gatherer, but the best I was able to obtain is a gigantic html file, which can be browsed offline, but still isn't quite what I seek.
Re: Magic Data
by Marek14 » 13 Jul 2009, 04:19
Well, I guess I would like the format of old Oracle spoiler, which looks like this:
Absolver Thrull
3W
Creature - Thrull Cleric
2/3
Haunt (When this card is put into a graveyard from play, remove it from the game haunting target creature.)
When Absolver Thrull comes into play or the creature it haunts is put into a graveyard, destroy target enchantment.
This is how it looks now:
Name: {Absolver Thrull}
Cost: 3W
Type: Creature — Thrull Cleric
Pow/Tgh: (2/3)
Rules Text: Haunt (When this card is put into a graveyard from the battlefield, exile it haunting target creature.)
When Absolver Thrull enters the battlefield or the creature it haunts is put into a graveyard, destroy target enchantment.
Set/Rarity: Guildpact Common
The Set/Rarity line would be a nice one to have (perhaps using shortcuts instead of full names), but not strictly necessary.
Absolver Thrull
3W
Creature - Thrull Cleric
2/3
Haunt (When this card is put into a graveyard from play, remove it from the game haunting target creature.)
When Absolver Thrull comes into play or the creature it haunts is put into a graveyard, destroy target enchantment.
This is how it looks now:
Name: {Absolver Thrull}
Cost: 3W
Type: Creature — Thrull Cleric
Pow/Tgh: (2/3)
Rules Text: Haunt (When this card is put into a graveyard from the battlefield, exile it haunting target creature.)
When Absolver Thrull enters the battlefield or the creature it haunts is put into a graveyard, destroy target enchantment.
Set/Rarity: Guildpact Common
The Set/Rarity line would be a nice one to have (perhaps using shortcuts instead of full names), but not strictly necessary.
Re: Magic Data
by Marek14 » 16 Jul 2009, 18:01
Thanks, I got it and the data are nice
The only change I had to do was to replace the dash in card text.

Re: Magic Data
by Marek14 » 19 Jul 2009, 17:46
OK, I found a bug in the text file you posted here. It seems that planeswalker texts are completely off.
Ajani Goldmane
2WW
Planeswalker - Ajani
1/1
: Target creature an opponent controls attacks you this turn if able.
LRW R, M10 M
(Interestingly, it looks like Alluring Siren itself is missing from the file).
Ajani Goldmane
2WW
Planeswalker - Ajani
1/1

LRW R, M10 M
(Interestingly, it looks like Alluring Siren itself is missing from the file).
Return to Magic Rules Engine Programming
Who is online
Users browsing this forum: No registered users and 1 guest