Page 22 of 31

Re: Magic Data

PostPosted: 05 Feb 2013, 23:21
by Arch
Not completely sure why I have set codes in there actually.

They are not good as identifiers since they may change. Any persistent data should not rely on them. I used to use them in the data for mapping but I just changed this.

Text version still uses them but those codes could be whatever.

Are there any use-cases for set codes? I don't have one. There's one or two versions for the HQ pictures if I recall correctly but those are more or less arbitrary.

Re: Magic Data

PostPosted: 06 Feb 2013, 12:48
by Marek14
Hm, did the alphabetization algorithm change? Before, all lowercase letters were put behind the uppercase letters, now it seems they are mixed (see the Leylines, for example, Meek and Void one used to be at the end of the list).

Also, Chancellor of the Dross doesn't have "flying" capitalized.

Re: Magic Data

PostPosted: 06 Feb 2013, 21:00
by Arch
http://dl.dropbox.com/u/2771470/mtg-data.zip
http://dl.dropbox.com/u/2771470/mtg-data-txt.zip

Some updates.
- Protection with 3 items are split correctly.
- A bunch of flip cards fixed in language. (And lavarunner in cards.xml) There are still some that are incorrect; flip cards that are not creatures.
- Text version has proper capitalization or split-rules.

Things to look into.
- Double cards (x // y) in lang_* are not ok.
- Set codes. As I said earlier, I don't know where this should go.
I calling it "things to look into" because I can't really see a solution at this point - at best I have ideas for work-arounds.

Marek14 wrote:Hm, did the alphabetization algorithm change? Before, all lowercase letters were put behind the uppercase letters, now it seems they are mixed (see the Leylines, for example, Meek and Void one used to be at the end of the list).
Probably, since it has changed. I'm currently downcasing everything before sorting, guess I didn't do that before. Want me to change it back? Either way is fine with me.

Re: Magic Data

PostPosted: 06 Feb 2013, 22:21
by Marek14
I'd prefer if you kept the old way, yes. I got used to it :)

Re: Magic Data

PostPosted: 06 Feb 2013, 22:53
by PilotPirx
Mana Cost for Soltari Guerillas misses some brackets.

Re: Magic Data

PostPosted: 07 Feb 2013, 00:28
by friarsol
Arch wrote:Are there any use-cases for set codes? I don't have one. There's one or two versions for the HQ pictures if I recall correctly but those are more or less arbitrary.
We parse the set codes up above in the .txt file for determining distinct cards, and set coverage in Forge. Tempest set code is TE, any card that has TE <rarity> on their set line (the last line in a card block) gets accumulated to see how much of the set we reach.

It's not a required functionality, but it is extremely handy. I don't really care if the set codes are arbitrary as long as they are unique per set, non-empty, and consistent within the mtg-data.txt file (which they have been). In the last version of the mtg-data.txt we were using (post RTR) all set codes existed, so all cards would be parsed appropriately.

But in this version, for example, since Astral doesn't have a set code anymore, the parsing is off for the set line.

Broken set code | Open
Aswan Jaguar
{1}{G}{G}
Creature - Cat
2/2
As Aswan Jaguar enters the battlefield, choose a creature type at random from among all creature types that a creature card in target opponent's decklist has.
{G}{G}, {T}: Destroy target creature with the chosen type. It can't be regenerated.
S
(Above that would normally be <setcode> <rarity> but setcode is "")


Looking through the old file vs the new, it looks like only Astral and Dreamcast sets have cards listed in the mtgdata.txt file including wonky Set lines. So I guess that means that Anthologies, Deckmasters, Renaissance, Rivals Quick Start, and Multiverse Gift Box, all don't have any cards attached to them. (So I don't really care if they don't have an attached set code, since there aren't any cards to parse with that code.)

Re: Magic Data

PostPosted: 07 Feb 2013, 01:08
by randian
The rule-number algorithm assigns the wrong no="X" to rules. For example, from Elite Inquisitor:

1 First Strike
1 Vigilance
2 Protection from Vampires
2 Protection from Werewolves
2 Protection from Zombies

I think that should read

1 First Strike
2 Vigilance
3 Protection from Vampires
4 Protection from Werewolves
5 Protection from Zombies

Forked-Branch Garami also incorrectly gives no="1" to its two rules.

I'd like to propose a new XML schema that I think handles duals much better. The example below has both a single and a dual. Dual cards are are a natural extension of singles and are handled the same way. You could do 3-way cards without any change. It eliminates some unnecessary XML elements. You can use the same XML schema for both card and meta (currently they're different, and lang* is yet a third), though I think the logic of lang*xml suggests getting rid of card/meta since they're derived information (from a hypothetical all-in-one language_english.xml).

Code: Select all
<set name="Invasion">
  <card name="Absorb">
    <rarity>R</rarity>
    <number>226</number>
    <spell name="Absorb">
      <cost>{W}{U}{U}</cost>
      <typelist>
        <type type="card">Instant</type>
      </typelist>
      <rulelist>
        <rule no="1">Counter target spell. You gain 3 life.</rule>
      </rulelist>
      <flavor>"Your presumption is your downfall."</flavor>
      <artist>Andrew Goldhawk</artist>
    </spell>
  </card>
  <card name="Assault // Battery" multi="split">
    <rarity>U</rarity>
    <number>295</rarity>
    <spell name="Assault">
      <cost>{R}</cost>
      <typelist>
        <type type="card">Sorcery</type>
      </typelist>
      <rulelist>
        <rule no="1" reminder="reminder">Assault deals 2 damage to target creature or player.</rule>
      </rulelist>
      <flavor>Flavor 1</flavor>
      <artist>Ben Thompson</artist>
    </spell>
    <spell name="Battery">
      <cost>{3}{G}</cost>
      <typelist>
        <type type="card">Sorcery</type>
      </typelist>
      <rulelist>
        <rule no="1" reminder="">Put a 3/3 green Elephant creature token into play.</rule>
      </rulelist>
      <flavor>Flavor 2</flavor>
      <artist>Ben Thompson</flavor>
    </spell>
  </card>
</set>

Re: Magic Data

PostPosted: 07 Feb 2013, 17:48
by PilotPirx
Thats for multi word lines like:

Elite Inquisitor
First Strike, vigilance
Protection from Vampires, from Werewolves, and from Zombies

Image

Re: Magic Data

PostPosted: 07 Feb 2013, 19:44
by Arch
Marek14 wrote:I'd prefer if you kept the old way, yes. I got used to it
I'll update it for next release.

PilotPirx wrote:Mana Cost for Soltari Guerillas misses some brackets.
Fixing.

friarsol wrote:...
Yeah the setcodes are required for the text version - and will be fixed. (At least for Astral and Dreamcast as you noted.) The way they currently exist though they serve very little purpose in setinfo.xml.

randian wrote:The rule-number algorithm assigns the wrong no="X" to rules.
The no-element corresponds to the line that the rule appears on, as described here: https://github.com/karmag/loa/blob/ec0e ... at.txt#L53 The rules are split to make parsing easier. The numbering was introduced so that the original card-text can be recovered.

randian wrote:I'd like to propose a new XML schema that I think handles duals much better. The example below has both a single and a dual. Dual cards are are a natural extension of singles and are handled the same way. You could do 3-way cards without any change. It eliminates some unnecessary XML elements. You can use the same XML schema for both card and meta (currently they're different, and lang* is yet a third), though I think the logic of lang*xml suggests getting rid of card/meta since they're derived information (from a hypothetical all-in-one language_english.xml).
I can see what you want with breaking up multi-part cards that way but don't see that it would provide a benefit over the current format. You're always stuck with handling multiparts specially since they need to be linked. Whether they are fetched sequencially or off another element doesn't really change much.

The split of data over cards, meta and lang* is actually very deliberate and structured that way to avoid repeating information. Consider handling the cards in your example when they are added to another set. You'd need to duplicate a lot of the data.

  • cards.xml contains the oracle version of the card, meaning the most up-to-date version from a rules perspective.
  • meta.xml contains the mapping to set as well as the information related to the card in that set. Should be pretty obvious about the split from cards.xml here since cards are re-printed all the time.
  • language_*.xml contains the version of the card as it was printed for each set. That is; it's only duplicated in cards.xml when a printed card has no corrections. language_english and cards belong together as much (or little) as language_russian and cards.
The relation between cards -> meta and meta -> lang* is a one-to-many in both cases and while they could be merged I think it's cleared to keep them apart.

Re: Magic Data

PostPosted: 07 Feb 2013, 20:44
by randian
Arch wrote:
  • meta.xml contains the mapping to set as well as the information related to the card in that set. Should be pretty obvious about the split from cards.xml here since cards are re-printed all the time.
Why does meta.xml no longer contain flavor text? Clearly that's "information related to the card in that set", since flavor text can change from set to set, but flavor text now only exists in lang*xml.

Re: Magic Data

PostPosted: 07 Feb 2013, 22:02
by friarsol
Arch wrote:
friarsol wrote:...
Yeah the setcodes are required for the text version - and will be fixed. (At least for Astral and Dreamcast as you noted.) The way they currently exist though they serve very little purpose in setinfo.xml.
Gotcha, just a miscommunication then. I should have clarified that I was referring to mtgdata.txt in the original post. Thanks Arch

Re: Magic Data

PostPosted: 08 Feb 2013, 09:13
by Arch
randian wrote:Why does meta.xml no longer contain flavor text? Clearly that's "information related to the card in that set", since flavor text can change from set to set, but flavor text now only exists in lang*xml.
Because there's one flavor-text per language. So when the language* files were introduced it made sense to put them there as well.

Re: Magic Data

PostPosted: 10 Feb 2013, 18:21
by Arch
http://dl.dropbox.com/u/2771470/mtg-data.zip
http://dl.dropbox.com/u/2771470/mtg-data-txt.zip

Fixed:
- Altered sorting for text version.
- Included set codes for dreamcast and astral. ("Fixes" the text version.)
- Fixed soltari guerilla cost.

Remaining:
- Some double/flip cards are wrong in language_*.
- Update format.txt

Re: Magic Data

PostPosted: 12 Feb 2013, 13:43
by Marek14
Druids' Repository is written without a space in card text (Druids'Repository). The same error seems to be present in other cards whose name contains "s' ", like All Suns' Dawn.

Re: Magic Data

PostPosted: 12 Feb 2013, 22:12
by Arch
http://dl.dropbox.com/u/2771470/mtg-data.zip
http://dl.dropbox.com/u/2771470/mtg-data-txt.zip

Fixed:
- Kamigawa flip cards in language_*.
- "s' ".

Known issues:
- Double cards in language_*.