Log in

How-To sitescriptTemplate

Contents

Introduction

This wiki page gives information about properties and functions you can find in LHpi sitescriptTemplate file. It helps you to understand what you must/can define and how-to define it.

Any remarks or feedbacks are welcome.

To keep this page in in a non-LordHelmchen-perspective, he will usually keep his edits to How-To_sitescriptTemplate#updating sitescripts to new library/template version. Make sure you read the sections appropriate for the LHpi version you are developing for. You could also incorporate changes mentioned in those sections into the rest of this page and then remove what is no longer needed.

Writing a new sitescript

Properties

LHpi library version

Make sure that your script defines the version of LHpi library it uses.

The property to adapt in sitescript file is: libver

Example:

libver = "2.9"

LHpi library version

Make sure that your script defines the version of LHpi data file it uses.

The property to adapt in sitescript file is: dataver

Example:dataver = "2"

sitescript revision number

Start at 1 here.

The property to adapt in sitescript file is: scriptver

Example:scriptver = "1"

(semi-optional)Name of sitescript

The first thing to do is to define the sitescript filename in order for LHpi library to know which one is executed.
It is used to construct the log file name (if SAVELOG is true) and the savepath default (needed for example if you want to save your HTML page locally for later offline process).

The property to adapt in sitescript file is: scriptname

Example:

scriptname = "LHpi.TheWebsiteName-v".. libver .. "." .. dataver .. "." .. scriptver .. ".lua"

It is a good choice to leave the library and data versions into the filename in order to know, without editing sitescript file, which version of LHpi library is used. If unset, defaults to fallback value "LHpi.SITESCRIPT_NAME_NOT_SET-v" .. LHpi.version .. ".lua", which probably is not what you want.

See also How-To_sitescriptTemplate#multiple_copies_ot_the_same_script

Regular expression to retrieve card info

Next thing to configure is the regular expression used by LHpi to retrieve all information of cards from HTML page.
You can find useful information about LUA regular expression in the LUA reference page: [1]

The regular expression shall match all of one card's price info from the HTML page, so that each result contains (at least) the card name (or names, if multiple localized names are given in one entry) and price, but including also other information that is given for the card, such as foil status and language (if this information is not constant through the whole source file).
Seperating all this information into its parts is done later via site.ParseHtmlData. Having all localized names is useful to set the card langage semi-automatically. If only one common entry is given for both foil and nonfoil price, you will need to apply some additional tricks later.

To test your regular expression, you can download locally (with your browser) from your website the HTML page of a set.
Edit that HTML page with Notepad++ and use the Find function (CTRL + F). In the Find windows, on the bottom left in Find Mode, you have the choice "Regular Expression".
Select that option and try your regular expression. Each time you click on Next button, the selection done by Notepad++ should highlight all information of ONE card, with a different card highlighted at each Next click.
Alternatively, continue writing the absolute minimum required for a runable sitescript, set DEBUG to true, DEBUGSKIPFOUND to false, run the sitescript and compare the log with the source data.

The property to adapt in sitescript file is: site.regex

Example: site.regex = "(<div.-name=\".-\".-foil=\"%d\".-class=\"price\">.-</span>)"

(Optional) Currency

You can also define currency used by your website when different from the dollar '$'. Currently, this information is not used, but it might be useful later. Once it is, it will default to "$" if unset.

The property to adapt in sitescript file is: site.currency

Example: site.currency="€"

(Optional) Regular expression about expected results

If the website displayed information about the set, for example the number of card, you can define the regular expression to retrieve these informations.
LHpi will log this together with the number of cards it is about to set prices for. This could be useful to manually check that the number of card data matches equals the number of cards claimed to be in the html source file, to make sure your regex finds all entries.

The property to adapt in sitescript file is: site.resultregex

By default, property is not set.
When property is not defined, the corresponding log line will be skipped.

(Optional) Site Encoding

When the website uses a specific encoding (e.g. UTF-8), you can define it in this property.
A correct encoding definition is needed in order for LHpi to process card data correctly and make sure that the price can be imported by MA. Both LHpi and MA expect strings to be utf-8 encoded.
The names returned from site.ParseHtmlData are converted by LHpi.LHpi.Toutf8. Curently, only "cp1252" actually does any converting, while "utf8" or "utf-8" intentionally returns the string as it was.
If you need it, you can call LHpi.LHpi.Toutf8 with other strings from your sitescript.

The property to adapt in sitescript file is: site.encoding

Example: site.encoding="UTF8"
By default, the property is not set and default encoding used is "cp1252"

Site languages

The language of cards supported by the website should be defined here.

By default, English language is defined but you can add more languages when supported.

The property to adapt in sitescript file is: site.langs

Example:

 site.langs = {
      [1] = {id=1, full = "English", abbr="ENG",  url="eng" },
      [4] = {id=4, full = "French",  abbr="FRA",  url="fra" }
 }

The format of each entry is:

[<langid>] = {id=<langid>, full = "<Fullname>", abbr="<AbbrName", url="<ParameterValue>"]

where

  • <langid> is the ID of language as defined by Magic Album in file \Database\Languages.txt
  • <Fullname> is the full name of the language
  • <AbbrName> if the abbreviated name of the language
  • <ParameterValue> is the value that can be appended to the website url when retrieving cards in that language
    • this field is only used in site.BuildUrl, so you can define it any way you need.
2.9 and above

site.langs was shortened, and it is now recommended to read site-independent fields from LHpi.Data.languages[<langid>] instead. Example:

 site.langs = {
      [1] = {id=1, url="eng" },
      [4] = {id=4, url="fra" }
 }

FRUC - Foil Rare Uncommon Common table

The property defines all types of cards available on the website. This table is useful when card information are available at different urls depending on their rarity/foilage.

The property to adapt in sitescript file is: site.frucs

2.7 and below

You can use the fruc's strings as url infixes in site.BuildUrl.
important note: The library assumes fruc[1] to be foil and all other frucs to be nonfoil. Thus, when you set to import only one type of foilage to be imported in MA's price manager dialog, all non-applicable frucs will not be imported. If the site gives foil and nonfoil prices on the same page, you still need to define two frucs. Have site.BuildUrl return the same url for both frucs. Duplicate urls for the same set will not be downloaded nor processed twice.

Example: site.frucs = { "foil", "regular" }

2.8 and above

The rarity categories defined in this table contain booleans to have each fruc identify itself as foil and/or nonfoil. They are used to determine whether a cetegorie should be imported, according to the setting chosen in MA's "Manage Prices" dialog. The fruc table also contains url infixes you can use in site.BuildUrl, similar to the url fields in site.sets and site.langs.

Examples:
site has a seperate page for foil and nonfoil prices:

site.frucs = {
  [1]= { id=1, name="Foil"   , isfoil=true , isnonfoil=false, url="foil" },
  [2]= { id=2, name="nonFoil"   , isfoil=false, isnonfoil=true , url="regular" },
}

site has only a single page per set, with both foil and nonfoil prices:

site.frucs = {
  [1]= { id=1, name="nonfoil+Foil"   , isfoil=true , isnonfoil=true, url="" },
}

site has a seperate page for foils, three pages for nonfoil cards of different rarities, and a page with foil and nonfoil prices for Timespiral-Timeshifted cards:

site.frucs = {
  [1]= { id=1, name="Foil"   , isfoil=true , isnonfoil=false, url="foil" },
  [2]= { id=2, name="Rare"   , isfoil=false, isnonfoil=true , url="rare" },
  [3]= { id=3, name="Uncommon", isfoil=false, isnonfoil=true , url="uncommon" },
  [4]= { id=4, name="Common"   , isfoil=false, isnonfoil=true , url="common" },
  [5]= { id=5, name="Purple"   , isfoil=true, isnonfoil=true , url="timeshifted" },
}

Site sets

The most important property of the sitescript.

The property to adapt in sitescript file is: site.sets

It defines the mapping between the Magic Album (MA) and the website supported sets.
It tells LHpi library that the MA set with id xxx is available on the website with foil/regular prices, in which language(s), at specific url suffix.

Here is the format of one set mapping:

[<setid>]={id = <setid>, lang = { <eng avail>, [<langid>]=<lang avail>}, fruc = { <foil avail>, <regular avail> }, url = "<url suffix>"},

where

  • <setid> is the id of the set as defined by MA. See file \Database\Sets.txt
  • <eng avail> is a boolean indicating whether the price information is available for cards in ENGLISH language or not
  • <langid> is the id of a language as defined by MA. See file \Database\Languages.txt
  • <lang avail> is a boolean indicating if the price information is available for cards in language defined by <langid> (previous marker)

NOTE: <langid> and <lang avail> markers are used together when website supports card information in language different than English. If no other language is available, only <eng avail> marker needs to be set.

  • <foil avail> is a boolean indicating whether price of foil cards is available or not
  • <regular avail> is a boolean indicating whether price of regular cards is available or not.

NOTE: You can have more boolean in the fruc array. The number of boolean depends on the FRUC property you defined previously. If the (previous) property is an array of 4 frucs (e.g. {"foil", "rare", "uncommon", "common"}) then the fruc property in the set mapping would have 4 booleans: {true, true, true, true} each one defining whether the price of corresponding fruc is available or not.

  • <url suffix> is the suffix added to the website url in order to retrieve price of cards for the current set.
    • this field is only used in site.BuildUrl, so you can define it any way you need.

Example:

[800]={id = 800, lang = { true, [4]=true}, fruc = { true, true }, url = "1966"}, -- Theros

  • 800 is the id of Theros set defined by MA
  • Both English and French languages are available on the website for card prices
  • Both foil and regular prices are available on the website. Assume than site.fruc is defined like {"foil", "regular"}
  • The id of the Theros set on website is 1966. That id (=suffix) will be appended to the url when retrieving card prices of Theros.

(Optional)Card name replacement

Sometimes, the card name given by the website does not match the card name defined in MA.
You need to match both in order to set the price. This is the objective of the property.

The property to adapt in sitescript file is: site.namereplace

The format is:

 {
  [<setid>] = { 
     ["<website cardname 1>"]	= "<MA cardname 1>",
     ["<website cardname 2>"]	= "<MA cardname 2>"
  }
 }

where

  • <setid> is the id of the set defined by MA. See file \Database\Sets.txt.
  • <website cardname> is the name of the card as defined by the website
  • <MA cardname> if the name of the corresponding card defined by MA

Note:

  • A set can have more than one namereplacement mapping (cardname 1, cardname 2, ...)
  • site.namereplacement can have more then one set replacement definition
using namereplace instead of variant/foiltweak tables

If you want to make use of the default variant and foiltweak tables defined in LHpi.Data, but the card names of the site do not match the expected names (and are not automatically converted to a matching format by LHpi.BuildCardData), you can also use namereplacement for this.
Example:

 {
  [786] = { -- Avacyn Restored
   ["Spirit Token (White)"]				= "Spirit (3)",
   ["Spirit Token (Blue)"]				= "Spirit (4)",
   ["Human Token (White)"]				= "Human (2)",
   ["Human Token (Red)"]				= "Human (7)",
  },
 }

(Optional)Variants

Some sets have variants of the same card, for example basic lands.
The variants property defines all of this and does mapping between the name of card given by the website and its variant in MA.

The property to adapt in sitescript file is: site.variants

The format is: { [<setid>] = { ["<website cardname>"] = { "<MA cardname>", { <variants name> } }, ... } ... }

where

  • <setid> is the id of the set defined by MA. See file \Database\Sets.txt.
  • <website cardname> is the variant name of the card as defined by the website
  • <MA cardname> if the name of the corresponding card defined by MA
  • <variants name> if the array of variants name as defined by MA. Usually, it is { 1, 2, 3, 4 }; see the default variant tables in LHpi.Data for other examples.

Example:

[800] = { -- Theros
  ["Plains"] 						= { "Plains"	, { 1    , 2    , 3    , 4     } },
  ["Island"] 						= { "Island" 	, { 1    , 2    , 3    , 4     } },
  ["Swamp 1"] 						= { "Swamp"		, { 1    , false, false, false } },
  ["Swamp 2"] 						= { "Swamp"		, { false, 2    , false, false } },
  ["Swamp 3"] 						= { "Swamp"		, { false, false, 3    , false } },
  ["Swamp 4"] 						= { "Swamp"		, { false, false, false, 4     } },
  ["Mountain"] 					= { "Mountain"	, { 1    , 2    , 3    , 4     } },
  ["Forest"] 						= { "Forest" 	, { 1    , 2    , 3    , 4     } }
}

Website uses the same name for Plains, Island, Mountain and Forest variants. So the price for these card will be identical for all variants.
However, website does distinction between variant of Swamp. So "Swamp 1" is mapped to the variant 1 of Swamp in MA, "Swamp 2 is mapped to variant 2, ...

Please note that defining a set variant in sitescript will override all default variants defined by LHpi library for that set. So make sure to define all variants in your sitescript to avoid losing variants previously defined by LHpi. Alternatively, use a namereplacement table to set the card names to the ones expected by the default variant table. In sets that have a collector number, the default variant table defined by LHpi.Data uses "name (collector number)" to denote variants, and "name" for "all variants". Sets without collector numbers use the variant name instead of the collector number.

(Optional)Foil tweak

This is similar to the namereplace table and can be used to set specific cards to foil or nonfoil explicitely.

The property to adapt in sitescript file is: site.foiltweak

The format is:

{ [<setid>] = { ["<website cardname 1>"] = { foil = <foilstatus> }, ["<website cardname 2>"] = { foil = <foilstatus> } ... } ... } where

- <setid> is the id of the set defined by MA. See file \Database\Sets.txt. - <website cardname> is the name of the card as defined by the website - <foilstatus> is a boolean, true for foil.

Please note that defining a set foiltweak in the sitescript will override all default foiltweaks defined by LHpi library for that set. So make sure to define all foiltweaks in your sitescript to avoid losing filtweaks previously defined by LHpi. Alternatively, use a namereplacement table to set the card names to the ones expected by the default foiltweak table.

(Optional)Expected results

When you know exactly the information that LHpi should retrieve from website, you can define these expectation in this property.

The property to adapt in sitescript file is: site.expected

The format is:

{
  [<setid>] = {  pset={ <nb of card in ENG>, [<langid>]=<nb of card in lang> }, failed={ <nb of failed in ENG>, [<langid>]=<nb of failed in lang> }, 
                 dropped=<nb of dropped card>, namereplaced=<nb of card with name replacement>, foiltweaked=<nb of card foil tweaked> },
}

where

  • <setid> is the id of the set defined by MA. See file \Database\Sets.txt.
  • <nb of card in ENG> is the total number of card expected for the set, in English language.
  • <langid> is the id of a language as defined by MA. See file \Database\Languages.txt.
  • <nb of card in lang> is the total umber of card expected for the set, in the corresponding language.
  • <nb of failed in ENG> is the total number of card that LHpi failed to import for the set, in English language.
  • <nb of failed in lang> is the total number of card that LHpi failed to import for the set, in the corresponding language.
  • <nb of dropped card> is the total number of card that LHpi dropped for the set, all languages combined.
  • <nb of card with name replacement> is the total number of card that LHpi did name replacement, all languages combined.
  • <nb of card foil tweaked> is the total number of card that LHpi did a foiltweak replacement, all languages combined.


If undefined, for each set that is both supported by the sitescript and chosen to be imported, pset[<langid>] defaults to the number of cards in the set (as supplied by LHpi.Data.sets[<setid>].cardcount) for each (supported & selected) language, while failed[<langid>], dropped,namereplaced and foiltweaked each default to 0. Thus, you only need to define expectations for sets where anything special happens.

Example:

{
  [788] = { pset={ 249+11, nil, 249 }, failed={ 0, nil, 11 }, dropped=0, namereplaced=1, foiltweaked=0 }, -- M2013
}

Please not that expectation occurs only if LHpi property CHECKEXPECTED is set to true.

EXPECTTOKENS

site.expected.EXPECTTOKENS option can be used to control how expectation works for sets without explicitly set expectation:
#boolean EXPECTTOKENS false:pset defaults to regular, true:pset defauts to regular+tokens

LHpi properties

You can customize how LHpi works or reports events during price imports with the help of Global properties.

  • VERBOSE

Controls the amount of feedback/logging done by LHpi.
If unset, defaults to true.

  • LOGDROPS

Controls whether dropped cards are logged or not by LHpi.
If unset, defaults to false.

  • LOGNAMEREPLACE

Controls whether name replacement of cards are logged or not by LHpi.
If unset, defaults to false.

  • LOGFOILTWEAK new in 2.7

Controls whether changing the foil status of cards are logged or not by LHpi.
If unset, defaults to false.

  • CHECKEXPECTED

Controls whether to check the counter of import done by LHpi agains the expected counter values defined in site.expected. With library 2.8 and above, this only checks set and failed prices. See Site Expected results property and STRICTCHECKEXPECTED below.
If unset, defaults to true.

  • STRICTCHECKEXPECTED new in 2.8

Controls whether to complain if drop,namereplace or foiltweak count differs. Only has any effect if CHECKEXPECTED==true. See Site Expected results property.
If unset, defaults to false.

  • DEBUG

When true, LHpi logs everything (which is more than VERBOSE, but still honouring LOGDROPS and LOGNAMEREPLACE) and exits in case of errors (instead of continuing as best as it can). If unset, defaults to false.

  • DEBUGSKIPFOUND

While DEBUG, do not log raw html data found by site.regex. Set this to true and check the log to debug your site.regex. If unset, defaults to true.

  • DEBUGVARIANTS

Enable DEBUG when card variants are encountered by LHpi. This is proably only needed true if you change the library code that deals with variants.
If unset, defaults to false.

  • OFFLINE

Control whether LHpi reads source data from local directory ('savepath' property) only or from site url. Use this to import old data or to save yourself and the website some bandwith while debugging your sitescript.
If unset, defaults to false.

  • SAVEHTML

Controls whether LHpi saves a local copy of each source html to 'savepath' when not in OFFLINE mode.
When property is activated, the local directory where LHpi saves data must be writable, otherwise LHpi will just disable SAVEHTML and note it in the log.
If unset, defaults to false.

  • SAVELOG

Control whether LHpi logs to separate logfile (true) or in Magic Album.log (false).
If unset, defaults to true.

  • SAVETABLE

Control whether LHpi saves prices into a file (in 'savepath') before importing to Magic Album. As with SAVEHTML, this will be set to false if savepath is not writeable.
If unset, defaults to false.

  • savepath

Name of existing directory, in \Prices folder, where LHpi read (OFFLINE) or write (SAVEHTML) source html data when corresponding properties are activated.
If this property is not set and LHpi is required to use it (OFFLINE or SAVEHTML activated), then the savepath defined by LHpi corresponds to the scriptname without versioning information.

Functions

ImportPrice ( importfoil, importlangs, importsets )

This is the main function, called by Magic Album, to import prices of selected sets for selected languages, in regular or foil quality.
It is the entry point where LHpi load it's library LUA file and starts the import magic.

Do not change anything in this function and keep it like this.

site.BuildUrl ( setid, langid, frucid, offline )

The purpose of this function is to construct the url where LHpi will retrieve HTML data containing card information about a specific set, language and FRUC.

Parameters of the function are:

  • setid: ID of the set for which url is constructed
  • langid: ID of the language to import
  • frucid: ID of the FRUC to import
  • offline: Boolean flag indicating whether the source are retrieved from local file or from internet

The content of the function already presents in sitescript.lua file can be used as an example.
What you should adapt for your website is following properties in the function:

site.domain = 'www.example.com/prices/'
The domain of the website where you find the prices. Without 'http' prefix.


site.file = 'set.php?context=magic'
The name of the file appended to the domain, and containing HTML parameters which do not concern the setid, langid nor fruc.


site.setprefix = "&set="
The HTML parameter defining the ID of the set on the website.
The corresponding value of the parameter is retrieved from "site.sets[<setid>].url" property.


site.langprefix = "&lang="
The HTML parameter defining the ID of the language on the website.
The corresponding value of the parameter is retrieved from "site.langs[<langid>].url" property.


site.frucprefix = "&fruc="
The HTML parameter defining the ID of the FRUC on the website.
The corresponding value of the parameter is retrieved from "site.frucs[<frucid>]" property.


site.suffix = ""
Any url suffix that can be appended to the url.


The next property in the function you can adapt, if for example some previous parameters are not required, is the url.
In the function, you will find a line like:

local url = site.domain .. site.file .. site.setprefix .. site.sets[setid].url .. site.langprefix .. site.langs[langid].url .. site.frucprefix .. site.frucs[frucid].url
which use all properties previously defined in the function.
If some of them are not needed by the website, just remove it from the url construction.

The rest of the function should not be adapted, except if you know what you're doing.

2.8

site.BuildUrl can potentially return the same url twice, with different url.foilonly; only the first of the duplicate urls is kept. Therefore, LHpi does not query url.foilonly anymore. If you know this does not happen with your site.BuildUrl and want to query url.foilonly, do it in site.ParseHtmlData.

site.ParseHtmlData ( foundstring, urldetails )

The purpose of the function is to extract card information (name, price, ...) from each entry found by the site.regex property in the HTML page.

Parameters of the function are:

foundstring
The card entry found by site.regex in the source HTML page. This entry should match information of ONE card.
If you know that the HTML page has more than one entry for the same card (e.g. one entry for regular price, one entry for foil price),
process only ONE entry at the time and set the correct properties on the card returned by the function.
urldetails
Table containing some information about the currently processed url.
The information is the following: { foilonly = #boolean , isfile = #boolean , setid = #number, langid = #number, frucid = #number }
Properties foilonly and isfile are the ones coming from the site.BuildUrl function.

You must first retrieve card information from foundstring with the help of regular expressions or other mechanisms that you know.
For example:

 local _start,_end,name = string.find(foundstring, 'cardname=\"(.-)\"' )
 local _start,_end,price = string.find( foundstring , 'class="price">([%d.,]+) .-</span>' )

The price retrieved from html source can actually contain decimal separators. In order for LHpi to support different kinds of separators ( 1.000,00 vs 1,000.00 ), it is important that the price you return does not contain these.
For example, if the price on the page is '12.50' or '12,50' (12 euros/dollars/<anything else> and 50 cents), the price you should return is '1250'.
With that price, LHpi will divide it by 100 in order to retrieve the cent amount. To make sure LHpi does not attempt to divide a string, please do explicit tonumber conversion.
A good way to do that is:

  price = string.gsub( price , "[,.]" , "" ) -- Remove all ',' or '.' characters from the price
  price = tonumber( price ) -- Convert price String to Number
 

Once you're done, you can construct the actual newCard returned by the function with following code:

  local newCard = { names = { [urldetails.langid] = name }, price = { [urldetails.langid] = price } [, <any other properties you can found>] }
 

names and price can contain entries for multiple languages, but the langids that do contain information should match. Note the marker <any other properties you can found>. In the newCard table, you can define some properties in addition to required names and price.
These informations can afterward be used in next function site.BCDpluginPre and/or site.BCDpluginPost (explained later).

These extra properties must be defined in a table and added to newCard under pluginData property.
Example:

  local newCard = { names = { ... }, price = { ... } , pluginData={ propA=ValueA, propB=ValueB ... }] }
 

In addition to required names and price properties, you can also set specific properties on the card that will be handled by LHpi. For each of these properties, if they are already defined by site.ParseHtmlData, LHpi.BuildCardData will keep the predefined values and skip the automatic detection and processing.:

name
this, together with the setid, is the primary key (unique identifier) of the card, as far as LHpi is concerned. Multiple entries of the same card will be collected, for example to associate different foilage, language and/or variant prices. This name must be identical to the oracle name in MA (alternatively, the localized name if the LHpi card dataset is for only one language), or through namereplacement and other processing by BuildCardData, become such (this means that for example "Forest Nr. 247 (fOiL)" would still be valid at this point). See LHpi.BuildCardData source for details or experiment. You can also set name to something unique to make sure this card triggers your namereplace,variant or foiltweak tables.
foil
a boolean that if true marks the card, and its price, as being foiled. It's probably a good idea to set this explicitely, if the card name does not include foilage information. Still subject to change by foiltweak table entries.
drop
a boolean value that if true results in LHpi skipping further processing of this card entry and not adding a price to be imported. Compare also different methods of #dropping cards
lang
a table of the form { <langid>=<langabbr>, ... }, where langid is a number corresponding to site.langs, and langabbr the matching abbreviation. It should only be set if the card contains a price for this language (in the price fieldd, which will then be applied to all languages set here; or predefined by regprice/foilprice below).
variant
to override the variants defined in 'site.variants' for the setid and cardname.
regprice
a table of the form { <langid>=<price>, ... }, where langid again corresponds to site.langs, and price is the price of the nonfoil card in this language. If variants is defined, <price> instead is a subtable of the form { <varname>=<price>, ... }, where varname is a string that must correspond to a variant name that exists for this card.
foilprice
a table of the form { <langid>=<price>, ... }, where langid again corresponds to site.langs, and price is the price of the foil card in this language. If variants is defined, <price> instead is a subtable of the form { <varname>=<price>, ... }, where varname is a string that must correspond to a variant name that exists for this card.
2.8 and above

To allow returning multiple cards, GetSourceData now expects to be returned a table of cards. In most cases, you'll parse a single card from each foundstring, so just wrap your parsed card in a container table:

 local newCard
 ...
 return { newCard }
 

preferred way to supply data
  1. Preferrably, BuildCardData can process all needed information from newCard={ names,price }.
  2. The next best solution is to also set newCard.foil, but otherwise still let BuildCardData, well, build the card data :-)
  3. If the library does not do all that is needed, help out in the two BuildCardData plugin functions. You can supply additional data to them that LHpi.BuildCardData will not touch via card.pluginData .
  4. Setting other card properties will make BuildCardData skip its processing for this property, so it is probably only a last ditch solution.
  5. Once you set newCard.variants or either newCard.regprice or newCard.foilprice for cards with variants, you will probably need to enable DEBUGVARIANTS and read the logfile to see why it fails ;-) .
how BuildCardDate uses newCard
  • how the card language is actually set:
    1. newCard.names is checked, in ascending langid order, and the first encountered nonempty name is used as card.name .
    2. newCard.names is checked for nonempty names and, if the corresponding language is set to be imported, card.lang is set.
    3. for each card.lang that is not nil/false, either card.regprice[langid] or card.foilprice[langid] is set to newCard.price .
  • how the card foilage is actually set:
    1. LHpi.Data.sets[setid].foilonly is queried
    2. card.name is checked for occurance of a "(foil)"-like substring
  • TODO should other details of BuildCardData processing be given here, or does the LHpi library's source suffice for the more interested developers?

(Optional)site.BCDpluginPre ( card , setid )

Function for special cases card data manipulation.
Ties into LHpi.buildCardData to make changes that are specific to one site and thus don't belong into the library.
This Plugin function is called before most of LHpi's BuildCardData processing. Already processed and set are card.name, card.lang, and cards that were already set as drop==true have been dropped before this point.

Parameters:

card
The card, resulting from parseHtmlData and initialized by LHpi after the function call, currently processed.
a table { name=<name>, pluginData=<customdata>, names={ [<langid>]=<localname>, ...}, lang={ [<langid>]=<langabbr>, ...}, <ParseHtmlData-presets> }
setid
ID of the set currently processed.
Return card
modified card for futher processing.

An example of such function is to replace characters in all card name by an alternative.
For example:

  card.name = string.gsub( card.name , "AE" , "Æ")
  card.name = string.gsub( card.name , "Ae" , "Æ") 
 

2.8 and above

BCDpluginPre and BCDpluginPost now have two additional parameters passed from LHpi.BuildCardData: importfoil and importlangs. There's currently no example where they are used, but as the information is accessible to LHpi.CardData, it should probably be available to the plugins as well.

(Optional)site.BCDpluginPost( card , setid )

Function for special cases card data manipulation.
Ties into LHpi.buildCardData to make changes that are specific to one site and thus don't belong into the library.
This Plugin function is called after LHpi's BuildCardData processing (and probably not needed). At this point, LHpi.BuildCardData has finished its processing; card.foil and card.pluginData are still present for use, but won't be needed by LHpi anymore.

Parameters:

card
The card currently processed by LHpi.
TODO: complete card table anatomy
setid
ID of the set currently processed.
Return card
modified card for further processing.

Updating sitescripts

Updating sitescripts with new set information

If you want to update a sitescript to include all sets in a given version of LHpi.Data, you can try using the scripts in update helper mode. This requires

  • library version >=2.15
  • the latest Data version (so it includes the set's data you want to add to the sitescript)
  • LHpi.dummyMA >= 0.6
  • a sitescript that responds properly to site.Initialize("update"), site.BuildUrl("list") and site.FetchExpansionList() and with site.updateFormatString matching the sitescript's site.sets table layout (As of 2015-10-08, this is true of all existing sitescripts).

Then edit the dummyMY to load the sitescript and run site.Initialize("update"). You can find the results in LHpi.log (or the sitescript's log if you enabled SAVEOG). To check how well the new sets are imported, enable VERBOSE,LOGNAMEREPLACE,LOGDROPS,LOGFOILTWEAK,STRICTEXPECTED in the sitescript and run it via Magic Album. Then check the log again.

resolving failed cards

You will probably need multiple runs of the script to determine everything, so to save time and bandwidth, I suggest the following preparation (example for LHpi.tcgplayerPriceGuide.lua):

  1. If you have not done yet, create a $PATH_TO_MA\Prices\LHpi.tcgplayerPriceGuide\ directory
  2. edit LHpi.tcgplayerPriceGuide.lua to set SAVEHTML=true
  3. run the script to download all source html files. With dummyMA >=0.6, you don't need MA for this step anymore.
  4. edit LHpi.tcgplayerPriceGuide.lua to set OFFLINE=true (don't forget to change it back once you're done and want live data again)

Now for the namereplacement

  • The target names can be found in lib\LHpi.Data-vX.lua (for cards that have variants) or in MA (Oracle name or localized name colummn).
  • To find the source names, set VERBOSE=true and LOGNAMEREPLACE=true (I'd set LOGDROPS,LOGFOILTWEAK,STRICTEXPECTED true as well), then run the script again and open the log.
  • find all lines that start with "! LHpi.SetPrice". Those tell you exactly which cards failed to be set, including the name LHpi sent to MA. For tokens, suffix the name with " Token" to enable LHpi to detect the object type.
  • If the site_name-to-LHpi_name mapping is not obvious, you can compare the card image on the site with the one in MA (prominent example: Basic Lands in sets without collector numbers).
  • Run the script again and observe the new "namereplaced X to Y" lines in the log, then hopefully see how the number of failed cards has decreased, while the number of namerepalced cards has increased.
  • repeat until either all cards from the source are identified or all cards in MA have received the correct price
    • If you have cards that refuse to be namereplaced and your suspect that special characters are to blame, you can try writing them as "\ddd" (where ddd is the byte value, padded with leading zeros to three digits). Lua is character encoding agnostic, every string is just a string of bytes internally.
      • set DEBUG=true
      • The log is now very wordy, but you can search for (part of) the card name.
      • At some point, there will be a byte-for-byte representation of the name in the log.
      • This is the state of the card name manipulation directly before LHpi looks for a namereplace entry (which, in turn, happens before foiltweak, variant and object type detecion.)
    • If you run into troubles, either try to figure it out by looking at the other sets' namereplacement and variant tables, or report back to me and I will fix the rest.
  • change site.expected to tell LHpi how many successfully set ("pset"), failed and namereplaced cards to expect.

updating sitescripts to new library/template version

Obviously, if you intend to actually run your sitescript, you need to download the corresponding library and data file versions.
Unless otherwise noted, each library version (starting with 2.7) should be compatible with each data file version. You can for example change dataver upon release of a new card set by WotC, and keep libver at a library that you and your sitescript are comfortable with.
It is usually a very good idea to use the ImportPrice fuction from the template version that matches the library version, and only incorporate changes you might have made to it into the the new function definition.
If you want to be nice to others that might want to use your sitescript as example or as a starting point for their own sitescript, you could verify that the luadoc comments still match the ones in the template that uses the same library version.

2.6 to 2.7
  • The new control option LOGFOILTWEAK should be added below LOGNAMEREPLACE.
  • The new configuration dataver should be added below libver.
  • the filename and scriptname configuration should include the data file version as infix between library version and sitescript revision number.
  • ImportPrice was changed to load the library from "Prices\lib" instead of "Prices". If you made no changes to this function, just copy the new version from the updated template to your sitescript. It was written in a way that still works, even if you decide to keep the library in "Prices", so you don't strictly need to create a "\lib" subfolder for the library and data file, though future documentation will assume you did.
  • due to a high amount of improved (luadoc) comments, it's probably best to either copy&paste all of your functions and tables into a copy of the new template file, or copy&paste all (luadoc) comments from the new template file into your sitescript.
2.7 to 2.8
  • You need to rewrite site.frucs to match the new format. The examples at How-To_sitescriptTemplate#site.frucs should get you started.
    • If you used site.fruc in your functions, they will need to be adapted, of course.
  • site.BuildUrl luadoc and example was changed slightly, but no change is strictly required to your existing implementation.
    • note that url.foilonly is not queried by LHpi anymore.
  • site.BCDpluginPre and siteBCDpluginPost both are passed two additional parameters now. To be able to take advantage of the new parameters, you need to change the function definition. If you do, please update the luadoc as well. I think the way lua works makes this change not strictly necessarry if you don't need access to the new parameters' contents.
  • site.ParseHtmlData now can return multiple cards. This means that you need to wrap the returned card(s) in a container table.
  • new option STRICTEXPECTED should be added above DEBUG
  • site.langs can be shortened, static fields can be read from LHpi.Data.languages[langid]
2.8 to 2.9
  • please reintroduce site.langs[langid].id fields, in case they are needed in the future.
  • boolean OPTIONS ("LHpi properties") have been reordered to better fit expected enuser needs, but this change is only cosmetically.
  • new SetPrice return value handling means you will need to check and adapt your site.expected.
  • new configuration scriptversion should be added above scriptname, and scriptname should probably include it.
2.8 to 2.12
  • rename DEBUGSKIPFOUND -> DEBUGFOUND
  • rename STRICTCHECKEXPECTED -> STRICTEXPECTED
  • site.variants subtables can have boolean field [code]override[/code]. If true, default variant table from LHpi.Data is ignored, otherwise both are merged.
  • table site.expected should be wrapped in function site.SetExpected(). see template for example.
2.12 to 2.13
  • in addition to site.expected.EXPECTTOKENS, you can set the new options site.expected.EXPECTNONTRAD and site.expected.EXPECTREPL to match the default expectations to what kind of cards the site offers to reduce hardcoded numbers.
  • take note that Tokens, Replicas (Oversized Commanders), Nontraditionals (Conspiracies, Schemes,Planes), Inserts (Double-sided checklists) need object type to be set. In most cases this should already be done by the library, but you may need to use namereplacement or variant tables to set card type suffixes or variants.
  • If you had BuildCardDataPlugins remove "(oversized)" and similar suffixes, you should probably let the library handle those to find the right object type.
  • use Data version 5 or later, as previous variant tables did not include object type variants.
  • to debug object type merging, you can set the new option STRICTOBJTYPE.

Hints for developing

Moved here (from the main page) until they find a proper place in the howto. Feel free to reorder or move them.

just try to run it

The library will exit with an error if any non-optional sitescript properties or functions are undefined, to inform you what absolutely needs to be defined.

LHpi helper functions

Besides the functions outlined below at How-To_sitescriptTemplate#procedure of function calls, LHpi also defines a few helper functions that are used throughout and can also be used in your sitescript.

GuessFileEncoding

LHpi.GuessFileEncoding(string)
Checks the beginning of the string parameter for unicode Byte Order Marks ("BOM") and returns the most probable character encoding. If none of the three unicode BOMs is found, "cp1252" will be guessed. Was planned to be used for autodetecting site.encoding property, but did not seem reliable enough.
Returns, as string, one of "cp1252" , "utf8" , "utf16-le" (little endian), "utf16-be" (big endian)

ByteRep

LHpi.ByteRep(string)
Lua does not really care about character encoding, but treats every string as a sequence of bytes. This function seperates the string parameter into single bytes and returns a string with a sequence of the byte's decimal representation:

 mystring = "Zwölffüßler"
 print(LHpi.ByteRep(mystring))
 [Z]=90 [w]=119 [�]=195 [�]=182 [l]=108 [f]=102 [f]=102 [�]=195 [�]=188 [�]=195 [�]=159 [l]=108 [e]=101 [r]=114 

When you need to replace characters in the strings you parsed from the website and are unsure what to search for, you can use the decimal byte representation for string replacements:

 mystring = string.gsub( mystring, "\195\182" , "ö" )
 mystring = string.gsub( mystring, "\195\188" , "ü" )
 mystring = string.gsub( mystring, "\195\159" , "ß" )
Toutf8

LHpi.Toutf8(string,encoding)
In theory, returns the parameter string, with special characters converted to utf-8.

Parameter encoding is optional and if unset will default to site.encoding.
if encoding == "utf-8" or encoding == "utf8" or encoding == "unicode" then string is returned unchanged.
elseif encoding == "cp1252" or encoding == "ansi" then string is subjected to a number of gsubs. Characters are added to this function as needed, so not every possible character is converted yet.
When called with other parameters, it will throw an error.

Log

LHpi.Log(string,level,file,append)
Parameter level is optional and if unset will default to 0; 1 is VERBOSE, 2 is DEBUG. If level is negative, then ma.Log(string) will be called instead of ma.PutFile.
Parameter file is optional and if unset will default to "Prices\\LHpi.log" or "Prices\\" .. string.gsub( scriptname , "lua$" , "log" ), depending on SAVELOG.
Parameter append is optional and if unset will default to 1; 0 is overwrite and will delete the logfile's previous contents.

Length

LHpi.Length(table)
non-recursively counts table rows and returns the (first level) table length as number. For non-tables, length(nil)=length(false)=nil, otherwise 1. In lua, length of a table t is defined to be any integer index n such that t[n] is not nil and t[n+1] is nil, which does not play nice with the table indices LHpi uses.

Tostring

LHpi.Tostring(table)
The native lua tostring function returns an internal identifier if called with a table. This function recursively returns a string representation of the table, in the form { [key]='value';... ;}

Logtable

LHpi.Logtable(table,name,loglevel)
For large tables, LHpi.Tostring crashes ma. recursion too deep / out of memory ? This function LHpi.Tostring's and logs each row seperately.
Parameter name is optional and if unset will default to tostring(table).
Parameter loglevel is optional and if unset will default to 0.

multiple copies ot the same script

You can have multiple versions of the same script, but you need to make sure scriptname = "LHpi.SCRIPTNAME-v" .. libver .. ".x.lua" (somewhere around line 75, depending on sitescript) is changed to reflect the script's filename. Otherwise, they will overwrite each other's logfile and html raw data. Of course, this could be intentional, so keep the name identical if that is what you want.

procedure of function calls

Helper function calls are not mentioned.

MA calls sitescript.ImportPrice
sitescript.Importprice loads the library, then calls LHpi.DoImport
LHpi.DoImport calls LHpi.ProcessUserParams and initializes
LHpi.DoImport calls LHpi.ListSources
 LHpi.ListSources loops through sets,langs,frucs and calls site.BuildUrl
LHpi.DoImport calls LHpi.MainImportCycle
LHpi.MainImportCycle loops through sets,urls and calls LHpi.GetSourceData
 LHpi.GetSourceData loops through matches with site.regex and calls site.ParseHtmlData, then returns sourceTable
LHpi.MainImportCycle loops through sourceTable and calls LHpi.BuildCardData
 LHpi.BuildCardData calls site.BCDpluginPre and site.BCDpluginPost and returns newcard
 LHpi.MainImportCycle calls LHpi.FillCardsetTable with newcard
  LHpi.FillCardsetTable might call LHpi.MergeCardrows
  then adds a (merged) row to cardsetTable
(optionally) LHpi.MainImportCycle calls LHpi.SaveCSV once cardsetTable is complete
LHpi.MainImportCycle loops through cardsetTable and calls LHpi.SetPrice

dropping cards

Reasons for and ways to make LHpi.BuildCardData drop cards:

  • if LHpi.BuildCardData sets card.drop=true, LHpi.FillCardsetTable is skipped.
  • site.ParseHtmlData can set card.drop, which will be preserved by LHpi.BuildCardData.
  • if not card.name then card.drop = true
  • if string.find( card.name , "%(DROP[ %a]*%)" ) then card.drop = true
    • An example is in LHpi.magicuniverseDE.lua's site.BCDpluginPre. The advantage of changing the name above setting card.drop directly is that you can also add a reason for dropping to the name, which will be loggged if LOGDROPS=true.

non-trivial tables

To prevent difficult-to-find syntax errors, you might want to end each table field with a , or ;, even if it is not necessarry for the last entry.

releasing your sitescript

Here's a few non-obligatory suggestions if you want to share your (more or less?) finished sitescript

copyright information

I suggest to change the beginning comments to something like

--[[- LHpi.sitescript name
LHpi sitescript to parse data and import prices from website name and/or url
Based on LHpi.sitescriptTemplate-vversion numbers by LordHelmchen Inspired by and loosely based on "MTG Mint Card.lua" by Goblin Hero, Stromglad1 and "Import Prices.lua" by woogerboy21 everything else Copyright (C) 2014-this year by your name here add your prefered contact method here
@module LHpi @author your name @copyright 2014-this year your name except parts by LordHelmchen, Goblin Hero, Stromglad1 or woogerboy2 @release...keep the gpl notes intact

upload the file somewhere

If you don't want to create a seperate thread for your script(s), feel free to reply to LHpi's release thread and attach your script there

add your script to LHpi wiki page

How about adding a section for your sitescript to LHpi#Sitescripts ?
It should start with a file information box that includes

  1. a download link
  2. the sitescript revision and required library and data file version numbers
    • those numbers can be combined to form one full version number in the format [library version].[data file version].[sitescript revision]
    • the date of the latest update to the script

A link to the queried website and a short description would also be nice.
If you want, you can keep a changelog in LHpi#version_history