EDHREC's Average Decks
Moderators: timmermac, Blacksmith, KrazyTheFox, Agetian, friarsol, CCGHQ Admins
EDHREC's Average Decks
by LtEntropy » 07 Jul 2019, 08:48
As many of you who play EDH may be familiar with, the popular site EDHREC has a feature where you can see what an average deck looks like based on the statistics it's found from decks online.
In the past I would manually copy and paste these decks in the deck editor importer to try them out. Long story short I got sick of that, scraped the site for all its decks and with a bit of regex and python work I've gotten every average deck (and themes) formatted for Forge.
3691 decks, which includes each theme that may be popular with each commander that EDHREC tries to build average decks with.
Any questions or feedback would be great to hear.
In the past I would manually copy and paste these decks in the deck editor importer to try them out. Long story short I got sick of that, scraped the site for all its decks and with a bit of regex and python work I've gotten every average deck (and themes) formatted for Forge.
3691 decks, which includes each theme that may be popular with each commander that EDHREC tries to build average decks with.
Any questions or feedback would be great to hear.
- Attachments
-
- EDH Rec Average Decks N-Z.zip
- (1.67 MiB) Downloaded 483 times
-
- EDH Rec Average Decks A-M.zip
- (1.95 MiB) Downloaded 502 times
Re: EDHREC's Average Decks
by Keineahnungking » 27 Aug 2019, 16:02
I have been really enjoying the decks and was wondering if you could post updated decklist now that M20 and Commander 2019 have been released or if you could post the script you used. Thanks in advance
- Keineahnungking
- Posts: 2
- Joined: 21 Jun 2017, 19:07
- Has thanked: 1 time
- Been thanked: 0 time
Re: EDHREC's Average Decks
by LtEntropy » 28 Aug 2019, 05:30
Thanks!
The program that I used was WinHTTrack Website Copier (http://www.httrack.com). It's something I've had lying around for a while, so I couldn't speak about what other programs or script methods are better than this, but it did the job I wanted.
The address I set the program to copy everything from was https://edhrec.com/decks/ hoping that I'd get just the basic average decks, but luckily I ended up with all the themes for each commander as well.
So I end up with a bunch of html files with the commander and theme in the file name. What I want isolated in each of them is a div with a class labelled "test_cardlistscontainer". This is the div with just the deck, and a couple "buy this deck here" links. The problem is editting thousands of files with the same regex to scrape away all the other html stuff.
I ended up using another program I had called Ecobyte Replace Text (http://www.ecobyte.com/).
I use this to apply regex text replacements across all the files. Once I was done, I then decided to take some time and relearn a bit of python to go through all the files add the tags I need for a Forge commander deck.
The script I used is below. It mentions opening up txt files, which I did with the Bulk Rename Utility (http://www.bulkrenameutility.co.uk), renaming all the files from htm to txt, then after from txt to dck. I'm a Windows user, and while I could be doing a lot of this in linux and bash, windows and batch files, or python if I took the time to learn it better, I would have but I also like collecting these kinds of utility apps and only using them when I need to.
The python script:
The program that I used was WinHTTrack Website Copier (http://www.httrack.com). It's something I've had lying around for a while, so I couldn't speak about what other programs or script methods are better than this, but it did the job I wanted.
The address I set the program to copy everything from was https://edhrec.com/decks/ hoping that I'd get just the basic average decks, but luckily I ended up with all the themes for each commander as well.
So I end up with a bunch of html files with the commander and theme in the file name. What I want isolated in each of them is a div with a class labelled "test_cardlistscontainer". This is the div with just the deck, and a couple "buy this deck here" links. The problem is editting thousands of files with the same regex to scrape away all the other html stuff.
I ended up using another program I had called Ecobyte Replace Text (http://www.ecobyte.com/).
I use this to apply regex text replacements across all the files. Once I was done, I then decided to take some time and relearn a bit of python to go through all the files add the tags I need for a Forge commander deck.
The script I used is below. It mentions opening up txt files, which I did with the Bulk Rename Utility (http://www.bulkrenameutility.co.uk), renaming all the files from htm to txt, then after from txt to dck. I'm a Windows user, and while I could be doing a lot of this in linux and bash, windows and batch files, or python if I took the time to learn it better, I would have but I also like collecting these kinds of utility apps and only using them when I need to.
The python script:
- Code: Select all
import os
for filename in os.listdir(r"C:\Users\------\Desktop\EDH Rec Average Decks"):
if filename.endswith(".txt"):
textfile = os.path.join(r"C:\Users\------\Desktop\EDH Rec Average Decks", filename)
# Get the file's name without extention
textnamebase = os.path.basename(textfile)
name = os.path.splitext(textnamebase)[0]
print(name)
newfile = open(os.path.join(r"C:\Users\------\Desktop\EDH Rec Average Decks\New folder", "new"+filename),"w")
newfile.write("[metadata]\n")
newfile.write("Name="+name+"\n")
newfile.write("[Commander]\n")
with open(textfile,"r") as file:
file.readline()
newfile.write(file.readline())
newfile.write("[Main]\n")
for line in file:
newfile.write(line)
file.close()
newfile.close()
continue
else:
continue
Re: EDHREC's Average Decks
by LtEntropy » 29 Sep 2019, 02:16
I wanted to post an update about downloading the decks from EDHREC.
I've made a script in python where when you run it, you'll be asked to enter a legendary creature, and it'll get the average deck and all other themed decks related to that commander. A good place to put this script would be the commander decks folder for Forge, and then run it whenever you want all the current decks on that commander, and it'll drop the deck files right in the same folder.
You could always modify the code as well to run the code through a loop for each commander name in a list.
Also, as for my description on scaping EDHREC before with WinHTTrack, I forgot that I didn't simply enter "https://edhrec.com/decks/", but a list of all the commanders' deck pages. Otherwise you get a 404.
Finally I'm providing zip files of updated decks and commanders as of Sept 27. 4573 decks in total.
I've made a script in python where when you run it, you'll be asked to enter a legendary creature, and it'll get the average deck and all other themed decks related to that commander. A good place to put this script would be the commander decks folder for Forge, and then run it whenever you want all the current decks on that commander, and it'll drop the deck files right in the same folder.
- Code: Select all
import urllib.request
import re
import time
from bs4 import BeautifulSoup
def savedeck(url,partner=False):
# Get the html of the card's page
fp = urllib.request.urlopen(url)
mybytes = fp.read()
mystr = mybytes.decode("utf8")
fp.close()
# Isolate the div with the decklist
content = BeautifulSoup(mystr, 'html.parser').find("div", class_="well")
# Remove the first three lines, which are two links to buy the deck and an empty line
decklist = re.sub('<[^<]+?>', '', content.text).split("\n",3)[3].replace('\n\n', '\n')
# Turn the deck into a list
decklist = decklist.split("\n")
# The name of the deck is just the page's folder name
name = url[url.rindex('/')+1:]
# Write the .dck file
file = open(name+".dck","w")
file.write("[metadata]\n")
file.write("Name="+name+"\n")
file.write("[Commander]\n")
file.write(decklist[0]+"\n")
if partner: file.write(decklist[1]+"\n")
file.write("[Main]\n")
for i in range(1+partner,len(decklist)):
file.write(decklist[i]+"\n")
file.close()
# Return the html text
return mystr
def getdecks(url,partner=False):
# Print the names as you go
name = url[url.rindex('/')+1:]
print(name)
htmltext = savedeck(url,partner=False)
otherdecks = re.findall(name+'-.*">',htmltext)
# Repeat for all other themes found
for i in range(len(otherdecks)):
# Limits the scraping to two decks a second
time.sleep(0.5)
# Trim off the "> from the regex search
otherdecks[i] = otherdecks[i][:-2].lower()
print(otherdecks[i])
savedeck("https://edhrec.com/decks/" + otherdecks[i],partner=False)
card = input("Legendary Creature: ")
# Turn the card into a URL
card = re.sub(" ","-",card)
card = re.sub("'","",card)
card = re.sub(",","",card)
card = re.sub('"',"",card)
getdecks("https://edhrec.com/decks/" + card.lower())
You could always modify the code as well to run the code through a loop for each commander name in a list.
Also, as for my description on scaping EDHREC before with WinHTTrack, I forgot that I didn't simply enter "https://edhrec.com/decks/", but a list of all the commanders' deck pages. Otherwise you get a 404.
Finally I'm providing zip files of updated decks and commanders as of Sept 27. 4573 decks in total.
- Attachments
-
- EDHREC Decks Q-Z.zip
- (1.59 MiB) Downloaded 490 times
-
- EDHREC Decks I-P.zip
- (1.6 MiB) Downloaded 430 times
-
- EDHREC Decks A-H.zip
- (1.31 MiB) Downloaded 572 times
Re: EDHREC's Average Decks
by Keineahnungking » 24 Mar 2022, 00:12
I wanted an updated version of these decks especially with all the commander related products releasing lately and the decks being almost 3 years old. So I modified LtEntropy's Python script to work with the current EDHREC website.
There are 8100 decks included in this download which I had to split into five parts.
Enjoy!
There are 8100 decks included in this download which I had to split into five parts.
Enjoy!
- Attachments
-
- EDHREC Decks A-F.zip
- (1.82 MiB) Downloaded 232 times
-
- EDHREC Decks G-J.zip
- (1.22 MiB) Downloaded 224 times
-
- EDHREC Decks K-N.zip
- (1.94 MiB) Downloaded 245 times
-
- EDHREC Decks O-S.zip
- (1.61 MiB) Downloaded 233 times
-
- EDHREC Decks T-Z.zip
- (1.55 MiB) Downloaded 222 times
- Keineahnungking
- Posts: 2
- Joined: 21 Jun 2017, 19:07
- Has thanked: 1 time
- Been thanked: 0 time
Re: EDHREC's Average Decks
by Computica » 28 Oct 2022, 21:14
Thanks; Post the updated script please.
8 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 6 guests