Storing Multi-dimensional Lists?

Storing Multi-dimensional Lists? - python

(Code below)
I'm scraping a website and the data I'm getting back is in 2 multi-dimensional arrays. I'm wanting everything to be in a JSON format because I want to save this and load it in again later when I add "tags".
So, less vague. I'm writing a program which takes in data like what characters you have and what missions are requiring you to do (you can complete multiple at once if the attributes align), and then checks that against a list of attributes that each character fulfills and returns a sorted list of the best characters for the context.
Right now I'm only scraping character data but I've already "got" the attribute data per character - the problem there was that it wasn't sorted by name so it was just a randomly repeating list that I needed to be able to look up. I still haven't quite figured out how to do that one.
Right now I have 2 arrays, 1 for the headers of the table and one for the rows of the table. The rows contain the "Answers" for the Header's "Questions" / "Titles" ; ie Maximum Level, 50
This is true for everything but the first entry which is the Name, Pronunciation (and I just want to store the name of course).
So:
Iterations = 0
While loop based on RowArray length / 9 (While Iterations <= that)
HeaderArray[0] gives me the name
RowArray[Iterations + 1] gives me data type 2
RowArray[Iterations + 2] gives me data type 3
Repeat until Array[Iterations + 8]
Iterations +=9
So I'm going through and appending these to separate lists - single arrays like CharName[] and CharMaxLevel[] and so on.
But I'm actually not sure if that's going to make this easier or not? Because my end goal here is to send "CharacterName" and get stuff back based on that AND be able to send in "DesiredTraits" and get "CharacterNames who fit that trait" back. Which means I also need to figure out how to store that category data semi-efficiently. There's over 80 possible categories and most only fit into about 10. I don't know how I'm going to store or load that data.
I'm assuming JSON is the best way? And I'm trying to keep it all in one file for performance and code readability reasons - don't want a file for each character.
CODE: (Forgive me, I've never scraped anything before + I'm actually somewhat new to Python - just got it 4? days ago)
https://pastebin.com/yh3Z535h
^ In the event anyone wants to run this and this somehow makes it easier to grab the raw code (:
import time
import requests, bs4, re
from urllib.parse import urljoin
import json
import os
target_dir = r"D:\00Coding\Js\WebScraper" #Yes, I do know that storing this in my Javascript folder is filthy
fullname = os.path.join(target_dir,'TsumData.txt')
StartURL = 'http://disneytsumtsum.wikia.com/wiki/Skill_Upgrade_Chart'
URLPrefix = 'http://disneytsumtsum.wikia.com'
def make_soup(url):
r = requests.get(url)
soup = bs4.BeautifulSoup(r.text, 'lxml')
return soup
def get_links(url):
soup = make_soup(url)
a_tags = soup.find_all('a', href=re.compile(r"^/wiki/"))
links = [urljoin(URLPrefix, a['href'])for a in a_tags] # convert relative url to absolute url
return links
def get_tds(link):
soup = make_soup(link)
#tds = soup.find_all('li', class_="category normal") #This will give me the attributes / tags of each character
tds = soup.find_all('table', class_="wikia-infobox")
RowArray = []
HeaderArray = []
if tds:
for td in tds:
#print(td.text.strip()) #This is everything
rows = td.findChildren('tr')#[0]
headers = td.findChildren('th')#[0]
for row in rows:
cells = row.findChildren('td')
for cell in cells:
cell_content = cell.getText()
clean_content = re.sub( '\s+', ' ', cell_content).strip()
if clean_content:
RowArray.append(clean_content)
for row in rows:
cells = row.findChildren('th')
for cell in cells:
cell_content = cell.getText()
clean_content = re.sub( '\s+', ' ', cell_content).strip()
if clean_content:
HeaderArray.append(clean_content)
print(HeaderArray)
print(RowArray)
return(RowArray, HeaderArray)
#Output = json.dumps([dict(zip(RowArray, row_2)) for row_2 in HeaderArray], indent=1)
#print(json.dumps([dict(zip(RowArray, row_2)) for row_2 in HeaderArray], indent=1))
#TempFile = open(fullname, 'w') #Read only, Write Only, Append
#TempFile.write("EHLLO")
#TempFile.close()
#print(td.tbody.Series)
#print(td.tbody[Series])
#print(td.tbody["Series"])
#print(td.data-name)
#time.sleep(1)
if __name__ == '__main__':
links = get_links(StartURL)
MainHeaderArray = []
MainRowArray = []
MaxIterations = 60
Iterations = 0
for link in links: #Specifically I'll need to return and append the arrays here because they're being cleared repeatedly.
#print("Getting tds calling")
if Iterations > 38: #There are this many webpages it'll first look at that don't have the data I need
TempRA, TempHA = get_tds(link)
MainHeaderArray.append(TempHA)
MainRowArray.append(TempRA)
MaxIterations -= 1
Iterations += 1
#print(MaxIterations)
if MaxIterations <= 0: #I don't want to scrape the entire website for a prototype
break
#print("This is the end ??")
#time.sleep(3)
#jsonized = map(lambda item: {'Name':item[0], 'Series':item[1]}, zip())
print(MainHeaderArray)
#time.sleep(2.5)
#print(MainRowArray)
#time.sleep(2.5)
#print(zip())
TsumName = []
TsumSeries = []
TsumBoxType = []
TsumSkillDescription = []
TsumFullCharge = []
TsumMinScore = []
TsumScoreIncreasePerLevel = []
TsumMaxScore = []
TsumFullUpgrade = []
Iterations = 0
MaxIterations = len(MainRowArray)
while Iterations <= MaxIterations: #This will fire 1 time per Tsum
print(Iterations)
print(MainHeaderArray[Iterations][0]) #Holy this gives us Mickey ;
print(MainHeaderArray[Iterations+1][0])
print(MainHeaderArray[Iterations+2][0])
print(MainHeaderArray[Iterations+3][0])
TsumName.append(MainHeaderArray[Iterations][0])
print(MainRowArray[Iterations][1])
#At this point it will, of course, crash - that's because I only just realized I needed to append AND I just realized that everything
#Isn't stored in a list as I thought, but rather a multi-dimensional array (as you can see below I didn't know this)
TsumSeries[Iterations] = MainRowArray[Iterations+1]
TsumBoxType[Iterations] = MainRowArray[Iterations+2]
TsumSkillDescription[Iterations] = MainRowArray[Iterations+3]
TsumFullCharge[Iterations] = MainRowArray[Iterations+4]
TsumMinScore[Iterations] = MainRowArray[Iterations+5]
TsumScoreIncreasePerLevel[Iterations] = MainRowArray[Iterations+6]
TsumMaxScore[Iterations] = MainRowArray[Iterations+7]
TsumFullUpgrade[Iterations] = MainRowArray[Iterations+8]
Iterations += 9
print(Iterations)
print("It's Over")
time.sleep(3)
print(TsumName)
print(TsumSkillDescription)
Edit:
tl;dr my goal here is to be like
"For this Mission Card I need a Blue Tsum with high score potential, a Monster's Inc Tsum for a bunch of games, and a Male Tsum for a long chain.. what's the best Tsum given those?" and it'll be like "SULLY!" and automatically select it or at the very least give you a list of Tsums. Like "These ones match all of them, these ones match 2, and these match 1"
Edit 2:
Here's the command Line Output for the code above:
https://pastebin.com/vpRsX8ni
Edit 3: Alright, just got back for a short break. With some minor looking over I see what happened - my append code is saying "Append this list to the array" meaning I've got a list of lists for both the Header and Row arrays that I'm storing. So I can confirm (for myself at least) that these aren't nested lists per se but they are definitely 2 lists, each containing a single list at every entry. Definitely not a dictionary or anything "special case" at least. This should help me quickly find an answer now that I'm not throwing "multi-dimensional list" around my google searches or wondering why the list stuff isn't working (as it's expecting 1 value and gets a list instead).
Edit 4:
I need to simply add another list! But super nested.
It'll just store the categories that the Tsum has as a string.
so Array[10] = ArrayOfCategories[Tsum] (which contains every attribute in string form that the Tsum has)
So that'll be ie TsumArray[10] = ["Black", "White Gloves", "Mickey & Friends"]
And then I can just use the "Switch" that I've already made in order to check them. Possibly. Not feeling too well and haven't gotten that far yet.

Just use the with open file as json_file , write/read (super easy).
Ultimately stored 3 json files. No big deal. Much easier than appending into one big file.

Related

Getting the last two variables in for loop

I am trying to make a program that shows me the data of two specific coins. What it basically does is to takes the data in an infinite "for loop" to display the info until I close the program.
And now I am trying to get the last two elements of this infinite for loop every time it runs again and make calculations with it. I know I can't just hold all the items in a list and I am not sure how to store last two's and use them every time.
for line in lines:
coinsq = line.strip()
url = priceKey + coinsq + "USDT"
data = requests.get(url)
datax = data.json()
print( datax['symbol'] + " " + datax['price'])

Store the data in a deque (from the collections module).
Initialise your deque like this:
from collections import deque
d = deque([], 2)
Now you can append to d as many times as you like and it will only ever have the most recent two entries.
So, for example:
d.append('a')
d.append('b')
d.append('c')
for e in d:
print(e)
Will give the output:
b
c
Adapting your code to use this technique should be trivial.
I recommend this approach in favour of using two variables because it's easier to change if you (for some reason) decided that you want the last N values because all you need to do is change the deque constructor

You can just use two variables that you update for each new elements, at the end you will just have the two last elements seen :
pre_last = None
last = None
for line in lines:
coinsq = line.strip()
url = priceKey + coinsq + "USDT"
data = requests.get(url)
datax = data.json()
print( datax['symbol'] + " " + datax['price'])
pre_last = last
last = datax
#Do the required calculations with last and pre_last
(And just to be exact this isn't an infinite loop otherwise there wouldn't be a 'last' element)

As your script does not have prior information of when the execution is going to halt, I suggest to define a queue-like structure. In each iteration, you update your last item and your previous-to-last. In that way, you just have to keep in memory two elements. I don't know how were you planning on accessing those two elements when the execution has finished, but you should be able to access that queue when the execution is over.
Sorry for not providing code, but this can be done in many ways, I supposed it was better to suggest you a way of proceeding.

You can define a variable for the second-last element of the for loop, and use the datax variable that's already defined in the loop as the last element:
sec_last = None
datax = None
for line in lines:
sec_last = datax
coinsq = line.strip()
url = priceKey + coinsq + "USDT"
data = requests.get(url)
datax = data.json()
print( datax['symbol'] + " " + datax['price'])
print(f"Last element", datax)
print(f"Second Last element", sec_last)

Using Try Except to iterate through a list in Python

I'm trying to iterate through a list of NFL QBs (over 100) and add create a list of links that I will use later.
The links follow a standard format, however if there are multiple players with the same name (such as 'Josh Allen') the link format needs to change.
I've been trying to do this with different nested while/for loops with Try/Except with little to no success. This is what I have so far:
test = ['Josh Allen', 'Lamar Jackson', 'Derek Carr']
empty_list=[]
name_int = 0
for names in test:
try:
q_b_name = names.split()
link1=q_b_name[1][0].capitalize()
link2=q_b_name[1][0:4].capitalize()+q_b_name[0][0:2].capitalize()+f'0{name_int}'
q_b = pd.read_html(f'https://www.pro-football-reference.com/players/{link1}/{link2}/gamelog/')
q_b1 = q_b[0]
#filter_status is a function that only works with QB data
df = filter_stats(q_b1)
#triggers the try if the link wasn't a QB
df.head(5)
empty_list.append(f'https://www.pro-football-reference.com/players/{link1}/{link2}/gamelog/')
except:
#adds one to the variable to change the link to find the proper QB link
name_int += 1
The result only appends the final correct link. I need to append each correct link to the empty list.
Still a beginner in Python and trying to challenge myself with different projects. Thanks!

As stated, the try/except will work in that it will try the code under the try block. If at any point within that block it fails or raises and exception/error, it goes and executes the block of code under the except.
There are better ways to go about this problem (for example, I'd use BeautifulSoup to simply check the html for the "QB" position), but since you are a beginner, I think trying to learn this process will help you understand the loops.
So what this code does:
1 It formats your player name into the link format.
2 We initialize a while loop that will it will enter
3 It gets the table.
4a) It enters a function that checks if the table contains 'passing'
stats by looking at the column headers.
4b) If it finds 'passing' in the column, it will return a True statement to indicate it is a "QB" type of table (keep in mind sometimes there might be runningbacks or other positions who have passing stats, but we'll ignore that). If it returns True, the while loop will stop and go to the next name in your test list
4c) If it returns False, it'll increment your name_int and check the next one
5 To take care of a case where it never finds a QB table, the while loop will go to False if it tries 10 iterations
Code:
import pandas as pd
def check_stats(q_b1):
for col in q_b1.columns:
if 'passing' in col.lower():
return True
return False
test = ['Josh Allen', 'Lamar Jackson', 'Derek Carr']
empty_list=[]
for names in test:
name_int = 0
q_b_name = names.split()
link1=q_b_name[1][0].capitalize()
qbStatsInTable = False
while qbStatsInTable == False:
link2=q_b_name[1][0:4].capitalize()+q_b_name[0][0:2].capitalize()+f'0{name_int}'
url = f'https://www.pro-football-reference.com/players/{link1}/{link2}/gamelog/'
try:
q_b = pd.read_html(url, header=0)
q_b1 = q_b[0]
except Exception as e:
print(e)
break
#Check if "passing" in the table columns
qbStatsInTable = check_stats(q_b1)
if qbStatsInTable == True:
print(f'{names} - Found QB Stats in {link1}/{link2}/gamelog/')
empty_list.append(f'https://www.pro-football-reference.com/players/{link1}/{link2}/gamelog/')
else:
name_int += 1
if name_int == 10:
print(f'Did not find a link for {names}')
qbStatsInTable = False
Output:
print(empty_list)
['https://www.pro-football-reference.com/players/A/AlleJo02/gamelog/', 'https://www.pro-football-reference.com/players/J/JackLa00/gamelog/', 'https://www.pro-football-reference.com/players/C/CarrDe02/gamelog/']

Python BeautifulSoup - Improve readability of find by Id function?

I would like to improve the readability following code, especially lines 8 to 11
import requests
from bs4 import BeautifulSoup
URL = 'https://docs.google.com/forms/d/e/1FAIpQLSd5tU8isVcqd02ymC2n952LC2Nz_FFPd6NT1lD4crDeSsJi2w/viewform?usp=sf_link'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
question1 = str(soup.find(id='i1'))
question1 = question1.split('>')[1].lstrip().split('.')[1]
question1 = question1[1:]
question1 = question1.replace("_", "")
print(question1)
Thanks in advance :)

You could use the following
question1 = soup.find(id='i1').getText().split(".")[1].replace("_","").strip()
to replace lines 8 to 11.
.getText() takes care of removing the html-tags. Rest is pretty much the same.
In python you can almos always just chain operations. So your code would also be valid a a one-liner:
question1 = str(soup.find(id='i1')).split('>')[1].lstrip().split('.')[1][1:].replace("_", "")
But in most cases it is better to leave the code in a more readable form than to reduce the line-count.

Abhinav, is not very clear what you want to achieve, the script is actually already very simple which is a good thing and follow the Pythonic principle of The Zen of Python:
"Simple is better than complex."
Also is not comprehensive of what you actually mean:
Make it more simple as in Understandable and clear for Human beings?
Make it more simple for the machine to compute it, hence improve performance?
Reduce the line of codes and follow more the programming Guidelines?
I point this out because for next time would be better to make it more explicit in the question, having said that, as I don't know exactly what you mean, I come up with an answer that more or less covers all of 3 points:
ANSWER
import requests
from bs4 import BeautifulSoup
URL = 'https://docs.google.com/forms/d/e/1FAIpQLSd5tU8isVcqd02ymC2n952LC2Nz_FFPd6NT1lD4crDeSsJi2w/viewform?usp=sf_link'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
# ========= < FUNCTION TO GET ALL QUESTION DYNAMICALLY > ========= #
def clean_string_by_id(page, id):
content = str(page.find(id=id)) # Get Content of page by different ids
if content != 'None': # Check if there is actual content or not
find_question = content.split('>') # NOTE: Split at tags closing
if len(find_question) >= 2 and find_question[1][0].isdigit(): # NOTE: If len is 1 means that is not the correct element Also we check if the first element is a digit means that is correct
cleaned_question = find_question[1].split('.')[1].strip() # We get the actual Question and strip it already !
result = cleaned_question.replace('_', '')
return result
else:
return
# ========= < Scan the entire page Dynamically + add result to a list> ========= #
all_questions = []
for i in range(1, 50): # NOTE: I went up to 50 but there may be many more, I let you test it
get_question = clean_string_by_id(soup, f'i{i}')
if get_question: # Append result to list only if there is actual content
all_questions.append(get_question)
# ========= < show all results > ========= #
for question in all_questions:
print(question)
NOTE
Here I'm assuming that you want to get all elements from this page, hence you don't want to write 2000 variables, as you can see I left the logic basically the same as yours but I wrapped everything in a Function instead.
In fact the steps you follow were pretty good and yes you may "improve it" or make it "smarter" however comprehensible wins complexity. Also take in mind that I assumed that get all the 'questions' from that Google Forms was your goal.
EDIT
As pointed by #wuerfelfreak and as he explains in his answer further improvement can be achived by using getText() function
Hence here the result of the above function using getText:
def clean_string_by_id(page, id):
content = page.find(id=id)
if content: # NOTE: Check if there is actual content or not, same as if len(content) >= 0
find_question = content.getText() # NOTE: Split at tags closing
if find_question: # NOTE: same as do if len(findÑ_question) >= 1: ... If is 0 means that is a empty line so we skip it
cleaned_question = find_question.split('.')[1].strip() # Same as before
result = cleaned_question.replace('_', '')
return result
Documentations & Guides
Zen of Python
getText
geeksforgeeks.org | isdigit()

CS50 'DNA': Ways to speed up my Week 6 'dna.py' program?

So for this problem I had to create a program that takes in two arguments. A CSV database like this:
name,AGATC,AATG,TATC
Alice,2,8,3
Bob,4,1,5
Charlie,3,2,5
And a DNA sequence like this:
TAAAAGGTGAGTTAAATAGAATAGGTTAAAATTAAAGGAGATCAGATCAGATCAGATCTATCTATCTATCTATCTATCAGAAAAGAGTAAATAGTTAAAGAGTAAGATATTGAATTAATGGAAAATATTGTTGGGGAAAGGAGGGATAGAAGG
My program works by first getting the "Short Tandem Repeat" (STR) headers from the database (AGATC, etc.), then counting the highest number of times each STR repeats consecutively within the sequence. Finally, it compares these counted values to the values of each row in the database, printing out a name if a match is found, or "No match" otherwise.
The program works for sure, but is ridiculously slow whenever ran using the larger database provided, to the point where the terminal pauses for an entire minute before returning any output. And unfortunately this is causing the 'check50' marking system to time-out and return a negative result upon testing with this large database.
I'm presuming the slowdown is caused by the nested loops within the 'STR_count' function:
def STR_count(sequence, seq_len, STR_array, STR_array_len):
# Creates a list to store max recurrence values for each STR
STR_count_values = [0] * STR_array_len
# Temp value to store current count of STR recurrence
temp_value = 0
# Iterates over each STR in STR_array
for i in range(STR_array_len):
STR_len = len(STR_array[i])
# Iterates over each sequence element
for j in range(seq_len):
# Ensures it's still physically possible for STR to be present in sequence
while (seq_len - j >= STR_len):
# Gets sequence substring of length STR_len, starting from jth element
sub = sequence[j:(j + (STR_len))]
# Compares current substring to current STR
if (sub == STR_array[i]):
temp_value += 1
j += STR_len
else:
# Ensures current STR_count_value is highest
if (temp_value > STR_count_values[i]):
STR_count_values[i] = temp_value
# Resets temp_value to break count, and pushes j forward by 1
temp_value = 0
j += 1
i += 1
return STR_count_values
And the 'DNA_match' function:
# Searches database file for DNA matches
def DNA_match(STR_values, arg_database, STR_array_len):
with open(arg_database, 'r') as csv_database:
database = csv.reader(csv_database)
name_array = [] * (STR_array_len + 1)
next(database)
# Iterates over one row of database at a time
for row in database:
name_array.clear()
# Copies entire row into name_array list
for column in row:
name_array.append(column)
# Converts name_array number strings to actual ints
for i in range(STR_array_len):
name_array[i + 1] = int(name_array[i + 1])
# Checks if a row's STR values match the sequence's values, prints the row name if match is found
match = 0
for i in range(0, STR_array_len, + 1):
if (name_array[i + 1] == STR_values[i]):
match += 1
if (match == STR_array_len):
print(name_array[0])
exit()
print("No match")
exit()
However, I'm new to Python, and haven't really had to consider speed before, so I'm not sure how to improve upon this.
I'm not particularly looking for people to do my work for me, so I'm happy for any suggestions to be as vague as possible. And honestly, I'll value any feedback, including stylistic advice, as I can only imagine how disgusting this code looks to those more experienced.
Here's a link to the full program, if helpful.
Thanks :) x

Thanks for providing a link to the entire program. It seems needlessly complex, but I'd say it's just a lack of knowing what features are available to you. I think you've already identified the part of your code that's causing the slowness - I haven't profiled it or anything, but my first impulse would also be the three nested loops in STR_count.
Here's how I would write it, taking advantage of the Python standard library. Every entry in the database corresponds to one person, so that's what I'm calling them. people is a list of dictionaries, where each dictionary represents one line in the database. We get this for free by using csv.DictReader.
To find the matches in the sequence, for every short tandem repeat in the database, we create a regex pattern (the current short tandem repeat, repeated one or more times). If there is a match in the sequence, the total number of repetitions is equal to the length of the match divided by the length of the current tandem repeat. For example, if AGATCAGATCAGATC is present in the sequence, and the current tandem repeat is AGATC, then the number of repetitions will be len("AGATCAGATCAGATC") // len("AGATC") which is 15 // 5, which is 3.
count is just a dictionary that maps short tandem repeats to their corresponding number of repetitions in the sequence. Finally, we search for a person whose short tandem repeat counts match those of count exactly, and print their name. If no such person exists, we print "No match".
def main():
import argparse
from csv import DictReader
import re
parser = argparse.ArgumentParser()
parser.add_argument("database_filename")
parser.add_argument("sequence_filename")
args = parser.parse_args()
with open(args.database_filename, "r") as file:
reader = DictReader(file)
short_tandem_repeats = reader.fieldnames[1:]
people = list(reader)
with open(args.sequence_filename, "r") as file:
sequence = file.read().strip()
count = dict(zip(short_tandem_repeats, [0] * len(short_tandem_repeats)))
for short_tandem_repeat in short_tandem_repeats:
pattern = f"({short_tandem_repeat}){{1,}}"
match = re.search(pattern, sequence)
if match is None:
continue
count[short_tandem_repeat] = len(match.group()) // len(short_tandem_repeat)
try:
person = next(person for person in people if all(int(person[k]) == count[k] for k in short_tandem_repeats))
print(person["name"])
except StopIteration:
print("No match")
return 0
if __name__ == "__main__":
import sys
sys.exit(main())

Grabbing values of an array in sets of 100

In the code below, ids is an array which contains the steam64 ids of all users in your friendslist. Now according to the steam web api documentation, GetPlayerSummaries only takes a list of 100 comma separated steam64 ids. Some users have more than 100 friends, and instead of running a for loop 200 times that each time calls the API, I want to take array in sets of 100 steam ids. What would be the most efficient way to do this (in terms of speed)?
I know that I can do ids[0:100] to grab the first 100 elements of an array, but how I accomplish doing this for a friendlist of say 230 users?
def getDescriptions(ids):
sids = ','.join(map(str, ids))
r = requests.get('http://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key='+API_KEY+'&steamids=' + sids)
data = r.json();
...

Utilizing the code from this answer, you are able to break this into groups of 100 (or less for the last loop) of friends.
def chunkit(lst, n):
newn = int(len(lst)/n)
for i in xrange(0, n-1):
yield lst[i*newn:i*newn+newn]
yield lst[n*newn-newn:]
def getDescriptions(ids):
friends = chunkit(ids, 3)
while (True):
try:
fids = friends.next()
sids = ','.join(map(str, fids))
r = requests.get('http://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key='+API_KEY+'&steamids=' + sids)
data = r.json()
# Do something with the data variable
except StopIteration:
break
This will create iterators broken into 3 (second parameter to chunkit) groups. I chose 3, because the base size of the friends list is 250. You can get more (rules from this post), but it is a safe place to start. You can fine tune that value as you need.
Utilizing this method, your data value will be overwritten each loop. Make sure you do something with it at the place indicated.

I have an easy alternative, just reduce your list size on each while/loop until exhaustion:
def getDescriptions(ids):
sids = ','.join(map(str, ids))
sids_queue = sids.split(',')
data = []
while len(sids_queue) != 0:
r = requests.get('http://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key='+ \
API_KEY+'&steamids=' + ','.join(sids_queue[:100])
data.append(r.json) # r.json without (), by the way
# then skip [0:100] and reassign to sids_queue, you get the idea
sids_queue = sids_queue[101:]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Storing Multi-dimensional Lists? - python

Just use the with open file as json_file , write/read (super easy). Ultimately stored 3 json files. No big deal. Much easier than appending into one big file.

Related

Getting the last two variables in for loop

Using Try Except to iterate through a list in Python

Python BeautifulSoup - Improve readability of find by Id function?

CS50 'DNA': Ways to speed up my Week 6 'dna.py' program?

Grabbing values of an array in sets of 100

Categories

Resources