Im building a google sheet to keep track of stock prices for the stocks i own. I have an API running thats connected to Google Sheets and my own python application.
My google sheet looks like this
Stock | Previous close
AAPL | 316.73
NVDA | 348.71
SPOT | 191.00
i currently have the code running as follows.
import requests
import gspread
from oauth2client.service_account import ServiceAccountCredentials
sheet = client.open("Stock").sheet1
AAPL = sheet.cell(2,1).value
url = ('https://ca.finance.yahoo.com/quote/'+AAPL+'?p='+AAPL+'&.tsrc=fin-srch')
response = requests.get(url)
htmltext = response.text
splitlist = htmltext.split("Previous Close")
afterfirstsplit =splitlist[1].split("\">")[2]
aftersecondsplit = afterfirstsplit.split("</span>")
datavalue = aftersecondsplit[0]
sheet.update_cell(2,2,datavalue)
# this would update the value within my google sheet to the previous close price
For each individual stock, i would copy and paste, change the stock symbol, to find the value of the next quote.
I know theres a way to use FOR statements to automate this process. I tried that with the following but it wouldnt update as needed. I reached a wall at this point and would appreciate any help or insight on how i could automate this function.
tickers = {sheet.cell(2,1).value : [],
sheet.cell(3,1).value : [],
sheet.cell(4,1).value : [],
sheet.cell(5,1).value :[]}
for symbols in tickers:
url = ('https://ca.finance.yahoo.com/quote/'+symbols+'?p='+symbols+'&.tsrc=fin-srch')
response = requests.get(url)
htmltext = response.text
splitlist = htmltext.split("Previous Close")
afterfirstsplit =splitlist[1].split("\">")[2]
aftersecondsplit = afterfirstsplit.split("</span>")
datavalue = aftersecondsplit[0]
sheet.update.cell(2,1,datavalue)
print (datavalue)
Doing this gathers all the values of the current stock prices and it does import it into the excel file but only to one coordinate. I dont know how to increase the '1' within sheet.update.cell(2,1,datavalue), each time within the FOR statement. I believe that is the way to solve this, but if anyone has any other suggestions, im all ears.
In regards to answering this part of your question:
"I don't know how to increase the '1' within sheet.update.cell(2,1,datavalue), each time within the FOR statement."
This is how you increment a counter inside a for loop typically speaking:
counter = 1
for symbol in tickers:
#Your code
sheet.update.cell(2,counter,datavalue)
counter = counter+1
While counter variables are a very common pattern used in most programming language (see Akib Rhast's answer), the more pythonic way to do it is by using the enumerate builtin function:
for column, symbol in enumerate(tickers, start=1):
# do stuff
sheet.update.cell(2,column,datavalue)
what is enumerate?
As the documentation states, enumerate takes something that you can iterate on (like a list) and returns a tuple with the counter as the first element and the elements from the iterator as the second element:
seasons = ['Spring', 'Summer', 'Fall', 'Winter']
list(enumerate(seasons, start=1))
# outputs [(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')]
It also has the advantage of doing so in a memory-efficient manner and is directly tied to your loop.
why is there a comma in my for loop?
This is just syntactic sugar in python that allows you to unpack a tuple or list:
alist = [1, 2, 3]
first, second, third = alist
print(third) # outputs 3
print(second) # outputs 2
print(first) # outputs 1
As enumerate returns a tuple, you are basically assigning each element on that tuple to a different variable at the same time.
Related
I am a beginner and I am trying to teach myself Python by using topics that are interesting to me and where I can at the same time challenge myself. I am currently struggling with a generic logical problem.
I would like to consume the CoinGecko API by using the following endpoint:
https://api.coingecko.com/api/v3/coins/bitcoin/market_chart?vs_currency=usd&days=max&interval=daily
I would like to replace "bitcoin" with a dynamic variable that refers to a list that I already have. (bitcoin, ethereum, fantom, avalanche-2)
Therefore I use the following code in combination with for i in list with a counter:
counter = 0
CoinDatabase = []
for i in CoinIDList:
if counter >2:
break
else:
r = requests.get (f"https://api.coingecko.com/api/v3/coins/{i}/market_chart?vs_currency=usd&days=max&interval=daily")
data = r.json()
prices = data["prices"]
market_caps = data["market_caps"]
total_volumes = data["total_volumes"]
for i in range (len(CoinDatabase)):
prices = data["prices"]
CoinDatabase.append(prices)
counter = counter+1
What is the smartest way to match the ID that is used in the "i" with the JSON response that I receive in every loop? Otherwise I get tons of JSONs without having the reference to what coin it belongs to.
My long term objective is to setup a small database with different values and maximum historical data for prices, market_caps, total_volumes, timestamp - just as a Python learning exercise.
Thanks in advance!
I am working on a project for school where I am creating a nutrition plan based off our schools nutrition menu. I am trying to create a dictionary with every item and its calorie content but for some reason the loop im using gets stuck at 7 and will never advance the rest of the list. To add to my dictionary. So when I search for a known key (Sour Cream) it throws and error because it is never added to the dictionary. I have also noticed it prints several numbers twice in a row as well double adding them to the dictionary.
edit: have discovered the double printing was from the print statement I had - still wondering about the 7 however
from bs4 import BeautifulSoup
import urllib3
import requests
url = "https://menus.sodexomyway.com/BiteMenu/Menu?menuId=14756&locationId=11870001&whereami=http://mnsu.sodexomyway.com/dining-near-me/university-dining-center"
r = requests.get(url)
soup = BeautifulSoup(r.content, "html5lib")
allFood = soup.findAll('a', attrs={'class':'get-nutritioncalculator primary-textcolor'})
allCals = soup.findAll('a', attrs={'class':'get-nutrition primary-textcolor'})
nums = '0123456789'
def printData(charIndex):
for char in allFood[charIndex].contents:
print(char)
for char in allCals[charIndex].contents:
print(char)
def getGoals():
userCalories = int(input("Please input calorie goal for the day (kC): "))
#Display Info (Text/RsbPi)
fullList = {}
def compileFood():
foodCount = 0
for food in allFood:
print(foodCount)
for foodName in allFood[foodCount].contents:
fullList[foodName] = 0
foodCount += 1
print(foodCount)
compileFood()
print(fullList['Sour Cream'])
Any help would be great. Thanks!
Ok first why is this happening:
The reason is because the food on the index 7 is empty. Because it's empty it will never enter your for loop and therefore never increase your foodCount => it will stuck at 7 forever.
So if you would shift your index increase outside of the for loop it would work without a problem.
But you doing something crude here.
You already iterate through the food item and still use an additional variable.
You could solve it smarter this way:
def compileFood():
for food in allFood:
for foodName in food.contents:
fullList[foodName] = 0
With this you don't need to care about an additional variable at all.
I must be missing something here since the computer doesn't joke around but this simple for loop is seemingly not giving me the desired output. Below is the code which uses aztro's API to grab today's horoscope for each of 12 zodiac signs and put them all in a list.
import requests
import json
zodiacSigns = ['Aries', 'Taurus', 'Gemini', 'Cancer', 'Leo', 'Virgo', 'Libra', 'Scorpio', 'Sagittarius', 'Capricorn', 'Aquarius', 'Pisces']
for zodiacSign in zodiacSigns:
params = (('sign','{}'.format(zodiacSign)), ('day','today'))
output = json.loads(requests.post('https://aztro.sameerkumar.website/', params=params).text)
descriptions = []
descriptions.append(output['description'])
print(descriptions)
This code outputs the horoscope for only Pisces, the last element in the list above:
["You need to take work more seriously today -- it may be that you've got an opportunity coming up that shouldn't be missed. It's easier than usual for you to make career moves, so go for it!"]
As a reference, a sample output of this aztro's API for a single zodiac sign is:
{
"compatibility":" Virgo",
"date_range":"Jan 20 - Feb 18",
"current_date":"August 23, 2018",
"description":"Today requires a willingness to go deeper than usual -- maybe to explore the nuances of your primary relationship, maybe to really get to know that one client or maybe just reading between the lines.",
"lucky_time":" 10am",
"lucky_number":" 13",
"color":" Navy Blue",
"mood":" Thoughtful"
}
The desired output would be a list of horoscopes for all 12 zodiac signs. I can't seem to catch the issue here, so I'd appreciate input from more experienced eyes. Gracias!
The problem lies in the declaration of the descriptions variable, which at every iteration is initialized to be an empty list.
Just move it out from the loop, like so:
descriptions = []
for zodiacSign in zodiacSigns:
params = (('sign','{}'.format(zodiacSign)), ('day','today'))
output = json.loads(requests.post('https://aztro.sameerkumar.website/', params=params).text)
descriptions.append(output['description'])
The statement descriptions = [] should be out of the for loop. If it is in the for loop, it will be initiated (erased, in this case) every iteration.
The code below should work:
import requests
import json
zodiacSigns = ['Aries', 'Taurus', 'Gemini', 'Cancer', 'Leo', 'Virgo', 'Libra', 'Scorpio', 'Sagittarius', 'Capricorn', 'Aquarius', 'Pisces']
descriptions = []
for zodiacSign in zodiacSigns:
params = (('sign','{}'.format(zodiacSign)), ('day','today'))
output = json.loads(requests.post('https://aztro.sameerkumar.website/', params=params).text)
descriptions.append(output['description'])
print(descriptions)
I was trying to scrape Instagram and I have already achieved my goal of scraping, but the result I get is perfect but I want it to be stored in the list in a list.
Code:-
Post links = ['https://www.instagram.com/p/BesW08pHfUt', 'https://www.instagram.com/p/BQZyTtej4yj']
for post_link in post_links:
_ = API.getMediaComments(get_media_id(post_link), max_id = 100)
for c in reversed(API.LastJson['comments']):
comment.append(c["user"]["username"])
The comments I get from each post links from Instagram
'https://www.instagram.com/p/BesW08pHfUt':- 'headhotel', 'famegalore', 'motivationpoem', 'malicioussatan'
'https://www.instagram.com/p/BQZyTtej4yj':- 'monarch_motivation', 'headhotel', 'motivationpoem'
The output I get
['headhotel', 'famegalore', 'motivationpoem', 'malicioussatan', 'monarch_motivation', 'headhotel', 'motivationpoem']
The output I want
[['headhotel', 'famegalore', 'motivationpoem', 'malicioussatan'], ['monarch_motivation', 'headhotel', 'motivationpoem']]
I know this is kind of easy but I have been coded this scraper in 2 days so I have got a bit of confused!
I'm not familiar with that API, but I think you want to do something like this:
for post_link in post_links:
_ = API.getMediaComments(get_media_id(post_link), max_id = 100)
sublist = []
for c in reversed(API.LastJson['comments']):
sublist.append(c["user"]["username"])
comment.append(sublist)
That creates a new sublist on each iteration of the outer loop, which the inner loop fills, and then we append the sublist to the main comment list.
(Code below)
I'm scraping a website and the data I'm getting back is in 2 multi-dimensional arrays. I'm wanting everything to be in a JSON format because I want to save this and load it in again later when I add "tags".
So, less vague. I'm writing a program which takes in data like what characters you have and what missions are requiring you to do (you can complete multiple at once if the attributes align), and then checks that against a list of attributes that each character fulfills and returns a sorted list of the best characters for the context.
Right now I'm only scraping character data but I've already "got" the attribute data per character - the problem there was that it wasn't sorted by name so it was just a randomly repeating list that I needed to be able to look up. I still haven't quite figured out how to do that one.
Right now I have 2 arrays, 1 for the headers of the table and one for the rows of the table. The rows contain the "Answers" for the Header's "Questions" / "Titles" ; ie Maximum Level, 50
This is true for everything but the first entry which is the Name, Pronunciation (and I just want to store the name of course).
So:
Iterations = 0
While loop based on RowArray length / 9 (While Iterations <= that)
HeaderArray[0] gives me the name
RowArray[Iterations + 1] gives me data type 2
RowArray[Iterations + 2] gives me data type 3
Repeat until Array[Iterations + 8]
Iterations +=9
So I'm going through and appending these to separate lists - single arrays like CharName[] and CharMaxLevel[] and so on.
But I'm actually not sure if that's going to make this easier or not? Because my end goal here is to send "CharacterName" and get stuff back based on that AND be able to send in "DesiredTraits" and get "CharacterNames who fit that trait" back. Which means I also need to figure out how to store that category data semi-efficiently. There's over 80 possible categories and most only fit into about 10. I don't know how I'm going to store or load that data.
I'm assuming JSON is the best way? And I'm trying to keep it all in one file for performance and code readability reasons - don't want a file for each character.
CODE: (Forgive me, I've never scraped anything before + I'm actually somewhat new to Python - just got it 4? days ago)
https://pastebin.com/yh3Z535h
^ In the event anyone wants to run this and this somehow makes it easier to grab the raw code (:
import time
import requests, bs4, re
from urllib.parse import urljoin
import json
import os
target_dir = r"D:\00Coding\Js\WebScraper" #Yes, I do know that storing this in my Javascript folder is filthy
fullname = os.path.join(target_dir,'TsumData.txt')
StartURL = 'http://disneytsumtsum.wikia.com/wiki/Skill_Upgrade_Chart'
URLPrefix = 'http://disneytsumtsum.wikia.com'
def make_soup(url):
r = requests.get(url)
soup = bs4.BeautifulSoup(r.text, 'lxml')
return soup
def get_links(url):
soup = make_soup(url)
a_tags = soup.find_all('a', href=re.compile(r"^/wiki/"))
links = [urljoin(URLPrefix, a['href'])for a in a_tags] # convert relative url to absolute url
return links
def get_tds(link):
soup = make_soup(link)
#tds = soup.find_all('li', class_="category normal") #This will give me the attributes / tags of each character
tds = soup.find_all('table', class_="wikia-infobox")
RowArray = []
HeaderArray = []
if tds:
for td in tds:
#print(td.text.strip()) #This is everything
rows = td.findChildren('tr')#[0]
headers = td.findChildren('th')#[0]
for row in rows:
cells = row.findChildren('td')
for cell in cells:
cell_content = cell.getText()
clean_content = re.sub( '\s+', ' ', cell_content).strip()
if clean_content:
RowArray.append(clean_content)
for row in rows:
cells = row.findChildren('th')
for cell in cells:
cell_content = cell.getText()
clean_content = re.sub( '\s+', ' ', cell_content).strip()
if clean_content:
HeaderArray.append(clean_content)
print(HeaderArray)
print(RowArray)
return(RowArray, HeaderArray)
#Output = json.dumps([dict(zip(RowArray, row_2)) for row_2 in HeaderArray], indent=1)
#print(json.dumps([dict(zip(RowArray, row_2)) for row_2 in HeaderArray], indent=1))
#TempFile = open(fullname, 'w') #Read only, Write Only, Append
#TempFile.write("EHLLO")
#TempFile.close()
#print(td.tbody.Series)
#print(td.tbody[Series])
#print(td.tbody["Series"])
#print(td.data-name)
#time.sleep(1)
if __name__ == '__main__':
links = get_links(StartURL)
MainHeaderArray = []
MainRowArray = []
MaxIterations = 60
Iterations = 0
for link in links: #Specifically I'll need to return and append the arrays here because they're being cleared repeatedly.
#print("Getting tds calling")
if Iterations > 38: #There are this many webpages it'll first look at that don't have the data I need
TempRA, TempHA = get_tds(link)
MainHeaderArray.append(TempHA)
MainRowArray.append(TempRA)
MaxIterations -= 1
Iterations += 1
#print(MaxIterations)
if MaxIterations <= 0: #I don't want to scrape the entire website for a prototype
break
#print("This is the end ??")
#time.sleep(3)
#jsonized = map(lambda item: {'Name':item[0], 'Series':item[1]}, zip())
print(MainHeaderArray)
#time.sleep(2.5)
#print(MainRowArray)
#time.sleep(2.5)
#print(zip())
TsumName = []
TsumSeries = []
TsumBoxType = []
TsumSkillDescription = []
TsumFullCharge = []
TsumMinScore = []
TsumScoreIncreasePerLevel = []
TsumMaxScore = []
TsumFullUpgrade = []
Iterations = 0
MaxIterations = len(MainRowArray)
while Iterations <= MaxIterations: #This will fire 1 time per Tsum
print(Iterations)
print(MainHeaderArray[Iterations][0]) #Holy this gives us Mickey ;
print(MainHeaderArray[Iterations+1][0])
print(MainHeaderArray[Iterations+2][0])
print(MainHeaderArray[Iterations+3][0])
TsumName.append(MainHeaderArray[Iterations][0])
print(MainRowArray[Iterations][1])
#At this point it will, of course, crash - that's because I only just realized I needed to append AND I just realized that everything
#Isn't stored in a list as I thought, but rather a multi-dimensional array (as you can see below I didn't know this)
TsumSeries[Iterations] = MainRowArray[Iterations+1]
TsumBoxType[Iterations] = MainRowArray[Iterations+2]
TsumSkillDescription[Iterations] = MainRowArray[Iterations+3]
TsumFullCharge[Iterations] = MainRowArray[Iterations+4]
TsumMinScore[Iterations] = MainRowArray[Iterations+5]
TsumScoreIncreasePerLevel[Iterations] = MainRowArray[Iterations+6]
TsumMaxScore[Iterations] = MainRowArray[Iterations+7]
TsumFullUpgrade[Iterations] = MainRowArray[Iterations+8]
Iterations += 9
print(Iterations)
print("It's Over")
time.sleep(3)
print(TsumName)
print(TsumSkillDescription)
Edit:
tl;dr my goal here is to be like
"For this Mission Card I need a Blue Tsum with high score potential, a Monster's Inc Tsum for a bunch of games, and a Male Tsum for a long chain.. what's the best Tsum given those?" and it'll be like "SULLY!" and automatically select it or at the very least give you a list of Tsums. Like "These ones match all of them, these ones match 2, and these match 1"
Edit 2:
Here's the command Line Output for the code above:
https://pastebin.com/vpRsX8ni
Edit 3: Alright, just got back for a short break. With some minor looking over I see what happened - my append code is saying "Append this list to the array" meaning I've got a list of lists for both the Header and Row arrays that I'm storing. So I can confirm (for myself at least) that these aren't nested lists per se but they are definitely 2 lists, each containing a single list at every entry. Definitely not a dictionary or anything "special case" at least. This should help me quickly find an answer now that I'm not throwing "multi-dimensional list" around my google searches or wondering why the list stuff isn't working (as it's expecting 1 value and gets a list instead).
Edit 4:
I need to simply add another list! But super nested.
It'll just store the categories that the Tsum has as a string.
so Array[10] = ArrayOfCategories[Tsum] (which contains every attribute in string form that the Tsum has)
So that'll be ie TsumArray[10] = ["Black", "White Gloves", "Mickey & Friends"]
And then I can just use the "Switch" that I've already made in order to check them. Possibly. Not feeling too well and haven't gotten that far yet.
Just use the with open file as json_file , write/read (super easy).
Ultimately stored 3 json files. No big deal. Much easier than appending into one big file.