Need to understand my error message to move forward - python

I am trying to use the following to retrieve stock data from Yahoo. can anyone tell me why this is not working? I would be super grateful for reply
here is my input
import pandas_datareader as pdweb
from pandas_datareader import data, wb
import datetime
prices = pdweb.get_data_yahoo(['CVX', 'XOM','BP'], start=datetime.datetime(2010,1,1), end=datetime.datetime(2013,1,1)) ,
['Adj Close']
prices.head()
here is the error message
AttributeError Traceback (most recent call last)
<ipython-input-9-95b02a209848> in <module>()
----> 1 prices = pdweb.get_data_yahoo(['CVX', 'XOM','BP'], start=datetime.datetime(2010,1,1), end=datetime.datetime(2013,1,1)) ,
2 ['Adj Close']
3
4 prices.head()
AttributeError: 'tuple' object has no attribute 'head'

It looks like the function you are calling is returning a tuple. And it looks like you want to access an instance of the class pdweb. To do this do:
P = pdweb()
Tuple = p.get_data_yahoo()
P.head()
Just a suggestion, I don't know how this library works, but I hope that this helps!
Edit:
Actually, as inspectorg4det said, since it's a tuple and not a list you would use tuple[index] to get an element. I do not know what I was thinking when I saw this question.

Related

How to import API data using Pandas?

I am trying to pull some data from EIA API, below is what I tried but I'm getting the error on the first line of code:
AttributeError: 'str' object has no attribute 'text'
Any help would be much appreciated!
call_eia = requests.get = 'https://api.eia.gov/v2/nuclear-outages/facility-nuclear-outages/data?api_key=XXXXXXXX'
data_eia=pd.read_csv(StringIO(call_eia.text))
You haven't requested anything from the API. Look carefully at your first line:
call_eia = requests.get = 'https://api.eia.gov/v2/nuclear-outages/facility-nuclear-outages/data?api_key=XXXXXXXX'
# ^ ^
There are 2 = signs, so what you're really doing is assigning the URL string to both your call_eia variable and the get attribute of the requests module, overwriting the function that was there originally. Then, when you try to pass call_eia to pd.read_csv(), instead of passing a requests object, you're just passing a string, the URL.
Try
call_eia = requests.get('https://api.eia.gov/v2/nuclear-outages/facility-nuclear-outages/data?api_key=XXXXXXXX')
instead and your code should work.

Translate a Pandas df using googletrans, AttributeError error

I am trying to translate words from a Pandas dataframe column and get error in googletrans.Translator() class. It works normal with single words or phrases. Can it be environmental issue?
Any help or suggestions much appreciated
import pandas as pd
from googletrans import Translator
translator = Translator()
df = pd.DataFrame({'Spanish':['piso','cama']})
df['English'] = df['Spanish'].apply(translator.translate, src='es', dest='en').apply(getattr, args=('text',))
Output:
AttributeError: 'Translator' object has no attribute 'raise_Exception'
Hi this error occured because there is an exception occurred in the runtime. To see the error insert this below code
python translator.raise_Exception = True
If you get the error as below
Exception: Unexpected status code "429" from ['translate.google.com']
which means Too many requests. Hope you would not get this error. If so you have to upgrade you account. To avoid the error please refer this answer
Source 1

Unable to retrieve value from dictionary after webscraping

I was hoping people on here would be able to answer what I believe to be a simple question. I'm a complete newbie and have been attempting to create an image webscraper from the site Archdaily. Below is my code so far after numerous attempts to debug it:
#### - Webscraping 0.1 alpha -
#### - Archdaily -
import requests
from bs4 import BeautifulSoup
# Enter the URL of the webpage you want to download the images from
page = 'https://www.archdaily.com/63267/ad-classics-house-vi-peter-eisenman/5037e0ec28ba0d599b000190-ad-classics-house-vi-peter-eisenman-image'
# Returns the webpage source code under page_doc
result = requests.get(page)
page_doc = result.content
# Returns the source code as BeautifulSoup object, as nested data structure
soup = BeautifulSoup(page_doc, 'html.parser')
img = soup.find('div', class_='afd-gal-items')
img_list = img.attrs['data-images']
for k, v in img_list():
if k == 'url_large':
print(v)
These elements here:
img = soup.find('div', class_='afd-gal-items')
img_list = img.attrs['data-images']
Attempts to isolate the data-images attribute, shown here:
My github upload of this portion, very long
As you can see, or maybe I'm completely wrong here, my attempts to call the 'url_large' values from this final dictionary list comes to a TypeError, shown below:
Traceback (most recent call last):
File "D:/Python/Programs/Webscraper/Webscraping v0.2alpha.py", line 23, in <module>
for k, v in img_list():
TypeError: 'str' object is not callable
I believe my error lies in the resulting isolation of 'data-images', which to me looks like a dict within a list, as they're wrapped by brackets and curly braces. I'm completely out of my element here because I basically jumped into this project blind (haven't even read past chapter 4 of Guttag's book yet).
I also looked everywhere for ideas and tried to mimic what I found. I've found solutions others have offered previously to change the data to JSON data, so I found the code below:
jsonData = json.loads(img.attrs['data-images'])
print(jsonData['url_large'])
But that was a bust, shown here:
Traceback (most recent call last):
File "D:/Python/Programs/Webscraper/Webscraping v0.2alpha.py", line 29, in <module>
print(jsonData['url_large'])
TypeError: list indices must be integers or slices, not str
There is a step I'm missing here in changing these string values, but I'm not sure where I could change them. I'm hoping someone can help me resolve this issue, thanks!
It's all about the types.
img_list is actually not a list, but a string. You try to call it by img_list() which results in an error.
You had the right idea of turning it into a dictionary using json.loads. The error here is pretty straight forward - jsonData is a list, not a dictionary. You have more than one image.
You can loop through the list. Each item in the list is a dictionary, and you'll be able to find the url_large attribute in each dictionary in the list:
images_json = img.attrs['data-images']
for image_properties in json.loads(images_json):
print(image_properties['url_large'])
#infinity & #simic0de are both right, but I wanted to more explicitly address what I see in your code as well.
In this particular block:
img_list = img.attrs['data-images']
for k, v in img_list():
if k == 'url_large':
print(v)
There is a couple syntax errors.
If 'img_list' truly WAS a dictionary, you cannot iterate through it this way. You would need to use img_list.items() (for python3) or img_list.iteritems() (python2) in the second line.
When you use the parenthesis like that, it implies that you're calling a function. But here, you're trying to iterate through a dictionary. That is why you get the 'is not callable' error.
The other main issue is the Type issue. simic0de & Infinity address that, but ultimately you need to check the type of img_list and convert it as needed so you can iterate through it.
Source of error:
img_list is a string. You have to convert it to list using json.loads and it not becomes a list of dicts that you have to loop over.
Working Solution:
import json
import requests
from bs4 import BeautifulSoup
# Enter the URL of the webpage you want to download the images from
page = 'https://www.archdaily.com/63267/ad-classics-house-vi-peter-eisenman/5037e0ec28ba0d599b000190-ad-classics-house-vi-peter-eisenman-image'
# Returns the webpage source code under page_doc
result = requests.get(page)
page_doc = result.content
# Returns the source code as BeautifulSoup object, as nested data structure
soup = BeautifulSoup(page_doc, 'html.parser')
img = soup.find('div', class_='afd-gal-items')
img_list = img.attrs['data-images']
for img in json.loads(img_list):
for k, v in img.items():
if k == 'url_large':
print(v)

python-for-list index out of range

I am a beginner of Python. Could someone point out why it keeps saying
Traceback (most recent call last):
File "C:/Python27/practice example/datascraper templates.py", line 21, in <module>
print findPatTitle[i]
IndexError: list index out of range
Thanks a lot.
Here are the codes:
from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import re
webpage=urlopen('http://www.voxeu.org/').read()
patFinderTitle=re.compile('<title>(.*)</title>') ##title tag
patFinderLink=re.compile('<link rel.*href="(.*)"/>') ##link tag
findPatTitle=re.findall(patFinderTitle,webpage)
findPatLink=re.findall(patFinderLink,webpage)
listIterator=[]
listIterator=range(2,16)
for i in listIterator:
print findPatTitle[i]
print findPatLink[i]
print '/n'
The error message is perfectly descriptive.
You're trying to access a hard-coded range of indices (2,16) into findPatTitle, but you have no idea how many items there are.
When you want to iterate over multiple similar collections simultaneously, use zip().
for title, link in zip(findPatTitle, findPatLink):
print 'Title={0} Link={1}'.format(title, link)
The problem is you have a different number of results than you expected. Don't hard-code that. But let's also rewrite this to be a bit more pythonic:
Replace this:
listIterator=[]
listIterator=range(2,16)
for i in listIterator:
print findPatTitle[i]
print findPatLink[i]
print '/n'
with the two lists zipped together:
for title, link in zip(findPatTitle, findPatLink):
print title
print link
print '/n'
This will loop over both at once, however long the list is. 1 element or 100 elements, it makes no difference.

Google search with python is sporadically non-accurate and has Type Errors

I am using some code I found here on SO to google search a set of strings and return the "expected" amount of results. Here is that code:
for a in months:
for b in range(1, daysInMonth[a] + 1):
#Code
if not myString:
googleStats.append(None)
else:
try:
query = urllib.urlencode({'q': myString})
url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % query
search_response = urllib.urlopen(url)
search_results = search_response.read()
results = json.loads(search_results)
data = results['responseData']
googleStats.append(data['cursor']['estimatedResultCount'])
except TypeError:
googleStats.append(None)
for x in range(0, len(googleStats)):
if googleStats[x] != None:
finalGoogleStats.append(googleStats[x])
There are two problems, which may be related. When I return the len(finalGoogleStats), it's different every time. One time it's 37, then it's 12. However, it should be more like 240.
This is TypeError I receive when I take out the try/except:
TypeError: 'NoneType' object has no attribute '__getitem__'
which occurs on line
googleStats.append(data['cursor']['estimatedResultCount'])
So, I just can't figure out why the number of Nones in googleStats changes every time and it's never as low as it should be. If anyone has any ideas, I'd love to hear them, thanks!
UPDATE
When I try to print out data for every think I'm searching, I get a ton of Nones and very, very few actual JSON dictionaries. The dictionaries I do get are spread out across all the searches, I don't see a pattern in what is a None and what isn't. So, the problem looks like it has more to do with GoogleAPI than anything else.
First, I'd say remove your try..except clause and see where exactly the problem is. Then as a general good practice, when you try to access layers of dictionary elements, use .get() method instead for better control.
As a demonstration of your possible TypeError, here is my educated guess:
>>> a = {}
>>> a['lol'] = None
>>> a['lol']['teemo']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object has no attribute '__getitem__'
>>>
There are ways to use .get(), for a simple demonstration:
>>> a = {}
>>> b = a.get('lol') # will return None
>>> if type(b) is dict: # determine type
... print b.get('teemo') # same technique if b is indeed of type dict
...
>>>
The answer is what I was fearing for a while, but thanks to everyone who tried to help, I upvoted you if anythign was useful.
So, Google seems to randomly freak out that I'm searching so must stuff. Here's the error they give to me :
Suspected Terms of Service Abuse ...... responseStatus:403
So, I guess they put limits on how much I can search with them. What is still strange, though, is that it doesn't happen all the time, I still get sporadic successful searches within the sea of errors. That is still a mystery...
By default the googleapi pass the least result. If you want to increase your display results, in your url add another parameter 'rsz=8' (by default rsz=1 hence the small result).
so your new url becomes:
url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&rsz=8&%s' % query
see detailed documentation here: https://developers.google.com/web-search/docs/reference#_class_GSearch

Categories

Resources