Hacker News API, KeyError: 'title - python

Just starting to learn python, and I am starting to learn the web-based side of it.
Following the instructions I have, i keep getting a KeyError: 'title' on line 18. Now I see it as the request not returning a title so it gives an error, how would I write it up to give a generic description if there is no 'title'???
import requests
from operator import itemgetter as ig
url = 'https://hacker-news.firebaseio.com/v0/topstories.json'
r = requests.get(url)
print("Status Code:", r.status_code)
submission_ids= r.json()
submission_dicts = []
for submission_id in submission_ids[:30]:
url= ("https://hacker-news.firebaseio.com/v0/item" + str(submission_id) + '.json')
submission_r = requests.get(url)
print(submission_r.status_code)
response_dict = submission_r.json()
submission_dict = {
'title': response_dict['title'],
'link': "https://news.ycombinator.com/item?id=" +str(submission_id),
'comments': response_dict.get('descendants', 0)
}
submission_dicts = sorted(submission_dicts, key= ig('comments'), reverse= True)
for submission_dict in submission_dicts:
print("\nTitle:", submission_dict['title'])
print("Discussion link:", submission_dict['link'])
print("Comments:", submission_dict['comments'])
Status Code: 200
401
Traceback (most recent call last):
File "C:\Users\Shit Head\Documents\Programming\Tutorial Files\hn_submissions.py", line 18, in <module>
'title': response_dict['title'],
KeyError: 'title'
[Finished in 1.2s]

Following the instructions I have, i keep getting a KeyError: 'title' on line 18. Now I see it as the request not returning a title so it gives an error, how would I write it up to give a generic description if there is no 'title'???
It sounds like you're just looking for the get method:
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
So, instead of this:
'title': response_dict['title'],
… you do this:
'title': response_dict.get('title', 'Generic Hacker News Submission'),
Under the covers, this is just a more convenient way to write something you could have done anyway. The following are all pretty much equivalent:
title = response_dict.get('title', 'Generic')
title = response_dict['title'] if title in response_dict else 'Generic'
if title in response_dict:
title = response_dict['title']
else:
title = 'Generic'
try:
title = response_dict['title']
except KeyError:
title = 'Generic'
This is worth knowing because Python only usually provides shortcuts like get for really common cases like looking things up in a dictionary. If you wanted to do the same thing with, say, a list that may be empty or have one item, or a file that might or might not exist, or a regular expression that might return a match with a group string or might return None, you'd need to do things the long way.

Related

How do you search for incomplete or partial queries in Mongo?

Using the following code with perfect results, pulling records from Mongo.
#app.route('/search-all/', methods=['GET', 'POST'])
def search_all():
query = request.form.get("SearchBox")
try:
for x in db['mongo-moviedb'].find({"Films.Title" : f"{query}" }, {"Films.$": 1}):
title = x["Films"][0]["Title"]
genre = x["Films"][0]["Genre"]
cast = x["Films"][0]["Actors"]
plot = x["Films"][0]["Plot"]
return render_template('Date-Night.html', title=title,genre=genre,cast=cast,plot=plot)
except (RuntimeError, TypeError, NameError):
error = 'Not found.'
return render_template('Date-Night.html', error = error)
The problem is it won't pull partial matches. I can get exact matches or the error prompt. I've tried using $regex and /like/ in the find() function to no avail. Haven't been able to find anything that returns a result that only partially matches the record. Also it has to include the 'query' variable pulled from form input.
Ex: 'Star' would return Star Wars, Starman, Wish Upon a Star....etc.
Thanks.
Example:
The correct syntax was:
for x in db['mongo-moviedb'].find({"Films.Title" : {"$regex" :f"{query}"} }, {"Films.$": 1}):
Thanks, Joe! It was in the first link you sent.

KeyError when parsing JSON from API

I need to parse data from a JSON file I've requested from an API. The API documentation provides an example code on how to implement it, however their method leads to the following KeyError:
Traceback (most recent call last):
File "testing2.py", line 11, in <module>
for flight in api_response['results']:
KeyError: 'results'
I've spent a lot of time looking for possible reasons but I can't find a solution that works for me. The Python file looks like this:
import requests
params = {
'access_key': my_access_key'
}
results = requests.get('http://api.aviationstack.com/v1/flights?access_key=my_access_key&airline_name=Air%20India&flight_number=560')
api_response = results.json()
for flight in api_response['results']:
if (flight['live']['is_ground'] is False):
print(u'%s flight %s from %s (%s) to %s (%s) is in the air.' % (
flight['airline']['name'],
flight['flight']['iata'],
flight['departure']['airport'],
flight['departure']['iata'],
flight['arrival']['airport'],
flight['arrival']['iata']))
I've been able to read through the data with print(api_response), however I can't parse through any of it since the key "results" simply doesn't seem to exist.
Edit:
This is the result of simply going to the URL or of print(api_response), a large chunk of text:
{"pagination":{"limit":100,"offset":0,"count":4,"total":4},"data":[{"flight_date":"2020-09-02","flight_status":"landed","departure":{"airport":"Indira Gandhi International","timezone":"Asia/Kolkata","iata":"DEL","icao":"VIDP","terminal":"3","gate":"33","delay":4,"scheduled":"2020-09-02T07:15:00+00:00","estimated":"2020-09-02T07:15:00+00:00","actual":"2020-09-02T07:18:00+00:00","estimated_runway":"2020-09-02T07:18:00+00:00","actual_runway":"2020-09-02T07:18:00+00:00"},"arrival":{"airport":"Hyderabad Airport","timezone":"Asia/Kolkata","iata":"HYD","icao":"VOHS","terminal":"1","gate":null,"baggage":null,"delay":2,"scheduled":"2020-09-02T09:10:00+00:00","estimated":"2020-09-02T09:10:00+00:00","actual":"2020-09-02T09:09:00+00:00","estimated_runway":"2020-09-02T09:09:00+00:00","actual_runway":"2020-09-02T09:09:00+00:00"},"airline":{"name":"Air India","iata":"AI","icao":"AIC"},"flight":{"number":"560","iata":"AI560","icao":"AIC560","codeshared":null},"aircraft":{"registration":"VT-EXM","iata":"A20N","icao":"A20N","icao24":"800C4B"},"live":null},{"flight_date":"2020-09-01","flight_status":"landed","departure":{"airport":"Tirupati","timezone":"Asia/Kolkata","iata":"TIR","icao":"VOTP","terminal":null,"gate":null,"delay":null,"scheduled":"2020-09-01T10:20:00+00:00","estimated":"2020-09-01T10:20:00+00:00","actual":"2020-09-01T10:06:00+00:00","estimated_runway":"2020-09-01T10:06:00+00:00","actual_runway":"2020-09-01T10:06:00+00:00"},"arrival":{"airport":"Hyderabad Airport","timezone":"Asia/Kolkata","iata":"HYD","icao":"VOHS","terminal":"3","gate":null,"baggage":null,"delay":null,"scheduled":"2020-09-01T11:40:00+00:00","estimated":"2020-09-01T11:40:00+00:00","actual":"2020-09-01T10:51:00+00:00","estimated_runway":"2020-09-01T10:51:00+00:00","actual_runway":"2020-09-01T10:51:00+00:00"},"airline":{"name":"Air India","iata":"AI","icao":"AIC"},"flight":{"number":"560","iata":"AI560","icao":"AIC560","codeshared":null},"aircraft":null,"live":null},{"flight_date":"2020-09-01","flight_status":"landed","departure":{"airport":"Indira Gandhi International","timezone":"Asia/Kolkata","iata":"DEL","icao":"VIDP","terminal":"3","gate":"29B","delay":6,"scheduled":"2020-09-01T06:50:00+00:00","estimated":"2020-09-01T06:50:00+00:00","actual":"2020-09-01T06:56:00+00:00","estimated_runway":"2020-09-01T06:56:00+00:00","actual_runway":"2020-09-01T06:56:00+00:00"},"arrival":{"airport":"Hyderabad Airport","timezone":"Asia/Kolkata","iata":"HYD","icao":"VOHS","terminal":"2","gate":null,"baggage":null,"delay":null,"scheduled":"2020-09-01T09:20:00+00:00","estimated":"2020-09-01T09:20:00+00:00","actual":null,"estimated_runway":null,"actual_runway":null},"airline":{"name":"Air India","iata":"AI","icao":"AIC"},"flight":{"number":"560","iata":"AI560","icao":"AIC560","codeshared":null},"aircraft":{"registration":"VT-EXM","iata":"A20N","icao":"A20N","icao24":"800C4B"},"live":null},{"flight_date":"2020-09-01","flight_status":"landed","departure":{"airport":"Hyderabad Airport","timezone":"Asia/Kolkata","iata":"HYD","icao":"VOHS","terminal":null,"gate":null,"delay":null,"scheduled":"2020-09-01T12:45:00+00:00","estimated":"2020-09-01T12:45:00+00:00","actual":"2020-09-01T12:41:00+00:00","estimated_runway":"2020-09-01T12:41:00+00:00","actual_runway":"2020-09-01T12:41:00+00:00"},"arrival":{"airport":"Indira Gandhi International","timezone":"Asia/Kolkata","iata":"DEL","icao":"VIDP","terminal":"3","gate":null,"baggage":null,"delay":null,"scheduled":"2020-09-01T14:55:00+00:00","estimated":"2020-09-01T14:55:00+00:00","actual":"2020-09-01T14:42:00+00:00","estimated_runway":"2020-09-01T14:42:00+00:00","actual_runway":"2020-09-01T14:42:00+00:00"},"airline":{"name":"Air India","iata":"AI","icao":"AIC"},"flight":{"number":"560","iata":"AI560","icao":"AIC560","codeshared":null},"aircraft":null,"live":null}]}
Try this. It accesses the key data instead of the key results (which probably does not exist in the response), and it does not raise a TypeError when accessing the live key within a flight object, which might be null (when it is on the ground).
import requests
params = {
'access_key': access_key,
'airline_name': 'Air India',
'flight_number': 560
}
results = requests.get('http://api.aviationstack.com/v1/flights', params=params)
api_response = results.json()
for flight in api_response['data']:
if flight.get('live') and not flight['live']['is_ground']:
print('{} flight {} from {} ({}) to {} ({}) is in the air.'.format(
flight['airline']['name'],
flight['flight']['iata'],
flight['departure']['airport'],
flight['departure']['iata'],
flight['arrival']['airport'],
flight['arrival']['iata']))
else:
print("All flights landed.")

Python: not every web page have a certain element

When I tried to use urls to scrape web pages, I found that some elements only exists in some pages and other have not. Let's take the code for example
Code:
for urls in article_url_set:
re=requests.get(urls)
soup=BeautifulSoup(re.text.encode('utf-8'), "html.parser")
title_tag = soup.select_one('.page_article_title')
if title_tag=True:
print(title_tag.text)
else:
#do something
if title_tag exits, I want to print them, if it's not, just skip them.
Another thing is that, I need to save other elements and title.tag.text in data.
data={
"Title":title_tag.text,
"Registration":fruit_tag.text,
"Keywords":list2
}
It will have an error cause not all the article have Title, what should I do to skip them when I try to save? 'NoneType' object has no attribute 'text'
Edit: I decide not to skip them and keep them as Null or None.
U code is wrong:
for urls in article_url_set:
re=requests.get(urls)
soup=BeautifulSoup(re.text.encode('utf-8'), "html.parser")
title_tag = soup.select_one('.page_article_title')
if title_tag=True: # wrong
print(title_tag.text)
else:
#do something
your code if title_tag=True,
changed code title_tag == True
It is recommended to create conditional statements as follows.
title_tag == True => True == title_tag
This is a way to make an error when making a mistake.
If Code is True = title_tag, occur error.
You can simply use a truth test to check if the tag is existing, otherwise assign a value like None, then you can insert it in the data container :
title_tag = soup.select_one('.page_article_title')
if title_tag:
print(title_tag.text)
title = title_tag.text
else:
title = None
Or in one line :
title = title_tag.text if title_tag else None

Not able to create a SOAP filter in suds

I have a SOAP request that takes below XML body
<x:Body>
<ser:CreateExportJobRequest>
<ser:ExportJobTypeName>Products</ser:ExportJobTypeName>
<ser:ExportColumns>
<ser:ExportColumn>Id</ser:ExportColumn>
<ser:ExportColumn>itemName</ser:ExportColumn>
</ser:ExportColumns>
<ser:ExportFilters>
<ser:ExportFilter id="updatedSince">
<ser:Text>2.0</ser:Text>
</ser:ExportFilter>
</ser:ExportFilters>
<ser:Frequency>ONETIME</ser:Frequency>
</ser:CreateExportJobRequest>
</x:Body>
I can make a successful request using Boomerang.
Now I actually want to use it in my python code. So I tried,
inputElement = client.factory.create('CreateExportJobRequest')
inputElement.ExportJobTypeName = "Products"
inputElement.ExportColumns.ExportColumn = ["Id", "itemName"]
inputElement.Frequency = 'ONETIME'
if updatedSince:
inputElement.ExportFilters.ExportFilter = ['updatedSince']
t = client.service.CreateExportJob(inputElement.ExportJobTypeName, inputElement.ExportColumns, inputElement.ExportFilters, None, None, inputElement.Frequency)
I get an error,
'list' object has no attribute 'id'
Because a somewhat wrong XML request gets created
<ns1:ExportFilters>
<ns1:ExportFilter>updatedSince</ns1:ExportFilter>
</ns1:ExportFilters>
So I tried few other things for ExportFilter like
inputElement.ExportFilters.ExportFilter = [{'id': 'updatedSince', 'text': updatedSince}]
and
inputElement.ExportFilters.ExportFilter = [('updatedSince', updatedSince)]
and
inputElement.ExportFilters.ExportFilter = [{'updatedSince': updatedSince}]
# says, Type not found: 'updatedSince'
and
inputElement.ExportFilters.ExportFilter = [
{'key': 'updatedSince', 'value': {'key': 'eq', 'value': updatedSince}}
]
# says, Type not found: 'value'
but nothing is working.
Before setting ExportFilter, it's value is in the form of
ExportFilters: (ExportFilters){
ExportFilter[] = <empty>
}
Please help.
After debugging and going through some suds code, I have found the fix.
The complete code snippet of the fix:
inputElement = client.factory.create('CreateExportJobRequest')
inputElement.ExportJobTypeName = "Products"
inputElement.ExportColumns.ExportColumn = ["Id", "itemName"]
inputElement.Frequency = 'ONETIME'
if updatedSince:
efilter = client.factory.create("ExportFilter")
efilter._id = 'updatedSince'
efilter.Text = updatedSince
inputElement.ExportFilters.ExportFilter.append(efilter)
t = client.service.CreateExportJob(inputElement.ExportJobTypeName, inputElement.ExportColumns, inputElement.ExportFilters, None, None, inputElement.Frequency)
Debugging:
Because suds was raising TypeNotFound exception, I looked for all the places that raise TypeNotFound inside suds. I put debug points in my PyCharm.
I found that the start method from Typed class inside suds/mx/literal.py was raising the error I was getting.
def start(self, content):
#
# Start marshalling the 'content' by ensuring that both the
# 'content' _and_ the resolver are primed with the XSD type
# information. The 'content' value is both translated and
# sorted based on the XSD type. Only values that are objects
# have their attributes sorted.
#
log.debug('starting content:\n%s', content)
if content.type is None:
name = content.tag
if name.startswith('_'):
name = '#'+name[1:]
content.type = self.resolver.find(name, content.value)
if content.type is None:
raise TypeNotFound(content.tag)
else:
known = None
if isinstance(content.value, Object):
known = self.resolver.known(content.value)
if known is None:
log.debug('object has no type information', content.value)
known = content.type
frame = Frame(content.type, resolved=known)
self.resolver.push(frame)
frame = self.resolver.top()
content.real = frame.resolved
content.ancestry = frame.ancestry
self.translate(content)
self.sort(content)
if self.skip(content):
log.debug('skipping (optional) content:\n%s', content)
self.resolver.pop()
return False
else:
return True
So from this logic, I came to the fix.
But, It would be really great if somebody suggests a standard procedure for this.

Django and JSON/AJAX testing

I tried looking around for an answer and gave it a great many tries, but there's something strange going on here. I got some functions in my view that operate on JSON data that comes in via AJAX. Currently I'm trying to do some unit testing on these.
In my test case I have:
kwargs = {'HTTP_X_REQUESTED_WITH': 'XMLHttpRequest'}
url = '/<correct_url>/upload/'
data = {
"id" : p.id
}
c = Client()
response = c.delete(url, data, **kwargs)
content_unicode = response.content.decode('utf-8')
content = json.loads(content_unicode)
p.id is just an integer that comes from a model I'm using.
I then have a function that is being tested, parts of which looks like follows:
def delete_ajax(self, request, *args, **kwargs):
print (request.body)
body_unicode = request.body.decode('utf-8')
print (body_unicode)
body_json = json.loads(body_unicode)
The first print statement yields:
.....b"{'id': 1}"
The other one:
{'id': 1}
and finally I get an error for fourth line as follows:
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
What's going wrong here? I understand that correct JSON format should be {"id": 1} and that's what I'm sending from my test case. But somewhere along the way single-quotes are introduced into the mix causing me head ache.
Any thoughts?
You need to pass a json string to Client.delete(), not a Python dict:
kwargs = {'HTTP_X_REQUESTED_WITH': 'XMLHttpRequest'}
url = '/<correct_url>/upload/'
data = json.dumps({
"id" : p.id
})
c = Client()
response = c.delete(url, data, **kwargs)
You should also set the content-type header to "application/json" and check the content-type header in your view but that's another topic.

Categories

Resources