Specific exception output - python

I am trying to find out which portion of my code contains a KeyError in my events list. Events is a list that contains JSON elements. I want to put timestamp, event_sequence_number, and device_id in their respective variables. However each JSON object is different and some do not contain the timestamp, event_sequence_number, or device_id keys. How can I change my bit of code so that I am able to output which specific key(s) is missing?
ex:
When timestamp is missing
"timestamp key is missing"
when timestamp and device_id is missing
"timestamp key is missing"
"device_id key is missing"
etc
Code:
for event in events:
try:
timestamp = event["event"]["timestamp"]
event_sequence_num = event["event"]["properties"]["event_sequence_number"]
device_id = event["application"]["mobile"]["device_id"]
event_identifier = str(device_id) + "_" + str(timestamp) + "_" + str(event_sequence_num)
event_dict[event_identifier] = 1
except KeyError:
print "JSON Key does not exist"

You can print the exception as that will include the key for which the KeyError was raised:
except KeyError as exc:
print "JSON Key does not exist: " + str(exc)
You can also access the key by looking at exc.args[0]:
except KeyError as exc:
print "JSON Key does not exist: " + str(exc.args[0])

Simeon Visser's answer is spot-on. Reporting the key causing the KeyError is probably the best that can be done in bare, straightforward Python. If you're only accessing the JSON structure once, that's the way to go.
I offer a longer alternative, however, for situations where you need to access the multi-level event data repeatedly. If you're accessing it often, your program can afford a few more lines of setup and infrastructure. Consider:
def getpath(obj, path, post=str):
"""
Use path as sequence of keys/indices into obj. Return the value
there, filtered through the post (postprocessing function).
If there is no such value, raise KeyError displaying the
partial path to the point where there is no index/key.
"""
c = obj
try:
for i, p in enumerate(path):
c = c[p]
return post(c) if post else c
except (KeyError, IndexError) as e:
msg = "JSON keys {0!r} don't exist".format(path[:i+1])
raise KeyError(msg)
# raise type(e)(msg) # Alternative if you want more exception variety
EID_COMPONENTS = [('application', 'mobile', 'device_id'),
('event', 'timestamp'),
('event', 'properties', 'event_sequence_number')]
for event in events:
event_identifier = '_'.join(getpath(event, p) for p in EID_COMPONENTS)
event_dict[event_identifier] = 1
There is more preparation here, with a separate getpath function and globally defined specification of what paths into the JSON data to get. On the plus side, the assembly of event_identifier is much shorter (if it were wrapped in a function, it'd be about 1/3 the size in either source lines or bytecodes).
If an attempted access fails, it returns a more complete error message, giving the path into the structure up to that point, not just the final key that was missing. In complex JSON with duplicated keys in different sub-structures (multiple timestamps, e.g.), knowing which attempted access failed can save you much debugging effort. You may also notice that the code is prepared to use integer indices and gracefully handle IndexError; in JSON, array values are common.
This is abstraction in action: More framework and more setup, but if you need to do a lot of deep structure accesses, the code size savings and better error reporting would advantage multiple parts of your program, making it potentially a good investment.

Related

Web scraping table with missing attributes via Python Selenium and Pandas

Scraping a table from a website. But encountering empty cells during the process. Below try-except block is screwing up the data at the end. Also dont want to exclude the complete row, as the information is still relevant even when the some attribute is missing.
try:
for i in range(10):
data = {'ID': IDs[i].get_attribute('textContent'),
'holder': holder[i].get_attribute('textContent'),
'view': view[i].get_attribute('textContent'),
'material': material[i].get_attribute('textContent'),
'Addons': addOns[i].get_attribute('textContent'),
'link': link[i].get_attribute('href')}
list.append(data)
except:
print('Error')
Any ideas?
What you can do is place all the objects to which you want to access the attributes to in a dictionary like this:
objects={"IDs":IDs,"holder":holder,"view":view,"material":material...]
Then you can iterate through this dictionary and if the specific attribute does not exist, simply append an empty string to the value corresponding to the dict key. Something like this:
the_keys=list(objects.keys())
for i in range(len(objects["IDs"])): #I assume the ID field will never be empty
#so making a for loop like this is better since you iterate only through
#existing objects
data={}
for j in range(len(objects)):
try:
data[the_keys[j]]=objects[the_keys[j]][i].get_attribute('textContent')
except Exception as e:
print("Exception: {}".format(e))
data[the_keys[j]]="" #this means we had an exception
#it is better to catch the specific exception that is thrown
#when the attribute of the element does not exist but I don't know what it is
list.append(data)
I don't know if this code works since I didn't try it but it should give you an overall idea on how to solve your problem.
If you have any questions, doubts, or concerns please ask away.
Edit: To get another object's attribute like the href you can simply include an if statement checking the value of the key. I also realized you can just loop through the objects dictionary getting the keys and values instead of accessing each key and value by an index. You could change the inner loop to be like this:
for key,value in objects.items():
try:
if key=="link":
data[key]=objects[key][i].get_attribute("href")
else:
data[key]=objects[key][i].get_attribute("textContent")
except Exception as e:
print("Error: ",e)
data[key]=""
Edit 2:
data={}
for i in list(objects.keys()):
data[i]=[]
for key,value in objects.items():
for i in range(len(objects["IDs"])):
try:
if key=="link":
data[key].append(objects[key][i].get_attribute("href"))
else:
data[key].append(objects[key][i].get_attribute("textContent"))
except Exception as e:
print("Error: ",e)
data[key].append("")
Try with this. You won't have to append the data dictionary to the list. Without the original data I won't be able to help much more. I believe this should work.

Python: Reciveing 'none' when trying to add the string of an exception to a dictionary

So I'm trying to get this working, where I remove the week's stats (weeklydict) from this second's stats (instantdict) so I have an accurate weekly progress for all keys of instantdict (keys being members). It works fine and dandy, but when a new member joins (adding to the keys in instantdict), shit hits the fan, so I use try/except, and attempt to add the missing member to weeklydict too, except when I do that using except keyerror as e and str(e), I'm given a 'none' value. Any idea on what to do?
Code:
for member, wins in instantDict.items():
try:
instantDict[member] = instantDict[member] - weeklyDict[member]
except KeyError as e:
weeklyDict[str(e)] = instantDict.get(str(e)) #error occurs here
instantDict[member] = instantDict[member] - weeklyDict[member] #thus fucking this up
Based on my testing, str(e) returns a string as such:
"'test'"
The value is a string displaying a string, so .get() is not finding the value. Try something like:
for member, wins in instantDict.items():
try:
instantDict[member] = instantDict[member] - weeklyDict[member]
except KeyError as e:
weeklyDict[str(e).strip("'")] = instantDict.get(str(e).strip("'"))
instantDict[member] = instantDict[member] - weeklyDict[member]
That should take the extra string characters off of the keyword, and allow .get() to actually find the value.
Alternatively, if you know that it errored because you know that member is not in the dictionary, why pull the exact same variable from the exception when you could just use member again?
Maybe it can't fetch the thing so try this:
weeklyDict[stre(e)] = instantDict.get(stre(e)]

KeyError: "Key 'fields' not found. If specifying a record_path, all elements of data should have the path."

I am getting very nested json for different items through an API and am then trying to convert some of the received information into a dataframe.
I have worked with this line to get the dataframe I want:
df = pd.json_normalize(result, record_path=['fields'],errors='ignore')
This works sometimes, but other times I either get a KeyError for the record-path:
KeyError: "Key 'fields' not found. If specifying a record_path, all elements of data should have the path."
I assume that this is because the json I receive is not always exactly the same but can vary according to the type of item that information about is requested.
My question now is, if there is a way to skip data which doesn't have any of these keys? Or if there are other options to ignore the data that doesn't have those keys in it?
Thanks for the well written question. To do this, you want to learn about "Exception Handling".
Its worth learning a bit more about it, but here is the tl/dr:
try:
df = pd.json_normalize(result, record_path=['fields'],,errors='ignore')
except KeyError as e:
print(f"Unable to normalize json: {json.dumps(result, indent=4)}")

json value priting by key within if in, but not without

Sorry if this is a silly question but I'm streaming data from a server and trying to pull specific values by keys, and they are only working if I first check if the key is present
JSON Example
{"time_exchange":"2018-04-04T14:29:53.0847306Z","time_coinapi":"2018-04-04T14:29:53.0847306Z","ask_price":117.1,"ask_size":158.30616728,"bid_size":102.60064,"bid_price":117.09,"symbol_id":"COINBASE_SPOT_LTC_USD","sequence":25388355,"type":"quote"}
It prints correctly if I do this:
data = json.loads(ws.recv())
if 'ask_size' in data:
print data['ask_size']
But if I do just:
data = json.loads(ws.recv())
print data['ask_size']
I get a key error:
KeyError: 'ask_size'
First point : neither using an intermediate variable nor checking if the key is present will change the content of the dict. Period. The only effect of checkin the key's presence in the dict is preventing the KeyError when it's missing.
Very obviously, what is happening here is that the key is sometimes missing and sometimes not. You can easily check this out with the correct test:
data = json.loads(ws.recv())
if 'ask_size' in data:
print data['ask_size']
else:
print "'ask_size' not found in %s" % data

Search in text file by date

I have code like this:
from datetime import datetime
from tabulate import tabulate
def search_projection_date():
projections = open('projections.txt','r').readlines()
date = input("Input projection date: ")
try:
date = date.strptime(date, "%d.%m.%Y.")
except:
print("Date input error!")
#search_projection_date()
for i in projections:
projections = i.strip("\n").split("|")
if date == projections[2]:
table = [['Code:',projections[0]],['Hall:',projections[1]],['Date:',projections[2]],['Start time:',projections[3]],['End time:', projections[4]],['Day(s):', projections[5]], ['Movie:', projections[6]], ['Price:', projections[7]]]
print (tabulate(table))
#break
else:
print("No projection on that date")
And text file like this:
0001|AAA|17.12.2017.|20:30|21:00|sunday|For a few dolars more|150
0002|BBB|17.12.2017.|19:30|21:15|saturday|March on the Drina|300
0003|GGG|19.12.2017.|18:00|19:00|tuesday|A fistful of Dolars|500
0004|GGG|16.12.2017.|21:15|00:00|sunday|The Good, the Bad and the Ugly|350
I try to search movie projections by date...
If there is a projection on the entered date it will find it and print the list, but before printing that list it will always print "Date input error" and after that list "No projection on that date". (if I put break in if statement it will print only the first found projection on entered day, withouth else statement, obvious)
Questions: How to print ONLY list of projections without "date input error" if date is correctly input.
How to print only "No projection on that date" if date is correct but there is no projection and how to ask user for input that until puts it correctly? In this way with recursion it will always throw exception and recursion search_projection_date() function.
There are a whole bunch of major problems with your code. As it happens, they showcase why some general advice we hear so often is actually good advice.
The line date = input("Input projection date: ") creates a string named date. input always returns a string. Strings in Python do not have a method called strptime. Which brings us to issue #2:
You should not catch generic exceptions. You were probably looking to trap a TypeError or ValueError in the except clause. However, you are getting an error that says AttributeError: 'str' object has no attribute 'strptime'. This is because you can't call methods that you want to exist but don't. Your except line should probably read something like except ValueError:.
Your except clause does nothing useful (beyond the problems listed above). If the string is not formatted correctly, you print a message but continue anyway. You probably want to use raise in the except clause to propagate the exception further. Luckily for you, you actually want the date to be a string, which brings us to issue #4:
Why are you attempting to convert the string to a date to begin with? You can not compare a date object and a string that you get from the file and ever expect them to be equal. You want to compare a string to a string. If you had some kind of validation in mind, that's fine, but use datetime.strptime and don't replace the original string; just raise an error if it doesn't convert properly.
The else clause in a for loop will execute whenever the loop terminates normally (i.e., without a break). Since you always iterate through all the lines, you will always trigger the else clause. You need to have another way to determine if you found matching items, like a boolean flag or a counter. I will show an example with a counter, since it is more general.
You never close your input file. Not a huge problem in this tiny example, but can cause major issues with bigger programs. Use a with block instead of raw open.
Your method of iterating through the file is not wrong per-se, but is inefficient. You load the entire file into memory, and then iterate over the lines. In Python, text files are already iterable over the lines, which is much more efficient since it only loads one line at a time into memory, and also doesn't make you process the file twice.
Combining all that, you can make your code look like this:
def search_projection_date():
counter = 0
with open('projections.txt','r') as projections:
date = input("Input projection date: ")
for line in projections:
projection = line.strip("\n").split("|")
if date == projection[2]:
table = [['Code:',projection[0]],
['Hall:',projection[1]],
['Date:',projection[2]],
['Start time:',projection[3]],
['End time:', projection[4]],
['Day(s):', projection[5]],
['Movie:', projection[6]],
['Price:', projection[7]]]
print(tabulate(table))
counter += 1
if not counter:
print("No projection on that date")
else:
print("Found {0} projections on {1}".format(counter, date))
I trusted your use of tabulate since I am not familiar with the module and have no intention of installing it. Keep in mind that the date verification is optional. If the user enters an invalid string, that's the end of it: you don't need to check for dates like 'aaaa' because they will just print No projection on that date. If you insist on keeping the verification, do it more like this:
from datetime import datetime
datetime.strftime(date, '%d.%m.%Y.')
That's it. It will raise a ValueError if the date does not match. You don't need to do anything with the result. If you want to change the error message, or return instead of raising an error, you can catch the exception:
try:
datetime.strftime(date, '%d.%m.%Y.')
except ValueError:
print('Bad date entered')
return
Notice that I am catching a very specific type of exception here, not using a generic except clause.

Categories

Resources