Sorry if this is a silly question but I'm streaming data from a server and trying to pull specific values by keys, and they are only working if I first check if the key is present
JSON Example
{"time_exchange":"2018-04-04T14:29:53.0847306Z","time_coinapi":"2018-04-04T14:29:53.0847306Z","ask_price":117.1,"ask_size":158.30616728,"bid_size":102.60064,"bid_price":117.09,"symbol_id":"COINBASE_SPOT_LTC_USD","sequence":25388355,"type":"quote"}
It prints correctly if I do this:
data = json.loads(ws.recv())
if 'ask_size' in data:
print data['ask_size']
But if I do just:
data = json.loads(ws.recv())
print data['ask_size']
I get a key error:
KeyError: 'ask_size'
First point : neither using an intermediate variable nor checking if the key is present will change the content of the dict. Period. The only effect of checkin the key's presence in the dict is preventing the KeyError when it's missing.
Very obviously, what is happening here is that the key is sometimes missing and sometimes not. You can easily check this out with the correct test:
data = json.loads(ws.recv())
if 'ask_size' in data:
print data['ask_size']
else:
print "'ask_size' not found in %s" % data
Related
I am making a telegram chatbot and can't figure out how to take out the [{' from the output.
def tether(bot, update):
tetherCall = "https://api.omniexplorer.info/v1/property/31"
tetherCallJson = requests.get(tetherCall).json()
tetherOut = tetherCallJson ['issuances'][:1]
update.message.reply_text("Last printed tether: " + str (tetherOut)+" Please take TXID and past it in this block explorer to see more info: https://www.omniexplorer.info/search")
My user will see this as a response: [{'grant': '25000000.00000000', 'txid': 'f307bdf50d90c92278265cd92819c787070d6652ae3c8af46fa6a96278589b03'}]
This looks like a list with a single dict in it:
[{'grant': '25000000.00000000',
'txid': 'f307bdf50d90c92278265cd92819c787070d6652ae3c8af46fa6a96278589b03'}]
You should be able to access the dict by indexing the list with [0]…
tetherOut[0]
# {'grant': '25000000.00000000',
# 'txid': 'f307bdf50d90c92278265cd92819c787070d6652ae3c8af46fa6a96278589b03'}
…and if you want to get a particular value from the dict you can index by its name, e.g.
tetherOut[0]['txid']
# 'f307bdf50d90c92278265cd92819c787070d6652ae3c8af46fa6a96278589b03'
Be careful chaining these things, though. If tetherOut is an empty list, tetherOut[0] will generate an IndexError. You'll probably want to catch that (and the KeyError that an invalid dict key will generate).
I am trying sentiment analysis where I have data like
source_text-> #LiesbethHBC I have a good feeling actually 🙈 its not that long, it's pretty soon!\nAw you deserve these tickets
then! 💖
result_value-> Sentiment(polarity=0.0, subjectivity=0.0)
I want to store this key value pair in a python dictionary.
I tried creating one as:
dict={}
dict[source_text].append(result_value)
but I get KeyError
Is there a way to store such text(just not characters) in a dictionary?
Your problem has nothing to do with "non-character text" (which doesn't mean anything actually), the only requirement for an object to be usable as a dict key is that it's hashable, and there's absolutely no restriction on what you can use as value.
Your problem quite simply comes from the fact that you're trying to get the value for an inexistant key (that's what KeyError means : the key you ask for does not exist in the dict).
Here :
mydict = {}
at this point, mydict is empty so just any item access will raise a KeyError
then you're doing this:
dict[source_text].append(result_value)
which is basically:
something = mydict[source_text] # get value for key `source_text`
something.append(result_value)
Since your dict is empty, the first line WILL obviously raise a KeyError.
If you want to store one unique result_value for each source_text value then the proper syntax is:
mydict[source_text] = result_value
If you want to store a list of result_value for each source_text value then you have to either explicitely test if the key is set, if not set it with an empty list, then append to this list:
if source_text not in mydict:
mydict[source_text] = []
mydict[source_text].append(result_value)
or just use a DefaultDict instead:
from collections import DefaultDict
mydict = DefaultDict(list)
# DefaultDict will automagically create the key with an empty list
# as value if the key is missing
mydict[source_text].append(result_value)
Now I strongly suggest that you invest some time in properly learning Python (hint: there's a quite decent tutorial in the official documentation) if you have to use it, this will save on everyone's time.
The problem is that when you tried to pull out the key #LiesbethHBC I have a good feeling actually 🙈 its not that long, it's pretty soon!\nAw you deserve these tickets then! 💖 in the dictionary which in this case is non-existent, Python gave you a KeyError meaning that the key didn't exist in the dictionary. A simple way to solve this is by initially checking whether you have that particular key in the dictionary, if yes, do whatever you wanna do with it, else create that key first.
By the way, avoid using dict (dictionary datatype) or any other datatypes as a variable name.
This is what you should actually do:
dictionary = {} # Since, 'dict' is the dictionary data-type in Python
if (source_text in dictionary):
# If the key exists...
dictionary[source_text].append(result_value)
else:
# If the key does not exist...
dictionary[source_text] = []
This should help...
Have you tried using '.update' method?
dict = {}
dict.update({'First':'Test'})
dict.update({'Lets Get':'Real'})
print (dict)
Output:
{'Testing': 'Dictionaries', 'Lets Get': 'Real'}
EDIT:
Or even:
dict = {}
dict.update({'Polarity':0.91})
dict.update({'Subjectivity':0.73})
print (dict)
Output:
{'Polarity': 0.8, 'Subjectivity': 0.73}
I am still new to python, and brand new to json. I am trying to go through output that is in json. I am not yet sure which fields will need to be printed out, but I do know that two of them will be needed.
How could I change:
import json
from pprint import pprint
with open('out.json') as data_file:
data = json.load(data_file)
pprint(data)
to print out say, field one, and field two?
I figure if I can print field one, and two, I can play around with it until I find the right fields. I imagine this is a derp level question, but being able to print specific fields is what I need to be able to do.
json.load is returning python obj (https://docs.python.org/3/library/json.html#json.load) so depending on content of 'out.json' it can be either dict, list or few other types.
In case of dictionary you can go with data['key'] or if it's list go with data[index] - where index is 1,2,...
For looping use for ie for list:
for elem in data:
print(elem)
of for dictionary:
for key, value in data.items():
print(key, value)
You could have find it easily in python's json documentation.
Here data is a dict type object. You can get any value by using the corresponding key like this:
print data['field']
But it will throw a KeyError if the field key is not present in the dict. For avoiding this issue you can use the get() method.
print data.get('field')
This will return None in case of missing key.
Summary: dictionary/json object indicates it does not have a given key (using either a hasattr call or a value in object.keys boolean test even though that key shows up in an object.keys() call. So how can I access the value for that key?
Longer version: I am quite puzzled trying to parse some json coming back from an API. When I try to determine whether the json object, which is showing up as a dictionary, has a given key, the code returns false for the key even when it shows the key is there for the object.
Here is how I am retrieving the json:
r = requests.get(url, headers = {'User-Agent':UA})
try:
print(r.json())
jsonobject = r.json()
print("class of jsonobject is %s"%jsonobject.__class__.__name__)
print("here are dictionary keys %s"%jsonobject.keys())
if hasattr(jsonobject, 'laps') and jsonobject['laps'] is not None:
...
else:
print("no laps object")
if hasattr(jsonobject, 'points') and jsonobject['points'] is not None:
...
The reason I am doing this is that often I am getting encoding errors from the field nested within the 'laps' array or the 'points' array so that I cannot insert the json data into a MongoDB database. I would like to delete these fields from the json object since they don't contain useful information anyway.
The problem is that the json object is always returning false for hasattr(jsonobject, 'laps') and hasattr(jsonobject,'points'. It returned false even in the case of a record where I then printed out the keys and they showed:
here are dictionary keys dict_keys(['is_peptalk_allowed', 'show_workout', 'hydration', 'records', 'include_in_stats', 'expand', 'pb_count', 'start_time', 'calories', 'altitude_max', 'hashtags', 'laps', 'pictures', 'duration', 'playlist'\
, 'sport', 'points', 'show_map', 'local_start_time', 'speed_avg', 'tagged_users', 'distance', 'altitude_min', 'is_live', 'author', 'feed_id', 'speed_max', 'id'])
So I thought perhaps the dict was behaving strangely with hasattr, and rewrote the code as:
if 'laps' in jsonobject.keys() and jsonobject['laps'] is not None:
but that also returns false even thoug hit again prints the same array of keys that does include 'laps'.
hasattr() is entirely the wrong tool to use. It tests for attributes, but dictionary keys are not attributes.
To test for keys, use the in test directly against the dictionary:
if 'lap' in jsonobject:
Calling jsonobject.keys() is redundant and creates a new dictionary view object.
It'll be true for your dictionary, but that's not the only thing you are testing for. Your test is:
if 'lap' in jsonobject and jsonobject['lap'] is not None:
That'll fail if 'lap' is a key but the value in the dictionary is None.
The above test can be more simply and compactly stated as:
if jsonobject.get('lap') is not None:
If None is a valid value, don't test for it; stick to just 'lap' in jsonobject.
I am trying to find out which portion of my code contains a KeyError in my events list. Events is a list that contains JSON elements. I want to put timestamp, event_sequence_number, and device_id in their respective variables. However each JSON object is different and some do not contain the timestamp, event_sequence_number, or device_id keys. How can I change my bit of code so that I am able to output which specific key(s) is missing?
ex:
When timestamp is missing
"timestamp key is missing"
when timestamp and device_id is missing
"timestamp key is missing"
"device_id key is missing"
etc
Code:
for event in events:
try:
timestamp = event["event"]["timestamp"]
event_sequence_num = event["event"]["properties"]["event_sequence_number"]
device_id = event["application"]["mobile"]["device_id"]
event_identifier = str(device_id) + "_" + str(timestamp) + "_" + str(event_sequence_num)
event_dict[event_identifier] = 1
except KeyError:
print "JSON Key does not exist"
You can print the exception as that will include the key for which the KeyError was raised:
except KeyError as exc:
print "JSON Key does not exist: " + str(exc)
You can also access the key by looking at exc.args[0]:
except KeyError as exc:
print "JSON Key does not exist: " + str(exc.args[0])
Simeon Visser's answer is spot-on. Reporting the key causing the KeyError is probably the best that can be done in bare, straightforward Python. If you're only accessing the JSON structure once, that's the way to go.
I offer a longer alternative, however, for situations where you need to access the multi-level event data repeatedly. If you're accessing it often, your program can afford a few more lines of setup and infrastructure. Consider:
def getpath(obj, path, post=str):
"""
Use path as sequence of keys/indices into obj. Return the value
there, filtered through the post (postprocessing function).
If there is no such value, raise KeyError displaying the
partial path to the point where there is no index/key.
"""
c = obj
try:
for i, p in enumerate(path):
c = c[p]
return post(c) if post else c
except (KeyError, IndexError) as e:
msg = "JSON keys {0!r} don't exist".format(path[:i+1])
raise KeyError(msg)
# raise type(e)(msg) # Alternative if you want more exception variety
EID_COMPONENTS = [('application', 'mobile', 'device_id'),
('event', 'timestamp'),
('event', 'properties', 'event_sequence_number')]
for event in events:
event_identifier = '_'.join(getpath(event, p) for p in EID_COMPONENTS)
event_dict[event_identifier] = 1
There is more preparation here, with a separate getpath function and globally defined specification of what paths into the JSON data to get. On the plus side, the assembly of event_identifier is much shorter (if it were wrapped in a function, it'd be about 1/3 the size in either source lines or bytecodes).
If an attempted access fails, it returns a more complete error message, giving the path into the structure up to that point, not just the final key that was missing. In complex JSON with duplicated keys in different sub-structures (multiple timestamps, e.g.), knowing which attempted access failed can save you much debugging effort. You may also notice that the code is prepared to use integer indices and gracefully handle IndexError; in JSON, array values are common.
This is abstraction in action: More framework and more setup, but if you need to do a lot of deep structure accesses, the code size savings and better error reporting would advantage multiple parts of your program, making it potentially a good investment.