KeyError exception in List Comprehension - python

I have the following function:
def calculate(blob, count_per_data):
return geometric_mean( [score_per_count[ count_per_data[data] ] for data in combinations(blob)] )
The problem with my code is that if data is not found in count_per_data I get an exception. Instead, I wish count_per_data["unknown"] to evaluate to 0, i.e. "unknown"'s count is 0.
In turn, the value 0 exists in score_per_count and is not equal to 0. In other words, the score associated with a count of 0 is not itself 0.
How would you recommend I fix the above code to achieve my goal?

If you want to make sure that, the data exists in count_per_data and the value of count_per_data exists in score_per_count, you can use the list comprehension as a filter, like this
return geometric_mean([score_per_count[count_per_data[data]] for data in combinations(blob) if data in count_per_data and count_per_data[data] in score_per_count])
More readable version,
return geometric_mean([score_per_count[count_per_data[data]]
for data in combinations(blob)
if data in count_per_data and count_per_data[data] in score_per_count])
But, if you want to use default values when key is not found in a dictionary, then you can use dict.get. Quoting from dict.get docs,
get(key[, default])
Return the value for key if key is in the dictionary, else default. If
default is not given, it defaults to None, so that this method never
raises a KeyError.
You can use it, like this
count_per_data.get(data, 0)
If data is not found in count_per_data, 0 will be used instead.

Add conditions to the comprehension list:
return geometric_mean([score_per_count[count_per_data[data]] for data in combinations(blob) if data in count_per_data.keys() and count_per_data[data]] in geometric_mean.keys() else 0)

Related

Read json key value as insensitive key

I need to be able to pull the value of the key 'irr' from this json address in python:
IRR = conparameters['components'][i]['envelope'][j]['irr']
Even if 'irr' is any oher case, like IRR, Irr... etc.
Is that easy?
There's nothing built-in that does it, you have to search for a matching key.
See Get the first item from an iterable that matches a condition for how to write a first() function that finds the first element of an iterable that matches a condition. I'll use that in the solution below.
cur = conparameters['components'][i]['envelope'][j]
key = first(cur.keys(), lambda k: lower(k) == 'irr')
IRR = cur[key]

if-Else condition working with JSON format in Python to determine which raws to append to a list

I am producing a dataset starting from a series of JSON files associated with a certain ID (authors_df contains a bunch of ids) and I am using for to do this.
I tried with a subset of authors and it works fine.
The problem is that some of the id have have an incomplete Json file. Thus I tried to include some 'else' conditions to make the code work also with incomplete data (json files of length 0).
the problem is that I don't know how to do.
I tried if len(json_value['resonanceCategorizations']['1']['fullData']) > 0 else null
but it does not work (KeyError: '1'). I guess I have to set a different condition encompassing JSON structure of the complete files rather than using null
here is my code, it all works but the problem is with the line with else null.
json_values_per_author = {}
datalist = []
datadict = {}
for index, row in authors_df.iterrows():
#get the author
author = row['author']
print(author)
#build the url
url = f'http://keystone-db.default.svc.cluster.local:5000/keystonedb/profiles/resonance/categorization?profileId={author}&regionId=1'
#get the json value
json_value = requests.get(url).json()
full_data = json_value['resonanceCategorizations']['1']['fullData'] if len(json_value['resonanceCategorizations']['1']['fullData']) > 0 else null
datalist.append({
"author": author,
"seed1": full_data[0]['seed'],
"seed2": full_data[1]['seed'] if len(full_data) > 2 else 'NA',
"seed3": full_data[2]['seed'] if len(full_data) > 3 else 'NA'
})
another thing I tried was
z = {"000": [{"seed": 0, "globalSegmentId": 0, "globalSegmentName": "Nope", "regionId": 0, "resonance": 0, "isGlobal": true, "globalRegion": 1}]}
full_data = json_value['resonanceCategorizations']['1']['fullData'] if len(json_value['resonanceCategorizations']['1']['fullData']) > 0 else z
basically creating a "null" JSON value to input as a default if there is no data
alternatively, it would be fine if I could just avoid appending the authors with no data.
If you are having problems with missing keys in dictionary, have a look at returning default value from dictionary
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
So in your case it might look like
full_data = json_value.get('resonanceCategorizations', {}).get('1', {}).get('fullData')
The problem is, it is unclear which key was not found, if 'resonanceCategorizations' or '1' is not found, you can not apply len to it.
There are two approaches you can take. The first one is to use the dict.get method. Consider the following example:
my_dict = {"a": 1, "b":2}
print(my_dict["a"]) # prints 1
print(my_dict.get("a")) # prints 1
print(my_dict.get("a", None)) # prints 1
print(my_dict["c"]) # raises KeyError
print(my_dict.get("c")) # raises KeyError
print(my_dict.get("c", None)) # prints None
This way, you can check whether the given field exists in a dictionary, of course, you need to do this everytime you access a field, and handle if the output is None.
Another approach is to use a try-catch block.
try:
value = some_dictionary["a"]["b"]["c"]
except KeyError:
value = None
The disadvantage of this method is that you do not know whether a, a.b or a.b.c was missing.

How to get all the iterations in a list with none values included from a tweet?

I have set of tweets with 10 dictionaries in the list "tweets". Each dictionary has several tweets. The first tweet has 100 and the rest 9 have 15 each.
I need the location of each tweet in all the dictionaries.
When I try to iterate the values from a list it shows this error.
if (type(tweets[j]['statuses'][k]['place']['name'])) != None:
TypeError: 'NoneType' object is not subscriptable
The code I have used for the iteration is
for j in range (0,10):
while j == 0:
for k in range(0,100):
st1 = tweets[j]['statuses'][k]['place']['name']
print(st1)
I tried using "filter" to take out the "None" values, even that is not working.
not every tweet has a location tagged to it. so it has None values. I need to print the locations of the tweets that are tagged.
Have you tried to check if the 'place' key is first available. I can see from your code that you are checking for ['place']['name']
Can you test your logic with the following filter logic without ['name']:
...
if (isinstance(tweets[j].get('statuses',[])[k].get('place', {}))) == dict:
...
The twitter api returns json, which is a dictionary type in Python. When you are calling keys using dict[key] syntax, this is called subscripting. Now, nested calls on a dict object are dependent on that object being a dictionary type:
dict[a][b] relies on dict[a] being a dictionary with key b being available. If dict[a] is a different type, say None or int, it is not subscriptable. This means that there is not necessarily a get attribute for that type. A simple way to fix this would be the following:
check = tweets[j]['statuses'][k]['place']
if isinstance(check, dict):
# do things
This makes sure that check is of type dict and therefore can be subscripted with a key
EDIT: Note that using dict[key] syntax is not safe against KeyErrors. If you want to avoid those, use get:
my_dictionary = {'a': 1, 'b': 2}
my_dictionary['c'] # Raises KeyError: 'c' not in dictionary
my_dictionary.get('c') # returns None, no KeyError
It takes the form dict.get(key, <return_value>), where return_value defaults to None
To make your program a bit more readable and avoid the inevitable infinite loop, ditch the while loop:
# This will get individual tweets
for tweet in tweets:
# Returns all statuses or empty list
statuses = tweet.get('statuses', [])
for status in statuses:
if not isinstance(status, dict):
continue # will jump to next iteration of inner for loop
# Will be a name or None, if empty dict is returned from place
name = status.get('place', {}).get('name')
if name:
print(name)
for element in tweets:
for s in element.get('statuses'):
place = s.get('place')
print(place['name'])
This fixed it.

Getting a strange result when comparing 2 dictionaries in python

So I have a pair of dictionaries in python: (both have exactly the same keys)
defaults = {'ToAlpha': 4, 'ToRed': 4, 'ToGreen': 4, 'ToBlue': 4,}
bridged = {'ToAlpha': 3, 'ToRed': 0, 'ToGreen': 1, 'ToBlue': 2,}
When I iterate through one of the dictionaries I do a quick check to see if the other dict has the same key, if it does then print it.
for key, value in defaults.iteritems():
if bridged.get(key):
print key
What I would expect to see is:
ToAlpha
ToRed
ToGreen
ToBlue
But for some reason, 'ToRed' is not printed. I must be missing something really simple here, but have no idea might might be causing this.
bridged.get('ToRed')
and
defaults.get('ToRed')
both work independently, but when iterated through the loop... Nothing!
Any idea's?
0 is false. Use in to check for containment.
if key in bridged:
The problem is in the if statement when 'ToRed' gets passed.
if 0
returns false, so the key is not returned. Use
if key in bridged
The problem is when key is ToRed, then bridged.get('ToRed') will be 0.
So following will evaluate to False:
if bridged.get(key):
thereby not printing 'ToRed'.
Instead of this, use the in operator.
Using in the most pythonic way to check if a key is in a dictionary.
So check using this:
if key in bridged:
Final code now becomes:
>>> for key, value in defaults.iteritems():
if key in bridged:
print key
ToAlpha
ToRed
ToBlue
ToGreen

Check if Dictionary Values exist in a another Dictionary in Python

I am trying to compare values from 2 Dictionaries in Python. I want to know if a value from one Dictionary exists anywhere in another Dictionary. Here is what i have so far. If it exists I want to return True, else False.
The code I have is close, but not working right.
I'm using VS2012 with Python Plugin
I'm passing both Dictionary items into the functions.
def NameExists(best_guess, line):
return all (line in best_guess.values() #Getting Generator Exit Error here on values
for value in line['full name'])
Also, I want to see if there are duplicates within best_guess itself.
def CheckDuplicates(best_guess, line):
if len(set(best_guess.values())) != len(best_guess):
return True
else:
return False
As error is about generator exit, I guess you use python 3.x. So best_guess.values() is a generator, which exhaust for the first value in line['full name'] for which a match will not be found.
Also, I guess all usage is incorrect, if you look for any value to exist (not sure, from which one dictinary though).
You can use something like follows, providing line is the second dictionary:
def NameExists(best_guess, line):
vals = set(best_guess.values())
return bool(set(line.values()).intersection(vals))
The syntax in NameExists seems wrong, you aren't using the value and best_guess.values() is returning an iterator, so in will only work once, unless we convert it to a list or a set (you are using Python 3.x, aren't you?). I believe this is what you meant:
def NameExists(best_guess, line):
vals = set(best_guess.values())
return all(value in vals for value in line['full name'])
And the CheckDuplicates function can be written in a shorter way like this:
def CheckDuplicates(best_guess, line):
return len(set(best_guess.values())) != len(best_guess)

Categories

Resources