Lets say we have a list of dictionaries, ranging in the thousands. Every dictionary has the exact same keys, but different values.
Is there any way to lookup which dictionary a key value comes from, and then access different key values from that same dictionary?
For example, say you have a list containing every person in a large city, dictionary format, like so:
people_in_city = [
{'name': 'Bob Jang',
'age': 45,
'sex': 'male'},
{'name': 'Barb Esau',
'age': 56,
'sex': 'female'},
etc.
etc.
Now, you want to lookup the age of Bob Jang, but you only have his name. Is there anyway to get his corresponding age using this format?
There's no fast way to do this in Python. I'd suggest something like:
def get_dict_from_key(list_of_dicts, key, value):
return next((d for d in list_of_dicts if d[key] == value))
result_d = get_dict_from_key(dicts, 'name', 'Bob Jang')
result_d['age']
That said, this is the kind of thing that relational databases are made for! Learn SQL and use it :)
I would suggest looking at a database as Adam Smith suggested. Though I would also like to suggest looking at a class. That would go something like:
class Person():
def __init__(self, name, age, gender):
self.name = name
self.age = age
self.gender = gender
people_in_city = []
def add_person(name, age, gender
people_in_city=people_in_city):
people_in_city.append(Person(name, age, gender))
def find_by_name(name):
for person in people_in_city:
if person.name == name:
return person
Not the most elegant way, but it gets the job done, plus you can add more information and not have have to change the nature of your search function. So lets say you find dear 'Bob Jang', and you realize that you want to know that job he is doing (assuming you coded that into the class). You just do:
person_of_interest = find_by_name('Bob Jang')
person_of_interest.job
person_of_interest.age
Note that this only gives the LAST found value of the name, and not everyone with that name. Other methods will have to be employed for that. This method also means that you are holding all the information in a list, and that might get slow as the list grows. That is why databases would work better as your list grows.
And as a bonus, it is possible to create each person in parallel.
Try this,
provided_name = 'some name'
persons_list_with_name = [person_info in person_info in people_in_city if person_info['name'] == provided_name]
for person_info in persons_list_with_name:
print person_info['name'], person_info['age']
(age,) = [v['age' ]for (k,v) in people_in_city.iteritems() if v['name']=="Bob Jang"]
people_in_city = [
{'name': 'Bob Jang',
'age': 45,
'sex': 'male'},
{'name': 'Barb Esau',
'age': 56,
'sex': 'female'}]
v = age or name sex etc..
for i in people_in_city:
if v in i.values():
print i
There is a python package called bidict that can do this. It provides a two-way dict, which allows you to get the key from the value or the value from the key. An example from the documentation:
>>> element_by_symbol = bidict(H='hydrogen')
>>> element_by_symbol['H'] # forward mapping works just like with dict
'hydrogen'
>>> element_by_symbol[:'hydrogen'] # use slice for the inverse mapping
'H'
Related
I am new to Python, and I want to know if there is a way to make the age value calculated using the year of birth which is an item with the age in the same dictionary.
This is what it came to my mind, and I think there is simple way like this without using additional variables of functions.
person = {
'name': Jane,
'yearofbirth': 1995,
'yearnow': 2019,
'age': person['yearnow'] + person['yearofbirth']
}
Any help would be appreciated. Thank you!
Yes, you can
Just not decalre the whole dict in one act
person = {
'name': Jane,
'yearofbirth': 1995,
'yearnow': 2019
}
person["age"] = (lambda yearnow, yearofbirth: yearnow - yearofbirth)(**person)
But in your example you shouldn't change anything, because there is no way to simplify it(easily). My solution should be used only in complicated tasks. I just you the way to simplify it in case of a huge amount of values in dict.
Instead of hardcoding the current year, you could get python to give it to you
from datetime import datetime
currentYear = datetime.now().year
person = {
'name': 'Jane',
'yearofbirth' : 1995
}
age = currentYear - person.get( "yearofbirth", "")
person.update({'age': age})
print(person)
You can't set your age inside the dict, as it has not been defined yet.
If you do like above code, we are setting the age outside the dict, then updating the dict with the value we just calculated based on currentYear and the age
The output is:
{'name': 'Jane', 'yearofbirth': 1991, 'age': 24}
You may have self referential dictionary by using a class derived from dict but dict itself doesn't have this capability. The following code is taken from this answer.
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item) % self
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/%(user)s',
'bin' : '%(home)s/bin'
})
print dictionary["home"]
print dictionary["bin"]
Let's say i have a city (value) and people (key).
1 city can have many people.
(For example.):
Code:
cities = {'Berlin':{'Dan', 'john'},'Tokyo':{'John'}}
city_dict = {}
people = {}
for city in cities:
?
i want to construct a dictionary in python which insert only if a match between keys occurring.
(For example the desired result.):
{'dan' : {'dan':'berlin','dan':'colorado'},'john' : {'john':'berlin','john':'Tokyo'}}
Thanks.
The desired result can't be achieved as dictionaries, by definition, can't contain duplicated keys.
You can, however, do the following (which is somehow close to the output you wanted):
from collections import defaultdict
cities = {'Berlin': {'Dan', 'John'}, 'Tokyo': {'John'}}
output = defaultdict(set)
for city, names in cities.items():
for name in names:
output[name].add(city)
print(output)
# defaultdict(<class 'set'>, {'Dan': {'Berlin'}, 'John': {'Berlin', 'Tokyo'}})
Other option, without dependencies and returning list of cities:
cities = {'Berlin':{'Dan', 'John'},'Tokyo':{'John', 'Paul'}, 'Liverpool':{'John', 'Paul', 'George', 'Ringo'}, 'Colorado':{'Ringo'} }
res = {}
for k, v in cities.items():
for e in v:
res.setdefault(e,[]).append(k)
print(res)
#=> {'Dan': ['Berlin'], 'John': ['Berlin', 'Tokyo', 'Liverpool'], 'Paul': ['Tokyo', 'Liverpool'], 'Ringo': ['Liverpool', 'Colorado'], 'George': ['Liverpool']}
You can't have a dictionary with duplicate keys like #DeepSpace indicated, so for your problem I can suggest you the following alternative.
Use a dictionary with people's name for keys and for value the cities. And so when you want when combine the two for creating a list tuples or so on.
people = {"Dan": ["Berlin","San Francisco"], "Mario": ["Rome"]}
for name, locations in people:
#combine name with single city if needed
for city in locations:
tuple_tmp = (name,city)
#next store it, print it,...
This approach cons are:
You need to process the values
If you have city and and want to retrieve all names in this one is very slow operation.
You can maintain another structure with the inverted relation, but it's memory consuming.
I have a dictionary of dictionary called data_dict. Following is how it looks:
{'UMANOFF ADAM S': {'total_stock_value': 'NaN', 'loans': 'NaN', 'salary': 288589},
'YEAP SOON': {'total_stock_value': 192758, 'loans': 'NaN', 'salary': 'NaN'},
'PIPER GREGORY F': {'total_stock_value': 880290, 'loans': 1452356, 'salary': 19791},
'Jack S': {'total_stock_value': 88000, 'loans': 'NaN', 'salary': 288589}
}
Basically it is of the format
{Person Name : Dictionary of that person's attributes}
I am trying to find the name of a person whose salary is certain X.
Specifically in above example - let's say I am trying to find the name of the persons whose salary is 288589. I expect all the names whose salary is 288589.
I have written following generalised function which will take a search key and value and return names of the persons for which that key, value holds true.
def search_person_by_attribute(attribute, value):
person_names = []
for person, attributes_dict in data_dict.items():
if attributes_dict[attribute] == value:
person_names.append(person)
return person_names
This method runs successfully
results = search_person_by_attribute("salary", 288589)
print(results)
and prints
['UMANOFF ADAM S','Jack S']
But somehow I feel this is quite a long way write it. Is there a better/shorter/more pythonic way to do it?
If you can also mention the efficiency (in terms of time complexity) of my as well your suggested solution will be a great bonus.
I would suggest something like this, which I think is not just shorter, but more readable than your version:
def search_person_by_attribute(d, attribute, value):
return [name for name in d if d[name][attribute] == value]
It works exactly like yours, but requires the dictionary as an additional parameter, because I think that's better style:
>>> search_person_by_attribute(d, "salary", 288589)
['UMANOFF ADAM S', 'Jack S']
here, I have a list of dictionaries, I need to find the object using value.
people = [
{'name': mifta}
{'name': 'khaled', 'age':30},
{'name': 'reshad', 'age':31}
]
I would like to find by 'age' key where value is 30. I can do this by following way
for person in people:
if person.get('age'):
if person['age'] == 30:
is there any better way to do this without lots of if else?
You can just use dict.get() one time without person['age'], it allows you to provide a default value if the key is missing, so you can try this:
dict.get
Return the value for key if key is in the dictionary, else default. If
default is not given, it defaults to None, so that this method never
raises a KeyError
people = [
{'name': 'mifta'},
{'name': 'khaled', 'age':30},
{'name': 'reshad', 'age':31}
]
for person in people:
if person.get('age',0)==30:
print(person)
If you want to avoid if..else you can use lambda function.
fieldMatch = filter(lambda x: 30 == x.get('age'), people)
or also use list comprehension to get names in a list.
names = [person['name'] for person in people if person.get('age') == 30]
I have a list of people:
[
{'name' : 'John', 'wins' : 10 },
{'name' : 'Sally', 'wins' : 0 },
{'name' : 'Fred', 'wins' : 3 },
{'name' : 'Mary', 'wins' : 6 }
]
I am adding wins using a list of names (['Fred', 'Mary', 'Sally']). I don't know if the name is in the list of people already, and I need to insert a new record if not. Currently I'm doing the following:
name = 'John'
person = None
pidx = None
for p in people_list:
if p['name'] == name:
person = p
pidx = people_list.index(p)
break
if person is None:
person = {'name' : name, 'wins' : 0}
person['wins'] += 1
if pidx is None:
people_list.append(person)
else
people_list[pidx] = person
Is there a better way to do this with a list? Given that I'm saving this to MongoDB I can't use a dict as it will save as an object and I want to use native array functions for sorting and mapping that aren't available for objects.
I'm assuming here that you don't want to use any structure other than the list. Your code should work, although you unnecessarily write the dictionary back to the list after updating it. Dictionaries are copied by reference, so once you update it, it stays updated in the list. After a little housekeeping, your code could look like this:
def add_win(people_list, name):
person = find_person(people_list, name)
person['wins'] += 1
def find_person(people_list, name):
for person in people_list:
if person['name'] == name:
return person
person = {'name': name, 'wins': 0}
people_list.append(person)
return person
Yes, use a dict.
wins = {}
for name in winners:
wins.setdefault(name, 0)
wins[name] += 1
edit:
index = {}
for name in wins:
person = index.setdefault(name, { 'name' : name, 'wins': 0 })
if person['wins'] == 0:
person_list.append(person)
person['wins'] += 1
If you don't want a dict permanently use one temporarily.
people = [
{'name' : 'John', 'wins' : 10 },
{'name' : 'Sally', 'wins' : 0 },
{'name' : 'Fred', 'wins' : 3 },
{'name' : 'Mary', 'wins' : 6 }
]
wins = ['Fred', 'Mary', 'Sally']
people_dict = dict((p["name"], p) for p in people)
for winner in wins:
people_dict[winner].setdefault("wins", 0)
people_dict[winner]["wins"] += 1
people = people_dict.values()
Your access pattern dictates the use of a different data structure (or at least another helper data structure). Scanning the list as you're doing is in fact the right thing to do if you're using a list, but you shouldn't be using a list (if you want it to be efficient, anyhow).
If the order of the list doesn't matter, you should use a Dictionary (python dict). If it does, you should use an OrderedDict from the collections module.
You could also use two separate data structures - the list you already have, and additionally a set containing just the names in the list so you have quick access to test inclusion or not. However, the set doesn't help you access the actual name data quickly (you'd still have to do a linear search in the list for that), so it would only be a helpful pattern if you merely were testing inclusion, but otherwise always walking the list as it was inserted.
Edit: it seems like what you might actually want is a list and a dict, where the dictionary is a mapping between the name and the index in the list. Alternatively you could still use a dict or OrderedDict, but insert them into Mongo as an array by using dict.iteritems() to create an array (or what would look like an array to Mongo) on insertion. You could use various mutators from zip to things in itertools to dynamically build up the objects you need in your resultant array.
This specific case is implemented by the collections.Counter type. Along with array generators, this is one expression:
[{'name':name, 'wins':wins}
for name, wins in Counter(names).items()]
If you want a specific order, sorted() is the easiest way (this also uses a plain generator (), rather than an array generator [], since it's temporary):
sorted(({'name':name, 'wins':wins} for name, wins in Counter(names).items()),
key=lambda item: item['name'])
Where item['name'] could be item['wins'] or any other comparable expression.