pythonic way to iterate over this dict structure - python

My structure is something like this in a YAML file:
nutrition:
fruits:
apple:
banana:
pear:
veggies:
spinach:
zucchini:
squash:
meats:
chicken:
fish:
ham:
I load this in with yaml.load()
Not sure but likely because of the colons at the end of the leaf-elements (which I'm not sure need to be there), the entire structure is a 3-level dict. I can change the YML if needed to make it more efficient.
Now, I want to quickly iterate over the structure, and based on the leaf-level element I find (e.g. 'spinach'), I want to look up another simple dict, called 'recipes', which can have the string 'spinach' as a substring of its keys. This lookup dict can have keys that say 'spinach juice' or 'spinach pie' or 'chicken spinach'.
I found a way to do this, but not sure it is the right pythonic way. Here is what I have:
for food_class in database['nutrition']:
for food in database['nutrition'][food_class]:
for key, value in recipes.items():
if re.search(food, key):
print key
Any advice/pointers to make it more efficient and/or pythonic?

You can use dict.items() so you don't need to put the dictionary lookup in the nested loop.
The foods aren't regular expressions, just strings, so use in rather than re.search().
Since you're not using the value from recipe, you don't need .items()
for food_class, foods in database['nutrition'].items():
for food in foods:
for key in recipes:
if food in key:
print(key)
If you want to search for the food as a whole word, you can use re.search(r'\b' + food + r'\b', key) or food in recipes.split(' ') as the test.

Related

How to create objects from a dict where the keys are the name objects and the values are the attributes? Using a loop

class Cars(object):
def __init__(self,brand=None,color=None,cost=None):
self.brand = brand
self.color = color
self.cost = cost
imagine i have 300 cars (from car1 to car300)
dict = {"car1":["Toyota","Red",10000],
"car2":["Tesla","White",20000],
"car3":["Honda","Red",15000]
}
What I have tried:
dict1 = globals()
for k,v in dict.items():
dict1[f"{k}"] = Cars(v[0],v[1],v[2])
print(k,v)
Is this the best possible way to do it?
Is this an aggressive way to do it?
I wold like to learn a efficient, safe way to do it
Close. First, you seem to have a name problem, Cars verses Coche. And you shouldn't use dict as a variable name. And you really need to consider whether putting these variables in the global namespace is a good idea. Besides that, you should not use an F string that doesn't add anything to the variable referenced. You can unpack the list with *v and you use a dict comprehension instead of a loop
my_dict = {k:Cars(*v) for k,v in dict.items()}
Use a dictionary for all the cars, not globals.
You can create it in one step with a dictionary comprehension.
all_cars = {name: Coche(brand, color, cost) for name, (brand, color, cost) in dict.items()}
print(all_cars['car1'].brand)

Mapping Python Dict keys to another Dict

Due to some poor planning I have a script that expects a python dict with certain keys however, the other script that creates this dict is using a different naming convention.
Unfortunately, due to translations that have already taken place it looks like I'll need to convert the dict keys.
Basically go from
{'oldKey':'data'}
to
{'newKey':'data'}
I was thinking of creating a dict:
{'oldKey':'newKey'}
and iterate through the dict to convert from oldKey to newKey however is this the most efficient/pythonic way to do it?
I can think of a couple of ways to do this which use dictionaries, but one of them might be more efficient depending on the coverage of the key usage.
a) With a dictionary comprehension:
old_dict = {'oldkey1': 'val1', 'oldkey2': 'val2',
'oldkey3': 'val3', 'oldkey4': 'val4',
'oldkey5': 'val5'}
key_map = {'oldkey1': 'newkey1', 'oldkey2': 'newkey2',
'oldkey3': 'newkey3', 'oldkey4': 'newkey4',
'oldkey5': 'newkey5'}
new_dict = {newkey: old_dict[oldkey] for (oldkey, newkey) in key_map.iteritems()}
print new_dict['newkey1']
b) With a simple class that does the mapping. (Note that I have switched the order of key/value in key_map in this example.) This might be more efficient because it will use lazy evaluation - no need to iterate through all the keys - which may save time if not all the keys are used.
class DictMap(object):
def __init__(self, key_map, old_dict):
self.key_map = key_map
self.old_dict = old_dict
def __getitem__(self, key):
return self.old_dict[self.key_map[key]]
key_map = {'newkey1': 'oldkey1',
'newkey2': 'oldkey2',
'newkey3': 'oldkey3',
'newkey4': 'oldkey4',
'newkey5': 'oldkey5'}
new_dict2 = DictMap(key_map, old_dict)
print new_dict2['newkey1']
This will solve your problem:
new_dict={key_map[oldkey]: vals for (oldkey, vals) in old_dict.items()}

How to get keys by value in dictionary (python 2.7)

I have this dictionary that describes students courses:
the keys are names (string) and the values are lists of courses (string)
students_dict={"name1":["math","computer science", "statistics"],"name2":["algebra","statistics","physics"],"name3":["statistics","math","programming"]}
I want to create a function that gets this dictionery and returns a new one:
the keys will be the courses (string)
and the values will be lists of the students names who take this course (list of srtings)
course_students={"statistics":["name1","name2","name3"],"algebra":["name2"],"programming":["name3"],"computer science":["name1"],"physics":["name2"],"math":["name1","name3"]}
the order doen't matter.
edit: this is kind of what im trying to do
def swap_student_courses(students_dict):
students_in_each_cours={}
cours_list=[...]
cours_names=[]
for cours in cours_list:
if students_dict.has_key(cours)==True:
cours_names.append(...)
students_in_each_cours.append(cours_names)
return students_in_each_cours
I would use a defaultdict here for simplicity's sake, but know that you can accomplish the same with a regular dict:
from collections import defaultdict
students_dict={"name1":["math","computer science", "statistics"],
"name2":["algebra","statistics","physics"],
"name3":["statistics","math","programming"]}
course_students = defaultdict(list)
for name, course_list in students_dict.items():
for course in course_list:
course_students[course].append(name)
It can be done with a set comprehension (to first get a unique set of course names) followed by a dict comprehension (to associate course names with a list of students for whom that course appears in their respective list):
all_courses = {course for student_course_list in students_dict.values() for course in student_course_list}
course_students = {course:[student_name for student_name,student_course_list in students_dict.items() if course in student_course_list] for course in all_courses}
Your attempted approach neglected to search through each student's course list: you used students_dict.has_key(cours) forgetting that student names, not courses, are the keys of students_dict.
Here is simple function you could use.
from collections import defaultdict
def create_new_dict(old_dict):
new_dict = defaultdict(list)
for student, courses in old_dict.items():
for course in courses:
new_dict[course].append(student)
return new_dict
The only difference between python standard dict and defaultdict is that if you try to access non existing key in standard dict it will result in KeyError while in defaultdict it will set default value for that key to anything passed on the creation of that dict. In our case its empty list.
Implementation without defaultdict
def create_new_dict(old_dict):
new_dict = dict()
for student, courses in old_dict.items():
for course in courses:
try:
new_dict[course].append(student)
except KeyError:
new_dict[course] = [student]
return new_dict
EDIT----
The KeyError is raising in standard dict because if it is the first time we try to access some key, 'math' for example, it is not in the dictionary. Here is excellent explanation of dictionaries.
It is not a problem that values repeat because in that case we simply append new student to the list.

Most efficient way to add new keys or append to old keys in a dictionary during iteration in Python?

Here's a common situation when compiling data in dictionaries from different sources:
Say you have a dictionary that stores lists of things, such as things I like:
likes = {
'colors': ['blue','red','purple'],
'foods': ['apples', 'oranges']
}
and a second dictionary with some related values in it:
favorites = {
'colors':'yellow',
'desserts':'ice cream'
}
You then want to iterate over the "favorites" object and either append the items in that object to the list with the appropriate key in the "likes" dictionary or add a new key to it with the value being a list containing the value in "favorites".
There are several ways to do this:
for key in favorites:
if key in likes:
likes[key].append(favorites[key])
else:
likes[key] = list(favorites[key])
or
for key in favorites:
try:
likes[key].append(favorites[key])
except KeyError:
likes[key] = list(favorites[key])
And many more as well...
I generally use the first syntax because it feels more pythonic, but if there are other, better ways, I'd love to know what they are. Thanks!
Use collections.defaultdict, where the default value is a new list instance.
>>> import collections
>>> mydict = collections.defaultdict(list)
In this way calling .append(...) will always succeed, because in case of a non-existing key append will be called on a fresh empty list.
You can instantiate the defaultdict with a previously generated list, in case you get the dict likes from another source, like so:
>>> mydict = collections.defaultdict(list, likes)
Note that using list as the default_factory attribute of a defaultdict is also discussed as an example in the documentation.
Use collections.defaultdict:
import collections
likes = collections.defaultdict(list)
for key, value in favorites.items():
likes[key].append(value)
defaultdict takes a single argument, a factory for creating values for unknown keys on demand. list is a such a function, it creates empty lists.
And iterating over .items() will save you from using the key to get the value.
Except defaultdict, the regular dict offers one possibility (that might look a bit strange): dict.setdefault(k[, d]):
for key, val in favorites.iteritems():
likes.setdefault(key, []).append(val)
Thank you for the +20 in rep -- I went from 1989 to 2009 in 30 seconds. Let's remember it is 20 years since the Wall fell in Europe..
>>> from collections import defaultdict
>>> d = defaultdict(list, likes)
>>> d
defaultdict(<class 'list'>, {'colors': ['blue', 'red', 'purple'], 'foods': ['apples', 'oranges']})
>>> for i, j in favorites.items():
d[i].append(j)
>>> d
defaultdict(<class 'list'>, {'desserts': ['ice cream'], 'colors': ['blue', 'red', 'purple', 'yellow'], 'foods': ['apples', 'oranges']})
All of the answers are defaultdict, but I'm not sure that's the best way to go about it. Giving out defaultdict to code that expects a dict can be bad. (See: How do I make a defaultdict safe for unexpecting clients? ) I'm personally torn on the matter. (I actually found this question looking for an answer to "which is better, dict.get() or defaultdict") Someone in the other thread said that you don't want a defaultdict if you don't want this behavior all the time, and that might be true. Maybe using defaultdict for the convenience is the wrong way to go about it. I think there are two needs being conflated here:
"I want a dict whose default values are empty lists." to which defaultdict(list) is the correct solution.
and
"I want to append to the list at this key if it exists and create a list if it does not exist." to which my_dict.get('foo', []) with append() is the answer.
What do you guys think?

Check if value exists in nested lists

in my list:
animals = [ ['dog', ['bite'] ],
['cat', ['bite', 'scratch'] ],
['bird', ['peck', 'bite'] ], ]
add('bird', 'peck')
add('bird', 'screech')
add('turtle', 'hide')
The add function should check that the animal and action haven't been added before adding them to the list. Is there a way to accomplish this without nesting a loop for each step into the list?
You're using the wrong data type. Use a dict of sets instead:
def add(key, value, userdict):
userdict.setdefault(key, set())
userdict[key].add(value)
Usage:
animaldict = {}
add('bird', 'peck', animaldict)
add('bird', 'screech', animaldict)
add('turtle', 'hide', animaldict)
While it is possible to construct a generic function that finds the animal in the list using a.index or testing with "dog" in animals, you really want a dictionary here, otherwise the add function will scale abysmally as more animals are added:
animals = {'dog':set(['bite']),
'cat':set(['bite', 'scratch'])}
You can then "one-shot" the add function using setdefault:
animals.setdefault('dog', set()).add('bite')
It will create the 'dog' key if it doesn't exist, and since setdefault returns the set that either exists or was just created, you can then add the bite action. Sets ensure that there are no duplicates automatically.
Based on recursive's solution, in Python 2.5 or newer you can use the defaultdict class, something like this:
from collections import defaultdict
a = defaultdict(set)
def add(animal, behavior):
a[animal].add(behavior)
add('bird', 'peck')
add('bird', 'screech')
add('turtle', 'hide')
You really should use a dictionary for this purpose. Or alternatively a class Animal.
You could improve your code like this:
if not any((animal[0] == "bird") for animal in animals):
# append "bird" to animals
animals_dict = dict(animals)
def add(key, action):
animals_dict.setdefault(key, [])
if action not in animals_dict[key]:
animals_dict[key].append(action)
(Updated to use setdefault - nice one #recursive)
While I agree with the others re. your choice of data structure, here is an answer to your question:
def add(name, action):
for animal in animals:
if animal[0] == name:
if action not in animal[1]:
animal[1].append(action)
return
else:
animals.append([name, [action]])
The for loop is an inevitable consequence of your data structure, which is why everyone is advising you to consider dictionaries instead.

Categories

Resources