Store path to dictionary value - python

How might one store the path to a value in a dict of dicts? For instance, we can easily store the path to the name value in a variable name_field:
person = {}
person['name'] = 'Jeff Atwood'
person['address'] = {}
person['address']['street'] = 'Main Street'
person['address']['zip'] = '12345'
person['address']['city'] = 'Miami'
# Get name
name_field = 'name'
print( person[name_field] )
How might the path to the city value be stored?
# Get city
city_field = ['address', 'city']
print( person[city_field] ) // Obviously won't work!

You can do:
path = ('address', 'city')
lookup = person
for key in path:
lookup = lookup[key]
print lookup
# gives: Miami
This will raise a KeyError if a part of the path does not exist.
It will also work if path consists of one value, such as ('name',).

You can use reduce function to do this
print reduce(lambda x, y: x[y], city_field, person)
Output
Miami

An alternative that uses Simeon's (and Unubtu's deleted answer) is to create your own dict class that defines an extra method:
class mydict(dict):
def lookup(self, *args):
tmp = self
for field in args:
tmp = tmp[field]
return tmp
person = mydict()
person['name'] = 'Jeff Atwood'
person['address'] = {}
person['address']['street'] = 'Main Street'
person['address']['zip'] = '12345'
person['address']['city'] = 'Miami'
print(person.lookup('address', 'city'))
print(person.lookup('name'))
print(person.lookup('city'))
which results in:
Miami
Jeff Atwood
Traceback (most recent call last):
File "look.py", line 17, in <module>
print(person.lookup('city'))
File "look.py", line 5, in lookup
tmp = tmp[field]
KeyError: 'city'
You can shorten the loop per the suggestion of thefourtheye. If you want to be really fancy, you can probably override private methods like __get__ to allow for a case like person['address', 'city'], but then things may become tricky.

Here's another way - behaves exactly the same as Simeon Visser's
from operator import itemgetter
pget = lambda map, path: reduce(lambda x,p: itemgetter(p)(x), path, map)
With your example data:
person = {
'name': 'Jeff Atwood',
'address': {
'street': 'Main Street',
'zip': '12345',
'city': 'Miami',
},
}
pget(person, ('address', 'zip')) # Prints '12345'
pget(person, ('name',)) # Prints 'Jeff Atwood'
pget(person, ('nope',)) # Raises KeyError

Related

Linking two lists based on a common value and

I am new to Python 2.7 and I want the 1st column as the key column in employees and it has to check on dept 1st column and generate results.
Employees comes from a text file and dept comes from a database. I tried a lot but didn't get an easy answer. What is wrong with my code?
**Inputs :**
employees=['1','peter','london']
employees=['2','conor','london']
employees=['3','ciara','london']
employees=['4','rix','london']
dept=['1','account']
dept=['2','developer']
dept=['3','hr']
**Expected Output :**
results=['1','peter','london','account']
results=['2','conor','london','developer']
results=['3','ciara','london','hr']
results=['4','rix','london',null]
your input makes no sense. Each line overwrites the previous one data-wise. Here it seems that the digits (as string) are the keys, and some default action must be done when no info is found in dept.
To keep the spirit, just create 2 dictionaries, then use dictionary comprehension to generate the result:
employees = dict()
dept = dict()
employees['1'] = ['peter','london']
employees['2'] = ['conor','london']
employees['3'] = ['ciara','london']
employees['4'] = ['rix','london']
dept['1']=['account']
dept['2']=['developer']
dept['3']=['hr']
result = {k:v+dept.get(k,[None]) for k,v in employees.items()}
print(result)
which yields a dictionary with all the info. Note that null is None in python:
{'1': ['peter', 'london', 'account'], '4': ['rix', 'london', None], '3': ['ciara', 'london', 'hr'], '2': ['conor', 'london', 'developer']}
You could go for a class. Consider this:
class Employee:
def __init__(self, number, name, location, dept):
self.number = str(number)
self.name = name
self.location = location
self.dept = dept
def data(self):
return [self.number,
self.name,
self.location,
self.dept]
peter = Employee(1, 'peter', 'london', 'account')
print(peter.data())
['1', 'peter', 'london', 'account']
>>>

Run value through function

I have the following management command in Django which updates a record with data from an external source:
class Command(BaseCommand):
def handle(self, *args, **options):
field_mappings = {
'first_name': 'Voornaam',
'initials': 'Voorletters',
'last_name_prefix': 'Voorvoegsel',
'last_name': 'Achternaam',
'sex': 'Geslacht',
'DOB': ['Geboortedatum', 'convert_DOB'],
'street': 'Straat',
'house_number': 'Huisnummer',
'zipcode': 'Postcode',
'city': 'Woonplaats',
'country': 'Land',
'phone_number': 'Telefoonnummer',
'phone_number_mobile': 'MobielTelefoonnummer',
'email_address': 'Emailadres',
'insurer_name': 'NaamVerzekeraar',
'insurance_policy_number': 'PolisnummerVerzekering',
'gp_name': 'NaamHuisarts',
}
patients = Patient.objects.all()
for patient in patients:
result = Query(patient.pharmacy, 'patient_by_bsn', {'BSN': patient.BSN}).run()
for x, y in field_mappings.items():
if type(y) == list:
pass
else:
setattr(patient, x, result[y]['0'])
patient.save()
print('Patient {}-{} updated'.format(patient.pharmacy.vv_id, patient.vv_id))
#staticmethod
def convert_DOB(value):
return timezone.datetime.utcfromtimestamp(value).date()
Most fields can be saved without converting the data first, but some fields like the DOB need converting (in this case from a UNIX timestamp to a Python datetime.date). Where it currently says pass underneath if type(y) == list I want to run the value through the listed function first so that it would save convert_DOB(value) instead of the original value - how can I do this?
First in your mapping don't use the name of the conversion function but the function itself, ie:
def convert_DOB(dob):
# your code here
# ...
field_mappings = {
# ...
'DOB': ['Geboortedatum', convert_DOB],
# ...
}
Then you just have to pass your value to the function and retrieve the result:
for attrname, sourcekey in field_mappings.items():
if isinstance(sourcekey, list):
# unpack the list (assumes length 2)
sourcekey, converter = key
value = result[sourcekey]['0']
# convert the value
value = converter(value)
else:
# no conversion needed
value = result[sourcekey]['0']
# done
setattr(patient, attrname, value)
Note that from a purely semantic POV you should be using a tuple not a list - a list is supposed to be an homogenous collection where position is not significant, a tuple is an heterogenous collection where poistion is significant. In your case the position is indeed significant since the first item is the key to the results dict and the second the conversion function.
Also the way you use field_mappings suggests you could replace it with a list of (attrname, key_or_key_converter) tuples since you only iterate over the dict's (key, value) pairs.

Normalize (map) JSON response in Python

I have an object that I'm retrieving via request that I need to map over and normalize to an expected response:
address_list = list(map(normalize_shipping_address, address_list))
I was hoping to be able to set up some sort of dict to handle the mapping such as:
def normalize_shipping_address(address):
normalized_address = {}
valueMapping = {
'firstname': 'firstName',
'lastname': 'lastName',
'city': 'city',
'postcode': 'zip',
'countryId': 'country',
'defaultShipping': 'isDefault',
}
for source_key, destination_key in valueMapping.iteritems():
try:
key = address[source_key]
except KeyError:
key = None
normalized_address[destination_key] = key
return normalized_address
but realized there is not a good way for me to get nested dictionary keys or list objects such as: address.some.nested.key or address.some.list[0].key.
I ended up handling these special "cases" by doing things such as:
try:
state = address['region']['regionCode']
except KeyError:
state = None
normalized_address['state'] = state
and
try:
address_name = address['customAttributes'][0]['value']
except KeyError:
address_name = None
normalized_address['addressName'] = address_name
This seems a bit verbose and clunky. Is there a more elegant way to parse/normalize dictionaries like this?

Consolidating row data from DB into a list of dicts

I'm reading data from a SELECT statement of SQLite. Date comes in the following form:
ID|Phone|Email|Status|Role
Multiple rows may be returned for the same ID, Phone, or Email. And for a given row, either Phone or Email can be empty/NULL. However, for the same ID, it's always the same value for Status and the same for Role. for example:
1|1234567892|a#email.com| active |typeA
2|3434567893|b#email.com| active |typeB
2|3434567893|c#email.com| active |typeB
3|5664567891|d#email.com|inactive|typeC
3|7942367891|d#email.com|inactive|typeC
4|5342234233| NULL | active |typeD
5| NULL |e#email.com| active |typeD
These data are returned as a list by Sqlite3, let's call it results. I need to go through them and reorganize the data to construct another list structure in Python. The final list basically consolidates the data for each ID, such that:
Each item of the final list is a dict, one for each unique ID in results. In other words, multiple rows for the same ID will be merged.
Each dict contains these keys: 'id', 'phones', 'emails', 'types', 'role', 'status'.
'phones' and 'emails' are lists, and contains zero or more items, but no duplicates.
'types' is also a list, and contains either 'phone' or 'email' or both, but no duplicates.
The order of dicts in the final list does not matter.
So far I have come up this:
processed = {}
for r in results:
if r['ID'] in processed:
p_data = processed[r['ID']]
if r['Phone']:
p_data['phones'].add(r['Phone'])
p_data['types'].add('phone')
if r['Email']:
p_data['emails'].add(r['Email'])
p_data['types'].add('email')
else:
p_data = {'id': r['ID'], 'status': r['Status'], 'role': r['Role']}
if r['Phone']:
p_data['phones'] = set([r['Phone']])
p_data.setdefault('types', set).add('phone')
if r['Email']:
p_data['emails'] = set([r['Email']])
p_data.setdefault('types', set).add('email')
processed[r['ID']] = p_data
consolidated = list(processed.values())
I wonder if there is a faster and/or more concise way to do this.
EDIT:
A final detail: I would prefer to have 'phones', 'emails', and 'types' in each dict as list instead of set. The reason is that I need to dump consolidated into JSON, and JSON does not allow set.
When faced with something like this I usually use:
processed = collections.defaultdict(lambda:{'phone':set(),'email':set(),'status':None,'type':set()})
and then something like:
for r in results:
for field in ['Phone','Email']:
if r[field]:
processed[r['ID']][field.lower()].add(r[field])
processed[r['ID']]['type'].add(field.lower())
Finally, you can dump it into a dictionary or a list:
a_list = processed.items()
a_dict = dict(a_list)
Regarding the JSON problem with sets, you can either convert the sets to lists right before serializing or write a custom encoder (very useful!). Here is an example of one I have for dates extended to handle sets:
class JSONDateTimeEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime.datetime):
return int(time.mktime(obj.timetuple()))
elif isinstance(ojb, set):
return list(obj)
try:
return json.JSONEncoder.default(self, obj)
except:
return str(obj)
and to use it:
json.dumps(a_list,sort_keys=True, indent=2, cls =JSONDateTimeEncoder)
I assume results is a 2d list:
print results
#[['1', '1234567892', 'a#email.com', ' active ', 'typeA'],
#['2', '3434567893', 'b#email.com', ' active ', 'typeB'],
#['2', '3434567893', 'c#email.com', ' active ', 'typeB'],
#['3', '5664567891', 'd#email.com', 'inactive', 'typeC'],
#['3', '7942367891', 'd#email.com', 'inactive', 'typeC'],
#['4', '5342234233', ' NULL ', ' active ', 'typeD'],
#['5', ' NULL ', 'e#email.com', ' active ', 'typeD']]
Now we group this list by id:
from itertools import groupby
data_grouped = [ (k,list(v)) for k,v in groupby( sorted(results, key=lambda x:x[0]) , lambda x : x[0] )]
# make list of column names (should correspond to results). These will be dict keys
names = [ 'id', 'phone','email', 'status', 'roll' ]
ID_info = { g[0]: {names[i]: list(list( map( set, zip(*g[1] )))[i]) for i in range( len(names))} for g in data_grouped }
Now for the types:
for k in ID_info:
email = [ i for i in ID_info[k]['email'] if i.strip() != 'NULL' and i != '']
phone = [ i for i in ID_info[k]['phone'] if i.strip() != 'NULL' and i != '']
if email and phone:
ID_info[k]['types'] = [ 'phone', 'email' ]
elif email and not phone:
ID_info[k]['types'] = ['email']
elif phone and not email:
ID_info[k]['types'] = ['phone']
else:
ID_info[k]['types'] = []
# project
ID_info[k]['id'] = ID_info[k]['id'][0]
ID_info[k]['roll'] = ID_info[k]['roll'][0]
ID_info[k]['status'] = ID_info[k]['status'][0]
And what you asked for (a list of dicts) is returned by ID_info.values()

How do I look get an associated value in a json variable using python?

How do I look up the 'id' associated with the a person's 'name' when the 2 are in a dictionary?
user = 'PersonA'
id = ? #How do I retrieve the 'id' from the user_stream json variable?
json, stored in a variable named "user_stream"
[
{
'name': 'PersonA',
'id': '135963'
},
{
'name': 'PersonB',
'id': '152265'
},
]
You'll have to decode the JSON structure and loop through all the dictionaries until you find a match:
for person in json.loads(user_stream):
if person['name'] == user:
id = person['id']
break
else:
# The else branch is only ever reached if no match was found
raise ValueError('No such person')
If you need to make multiple lookups, you probably want to transform this structure to a dict to ease lookups:
name_to_id = {p['name']: p['id'] for p in json.loads(user_stream)}
then look up the id directly:
id = name_to_id.get(name) # if name is not found, id will be None
The above example assumes that names are unique, if they are not, use:
from collections import defaultdict
name_to_id = defaultdict(list)
for person in json.loads(user_stream):
name_to_id[person['name']).append(person['id'])
# lookup
ids = name_to_id.get(name, []) # list of ids, defaults to empty
This is as always a trade-off, you trade memory for speed.
Martijn Pieters's solution is correct, but if you intend to make many such look-ups it's better to load the json and iterate over it just once, and not for every look-up.
name_id = {}
for person in json.loads(user_stream):
name = person['name']
id = person['id']
name_id[name] = id
user = 'PersonA'
print name_id[user]
persons = json.loads(...)
results = filter(lambda p:p['name'] == 'avi',persons)
if results:
id = results[0]["id"]
results can be more than 1 of course..

Categories

Resources