Trying not to use too many variables in code, I came up with the code below. It looks horrible. Any ideas on how to format it nicely? Do I need to use more variables?
I write code like this a lot, and it'd help to see what methods people usually resort to have readable code while making creating less variables
exceptions = []
# find all the distinct parent exceptions (sorted) and add to the list
# with their children list
for parent in collection.find(
{'tags': 'exception'}).sort('viewPriority').distinct('parentException'):
group_info = {'groupName': parent,
'children': [{'value': ex['value'],
'label': ex['label'],}
for ex in collection.find({'tags': 'exception',
'parentException': parent}
).sort('viewPriority')],
}
exceptions.append(group_info)
I would break your logic up into functions
def get_children(parent):
result = collection.find({'tags': 'exception', 'parentException': parent})
result = result.sort('viewPriority')
return [{'value': ex['value'], 'label': ex['label']} for ex in result]
def get_group_info(parent):
return {'groupName': parent, 'children': get_children(parent)}
result = collection.find({'tags': 'exception'})
result = result.sort('viewPriority').distinct('parentException')
exceptions = [get_group_info(parent) for parent in result]
As a bonus, you can easily unittest get_children and get_group_info
Definitely difficult to get this to look any good, here is my best attempt at keeping the line lengths short and maintaining readability:
exceptions = []
# find all the distinct parent exceptions (sorted) and add to the list
# with their children list
for parent in (collection.find({'tags': 'exception'})
.sort('viewPriority').distinct('parentException')):
group_info = {
'groupName': parent,
'children': [{'value': ex['value'], 'label': ex['label'],}
for ex in (collection.find({'tags': 'exception',
'parentException': parent})
.sort('viewPriority'))],
}
exceptions.append(group_info)
Related
I'm scraping a website, which returns a dictionary:
person = {'name0':{'first0': 'John', 'last0':'Smith'},
'age0':'10',
'location0':{'city0':'Dublin'}
}
I'm trying to write a function that will return a dictionary {'name':'John', 'age':'10'} when passed the above dictionary.
I want to ideally put a try:... except KeyError around each item since sometimes keys will be missing.
def func(person):
filters = [('age', 'age0'), ('name', ['name0', 'first0'])]
result = {'name': None, 'age': None}
for i in filters:
try:
result[i[0]] = person[i[1]]
except KeyError:
pass
return result
The problem is result[i[0]] = person[i[1]] doesn't work for 'name' since there's two keys that need to be followed sequentially and I don't know how to do that.
I want some way of telling it (in the loop) to go to person['name0']['first0'] (and so on to whatever depth the thing I want is).
I have lots of things to extract, so I'd rather do it in a loop instead of a try..except statement for each variable individually.
In order to follow several key sequentially, you can use get and set the default value to {} (empty dictionary) for the upper levels. Set the default value to None (or whatever suits you) for the last level:
def func(person):
return {'name': person.get('name0', {}).get('first0', None),
'age': person.get('age0', None)}
Best I could manage was using a for loop to iterate through the keys:
person = {'name0':{'first0': 'John', 'last0':'Smith'},
'age0':'10',
'location0':{'city0':'Dublin'}
}
Additionally I used .get(key) rather than try..except as suggested by #wiwi
def func(person):
filters = [('age', ['age0']), ('name', ['name0', 'first0'])]
result = {'name': None, 'age': None}
for filter in filters:
temp = person.copy()
for key in filter[1]:
temp = temp.get(key)
if not temp: # NoneType doesn't have .get method
break
result[filter[0]] = temp
return result
func(person) then returns {'name': 'John', 'age': '10'}.
It handles missing input too:
person2 = {'age0':'10',
'location0':{'city0':'Dublin'}}
func(person2) returns {'name': None, 'age': '10'}
You can put the try...except in another loop, if there's a list of keys instead of a single key:
def getNestedVal(obj, kPath:list, defaultVal=None):
if isinstance(kPath, str) or not hasattr(kPath, '__iter__'):
kPath = [kPath] ## if not iterable, wrap as list
for k in kPath:
try: obj = obj[k]
except: return defaultVal
return obj
def func(person):
filters = [('age', 'age0'), ('name', ['name0', 'first0']),#]
('gender', ['gender0'], 'N/A')] # includes default value
return {k[0]: getNestedVal(person, *k[1:3]) for k in filters}
[I added gender just to demonstrate how defaults can also be specified for missing values.]
With this, func(person) should return
{'age': '10', 'name': 'John', 'gender': 'N/A'}
I also have a flattenObj function, a version of which is defined below:
def flattenDict(orig:dict, kList=[], kSep='_', stripNum=True):
if not isinstance(orig, dict): return [(kList, orig)]
tList = []
for k, v in orig.items():
if isinstance(k, str) and stripNum: k = k.strip('0123456789')
tList += flattenDict(v, kList+[str(k)], None)
if not isinstance(kSep, str): return tList
return {kSep.join(kl): v for kl,v in tList}
[I added stripNum just to get rid of the 0s in your keys...]
flattenDict(person) should return
{'name_first': 'John', 'name_last': 'Smith', 'age': '10', 'location_city': 'Dublin'}
def read_data(service_client):
data = list_data(domain, realm) # This returns a data frame
building_data = []
building_names = {}
all_buildings = {}
for elem in data.iterrows():
building = elem[1]['building_name']
region_id = elem[1]['region_id']
bandwith = elem[1]['bandwith']
building_id = elem[1]['building_id']
return {
'Building': building,
'Region Id': region_id,
'Bandwith': bandwith,
'Building Id': building_id,
}
Basically I am able to return a single dictionary value upon a iteration here in this example. I have tried printing it as well and others.
I am trying to find a way to store multiple dictionary values on each iteration and return it, instead of just returning one.. Does anyone know any ways to achieve this?
You may replace your for-loop with the following to get all dictionaries in a list.
naming = {
'building_name': 'Building',
'region_id': 'Region Id',
'bandwith': 'Bandwith',
'building_id': 'Building Id',
}
return [
row[list(naming.values())].to_dict()
for idx, row in data.rename(naming, axis=1).iterrows()
]
I wrote a code that takes 9 keys from API.
The authors, isbn_one, isbn_two, thumbinail, page_count fields may not always be retrievable, and if any of them are missing, I would like it to be None. Unfortunately, if, or even nested, doesn't work. Because that leads to a lot of loops. I also tried try and except KeyError etc. because each key has a different error and it is not known which to assign none to. Here is an example of logic when a photo is missing:
th = result['volumeInfo'].get('imageLinks')
if th is not None:
book_exists_thumbinail = {
'thumbinail': result['volumeInfo']['imageLinks']['thumbnail']
}
dnew = {**book_data, **book_exists_thumbinail}
book_import.append(dnew)
else:
book_exists_thumbinail_n = {
'thumbinail': None
}
dnew_none = {**book_data, **book_exists_thumbinail_n}
book_import.append(dnew_none)
When I use logic, you know when one condition is met, e.g. for thumbinail, the rest is not even checked.
When I use try and except, it's similar. There's also an ISBN in the keys, but there's a list in the dictionary over there, and I need to use something like this:
isbn_zer = result['volumeInfo']['industryIdentifiers']
dic = collections.defaultdict(list)
for d in isbn_zer:
for k, v in d.items():
dic[k].append(v)
Output data: [{'type': 'ISBN_10', 'identifier': '8320717507'}, {'type': 'ISBN_13', 'identifier': '9788320717501'}]
I don't know what to use anymore to check each key separately and in the case of its absence or lack of one ISBN (identifier) assign the value None. I have already tried many ideas.
The rest of the code:
book_import = []
if request.method == 'POST':
filter_ch = BookFilterForm(request.POST)
if filter_ch.is_valid():
cd = filter_ch.cleaned_data
filter_choice = cd['choose_v']
filter_search = cd['search']
search_url = "https://www.googleapis.com/books/v1/volumes?"
params = {
'q': '{}{}'.format(filter_choice, filter_search),
'key': settings.BOOK_DATA_API_KEY,
'maxResults': 2,
'printType': 'books'
}
r = requests.get(search_url, params=params)
results = r.json()['items']
for result in results:
book_data = {
'title': result['volumeInfo']['title'],
'authors': result['volumeInfo']['authors'][0],
'publish_date': result['volumeInfo']['publishedDate'],
'isbn_one': result['volumeInfo']['industryIdentifiers'][0]['identifier'],
'isbn_two': result['volumeInfo']['industryIdentifiers'][1]['identifier'],
'page_count': result['volumeInfo']['pageCount'],
'thumbnail': result['volumeInfo']['imageLinks']['thumbnail'],
'country': result['saleInfo']['country']
}
book_import.append(book_data)
else:
filter_ch = BookFilterForm()
return render(request, "BookApp/book_import.html", {'book_import': book_import,
'filter_ch': filter_ch})```
I'm trying to separate various functions in my program to keep things neat. And I'm getting stuck trying to use variables created in one module in another module. I tried using global list_of_names but it wasn't working, and I've read that it's recommended not to do so anyway.
Below is a sample of my code. In my opinion, it doesn't make sense to pass list_of_names as a function argument because there are multiple other variables that I need to do this with, aside from the actual arguments that do get passed.
Unfortunately, even if I were to move read_json into engine.py, I'd still have the same problem in main.py as I need to reference list_of_names there as well.
# main.py:
import json
from engine import create_person
def read_json():
with open('names.json', 'r') as file
data = json.load(file)
return data
list_of_names = read_json()
person1 = create_person()
# engine.py:
from random import choice
def create_person():
name = choice(list_of_names)
new_person = {
'name': name,
# other keys/values created in similar fashion
}
return new_person
EDIT1:
Here's my new code. To me, this doesn't seem efficient to have to build the parameter list and then deconstruct it inside the function. (I know I'm reusing variable names for this example) Then I have to pass some of those parameters to other functions.
# main.py:
import json
from engine import create_person
def read_json():
with open('names.json', 'r') as file
data = json.load(file)
return data
player_id_index = 0
list_of_names = read_json()
person_parameters = [
list_of_names,
dict_of_locations,
player_id_index,
dict_of_occupations,
.
.
.
]
person1, player_id_index = create_person()
# engine.py:
from random import choice
def create_person(person_params):
list_of_names = person_params[0]
dict_of_locations = person_params[1]
player_id_index = person_params[2]
dict_of_occupations = person_params[3]
.
.
.
attr = person_params[n]
name = choice(list_of_names)
location = get_location(dict_of_locations) # a function elsewhere in engine.py
p_id = player_id_index
occupation = get_occupation(dict_of_occupations) # a function elsewhere in engine.py
new_person = {
'name': name,
'hometown': location,
'player id': p_id,
'occupation': occupation,
.
.
.
}
player_id_index += 1
return new_person, player_id_index
In general you should not be relying on shared global state. If you need to share state encapsulate the state in objects or pass as function arguments.
Regarding your specific problem it looks like you want to assemble random dictionaries from a set of options. It could be coded like this:
from random import choice
person_options = {
'name': ['fred', 'mary', 'john', 'sarah', 'abigail', 'steve'],
'health': [6, 8, 12, 15],
'weapon': ['sword', 'bow'],
'armor': ['naked', 'leather', 'iron']
}
def create_person(person_options):
return {k:choice(opts) for k, opts in person_options.items()}
for _ in range(4):
print create_person(person_options)
In action:
>>> for _ in range(4):
... print(create_person(person_options))
...
{'armor': 'naked', 'weapon': 'bow', 'health': 15, 'name': 'steve'}
{'armor': 'iron', 'weapon': 'sword', 'health': 8, 'name': 'fred'}
{'armor': 'iron', 'weapon': 'sword', 'health': 6, 'name': 'john'}
{'armor': 'iron', 'weapon': 'sword', 'health': 12, 'name': 'john'}
Note that a dictionary like {'armor': 'naked', 'weapon': 'bow', 'health': 15, 'name': 'steve'} looks like it might want to be an object. A dictionary is a glob of state without any defined behavior. If you make a class to house this state the class can grow methods that act on that state. Of course, explaining all this could make this answer really really long. For now, just realize that you should move away from having shared state that any old bit of code can mess with. A little bit of discipline on this will make your code much easier to refactor later on.
This addresses your edited question:
from random import choice
from itertools import count
from functools import partial
person_options = {
'name': partial(
choice, ['fred', 'mary', 'john', 'sarah', 'abigail', 'steve']),
'location': partial(
get_location, {'heaven':1, 'hell':2, 'earth':3}),
'player id': count(1).next
}
def create_person(person_options):
return {k:func() for k, func in person_options.items()}
However, we are now way beyond the scope of your original question and getting into specifics that won't be helpful to anyone other than you. Such questions are better asked on Code Review Stack Exchange
I am parsing JSON that stores various code snippets and I am first building a dictionary of languages used by these snippets:
snippets = {'python': {}, 'text': {}, 'php': {}, 'js': {}}
Then when looping through the JSON I'm wanting add the information about the snippet into its own dictionary to the dictionary listed above. For example, if I had a JS snippet - the end result would be:
snippets = {'js':
{"title":"Script 1","code":"code here", "id":"123456"}
{"title":"Script 2","code":"code here", "id":"123457"}
}
Not to muddy the waters - but in PHP working on a multi-dimensional array I would just do the following (I am lookng for something similiar):
snippets['js'][] = array here
I know I saw one or two people talking about how to create a multidimensional dictionary - but can't seem to track down adding a dictionary to a dictionary within python. Thanks for the help.
This is called autovivification:
You can do it with defaultdict
def tree():
return collections.defaultdict(tree)
d = tree()
d['js']['title'] = 'Script1'
If the idea is to have lists, you can do:
d = collections.defaultdict(list)
d['js'].append({'foo': 'bar'})
d['js'].append({'other': 'thing'})
The idea for defaultdict it to create automatically the element when the key is accessed. BTW, for this simple case, you can simply do:
d = {}
d['js'] = [{'foo': 'bar'}, {'other': 'thing'}]
From
snippets = {'js':
{"title":"Script 1","code":"code here", "id":"123456"}
{"title":"Script 2","code":"code here", "id":"123457"}
}
It looks to me like you want to have a list of dictionaries. Here is some python code that should hopefully result in what you want
snippets = {'python': [], 'text': [], 'php': [], 'js': []}
snippets['js'].append({"title":"Script 1","code":"code here", "id":"123456"})
snippets['js'].append({"title":"Script 1","code":"code here", "id":"123457"})
print(snippets['js']) #[{'code': 'code here', 'id': '123456', 'title': 'Script 1'}, {'code': 'code here', 'id': '123457', 'title': 'Script 1'}]
Does that make it clear?