What is the pythonic way to extract values from dict - python

What is the best way to extract values from dictionary. Let's suppose we have a list of dicts:
projects = [{'project': 'project_name1',
'dst-repo': 'some_dst_path',
'src-repo': 'some_src_path',
'branches': ['*']},
{...},
{...}]
Now I just iterate through this dictionary and get values, something like:
for project in projects:
project_name = project.get('project')
project_src = ....
project_dst = ....
....
....
So the question is: "Are there any more pythonic approaches to extract values by key from dictionary that allow not making so many lines of code for new variable assignment?"

There's nothing wrong with what you're doing, but you can make it more compact by using a list comprehension to extract the values from the current dictionary. Eg,
projects = [
{
'project': 'project_name1',
'dst-repo': 'some_dst_path',
'src-repo': 'some_src_path',
'branches': ['*']
},
]
keys = ['project', 'src-repo', 'dst-repo', 'branches']
for project in projects:
name, src, dst, branches = [project[k] for k in keys]
# Do stuff with the values
print(name, src, dst, branches)
output
project_name1 some_src_path some_dst_path ['*']
However, this approach gets unwieldy if the number of keys is large.
If keys are sometimes absent from the dict, then you will need to use the .get method, which returns None for missing keys (unless you pass it a default arg):
name, src, dst, branches = [project.get(k) for k in keys]
If you need specific default for each key, you could put them into a dict, eg
defaults = {
'project': 'NONAME',
'src-repo': 'NOSRC',
'dst-repo': 'NODEST',
'branches': ['*'],
}
projects = [
{
'project': 'project_name1',
'src-repo': 'some_src_path',
},
]
keys = ['project', 'src-repo', 'dst-repo', 'branches']
for project in projects:
name, src, dst, branches = [project.get(k, defaults[k]) for k in keys]
# Do stuff with the values
print(name, src, dst, branches)
output
project_name1 some_src_path NODEST ['*']

out = [elt.values() for elt in projects]

for project in projects:
project_name = project['project']
project_src = ....
project_dst = ....
....
....
I'm not sure you can get less typing
//EDIT:
Ok, it looks I misunderstood the question:
Assume we have a list of dicts like this:
projects = [ {'project': "proj1", 'other': "value1", 'other2': "value2"},
{'project': "proj2", 'other': "value3", 'other2': "value4"},
{'project': "proj2", 'other': "value3", 'other2': "value4"} ]
To extract the list of project fields, you can use the following expression:
projects_names = [x['project'] for x in projects]
This will iterate over project list, extracting the value of 'project' key from each dictionary.

Related

How to check each key separately from a list in a loop without creating multiple loops. Which may have a KeyError etc

I wrote a code that takes 9 keys from API.
The authors, isbn_one, isbn_two, thumbinail, page_count fields may not always be retrievable, and if any of them are missing, I would like it to be None. Unfortunately, if, or even nested, doesn't work. Because that leads to a lot of loops. I also tried try and except KeyError etc. because each key has a different error and it is not known which to assign none to. Here is an example of logic when a photo is missing:
th = result['volumeInfo'].get('imageLinks')
if th is not None:
book_exists_thumbinail = {
'thumbinail': result['volumeInfo']['imageLinks']['thumbnail']
}
dnew = {**book_data, **book_exists_thumbinail}
book_import.append(dnew)
else:
book_exists_thumbinail_n = {
'thumbinail': None
}
dnew_none = {**book_data, **book_exists_thumbinail_n}
book_import.append(dnew_none)
When I use logic, you know when one condition is met, e.g. for thumbinail, the rest is not even checked.
When I use try and except, it's similar. There's also an ISBN in the keys, but there's a list in the dictionary over there, and I need to use something like this:
isbn_zer = result['volumeInfo']['industryIdentifiers']
dic = collections.defaultdict(list)
for d in isbn_zer:
for k, v in d.items():
dic[k].append(v)
Output data: [{'type': 'ISBN_10', 'identifier': '8320717507'}, {'type': 'ISBN_13', 'identifier': '9788320717501'}]
I don't know what to use anymore to check each key separately and in the case of its absence or lack of one ISBN (identifier) assign the value None. I have already tried many ideas.
The rest of the code:
book_import = []
if request.method == 'POST':
filter_ch = BookFilterForm(request.POST)
if filter_ch.is_valid():
cd = filter_ch.cleaned_data
filter_choice = cd['choose_v']
filter_search = cd['search']
search_url = "https://www.googleapis.com/books/v1/volumes?"
params = {
'q': '{}{}'.format(filter_choice, filter_search),
'key': settings.BOOK_DATA_API_KEY,
'maxResults': 2,
'printType': 'books'
}
r = requests.get(search_url, params=params)
results = r.json()['items']
for result in results:
book_data = {
'title': result['volumeInfo']['title'],
'authors': result['volumeInfo']['authors'][0],
'publish_date': result['volumeInfo']['publishedDate'],
'isbn_one': result['volumeInfo']['industryIdentifiers'][0]['identifier'],
'isbn_two': result['volumeInfo']['industryIdentifiers'][1]['identifier'],
'page_count': result['volumeInfo']['pageCount'],
'thumbnail': result['volumeInfo']['imageLinks']['thumbnail'],
'country': result['saleInfo']['country']
}
book_import.append(book_data)
else:
filter_ch = BookFilterForm()
return render(request, "BookApp/book_import.html", {'book_import': book_import,
'filter_ch': filter_ch})```

Filter the config dictionary based on user input - Python

I have a json config, based on user input, need to filter out the config and get only specific section. I tried running the code mentioned below, it returns the partially expected results.
Config:
superset_config = """
[ {
"Area":"Texas",
"Fruits": {
"RED": {
"Apple":["val1"],
"Grapes":["green"]
},
"YELLOW": {"key2":["val2"]}
}
},
{
"Area":"Dallas",
"Fruits": {
"GREEN": { "key3": ["val3"]}
}
}
]
"""
User Input:
inputs = ['Apple'] # input list
Code:
import json
derived_config = []
for each_src in json.loads(superset_config):
temp = {}
for src_keys in each_src:
if src_keys=='Fruits':
temp_inner ={}
for key,value in each_src[src_keys].items():
metrics = {key_inner:value_inner for key_inner,value_inner in value.items() if key_inner in inputs}
temp_inner[key]=metrics
temp[src_keys] = temp_inner
else:
temp[src_keys] = each_src[src_keys]
derived_config.append(temp)
what do I get from above code:
derived_config= [
{'Area': 'Texas',
'Fruits': {'RED': {'Apple': 'val1'},
'YELLOW': {}
}
},
{'Area': 'Dallas',
'Fruits': {'GREEN': {}
}
}
]
what is needed: I need below results
derived_config= [
{'Area': 'Texas',
'Fruits': {'RED': {'Apple': 'val1'}
}
}
]
can anyone please help? thanks.
Maybe something like this:
import json
inputs = ['Apple'] # input list
derived_config = []
for each_src in json.loads(superset_config):
filtered_fruits = {k: v for k, v in (each_src.get('Fruits') or {}).items()
if any(input_ in v for input_ in inputs)}
if filtered_fruits:
each_src['Fruits'] = filtered_fruits
derived_config.append(each_src)
print(derived_config)
Edit: Based on the comments, it looks like you might want to filter the inner Fruits map based on the input list of fruits as well. In that case, we don't need to use the any function as above.
There is also an unintentional risk that we might mutate the original source config. For example, if you save the result of json.loads(superset_config) to a variable and then try to filter multiple fruits from it, likely it'll mutate the original config object. If you are directly calling jsons.load each time, then you don't need to worry about mutating the object; however you need to be aware that due to list and dict being mutable types in Python, this can be a concern to us.
The solution below does a good job of eliminating a possibility of mutating the original source object. But again, if you are calling jsons.load each time anyway, then you don't need to worry about this and you are free to modify the original config object.
import json
# Note: If you are using Python 3.9+, you can just use the standard collections
# for `dict` and `list`, as they now support parameterized values.
from typing import Dict, Any, List
# The inferred type of the 'Fruits' key in the superset config.
# This is a mapping of fruit color to a `FruitMap`.
Fruits = Dict[str, 'FruitMap']
FruitMap = Dict[str, Any]
# The inferred type of the superset config.
Config = List[Dict[str, Any]]
def get_fruits_config(src_config: Config, fruit_names: List[str]) -> Config:
"""
Returns the specified fruit section(s) from the superset config.
"""
fruits_config: Config = []
final_src: Dict
for each_src in src_config:
fruits: Fruits = each_src.get('Fruits') or {}
final_fruits: Fruits = {}
for fruit_color, fruit_map in fruits.items():
desired_fruits = {fruit: val for fruit, val in fruit_map.items()
if fruit in fruit_names}
if desired_fruits:
final_fruits[fruit_color] = desired_fruits
if final_fruits:
final_src = each_src.copy()
final_src['Fruits'] = final_fruits
fruits_config.append(final_src)
return fruits_config
Usage:
inputs = ['Apple'] # input list
config = json.loads(superset_config)
derived_config = get_fruits_config(config, inputs)
print(derived_config)
# prints:
# [{'Area': 'Texas', 'Fruits': {'RED': {'Apple': ['val1']}}}]

customized OrderedDict format and transfer into dictionary

I have an issue about how to customize OrderedDict format and convert them into a json or dictionary format(but be able to reset the key names and the structure). I have the data below:
result= OrderedDict([('index', 'cfs_fsd_00001'),
('host', 'GIISSP707'),
('source', 'D:\\usrLLSS_SS'),
('_time', '2018-11-02 14:43:30.000 EDT'),
('count', '153')])
...However, I want to change the format like this:
{
"servarname": {
"index": "cfs_fsd_00001",
"host": "GIISSP707"
},
"times": '2018-11-02 14:43:30.000 EDT',
"metricTags": {
"source": 'D:\\ddevel.log'"
},
"metricName": "serverice count",
"metricValue": 153,
"metricType": "count"
}
I will be really appreciate your help. Basically the output I got is pretty flat. But I want to customize the structure. The original structure is
OrderedDict([('index', 'cfs_fsd_00001'),('host', 'GIISSP707').....]).
The output I want to achieve is {"servarname"{"index":"cfs_fsd_00001","host":"GIISSP707"},......
You can simply reference the result dict with the respective keys that you want your target data structure to have:
{
"servarname": {
"index": result['index'],
"host": result['host']
},
"times": result['_time'],
"metricTags": {
"source": result['source']
},
"metricName": "serverice count",
"metricValue": result['count'],
"metricType": "count"
}
No sure how flexible you need for your method. I assume you have a few common keys in your OrderedDict and you want to find the metric there, then reformat them into a new dict. Here is a short function which is implemented in python 3 and I hope it could help.
from collections import OrderedDict
import json
def reformat_ordered_dict(dict_result):
"""Reconstruct the OrderedDict result into specific format
This method assumes that your input OrderedDict has the following common keys: 'index',
'host', 'source', '_time', and a potential metric whcih is subject to change (of course
you can support more metrics with minor tweak of the code). The function also re-map the
keys (for example, mapping '_time' to 'times', pack 'index' and 'source' into 'servarname'
).
:param dict_result: the OrderedDict
:return: the reformated OrderedDict
"""
common_keys = ('index', 'host', 'source', '_time')
assert all(common_key in dict_result for common_key in common_keys), (
'You have to provide all the commen keys!')
# write common keys
reformated = OrderedDict()
reformated["servarname"] = OrderedDict([
("index", dict_result['index']),
("host", dict_result['host'])
])
reformated["times"] = dict_result['_time']
reformated["metricTags"] = {"source": dict_result['source']}
# write metric
metric = None
for key in dict_result.keys():
if key not in common_keys:
metric = key
break
assert metric is not None, 'Cannot find metric in the OrderedDict!'
# don't know where you get this value. But you can customize it if needed
# for exampe if the metric name is needed here
reformated['metricName'] = "serverice count"
reformated['metricValue'] = dict_result[metric]
reformated['metricType'] = metric
return reformated
if __name__ == '__main__':
result= OrderedDict([('index', 'cfs_fsd_00001'),
('host', 'GIISSP707'),
('source', 'D:\\usrLLSS_SS'),
('_time', '2018-11-02 14:43:30.000 EDT'),
('count', '153')])
reformated = reformat_ordered_dict(result)
print(json.dumps(reformated))

Dictionary key name from variable

I am trying to create a nested dictionary, whereby the key to each nested dictionary is named from the value from a variable. My end result should look something like this:
data_dict = {
'jane': {'name': 'jane', 'email': 'jane#example.com'},
'jim': {'name': 'jim', 'email': 'jim#example.com'}
}
Here is what I am trying:
data_dict = {}
s = "jane"
data_dict[s][name] = 'jane'
To my surprise, this does not work. Is this possible?
You want something like:
data_dict = {}
s = "jane"
data_dict[s] = {}
data_dict[s]['name'] = s
That should work, though I would recommend instead of a nested dictionary that you use a dictionary of names to either namedtuples or instances of a class.
Try this:
data_dict = {}
s = ["jane", "jim"]
for name in s:
data_dict[name] = {}
data_dict[name]['name'] = name
data_dict[name]['email'] = name + '#example.com'
as #Milad in the comment mentioned, you first need to initialize s as empty dictionary first
data={}
data['Tom']={}
data['Tom']['name'] = 'Tom Marvolo Riddle'
data['Tom']['email'] = 'iamlordvoldermort.com'
For existing dictionaries you can do dict[key] = value although if there is no dict that would raise an error. I think this is the code you want to have:
data_dict = {}
s = "jane"
data_dict[s] = {"name": s, "email": f"{s}#example.com"}
print(data_dict)
I just realized when I got a notification about this question:
data_dict = defaultdict(dict)
data_dict["jane"]["name"] = "jane"
Would be a better answer I think.

Parsing json file with changeable structure in Python

I'm using Yahoo Placemaker API which gives different structure of json depending on input.
Simple json file looks like this:
{
'document':{
'itemDetails':{
'id'='0'
'prop1':'1',
'prop2':'2'
}
'other':{
'propA':'A',
'propB':'B'
}
}
}
When I want to access itemDetails I simply write json_file['document']['itemDetails'].
But when I get more complicated response, such as
{
'document':{
'1':{
'itemDetails':{
'id'='1'
'prop1':'1',
'prop2':'2'
}
},
'0':{
'itemDetails':{
'id'='0'
'prop1':'1',
'prop2':'2'
},
'2':{
'itemDetails':{
'id'='1'
'prop1':'1',
'prop2':'2'
}
'other':{
'propA':'A',
'propB':'B'
}
}
}
the solution obviously does not work.
I use id, prop1 and prop2 to create objects.
What would be the best approach to automatically access itemDetails in the second case without writing json_file['document']['0']['itemDetails'] ?
If I understand correctly, you want to loop through all of json_file['document']['0']['itemDetails'], json_file['document']['1']['itemDetails'], ...
If that's the case, then:
item_details = {}
for key, value in json_file['document']:
item_details[key] = value['itemDetails']
Or, a one-liner:
item_details = {k: v['itemDetails'] for k, v in json_file['document']}
Then, you would access them as item_details['0'], item_details['1'], ...
Note: You can suppress the single quotes around 0 and 1, by using int(key) or int(k).
Edit:
If you want to access both cases seamlessly (whether there is one result or many), you could check:
if 'itemDetails' in json_file['document']:
item_details = {'0': json_file['document']['itemDetails']}
else:
item_details = {k: v['itemDetails'] for k, v in json_file['document'] if k != 'other'}
Then loop through the item_details dict.

Categories

Resources