I have a list of dictionaries
d = [{"Date": "2020/10/03 3:30", "Name": "John"}, {"Date": "2020/10/03 5:15", "Name": "Harry"}, {"Date": "2020/10/05 6:30", "Name": "Rob"}]
and I want to only print the name with the same dates.
Output:
John
Harry
I am not sure how I can implement this, any tips ?
Your problem can easily be solved by traversing the list of entries and collecting the dates with the names in a new dictionary. So, you use the dates as key for a dictionary and add the names in a corresponding list of that date. I'm adding a code snippet that does that fairly easily:
d = [{"Date": "2020/10/03 3:30", "Name": "John"}, {"Date": "2020/10/03 5:15","Name": "Harry"}, {"Date": "2020/10/05 6:30", "Name": "Rob"}]
dates = {}
for entry in d:
date = entry["Date"].split()[0]
if date in dates:
dates[date].append(entry["Name"])
else:
dates[date] = []
dates[date].append(entry["Name"])
print(dates["2020/10/03"])
print(dates["2020/10/05"])
Yes, I know my code snippet doesn't directly provide your specified output. I kept it open ended so you can tailor it towards your specific needs.
Here's an approach using collections.Counter and a couple list comprehensions:
from collections import Counter
d = [{"Date": "2020/10/03 3:30", "Name": "John"}, {"Date": "2020/10/03 5:15", "Name": "Harry"}, {"Date": "2020/10/05 6:30", "Name": "Rob"}]
dates = Counter([obj['Date'].split()[0] for obj in d])
multiples = [val for val in dates.keys() if dates[val] > 1]
for obj in d:
if obj['Date'].split()[0] in multiples:
print(obj['Name'])
This prints the following output:
John
Harry
You can extract dates and then sort and group by them.
from datetime import datetime
from itertools import groupby
This is a helper function for extracting the date from a dictionary:
def dategetter(one_dict):
return datetime.strptime(one_dict['Date'],
"%Y/%m/%d %H:%M").date()
This is a dictionary comprehension that extracts, sorts, groups, and organizes the results into a dictionary. You can print the dictionary data in any way you want.
{date: [name['Name'] for name in names] for date,names
in groupby(sorted(d, key=dategetter), key=dategetter)}
#{datetime.date(2020, 10, 3): ['John', 'Harry'],
# datetime.date(2020, 10, 5): ['Rob']}
Assuming the "Date" format is consistent, you can use simple dict & list comprehensions:
from collections import defaultdict
res = defaultdict(list)
for i in d:
dt = i["Date"].split()[0]
res[dt].append(i["Name"])
for date, names in res.items():
if len(names) > 1:
print(*names, sep="\n")
Related
This question already has answers here:
How do I sort a list of dictionaries by a value of the dictionary?
(20 answers)
Closed 1 year ago.
I have this json file:
[
{"industry": "food", "price": 100.0, "name": "mcdonald's"},
{"industry": "food", "price": 90.0, "name": "tacobell"},
{"industry": "food", "price": 150.0, "name": "Subway"},
{"industry": "Cars", "price": 90.0, "name": "Audi"}
]
This is my code:
import json
from pprint import pprint
with open('firm_list.json', encoding='utf-8') as data_file:
data = json.loads(data_file.read())
pprint(data)
result_list=[]
for json_dict in data:
result_list.append(json_dict['price'])
result_list=[json_dict['price'] for json_dict in data]
result_list.sort(reverse= True)
print(result_list)
I want to print a list of firms in the food industry and respective prices, which is
sorted by the price, with the highest price appearing first. But my code print in the list also the firm from the car industry. How can I print just the firms from the food industry? Is it possible to have the name of the firms on the list too?
You can filter by food industry then sort the result:
import json
with open('example.json') as data_file:
data = json.loads(data_file.read())
data = [i for i in data if i['industry'] == 'food']
data.sort(key=lambda i: i['price'], reverse=True)
# EDIT: filtering keys in dictionaries
data = [{k: d[k] for k in ['name', 'price']} for d in data]
print(data)
Result:
[{'price': 90.0, 'name': 'tacobell'}, {'price': 100.0, 'name': "mcdonald's"}, {'price': 150.0, 'name': 'Subway'}]
You can add an if filter to your list comprehension:
result_list = [
json_dict['price']
for json_dict in data
if json_dict['industry'] == 'food'
]
In the line 9 of the code, you are appending all the price to result_list. You might want to add a check. append that price to the results only if json_dict['industry'] == 'food'
result_list=[]
for json_dict in data:
if (json_dict['price']):
result_list.append(json_dict['price'])
result_list.sort(reverse= True)
print(result_list)
Also, result_list=[json_dict['price'] for json_dict in data] seems like a redundant line to me. I don't think it's required there.
Below are 4 JSON files:
3 JSON files have 3 key fields: name, rating, and year
1 JSON has only 2 key fields: name, rating (no year)
[
{
"name": "Apple",
"year": "2014",
"rating": "21"
},
{
"name": "Pear",
"year": "2003",
"rating": ""
},
{
"name": "Pineapple",
"year": "1967",
"rating": "60"
},
]
[
{
"name": "Pineapple",
"year": "1967",
"rating": "5.7"
},
{
"name": "Apple",
"year": "1915",
"rating": "2.3"
},
{
"name": "Apple",
"year": "2014",
"rating": "3.7"
}
]
[
{
"name": "Apple",
"year": "2014",
"rating": "2.55"
}
]
[
{
"name": "APPLE",
"rating": "+4"
},
{
"name": "LEMON",
"rating": "+3"
}
]
When you search for 'Apple' across all 4 files, you want to return 1 name, 1 year, and 4 ratings:
name: Apple (closest match to search term across all 4 files)
year: 2014 (the MOST COMMON year for Apple across first 3 JSONs)
rating: 21 (from JSON1)
3.7 (from JSON2)
2.55 (from JSON3)
+4 (from JSON4)
Now pretend JSON3 (or any JSON) has no match for 'name: Apple'. In that case, instead return the following. Assume there will be at least one match in at least one file.
name: Apple (closest match to search term across all 4 files)
year: 2014 (the MOST COMMON year for Apple across first 3 JSONs)
rating: 21 (from JSON1)
3.7 (from JSON2)
Not Found (from JSON3)
+4 (from JSON4)
How would you get this output in Python?
This question is similar to the example code in Python - Getting the intersection of two Json-Files , except there are 4 files, 1 file is missing the year key, and we don't need the intersection of the rating key's value.
Here's what I have so far, just for two sets of JSON above:
import json
with open('1.json', 'r') as f:
json1 = json.load(f)
with open('2.json', 'r') as f:
json2 = json.load(f)
json2[0]['name'] = list(set(json2[0]['name']) - set(json1[0]['name']))
print(json.dumps(json2, indent=2))
I get output from this, but it doesn't match what I'm trying to achieve. For example, this is part of the output:
{
"name": [
"a",
"n",
"i",
"P"
],
"year": "1967",
"rating": "5.7"
},
When you are creating a set with the set constructor, it expects an iterable object and will iterate through the values of this object to make your set. So when you try to make a set directly from a string you end up with
name = set('Apple')
# name = {'A', 'p', 'p', 'l', 'e'}
since the string is an iterable object made up of characters. Instead, you would want to wrap the string into a list or tuple like so
name = set(['Apple'])
# name = {'Apple'}
which in your case would look like
json2[0]['name'] = list(set([json2[0]['name']]) - set([json1[0]['name']]))
but I still don't think that this is really what you are trying to achieve. Instead I would suggest that you iterate through the each of your json files making your own dictionary that is indexed on the names from the json files. Each value in the dictionary would have another dictionary with two keys, rating and year, both of which have a list of values. Once you're done building up your dictionary you would end up with a rating and year list for each name, and then you could convert each year list to a single value by choosing the most frequent year in the year list.
Here's an example of how your dictionary might look
{
"Apple": { "rating": [21, 3.7, ...], "year": [1915, 2014, 2014] }
"Pineapple": ...
...
}
I extract data using API and retrieve a list of servers and backups. Some servers have more than one backup. This is how I get list of all servers with backaup IDs.
bkplist = requests.get('https://heee.com/1.2/storage/backup')
bkplist_json = bkplist.json()
backup_list = bkplist.json()
backupl = backup_list['storages']['storage']
Json looks like this:
{
"storages": {
"storage": [
{
"access": "",
"created": "",
"license": ,
"origin": "01165",
"size": ,
"state": "",
"title": "",
"type": "backup",
"uuid": "01019",
"zone": ""
},
Firstly I create a dictionary to store this data:
backup = {}
for u in backup_list['storages']['storage']:
srvuuidorg = u['origin']
backup_uuid = u['uuid']
backup[srvuuidorg] = backup_uuid
But then I find out there is more than one value for every server. As dictionary can have just one value assigned to one key I wanted to use some hybrid of list and dictionary, but I just can't figure it out how to do this with this example.
Servers are nested in storages->storage and I need to assign a couple of uuid which is backup ID to one origin which is server ID.
I know about collections module and with a simple example it is quite understandable, but I have a problem how to use this in my example with extracting data through API.
How extract origin and assign to this key other values stored in json uuid?
What's more it is a massive amount of data so I cannot add every value manually.
You can do something like this.
from collections import defaultdict
backup = defaultdict(list)
for u in backup_list['storages']['storage']:
srvuuidorg = u['origin']
backup_uuid = u['uuid']
backup[srvuuidorg].append(backup_uuid)
Note that you can simplify your loop like this.
from collections import defaultdict
backup = defaultdict(list)
for u in backup_list['storages']['storage']:
backup[u['origin']].append(u['uuid'])
But this may be considering as less readable.
You could store uuid list for origin key.
I sugget the following 2 ways:
Creating empty list for first accessing origin, and then appending to it:
backup = {}
for u in backup_list['storages']['storage']:
srvuuidorg = u['origin']
backup_uuid = u['uuid']
if not backup.get(srvuuidorg):
backup[srvuuidorg] = []
backup[srvuuidorg].append(backup_uuid)
Using defaultdict collection, which basically does the same for you under the hood:
from collections import defaultdict
backup = defaultdict(list)
for u in backup_list['storages']['storage']:
srvuuidorg = u['origin']
backup_uuid = u['uuid']
backup[srvuuidorg].append(backup_uuid)
It seems to me that the last way is more elegant.
If you need to store uuid unique list you should use the saem approach with set instead of list.
A json allows to contain an array in a key:
var= {
"array": [
{"id": 1, "value": "one"},
{"id": 2, "value": "two"},
{"id": 3, "value": "three"}
]
}
print var
{'array': [{'id': 1, 'value': 'one'}, {'id': 2, 'value': 'two'}, {'id': 3, 'value': 'three'}]}
var["array"].append({"id": 4, "value": "new"})
print var
{'array': [{'id': 1, 'value': 'one'}, {'id': 2, 'value': 'two'}, {'id': 3, 'value': 'three'}, {'id': 4, 'value': 'new'}]}
You can use a list for multiple values.
dict = {"Greetings": ["hello", "hi"]}
I have a very long json file, that I need make sense of in order to query the correct data that is related to what I am interested in. In order to do this, I would like to extract all of the key values in order to know what is available to query. Is there an quick way of doing this, or should I just write a parser that traverses the json file and extracts anything in-between either { and : or , and : ?
given the example:
[{"Name": "key1", "Value": "value1"}, {"Name": "key2", "Value": "value2"}]
I am looking for the values:
"Name"
"Value"
That will depend on if there's any nesting. But the basic pattern is something like this:
import json
with open("foo.json", "r") as fh:
data = json.load(fh)
all_keys = set()
for datum in data:
keys = set(datum.keys())
all_keys.update(keys)
This:
dict = [{"Name": "key1", "Value": "value1"}, {"Name": "key2", "Value": "value2"}]
for val in dict:
print(val.keys())
gives you:
dict_keys(['Name', 'Value'])
dict_keys(['Name', 'Value'])
Input data:
results= [
{
"timestamp_datetime": "2014-03-31 18:10:00 UTC",
"job_id": 5,
"processor_utilization_percentage": 72
},
{
"timestamp_datetime": "2014-03-31 18:20:00 UTC",
"job_id": 2,
"processor_utilization_percentage": 60
},
{
"timestamp_datetime": "2014-03-30 18:20:00 UTC",
"job_id": 2,
"processor_utilization_percentage": 0
}]
Output has to be sorted like below, grouping by job_id in ascending order:
newresult = {
'2':[{ "timestamp_datetime": "2014-03-31 18:20:00 UTC",
"processor_utilization_percentage": 60},
{"timestamp_datetime": "2014-03-30 18:20:00 UTC",
"processor_utilization_percentage": 0},]
'5':[{
"timestamp_datetime": "2014-03-31 18:10:00 UTC",
"processor_utilization_percentage": 72},
],
}
What is pythonic way to do this?
You are grouping; this is easiest with a collections.defaultdict() object:
from collections import defaultdict
newresult = defaultdict(list)
for entry in result:
job_id = entry.pop('job_id')
newresult[job_id].append(entry)
newresult is a dictionary and these are not ordered; if you need to access job ids in ascending order, sort the keys as you list them:
for job_id in sorted(newresult):
# loops over the job ids in ascending order.
for job in newresult[job_id]:
# entries per job id
You can use itertools.groupby to group the results by their job_id:
from itertools import groupby
new_results = {k: list(g) for k, g in groupby(results, key=lambda d: d["job_id"])}
The result is a dictionary, i.e. it has no particular order. If you want to iterate the values in ascending order, you can just do something like this:
for key in sorted(new_results):
entries = new_results[key]
# do something with entries
Update: as Martijn points out, this requires the results list to be sorted by the job_ids (as it is in your example), otherwise entries might be lost.
Assuming you really didn't want the the job_id in the newresult:
from collections import defaultdict
newresult = defaultdict(list)
for result in results:
job_id = result['job_id']
newresult[job_id].append(
{'timestamp_datetime':result['timestamp_datetime'],
'processor_utilization_percentage':result['processor_utilization_percentage']}
)
#print newresult
I don't really see a way to do this with a dictionary comprehension, but I'm sure there's someone out there with more experience in doing that sort of thing who could pull it off. This is pretty straightforward, though.