This question already has answers here:
How do I sort a list of dictionaries by a value of the dictionary?
(20 answers)
Closed 1 year ago.
I have this json file:
[
{"industry": "food", "price": 100.0, "name": "mcdonald's"},
{"industry": "food", "price": 90.0, "name": "tacobell"},
{"industry": "food", "price": 150.0, "name": "Subway"},
{"industry": "Cars", "price": 90.0, "name": "Audi"}
]
This is my code:
import json
from pprint import pprint
with open('firm_list.json', encoding='utf-8') as data_file:
data = json.loads(data_file.read())
pprint(data)
result_list=[]
for json_dict in data:
result_list.append(json_dict['price'])
result_list=[json_dict['price'] for json_dict in data]
result_list.sort(reverse= True)
print(result_list)
I want to print a list of firms in the food industry and respective prices, which is
sorted by the price, with the highest price appearing first. But my code print in the list also the firm from the car industry. How can I print just the firms from the food industry? Is it possible to have the name of the firms on the list too?
You can filter by food industry then sort the result:
import json
with open('example.json') as data_file:
data = json.loads(data_file.read())
data = [i for i in data if i['industry'] == 'food']
data.sort(key=lambda i: i['price'], reverse=True)
# EDIT: filtering keys in dictionaries
data = [{k: d[k] for k in ['name', 'price']} for d in data]
print(data)
Result:
[{'price': 90.0, 'name': 'tacobell'}, {'price': 100.0, 'name': "mcdonald's"}, {'price': 150.0, 'name': 'Subway'}]
You can add an if filter to your list comprehension:
result_list = [
json_dict['price']
for json_dict in data
if json_dict['industry'] == 'food'
]
In the line 9 of the code, you are appending all the price to result_list. You might want to add a check. append that price to the results only if json_dict['industry'] == 'food'
result_list=[]
for json_dict in data:
if (json_dict['price']):
result_list.append(json_dict['price'])
result_list.sort(reverse= True)
print(result_list)
Also, result_list=[json_dict['price'] for json_dict in data] seems like a redundant line to me. I don't think it's required there.
Related
I receive orders that contain the name of customer, price of order and the quantity of the order.
The format of the order looks like this: {'first order': ['Alex', '100#2']}
(100 refers to a price and 2 refers to a quantity).
So I have different orders: {'first order': ['Alex', '99#2'], 'second order': ['Ann', '101#2'], 'third order': ['Nick', '110#3']}
We need to compare the prices and see which is the highest price and which is the lowest.
I was thinking of doing this by cutting the substring into two parts, the first part before the '#' symbol and the second part after the '#' symbol, then extract the numbers from the first part and compare them with others.
What's the most efficient way you can tell to solve this issue?
Thank you
I'd suggest to transform the dictionary to a list of dictionaries and convert the string to two floats. For example:
orders = {
"first order": ["Alex", "99#2"],
"second order": ["Ann", "101#2"],
"third order": ["Nick", "110#3"],
}
orders = [
{
"order": k,
"name": name,
"price": float(pq.split("#")[0]),
"quantity": float(pq.split("#")[1]),
}
for k, (name, pq) in orders.items()
]
Then if you want to find highest and lowest price you can use min/max function easily:
highest = max(orders, key=lambda k: k["price"])
lowest = min(orders, key=lambda k: k["price"])
print(highest)
print(lowest)
Prints:
{'order': 'third order', 'name': 'Nick', 'price': 110.0, 'quantity': 3.0}
{'order': 'first order', 'name': 'Alex', 'price': 99.0, 'quantity': 2.0}
I have a list of dictionaries
d = [{"Date": "2020/10/03 3:30", "Name": "John"}, {"Date": "2020/10/03 5:15", "Name": "Harry"}, {"Date": "2020/10/05 6:30", "Name": "Rob"}]
and I want to only print the name with the same dates.
Output:
John
Harry
I am not sure how I can implement this, any tips ?
Your problem can easily be solved by traversing the list of entries and collecting the dates with the names in a new dictionary. So, you use the dates as key for a dictionary and add the names in a corresponding list of that date. I'm adding a code snippet that does that fairly easily:
d = [{"Date": "2020/10/03 3:30", "Name": "John"}, {"Date": "2020/10/03 5:15","Name": "Harry"}, {"Date": "2020/10/05 6:30", "Name": "Rob"}]
dates = {}
for entry in d:
date = entry["Date"].split()[0]
if date in dates:
dates[date].append(entry["Name"])
else:
dates[date] = []
dates[date].append(entry["Name"])
print(dates["2020/10/03"])
print(dates["2020/10/05"])
Yes, I know my code snippet doesn't directly provide your specified output. I kept it open ended so you can tailor it towards your specific needs.
Here's an approach using collections.Counter and a couple list comprehensions:
from collections import Counter
d = [{"Date": "2020/10/03 3:30", "Name": "John"}, {"Date": "2020/10/03 5:15", "Name": "Harry"}, {"Date": "2020/10/05 6:30", "Name": "Rob"}]
dates = Counter([obj['Date'].split()[0] for obj in d])
multiples = [val for val in dates.keys() if dates[val] > 1]
for obj in d:
if obj['Date'].split()[0] in multiples:
print(obj['Name'])
This prints the following output:
John
Harry
You can extract dates and then sort and group by them.
from datetime import datetime
from itertools import groupby
This is a helper function for extracting the date from a dictionary:
def dategetter(one_dict):
return datetime.strptime(one_dict['Date'],
"%Y/%m/%d %H:%M").date()
This is a dictionary comprehension that extracts, sorts, groups, and organizes the results into a dictionary. You can print the dictionary data in any way you want.
{date: [name['Name'] for name in names] for date,names
in groupby(sorted(d, key=dategetter), key=dategetter)}
#{datetime.date(2020, 10, 3): ['John', 'Harry'],
# datetime.date(2020, 10, 5): ['Rob']}
Assuming the "Date" format is consistent, you can use simple dict & list comprehensions:
from collections import defaultdict
res = defaultdict(list)
for i in d:
dt = i["Date"].split()[0]
res[dt].append(i["Name"])
for date, names in res.items():
if len(names) > 1:
print(*names, sep="\n")
I extract data using API and retrieve a list of servers and backups. Some servers have more than one backup. This is how I get list of all servers with backaup IDs.
bkplist = requests.get('https://heee.com/1.2/storage/backup')
bkplist_json = bkplist.json()
backup_list = bkplist.json()
backupl = backup_list['storages']['storage']
Json looks like this:
{
"storages": {
"storage": [
{
"access": "",
"created": "",
"license": ,
"origin": "01165",
"size": ,
"state": "",
"title": "",
"type": "backup",
"uuid": "01019",
"zone": ""
},
Firstly I create a dictionary to store this data:
backup = {}
for u in backup_list['storages']['storage']:
srvuuidorg = u['origin']
backup_uuid = u['uuid']
backup[srvuuidorg] = backup_uuid
But then I find out there is more than one value for every server. As dictionary can have just one value assigned to one key I wanted to use some hybrid of list and dictionary, but I just can't figure it out how to do this with this example.
Servers are nested in storages->storage and I need to assign a couple of uuid which is backup ID to one origin which is server ID.
I know about collections module and with a simple example it is quite understandable, but I have a problem how to use this in my example with extracting data through API.
How extract origin and assign to this key other values stored in json uuid?
What's more it is a massive amount of data so I cannot add every value manually.
You can do something like this.
from collections import defaultdict
backup = defaultdict(list)
for u in backup_list['storages']['storage']:
srvuuidorg = u['origin']
backup_uuid = u['uuid']
backup[srvuuidorg].append(backup_uuid)
Note that you can simplify your loop like this.
from collections import defaultdict
backup = defaultdict(list)
for u in backup_list['storages']['storage']:
backup[u['origin']].append(u['uuid'])
But this may be considering as less readable.
You could store uuid list for origin key.
I sugget the following 2 ways:
Creating empty list for first accessing origin, and then appending to it:
backup = {}
for u in backup_list['storages']['storage']:
srvuuidorg = u['origin']
backup_uuid = u['uuid']
if not backup.get(srvuuidorg):
backup[srvuuidorg] = []
backup[srvuuidorg].append(backup_uuid)
Using defaultdict collection, which basically does the same for you under the hood:
from collections import defaultdict
backup = defaultdict(list)
for u in backup_list['storages']['storage']:
srvuuidorg = u['origin']
backup_uuid = u['uuid']
backup[srvuuidorg].append(backup_uuid)
It seems to me that the last way is more elegant.
If you need to store uuid unique list you should use the saem approach with set instead of list.
A json allows to contain an array in a key:
var= {
"array": [
{"id": 1, "value": "one"},
{"id": 2, "value": "two"},
{"id": 3, "value": "three"}
]
}
print var
{'array': [{'id': 1, 'value': 'one'}, {'id': 2, 'value': 'two'}, {'id': 3, 'value': 'three'}]}
var["array"].append({"id": 4, "value": "new"})
print var
{'array': [{'id': 1, 'value': 'one'}, {'id': 2, 'value': 'two'}, {'id': 3, 'value': 'three'}, {'id': 4, 'value': 'new'}]}
You can use a list for multiple values.
dict = {"Greetings": ["hello", "hi"]}
I am using this database: https://cpj.org/data/killed/?status=Killed&motiveConfirmed%5B%5D=Confirmed&type%5B%5D=Journalist&localOrForeign%5B%5D=Foreign&start_year=1992&end_year=2019&group_by=year
I have preprocessed it into this csv (showing only 2 lines of 159):
year,combinedStatus,fullName,sortName,primaryNationality,secondaryNationality,tertiaryNationality,gender,photoUrl,photoCredit,type,lastStatus,typeOfDeath,status,employedAs,organizations,jobs,coverage,mediums,country,location,region,state,locality,province,localOrForeign,sourcesOfFire,motiveConfirmed,accountabilityCrossfire,accountabilityAssignment,impunityMurder,tortured,captive,threatened,charges,motive,lengthOfSentence,healthProblems,impCountry,entry,sentenceDate,sentence,locationImprisoned
1994,Confirmed,Abdelkader Hireche,,,,,Male,,,Journalist,,Murder,Killed,Staff,Algerian Television (ENTV),Broadcast Reporter,Politics,Television,Algeria,Algiers,,,Algiers,,Foreign,,Confirmed,,,Partial Impunity,No,No,No,,,,,,,,,
2014,Confirmed,Ahmed Hasan Ahmed,,,,,Male,,,Journalist,,Dangerous Assignment,Killed,Staff,Xinhua News Agency,"Camera Operator,Photographer","Human Rights,Politics,War",Internet,Syria,Damascus,,,Damascus,,Foreign,,Confirmed,,,,,,,,,,,,,,,
And I want to make this type of JSON out of it:
"Afghanistan": {"year": 2001, "fullName": "Volker Handloik", "gender": "Male", "typeOfDeath": "Crossfire", "employedAs": "Freelance", "organizations": "freelance reporter", "jobs": "Print Reporter", "coverage": "War", "mediums": "Print", "photoUrl": NaN}, "Somalia": {"year": 1994, "fullName": "Pierre Anceaux", "gender": "Male", "typeOfDeath": "Murder", "employedAs": "Freelance", "organizations": "freelance", "jobs": "Broadcast Reporter", "coverage": "Human Rights", "mediums": "Television", "photoUrl": NaN}
The problem is that Afghanistan (as you can see in the link) has had many journalist deaths. I want to list all these killings under the Index 'Afghanistan'. However, as I currently do it, only the last case (Volker Handloik) in the csv file shows up. How can I get it so every case shows up?
this is my code atm
import pandas as pd
import pprint as pp
import json
# list with stand-ins for empty cells
missing_values = ["n/a", "na", "unknown", "-", ""]
# set missing values to NaN
df = pd.read_csv("data_journalists.csv", na_values = missing_values, skipinitialspace = True, error_bad_lines=False)
# columns
columns_keep = ['year', 'fullName', 'gender', 'typeOfDeath', 'employedAs', 'organizations', 'jobs', 'coverage', 'mediums', 'country', 'photoUrl']
small_df = df[columns_keep]
with pd.option_context('display.max_rows', None, 'display.max_columns', None): # more options can be specified also
print(small_df)
# create dict with country-column as index
df_dict = small_df.set_index('country').T.to_dict('dict')
print(df_dict)
# make json file from the dict
with open('result.json', 'w') as fp:
json.dump(df_dict, fp)
# use pretty print to see if dict matches the json example in the exercise
pp.pprint(df_dict)
I want to include all of these names (and more) in the JSON under the index Afghanistan
I think I will need a list of objects that is attached to the index of a country so that every country can show all the cases of journalists death instead of only 1 (each time being replaced by the next in the csv) I hope this is clear enough
I'll keep your code until the definition of small_df.
After that, we perform a groupby on the 'country' column and use pd.to_json on it:
country_series = small_df.groupby('country').apply(lambda r : r.drop(['country'], axis=1).to_json())
country_series is a pd.Series with the countries as index.
After that, we create a nested dictionary, so that we have a valid json object:
fullDict = {}
for ind, a in country_series.iteritems():
b = json.loads(a)
c = b['fullName']
smallDict = {}
for index, journalist in c.items():
smallDict[journalist] = {}
for i in b.keys():
smallDict[journalist][i] = b[i][index]
fullDict[ind] = (smallDict)
The nomenclature in my part of code is pretty bad, but I tried to write all the steps explicitly so that things should be clear.
Finally, we write the results to a file:
with open('result.json', 'w') as f:
json.dump(fullDict, f)
I am new to Python. I have two lists. One is key list and another list is value list.
title = ["Code","Title","Value",....] value = [["100","abcd",100",...],["101","efgh",200",...],...] data={} data.setdefault("data",[]).append({"code": sp[0],"val": sp[2]})
this code gives me the following result.
{'data': [{'code': '100', 'val': '100'},{'code': '101', 'val': '200'}]}
But I want the result as the below,
{ "100": { "Title": "abcd", "Value": "100", ............, ............}, "101": { "Title": "efgh", "Value": "200", ............, ............} }
i.e., The first column of the value list should be the key of every Json array list and other items of the lists should be generated as key and value pair. How can I generate the Json array using Python code referring that two lists.
As it is not mentioned that about the size of list ,the below could would do the job.I am using python3.x
title = ["Code","Title","Value"]
value = [["100","abcd","100"],["101","efgh","200"]]
dic1={}
for i in range(len(title)-1):
for j in range(len(title)-1):
dic1.setdefault(value[i][0],{}).update({title[j+1]:value[i][j+1]})
Output is
{'101': {'Title': 'efgh', 'Value': '200'}, '100': {'Title': 'abcd', 'Value': '100'}}
I hope it is helpful!
You can build a dict with this lists. I made a quick snippet just for you to understand
title = ["Code","Title","Value"]
value = [['100','abcd','100'],['101','efgh','200']]
data={}
for whatever in value:
your_titles = {}
print(whatever[0])
your_titles[title[1]] = whatever[1]
your_titles[title[2]] = whatever[0]
your_titles[title[0]] = whatever[2]
data[whatever[0]] = your_titles
print(data)
The output:
{'100': {'Code': '100', 'Value': '100', 'Title': 'abcd'}, '101': {'Code': '200', 'Value': '101', 'Title': 'efgh'}}
Please read this tutorial and try to make it yourself. This is not the optimal solution for this problem.
Make a data frame and then set the column to index and then convert it to json:
data_frame = pd.DataFrame(columns = title, data = value)
data = data_frame.set_index('Code')
json1 = data.to_json(orient='index')