Adding a count to object array in python - python

How would you count and group object array using Counter
list = [ {
"text": "London",
},
{
"text": "New york",
},
{
"text": "London",
}]
The output that im expecting is:
[ {
"text": "London",
"count": 2
},
{
"text": "New york",
"count": 1
}
]

An option would be the following:
counter = {}
for d in current_list:
for v in d.values():
present = counter.get(v)
if present:
counter[v] += 1
else:
counter[v] = 1
print(counter)
new_dict = [{"text": k, "counter": v} for k, v in counter.items()]
print(new_dict)

If you are looking for a more compact option you can do this with a one line list comprehension
lst = [ {
"text": "London",
},
{
"text": "New york",
},
{
"text": "London",
}]
[{'text': location, 'count': [x['text'] for x in lst].count(location)} for location in set([x['text'] for x in lst])]
Output:
[{'text': 'London', 'count': 2}, {'text': 'New york', 'count': 1}]
Note: I renamed the original list just to avoid using the keyword list as a variable name
To extend this for an arbitrary number of attributes that are unique to the location use this
[{**{n['text']: n for n in lst}[loc], **{'count': [x['text'] for x in lst].count(loc)}} for loc in set([x['text'] for x in lst])]
with a new input of
lst = [ {
"text": "London",
"country": "UK"
},
{
"text": "New york",
"country": "USA"
},
{
"text": "London",
"country": "UK"
}]
Output:
[{'text': 'New york', 'country': 'USA', 'count': 1},
{'text': 'London', 'country': 'UK', 'count': 2}]

I would do this in two steps
from collections import Counter
>>> l = [...]
>>> counter = Counter([x['text'] for x in l])
>>> [{'text': k, 'count': c} for k, c in counter.items()]
[{'text': 'London', 'count': 2}, {'text': 'New york', 'count': 1}]

Unserious itertools.groupby solution
from itertools import groupby
L = [ {
"text": "London",
},
{
"text": "New york",
},
{
"text": "London",
}]
L = [{**key, "count" : len((*group,))} for key, group in groupby(sorted(L, key=str))]
print(L)
It works in this case because the dictionaries have only a key, but it might not work if you have 2 dictionaries with keys ordered in opposite ways, happy to improve it if you have any suggestion for the sorting function

Related

Fetch the Json for a particular key value

for an input json
[{
"Name": "John",
"Age": "23",
"Des": "SE"
},
{
"Name": "Rai",
"Age": "33",
"Des": "SSE"
},
{
"Name": "James",
"Age": "42",
"Des": "SE"
}
]
I want to filter out the json data where only "Des":"SE" is true
required output
[{
"Name": "John",
"Age": "23"
},
{
"Name": "James",
"Age": "42"
}
]
A list comprehension should do it:
out = [{'Name':d['Name'], 'Age':d['Age']} for d in lst if d['Des']=='SE']
Another way:
out = [d for d in lst if d.pop('Des')=='SE']
Output:
[{'Name': 'John', 'Age': '23'}, {'Name': 'James', 'Age': '42'}]
To make it more dynamic if each json has more elements:
import json
input_str = '[{"Name": "John", "Age": "23", "Des": "SE"}, {"Name": "Rai", "Age": "33", "Des": "SSE"}, {"Name": "James", "Age": "42", "Des": "SE"}]'
input_list = json.loads(input_str)
# If you already converted to a list of dicts, then you don't need the above
# Using pop here removes the key you are using to filter
output = [each for each in input_list if each.pop("Des") == "SE"]
using the json module, you can load a file using loads or a string using load. From there, it acts as a normal python list of dictionaries which you can iterate over and check the keys of. From there, you simply create a new list of dictionaries that match your desired pattern and remove the key you are no longer using. Example:
import json
jsonString = """[{
"Name": "John",
"Age": "23",
"Des": "SE"
},
{
"Name": "Rai",
"Age": "33",
"Des": "SSE"
},
{
"Name": "James",
"Age": "42",
"Des": "SE"
}
]"""
jsonList = json.loads(jsonString)
filteredList = []
def CheckDes(dataDict: dict):
if dataDict['Des'] == 'SE':
dataDict.pop('Des')
filteredList.append(dataDict)
print(jsonList)
"""
[
{
'Name': 'John',
'Age': '23',
'Des': 'SE'
},
{
'Name': 'Rai',
'Age': '33',
'Des': 'SSE'
},
{
'Name': 'James',
'Age': '42',
'Des': 'SE'
}
]"""
[CheckDes(dataDict) for dataDict in jsonList]
print(filteredList)
"""[
{
'Name': 'John',
'Age': '23'
},
{
'Name': 'James',
'Age': '42'
}
]
"""

How to Manipulate dictionary data in python [duplicate]

I need your help to solve the task: I have a list of dicts with the next data about products:
- id;
- title;
- country;
- seller;
In the output result I'm expecting to group all the dictionaries with the same id, creating a new key called "info" and this key
must consist of list of dicts with info about product "country" and product "seller", related to each one product.
Input data
data = [
{"id": 1, "title": "Samsung", "country": "France", "seller": "amazon_fr"},
{"id": 2, "title": "Apple", "country": "Spain", "seller": "amazon_es"},
{"id": 2, "title": "Apple", "country": "Italy", "seller": "amazon_it"},
]
Output data
result = [
{"id": 1, "title": "Samsung", "info": [{"country": "France", "seller": "amazon_fr"}]},
{"id": 2, "title": "Apple", "info": [{"country": "Spain", "seller": "amazon_es"}, {"country": "Italy", "seller": "amazon_it"}]},
]
Thanks a lot in advance for your efforts.
P.S. Pandas solutions are also appreciated.
Here's a straight python solution, creating a result dictionary based on the id values from each dictionary in data, and updating the values in that dictionary when a matching id value is found. The values of the dictionary are then used to create the output list:
data = [
{"id": 1, "title": "Samsung", "country": "France", "seller": "amazon_fr"},
{"id": 2, "title": "Apple", "country": "Spain", "seller": "amazon_es"},
{"id": 2, "title": "Apple", "country": "Italy", "seller": "amazon_it"},
]
result = {}
for d in data:
id = d['id']
if id in result:
result[id]['info'] += [{ "country": d['country'], "seller": d['seller'] }]
else:
result[id] = { "id": id, "title": d['title'], "info" : [{ "country": d['country'], "seller": d['seller'] }] };
result = [r for r in result.values()]
print(result)
Output:
[
{'title': 'Samsung', 'id': 1, 'info': [{'seller': 'amazon_fr', 'country': 'France'}]},
{'title': 'Apple', 'id': 2, 'info': [{'seller': 'amazon_es', 'country': 'Spain'},
{'seller': 'amazon_it', 'country': 'Italy'}
]
}
]
you can use itertools.groupby:
from operator import itemgetter
from itertools import groupby
data.sort(key=itemgetter('id'))
group = groupby(data, key=lambda x: (x['id'], x['title']))
result = [
{'id': i, 'title': t, 'info': [{'country': d['country'], 'seller': d['seller']} for d in v]}
for (i, t), v in group]
output:
[{'id': 1,
'title': 'Samsung',
'info': [{'country': 'France', 'seller': 'amazon_fr'}]},
{'id': 2,
'title': 'Apple',
'info': [{'country': 'Spain', 'seller': 'amazon_es'},
{'country': 'Italy', 'seller': 'amazon_it'}]}]

How to create a dictionary if value contain a particular string

I have a dictionary below
Fist i need to check a parent which contain Main or Contract.
FOr Main add name to the level1 dictionary and Contract level2 dictionary
d = {"employee": [
{
"id": "18",
"name": "Manager",
"parent": "Main level"
},
{
"id": "19",
"name": "Employee",
"parent": "Main level"
},
{
"id": "32",
"name": "Contract",
"parent": "Contract level"
},
{
"id": "21",
"name": "Admin",
"parent": "Main level"
},
]}
Expected out is below
{"employee": [
{'level1':['Manager','Employee']},
{'level2':['Test','HR']},
{
"id": "18",
"name": "Manager",
"parent": "Main level"
},
{
"id": "19",
"name": "Employee",
"parent": "Main level"
},
{
"id": "32",
"name": "Test",
"parent": "Contract level"
},
{
"id": "21",
"name": "HR",
"parent": "Contract level"
},
]}
Code
d['level1'] = {}
d['level2'] = {}
for i,j in d.items():
#check parent is Main
if j['parent'] in 'Main':
d['level1'] = j['name']
if j['parent'] in 'Contract':
d['level2'] = j['name']
I got the error TypeError: list indices must be integers or slices, not str
Your for loop is misguided.
You made 3 mistakes:
You tried looping over the parent dict instead of the actual list of employees.
You are using x in y backwards, for checking if one string contains another.
You are not actually appending new names to the "levels" lists.
Try this:
d["level1"] = []
d["level2"] = []
for j in d["employee"]:
# check parent is Main
if "Main" in j["parent"]:
d["level1"] += [j["name"]]
if "Contract" in j["parent"]:
d["level2"] += [j["name"]]
That will give you the "levels" as dict "siblings" of the employees (instead of in the list of employees, which is what you actually want).
To get the exact result you want, you would have to do something like this:
level1 = []
level2 = []
for j in d["employee"]:
# check parent is Main
if "Main" in j["parent"]:
level1 += [j["name"]]
if "Contract" in j["parent"]:
level2 += [j["name"]]
d["employee"] = [{"level1": level1}, {"level2": level2}] + d["employee"]
Try this:
dd = {'Main level': 'level1', 'Contract level': 'level2'}
res = {}
for x in d['employee']:
k = dd[x['parent']]
if k in res:
res[k].append(x['name'])
else:
res[k] = [x['name']]
d['employee'] = [{k: v} for k, v in res.items()] + d['employee']
print(d)
Output:
{'employee': [{'level1': ['Manager', 'Employee', 'Admin']},
{'level2': ['Contract']},
{'id': '18', 'name': 'Manager', 'parent': 'Main level'},
{'id': '19', 'name': 'Employee', 'parent': 'Main level'},
{'id': '32', 'name': 'Contract', 'parent': 'Contract level'},
{'id': '21', 'name': 'Admin', 'parent': 'Main level'}]}

Group list of dictionaries by dictionary column

I need your help to solve the task: I have a list of dicts with the next data about products:
- id;
- title;
- country;
- seller;
In the output result I'm expecting to group all the dictionaries with the same id, creating a new key called "info" and this key
must consist of list of dicts with info about product "country" and product "seller", related to each one product.
Input data
data = [
{"id": 1, "title": "Samsung", "country": "France", "seller": "amazon_fr"},
{"id": 2, "title": "Apple", "country": "Spain", "seller": "amazon_es"},
{"id": 2, "title": "Apple", "country": "Italy", "seller": "amazon_it"},
]
Output data
result = [
{"id": 1, "title": "Samsung", "info": [{"country": "France", "seller": "amazon_fr"}]},
{"id": 2, "title": "Apple", "info": [{"country": "Spain", "seller": "amazon_es"}, {"country": "Italy", "seller": "amazon_it"}]},
]
Thanks a lot in advance for your efforts.
P.S. Pandas solutions are also appreciated.
Here's a straight python solution, creating a result dictionary based on the id values from each dictionary in data, and updating the values in that dictionary when a matching id value is found. The values of the dictionary are then used to create the output list:
data = [
{"id": 1, "title": "Samsung", "country": "France", "seller": "amazon_fr"},
{"id": 2, "title": "Apple", "country": "Spain", "seller": "amazon_es"},
{"id": 2, "title": "Apple", "country": "Italy", "seller": "amazon_it"},
]
result = {}
for d in data:
id = d['id']
if id in result:
result[id]['info'] += [{ "country": d['country'], "seller": d['seller'] }]
else:
result[id] = { "id": id, "title": d['title'], "info" : [{ "country": d['country'], "seller": d['seller'] }] };
result = [r for r in result.values()]
print(result)
Output:
[
{'title': 'Samsung', 'id': 1, 'info': [{'seller': 'amazon_fr', 'country': 'France'}]},
{'title': 'Apple', 'id': 2, 'info': [{'seller': 'amazon_es', 'country': 'Spain'},
{'seller': 'amazon_it', 'country': 'Italy'}
]
}
]
you can use itertools.groupby:
from operator import itemgetter
from itertools import groupby
data.sort(key=itemgetter('id'))
group = groupby(data, key=lambda x: (x['id'], x['title']))
result = [
{'id': i, 'title': t, 'info': [{'country': d['country'], 'seller': d['seller']} for d in v]}
for (i, t), v in group]
output:
[{'id': 1,
'title': 'Samsung',
'info': [{'country': 'France', 'seller': 'amazon_fr'}]},
{'id': 2,
'title': 'Apple',
'info': [{'country': 'Spain', 'seller': 'amazon_es'},
{'country': 'Italy', 'seller': 'amazon_it'}]}]

Python: What is Best Approach for Flattening a list of Dictionary

How can I flatten a List of dictionary with nested dictonaries, say I have the following dict:
data = [
{ 'Name':'xyx',
'Age':22,
'EmpDetails':{'Salary':100,'Job':'Intern','Location':'TER'}
},
{ 'Name':'abc',
'Age':23,
'EmpDetails':{'JoinDate':'20110912','Salary':200,'Job':'Intern2','Location':'TER2'}
},
{'Name':'efg',
'Age':24,
'EmpDetails':{'JoinDate':'20110912','enddate':'20120912','Salary':300,'Job':'Intern3','Location':'TER3'}
}
]
i would need the EmpDetails Node removed and move its values one level up, like below
data = [
{ 'Name':'xyx','Age':22,'Salary':100,'Job':'Intern','Location':'TER'},
{ 'Name':'abc','Age':23,'JoinDate':'20110912','Salary':200,'Job':'Intern2','Location':'TER2'},
{'Name':'efg','Age':24,'JoinDate':'20110912','enddate':'20120912','Salary':300,'Job':'Intern3','Location':'TER3'}
]
i am this now using below, is there any faster way of doing this?
newlist = []
for d in data:
empdict ={}
for key, val in d.items():
if(key!='EmpDetails'):
empdict[key] = val
if(key=='EmpDetails'):
for key2, val2 in val.items():
empdict[key2] = val2
newlist.append(empdict)
This is one approach using dict.update and .pop
Ex:
data = [
{ 'Name':'xyx',
'Age':22,
'EmpDetails':{'Salary':100,'Job':'Intern','Location':'TER'}
},
{ 'Name':'abc',
'Age':23,
'EmpDetails':{'JoinDate':'20110912','Salary':200,'Job':'Intern2','Location':'TER2'}
},
{'Name':'efg',
'Age':24,
'EmpDetails':{'JoinDate':'20110912','enddate':'20120912','Salary':300,'Job':'Intern3','Location':'TER3'}
}
]
for i in data:
i.update(i.pop("EmpDetails"))
print(data)
Output:
[{'Age': 22, 'Job': 'Intern', 'Location': 'TER', 'Name': 'xyx', 'Salary': 100},
{'Age': 23,
'Job': 'Intern2',
'JoinDate': '20110912',
'Location': 'TER2',
'Name': 'abc',
'Salary': 200},
{'Age': 24,
'Job': 'Intern3',
'JoinDate': '20110912',
'Location': 'TER3',
'Name': 'efg',
'Salary': 300,
'enddate': '20120912'}]
One line method, maybe a little tricky.
data = [
{
"Name": "xyx",
"Age": 22,
"EmpDetails": {"Salary": 100, "Job": "Intern", "Location": "TER"},
},
{
"Name": "abc",
"Age": 23,
"EmpDetails": {
"JoinDate": "20110912",
"Salary": 200,
"Job": "Intern2",
"Location": "TER2",
},
},
{
"Name": "efg",
"Age": 24,
"EmpDetails": {
"JoinDate": "20110912",
"enddate": "20120912",
"Salary": 300,
"Job": "Intern3",
"Location": "TER3",
},
},
]
# only python3.5+
res = [{**item.pop("EmpDetails", {}), **item} for item in data]
I'd prefer using json_normalize() method from pandas library since it would be an elegant solution and had no effect on readability of your code.
Examples can bee seen here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.io.json.json_normalize.html

Categories

Resources