Is there a way to see how many items in a dictionary share the same value in Python?
Let's say that I have a dictionary like:
{"a": 600, "b": 75, "c": 75, "d": 90}
I'd like to get a resulting dictionary like:
{600: 1, 75: 2, 90: 1}
My first naive attempt would be to just use a nested-for loop and for each value then I would iterate over the dictionary again. Is there a better way to do this?
You could use itertools.groupby for this.
import itertools
x = {"a": 600, "b": 75, "c": 75, "d": 90}
[(k, len(list(v))) for k, v in itertools.groupby(sorted(x.values()))]
When Python 2.7 comes out you can use its collections.Counter class
otherwise see counter receipe
Under Python 2.7a3
from collections import Counter
items = {"a": 600, "b": 75, "c": 75, "d": 90}
c = Counter( items )
print( dict( c.items() ) )
output is
{600: 1, 90: 1, 75: 2}
>>> a = {"a": 600, "b": 75, "c": 75, "d": 90}
>>> b = {}
>>> for k,v in a.iteritems():
... b[v] = b.get(v,0) + 1
...
>>> b
{600: 1, 90: 1, 75: 2}
>>>
Use Counter (2.7+, see below at link for implementations for older versions) along with dict.values().
>>> a = {"a": 600, "b": 75, "c": 75, "d": 90}
>>> d={}
>>> for v in a.values():
... if not v in d: d[v]=1
... else: d[v]+=1
...
>>> d
{600: 1, 90: 1, 75: 2}
Related
This question already has answers here:
Iterating over dictionaries using 'for' loops
(15 answers)
How to filter a dictionary according to an arbitrary condition function?
(7 answers)
Closed 12 months ago.
I need to use for For loop to find the return the list of values in a dictionary greater than x.
d= {}
for key in d():
if key > x:
return(d(key))
d = dict(a=1, b=10, c=30, d=2)
>>> d
{'a': 1, 'c': 30, 'b': 10, 'd': 2}
d = dict((k, v) for k, v in d.items() if v >= 10)
>>> d
{'c': 30, 'b': 10}
values_list = list(d.values())
>>> values_list
[30, 10]
We hold greater_than_x list, and append the values in d dictionary if it's bigger than the given x.
x = 20
greater_than_x = []
d = {"a": 10, "b": 20, "c": 30}
for value in d.values():
if value > x:
greater_than_x.append(value)
print(greater_than_x)
>[30]
One-liner applying the same logic:
x = 20
d = {"a": 10, "b": 20, "c": 30}
greater_than_x = [value for value in d.values() if value > x]
print(greater_than_x)
>[30]
I have json like this:
json = {
"b": 22,
"x": 12,
"a": 2,
"c": 4
}
When i generate an Excel file from this json like this:
import pandas as pd
df = pd.read_json(json_text)
file_name = 'test.xls'
file_path = "/tmp/" + file_name
df.to_excel(file_path, index=False)
print("path to excel " + file_path)
Pandas does its own ordering in the Excel file like this:
pandas_json = {
"a": 2,
"b": 22,
"c": 4,
"x": 12
}
I don't want this. I need the ordering which exists in the json. Please give me some advice how to do this.
UPDATE:
if i have json like this:
json = [
{"b": 22, "x":12, "a": 2, "c": 4},
{"b": 22, "x":12, "a": 2, "c": 2},
{"b": 22, "x":12, "a": 4, "c": 4},
]
pandas will generate its own ordering like this:
panas_json = [
{"a": 2, "b":22, "c": 4, "x": 12},
{"a": 2, "b":22, "c": 2, "x": 12},
{"a": 4, "b":22, "c": 4, "x": 12},
]
How can I make pandas preserve my own ordering?
You can read the json as OrderedDict which will help to retain original order:
import json
from collections import OrderedDict
json_ = """{
"b": 22,
"x": 12,
"a": 2,
"c": 4
}"""
data = json.loads(json_, object_pairs_hook=OrderedDict)
pd.DataFrame.from_dict(data,orient='index')
0
b 22
x 12
a 2
c 4
Edit, updated json also works:
j="""[{"b": 22, "x":12, "a": 2, "c": 4},
{"b": 22, "x":12, "a": 2, "c": 2},{"b": 22, "x":12, "a": 4, "c": 4}]"""
data = json.loads(j, object_pairs_hook=OrderedDict)
pd.DataFrame.from_dict(data).to_json(orient='records')
'[{"b":22,"x":12,"a":2,"c":4},{"b":22,"x":12,"a":2,"c":2},
{"b":22,"x":12,"a":4,"c":4}]'
I looked up intersections of dictionaries, and tried to use the set library, but couldn't figure out how to show the values and not just pull out the keys to work with them, so I'm hoping for some help. I've got three dictionaries of random length:
dict_a= {1: 488, 2: 336, 3: 315, 4: 291, 5: 275}
dict_b={2: 0, 3: 33, 1: 61, 5: 90, 15: 58}
dict_c= {1: 1.15, 9: 0, 2: 0.11, 15: 0.86, 19: 0.008, 20: 1834}
I need to figure out what keys are in dictionary A, B, and C, and combine those to a new dictionary. Then I need to figure out what keys are in dictionary A&B or A&C or B&C, and pull those out to a new dictionary. What I should have left over in A, B, and C are the ones that are unique to that dictionary.
So, eventually, I'd wind up with separate dictionaries, as follows:
total_intersect= {1: {488, 61, 1.15}, 2: {336, 0, 0.11}}
A&B_only_intersect = {3: {315,33}, 5:{275,90}} (then dicts for A&C intersect and B&C intersect)
dict_a_leftover= {4:291} (and dicts for leftovers from B and C)
I thought about using zip, but it's important that all those values stay in their respective places, meaning I can't have A values in the C position. Any help would be awesome!
lst = [dict_a,dict_b,dict_c]
total_intersect_key = set(dict_a) & set(dict_b) & set(dict_c)
total_intersect = { k:[ item[k] for item in lst ] for k in total_intersect_key}
output:
{1: [488, 61, 1.15], 2: [336, 0, 0.11]}
for other question just reduce the lst elements
lst = [dict_a,dict_b]
A&B_only_intersect = { k:[ item[k] for item in lst ] for k in set(dict_a.keys) & set(dict_b)}
also you can convert it to a function
def intersect(lst):
return { k:[ item[k] for item in lst if k in item ] for k in reduce( lambda x,y:set(x)&set(y), lst ) }
example:
>>> a
{1: 488, 2: 336, 3: 315, 4: 291, 5: 275}
>>> b
{1: 61, 2: 0, 3: 33, 5: 90, 15: 58}
>>> c
{1: 1.15, 2: 0.11, 9: 0, 15: 0.86, 19: 0.008, 20: 1834}
>>> intersect( [a,b] )
{1: [488, 61], 2: [336, 0], 3: [315, 33], 5: [275, 90]}
>>> intersect( [a,c] )
{1: [488, 1.15], 2: [336, 0.11]}
>>> intersect( [b,c] )
{1: [61, 1.15], 2: [0, 0.11], 15: [58, 0.86]}
>>> intersect( [a,b,c] )
{1: [488, 61, 1.15], 2: [336, 0, 0.11]}
-----update-----
def func( lst, intersection):
if intersection:
return { k:[ item[k] for item in lst if k in item ] for k in reduce( lambda x,y:set(x)&set(y), lst ) }
else:
return { k:[ item[k] for item in lst if k in item ] for k in reduce(lambda x,y:set(x).difference(set(y)), lst ) }
>>> func([a,c],False)
{3: [315], 4: [291], 5: [275]}
>>> func([a,b],False)
{4: [291]}
>>> func( [func([a,b],False),func([a,c],False)],True)
{4: [[291], [291]]}
One issue: you need to take the duplication out for final result or try to improve func itself.
{k:set( reduce( lambda x,y:x+y, v) ) for k,v in func( [func([a,b],False),func([a,c],False)],True).iteritems()}
{4: set([291])}
I hope this might help
dict_a= {1: 488, 2: 336, 3: 315, 4: 291, 5: 275}
a = set(dict_a)
dict_b={2: 0, 3: 33, 1: 61, 5: 90, 15: 58}
b = set( dict_b)
dict_c= {1: 1.15, 9: 0, 2: 0.11, 15: 0.86, 19: 0.008, 20: 1834}
c = set( dict_c )
a_intersect_b = a & b
a_intersect_c = a & c
b_intersect_c = b & c
a_interset_b_intersect_c = a_intersect_b & c
total_intersect = {}
for id in a_interset_b_intersect_c:
total_intersect[id] = { dict_a[id] , dict_b[id] , dict_c[id] }
print total_intersect
a_b_only_intersect = {}
for id in a_intersect_b:
a_b_only_intersect[id] = { dict_a[id] , dict_b[id] }
print a_b_only_intersect
b_c_only_intersect = {}
for id in b_intersect_c:
b_c_only_intersect[id] = { dict_b[id] , dict_c[id] }
print b_c_only_intersect
a_c_only_intersect = {}
for id in a_intersect_c:
a_c_only_intersect[id] = { dict_a[id] , dict_c[id] }
print a_c_only_intersect
Similarly u can find leftovers in a , b and c using "difference" of sets.
Can anyone tell me how I can get my code to produce the desired outputs below. Cheers
def dict_invert(d):
inv = {}
for k, v in d.iteritems():
keys = inv.setdefault(v, [])
keys.append(k)
return inv
my input1: >>> dict_invert({30000: 30, 600: 30, 2: 10})
my output1: >>> {10: [2], 30: [30000, 600]}
desired output1 >>> {10: [2], 30: [600, 30000]}
my input2: >>> dict_invert({0: 9, 9: 9, 5: 9})
my output2: >>> {9: [0, 9, 5]}
desired output2: >>> {9: [0, 5, 9]}
You can use a collections.defaultdict to group the keys of the input dictionary into lists by the values of the input dictionary:
from collections import defaultdict
def dict_invert(d):
dd = defaultdict(list)
for k in d:
dd[d[k]].append(k)
return {k:sorted(dd[k]) for k in dd}
>>> dict_invert({30000: 30, 600: 30, 2: 10})
{10: [2], 30: [600, 30000]}
>>> dict_invert({0: 9, 9: 9, 5: 9})
{9: [0, 5, 9]}
So, for these examples, this produces the output that you wanted. It's not clear whether you also want the resultant dictionary to be sorted by key. In the examples above, the keys appear sorted, but they are not really because a dictionary has no inherent order.
>>> dict_invert({30000: 30, 600: 30, 2: 10, 1234: -1})
{10: [2], 30: [600, 30000], -1: [1234]}
If you want the keys to be ordered take a look at collections.OrderedDict.
def dict_invert(d):
dd = defaultdict(list)
for k in d:
dd[d[k]].append(k)
return OrderedDict(sorted((k, sorted(dd[k])) for k in dd))
>>> dict_invert({30000: 30, 600: 30, 2: 10, 1234: -1})
OrderedDict([(-1, [1234]), (10, [2]), (30, [600, 30000])])
function dict_invert(obj){
var objArr = {};
for(var key in obj){
if(!objArr[obj[key]]){
objArr[obj[key]] = [];
}
objArr[obj[key]].push(key);
}
return objArr;
}
var objArr = dict_invert({30000: 30, 600: 30, 2: 10});
I have a list of data of the form:
[line1,a]
[line2,c]
[line3,b]
I want to use a mapping of a=5, c=15, b=10:
[line1,5]
[line2,15]
[line3,10]
I am trying to use this code, which I know is incorrect, can someone guide me on how to best achieve this:
mapping = {"a": 5, "b": 10, "c": 15}
applyMap = [line[1] = 'a' for line in data]
Thanks
EDIT:
Just to clarify here, for one line, however I want this mapping to occur to all lines in the file:
Input: ["line1","a"]
Output: ["line1",5]
You could try with a list comprehension.
lines = [
["line1", "much_more_items1", "a"],
["line2", "much_more_items2", "c"],
["line3", "much_more_items3", "b"],
]
mapping = {"a": 5, "b": 10, "c": 15}
# here I assume the key you need to remove is at last position of your items
result = [ line[0:-1] + [mapping[line[-1]] for line in lines ]
Try something like this:
data = [
['line1', 'a'],
['line2', 'c'],
['line3', 'b'],
]
mapping = {"a": 5, "b": 10, "c": 15}
applyMap = [[line[0], mapping[line[1]]] for line in data]
print applyMap
>>> data = [["line1", "a"], ["line2", "b"], ["line3", "c"]]
>>> mapping = { "a": 5, "b": 10, "c": 15}
>>> [[line[0], mapping[line[1]]] for line in data]
[['line1', 5], ['line2', 10], ['line3', 15]]
lineMap = {'line1': 'a', 'line2': 'b', 'line3': 'c'}
cha2num = {'a': 5, 'b': 10, 'c': 15}
result = [[key,cha2num[lineMap[key]]] for key in lineMap]
print result
what you need is a map to relevance 'a' -> 5