Finding the percentange in a list of values - python

I have a dictionary that has multiple values assigned to each key. For each list of values in each key, I am trying to find a percentage of how many fit the 'flexibility' criteria. Since the values are stings it is throwing me for a loop (pun not intended). I am trying to get one value that has the percentage of values that are either 'none' or 'flexible' out of the total values in the loop.
Basically if the dictionary looks like this:
dict1 = {'German' : ["None", "None" ,"Flexible", "Hard"],
"French" : ["Hard", "Hard", "Hard", "Hard"]
}
I want the code to give me this (rounding to 2 decimals is fine:
dict1 = {"German" : "0.75",
"French" : "1.00"
}
import pandas as pd
def course_prereq_flexibility(fn):
df = pd.read_csv(fn)
df2 = df[["area", "prereq_type"]].copy()
def percentages (df2):
dict1 = {}
for items in range(len(df2)):
key = df2.iloc[items, 0]
values = df2.iloc[items, 1]
dict1.setdefault(key, [])
dict1[key].append(values)
dict1
I am a bit confused on where to go from creating the dictonary and would really appreciate a walk through of the steps I could go through.

Without using pandas, it's reasonably straightfoward to do this with just collections.Counter.
>>> dict1 = {'German' : ["None", "None" ,"Flexible", "Hard"],
...
... "French" : ["Hard", "Hard", "Hard", "Hard"]
...
... }
>>>
>>> {k: c
... for k, v in dict1.items()
... for c in (Counter(v),)}
{'German': Counter({'None': 2, 'Flexible': 1, 'Hard': 1}), 'French': Counter({'Hard': 4})}
>>> {k: (c['None'] + c['Flexible']) / len(v)
... for k, v in dict1.items()
... for c in (Counter(v),)}
{'German': 0.75, 'French': 0.0}

There are a number of ways to achieve this. The following is one example:
dict1 = {
"German": ["None", "None", "Flexible", "Hard"],
"French": ["Hard", "Hard", "Hard", "Hard"]
}
def percentage_in_list(input_list, elements_to_find=None):
if elements_to_find is None:
elements_to_find = ["None", "Flexible"]
nr_found = len([x for x in input_list if x in elements_to_find])
return (nr_found / len(input_list)) * 100
percentages = {k: percentage_in_list(v) for k,v in dict1.items()}
print(percentages)
The function percentage_in_list returns the percentage of values that corresponds to one of the values in elements_to_find which in this case is set to "None" and "Flexible" by default. In the function, a list comprehension is used to filter out all the elements of the input_list that are in elements_to_find. The len of the result of the list comprehension is the number of elements that have been found. Now, this number just has to be divided by the length of the input list and multiplied by 100 to return the percentage.
In the main code, a dictionary comprehension is used to iterate over dict1 and call the function percentage_in_list for every value in the dictionary.

Related

Get keys from a dictionary that contains duplicate list values, then store their corresponding keys in a nested list

I've tried to word this the best way that I possibly can, but it will probably be more clear if I provide an example of what I am trying to acheive:
Input:
source_dictionary = {"person1": ["x1","x2","x3","x4"],
"person2": ["x1","x2","x3","x4"],
"person3": ["x1","x2"],
"person4": ["x1","x2"],
"person5": ["x1","x2"]
}
Intended output:
[["person1","person2"],["person3","person4","person5"]]
Handling the lists in the dictionary is proving to be quite a challenge.
Appologies, I forgot to include what I have tried so far. As mentioned above - I am having issues with the lists:
rev_dict = {}
for key, value in source_dictionary.items():
rev_dict.setdefault(value, set()).add(key)
result = [key for key, values in rev_dict.items()
if len(values) > 1]
Assuming you want to join the keys by identical value, use a defaultdict:
source_dictionary = {"person1": ["x1","x2","x3","x4"],
"person2": ["x1","x2","x3","x4"],
"person3": ["x1","x2"],
"person4": ["x1","x2"],
"person5": ["x1","x2"]
}
from collections import defaultdict
d = defaultdict(list)
for key, value in source_dictionary.items():
d[tuple(value)].append(key)
out = list(d.values())
Alternative with setdefault:
d = {}
for key, value in source_dictionary.items():
d.setdefault(tuple(value), []).append(key)
out = list(d.values())
output:
[['person1', 'person2'], ['person3', 'person4', 'person5']]
source_dictionary = {"person1": ["x1","x2","x3","x4"],
"person2": ["x1","x2","x3","x4"],
"person3": ["x1","x2"],
"person4": ["x1","x2"],
"person5": ["x1","x2"]
}
L = []
for i in source_dictionary.values():
K = []
for j in source_dictionary.keys():
if source_dictionary[j] == i :
K.append(j)
if K not in L:
L.append(K)
print(L)

Collect sub-elements in a json object python

So essentially I have a JSON object obtained through an API that looks similar to the one below and I am wondering how I would collect the sub-elements such as name and quantity and place it into an array/list.
{
"item_one": {
"name": "Item One",
"weight": 0,
"quantity": 1
},
"item_two": {
"name": "Item Two",
"weight": 0,
"quantity": 23
},
"item_three": {
"name": "Item Three",
"weight": 0,
"quantity": 53
}
}
An example for what the desired output is would be the following:
nameLst = ['Item One', 'Item Two', 'Item Three']
quantityLst = ['1', '23', '53']
So far the only way I know how to do this would be to individually collect the quantity and name data by searching through all the specific items, this however would be impossible due to the sheer number of potential items.
You don't need to know the item names, you can simply loop over the keys of the dictionary and use those keys to query the JSON blob for each subdict.
namelst = []
quantitylst = []
for key in d.keys():
subdict = d[key]
namelst.append(subdict["name"])
quantitylst.append(subdict["quantity"])
If you don't need the keys at any point, then you can loop over the values solely as Kelly Bundy mentions.
for v in d.values():
namelst.append(v["name"])
quantitylst.append(v["quantity"])
So far the only way I know how to do this would be to individually collect the quantity and name data by searching through all the specific items, this however would be impossible due to the sheer number of potential items.
I imagine you're just saying that this would be hard to do by hand, and you could do something like this.
distinct_keys = {k for d in json_obj.values() for k in d}
# you seem to want to convert ints to strings?
# if so, consider (some_transform(d[k]) if k in d else None)
result = {k:[d.get(k, None) for d in json_obj.values()] for k in distinct_keys}
If you actually need to iterate through this thing one object at a time though, consider something like the following:
from collections import defaultdict
result = defaultdict(list)
for d in json_obj.values():
# if you KNOW you don't have missing data
# for k,v in d.items(): result[k].append(v)
# you probably do have missing data though, so a cost proportional
# to your key sizes is unavoidable starting from completely unprocessed
# json data. you could save a little work, but here's the basic idea
# the work we do is different based on which sets/maps have they
# keys we're operating on
s = set(d.keys())
new_keys = s.difference(result)
missing_keys = [k for k in result if k not in s]
same_keys = s.intersection(result)
# this doesn't necessarily have to be special cased, but it
# allows us to guarantee result is non-empty everywhere else
# and avoid some more special casing.
if new_keys and not result:
for k,v in d.items():
result[k].append(v)
else:
# backfill new keys we found with None
L = result[next(iter(result))]
for key in new_keys:
result[key] = [None]*len(L)
result[key].append(d[key])
# fill in None for the current object for any keys in result
# that we don't have available
for key in missing_keys:
result[key].append(None)
# for everything in both objects, just append the new data
for key in same_keys:
result[key].append(d[key])
Then if you really needed variables and not a dictionary you can explicitly store them that way.
for k,L in result.items():
globals()[f'{k}Lst'] = L

Dictionary function to find the key,value pair with least value

How to print the cheapest item in the dictionary when the dictionary consists of keys and values as items and their prices?
I tried using operator function for sorting but it converts the dictionary to tuple and then i am unable to display the dictionary key/value.
is there any other approach?
You can use min with the dictionary's .items(), and pass the value of the pair to sort against.
>>> data = {'foo': 17.5, 'bar': 5.8, 'abc': 12.6}
>>> min(data.items(), key=lambda i: i[1])
('bar', 5.8)
Below are your answers:
Novice way:
shoes_list = {'adidas':1000, 'Nike':3000, 'local': 100}
cheapest = ""
for key in shoes_list:
if cheapest == "" or shoes_list[cheapest] > shoes_list[key]:
cheapest = key
print(cheapest)
Intermediate:
shoes_list = {'adidas':1000, 'Nike':3000, 'local': 100}
Cheapest = min(shoes_list, key=shoes_list.get)
print(Cheapest)
Most efficient way:
import operator
shoes_list = {'adidas':1000, 'Nike':3000, 'local': 100}
print(min(shoes_list.items(), key=operator.itemgetter(1))[0])
My initial thought was a dictionary comprehension:
>>> data = {'foo': 17.5, 'bar': 5.8, 'abc': 12.6}
>>> min_val = min(data.values())
>>> {k: v for k, v in data.items() if v == min_val}
{'bar': 5.8}
However, CoryKramer's only iterates over the dictionary once, whereas my answer needs two runs
#1) Convert the dictionary values into list,find the minimum
#2) Find the index value of the minimum value.
#3) Finally convert the list value to strings and print it.
#SOURCE CODE
dd={'mobile1':10000, 'mobile2':11000, 'mobile3':13000, 'mobile4':9000, 'mobile5':15000, 'mobile6':16000, 'mobile7':17000, 'mobile8':18000, 'mobile9':19000}
k=list(dd.values())
d={}
def get_key(val):
for key, value in dd.items():
if val == value:
return key
mm=k[0]
for i in range(1,len(k)):
if k[i]<mm:
mm=k[i]
l=mm
index_value = list(dd.keys()).index(get_key(l))
f=list(list(dd.items())[index_value])
print(str(f[0])+":"+str(f[1]))

Python summing up values in a nested dictionary

I have a dictionary P which represents a dictionary within a dictionary within a dictionary. It looks something like this.
P={key1:{keyA:{value1: 1, value2:3}, keyB:{value1:3,value2:4}},
key2:{keyA:{value1: 1, value2:3}, keyB:{value1:3,value2:4}}, key3{...:{...:}}}
What I am trying to do is to write each value of value1,value 2 in terms of their percentages of the totalPopulation from whichever is there base key.
For example key1 should look like
key1:{keyA:{value1: 1/(1+3+3+4), value2:3/(1+3+3+4)}, keyB:
{value1:3/(1+3+3+4),value2:4/(1+3+3+4)}
What I am not sure about is how to iterate over this dictionary and only collect the innermost values of a certain key so I can then sum up all the values and divide each value by that sum.
This can be done in single line using dict comprehension and map like this:
#from __future__ import division # use for Python 2.x
p = {"key1":{"keyA":{"value1": 1, "value2":3}, "keyB":{"value1":3,"value2":4}}}
p = {kOuter:{kInner:{kVal: vVal/sum(map(lambda x: sum(x.values()), vOuter.values())) for kVal, vVal in vInner.iteritems()} for kInner, vInner in vOuter.iteritems()} for kOuter, vOuter in p.iteritems()}
A more readable version of above :
p = {
kOuter:{
kInner:{
kVal: vVal/sum(map(lambda x: sum(x.values()), vOuter.values())) for kVal, vVal in vInner.iteritems()
}
for kInner, vInner in vOuter.iteritems()
}
for kOuter, vOuter in p.iteritems()
}
OUTPUT
>>> p
>>>
{'key1': {'keyB': {'value2': 0.36363636363636365, 'value1': 0.2727272727272727}, 'keyA': {'value2': 0.2727272727272727, 'value1': 0.09090909090909091}}}
The only problem with this is that the sum is calculated repeatedly, you can fix that by calculating the sum for each of your key1, key2... before this dict comprehension and use the stored values instead, like this :
keyTotals = {kOuter:sum(map(lambda x: sum(x.values()), vOuter.values())) for kOuter, vOuter in p.iteritems()}
and then you can simply access the sums calculated above by keys, like this:
p = {kOuter:{kInner:{kVal: vVal/keyTotals[kOuter] for kVal, vVal in vInner.iteritems()} for kInner, vInner in vOuter.iteritems()} for kOuter, vOuter in p.iteritems()}
test = {"key1":{"keyA":{"value1": 1, "value2":3}, "keyB":{"value1":3,"value2":4}}}
for a in test:
s = 0
for b in test[a]:
for c in test[a][b]:
s += test[a][b][c]
print(s)
for b in test[a]:
for c in test[a][b]:
test[a][b][c] = test[a][b][c] / s
This should do what you want. I've only included "key1" in this example.

make a dict/json from string with duplicate keys Python

I have a string that could be parsed as a JSON or dict object. My string variable looks like this :
my_string_variable = """{
"a":1,
"b":{
"b1":1,
"b2":2
},
"b": {
"b1":3,
"b2":2,
"b4":8
}
}"""
When I do json.loads(my_string_variable), I have a dict but only the second value of the key "b" is kept, which is normal because a dict can't contain duplicate keys.
What would be the best way to have some sort of defaultdict like this :
result = {
"a": 1,
"b": [{"b1": 1, "b2": 2}, {"b1": 3, "b2": 2, "b4": 8}],
}
I have already looked for similar questions but they all deal with dicts or lists as an input and then create defaultdicts to handle the duplicate keys.
In my case I have a string variable and I would want to know if there is a simple way to achieve this.
something like the following can be done.
import json
def join_duplicate_keys(ordered_pairs):
d = {}
for k, v in ordered_pairs:
if k in d:
if type(d[k]) == list:
d[k].append(v)
else:
newlist = []
newlist.append(d[k])
newlist.append(v)
d[k] = newlist
else:
d[k] = v
return d
raw_post_data = '{"a":1, "b":{"b1":1,"b2":2}, "b": { "b1":3, "b2":2,"b4":8} }'
newdict = json.loads(raw_post_data, object_pairs_hook=join_duplicate_keys)
print (newdict)
Please note that above code depends on value type, if type(d[k]) == list. So if original string itself gives a list then there could be some error handling required to make the code robust.
Accepted answer is perfectly fine. I just wanted to show another approach.
So at first, you dedicate a list for values in order to easily accumulate next values. At the end, you call pop on the lists which have only one item. This means that the list doesn't have duplicate values:
import json
from collections import defaultdict
my_string_variable = '{"a":1, "b":{"b1":1,"b2":2}, "b": { "b1":3, "b2":2,"b4":8} }'
def join_duplicate_keys(ordered_pairs):
d = defaultdict(list)
for k, v in ordered_pairs:
d[k].append(v)
return {k: v.pop() if len(v) == 1 else v for k, v in d.items()}
d = json.loads(my_string_variable, object_pairs_hook=join_duplicate_keys)
print(d)
output:
{'a': 1, 'b': [{'b1': 1, 'b2': 2}, {'b1': 3, 'b2': 2, 'b4': 8}]}

Categories

Resources