Fasest way to generate dictionaries from a pandas df without to_dict - python

I am trying to go from this dataframe:
run property low high abs1perc0 in1out0 weight
0 bob a 5 9 1 1 2
1 bob s 5 9 1 1 2
2 bob d 1 10 0 1 2
3 tom a 1 2 1 1 2
4 tom s 2 3 1 1 2
5 tom d 8 9 0 1 2
to dictionaries that are named after a concatenation of the individual 'run' names and the column names (except property). Property has to become the key and the data has to become the values i.e:
boblow = {'a':5, 's':5, 'd':1}
bobhigh = {'a':9, 's':9, 'd':10}
bobabs1perc0 = {'a':1, 's':1, 'd':0}
...
tomlow = {'a':1, 's':2, 'd':8}
...
This would have to happen to huge dfs and I cant wrap my head around how to do it other than by hand. I started making a list of concatenated names of individual values of the 'run' column but I'm certain someone here has a much faster and smarter way of doing it.
Thanks a Bunch!!

I recommend save the the output into dict of dict , also do not merge your tuple key to one key , also after we reshape your df, to_dict still work
d=df.set_index(['run','property']).stack().unstack(1).to_dict('index')
{('bob', 'low'): {'a': 5, 'd': 1, 's': 5}, ('bob', 'high'): {'a': 9, 'd': 10, 's': 9}, ('bob', 'abs1perc0'): {'a': 1, 'd': 0, 's': 1}, ('bob', 'in1out0'): {'a': 1, 'd': 1, 's': 1}, ('bob', 'weight'): {'a': 2, 'd': 2, 's': 2}, ('tom', 'low'): {'a': 1, 'd': 8, 's': 2}, ('tom', 'high'): {'a': 2, 'd': 9, 's': 3}, ('tom', 'abs1perc0'): {'a': 1, 'd': 0, 's': 1}, ('tom', 'in1out0'): {'a': 1, 'd': 1, 's': 1}, ('tom', 'weight'): {'a': 2, 'd': 2, 's': 2}}
d[('bob','low')]
{'a': 5, 'd': 1, 's': 5}

Related

find difference of values between 2 array of objects in python

I have 2 array of objects:
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 5}]
Output:
a - b = [{'c': 3, 'd': 4}] ("-" symbol is only for representation, showing difference. Not mathematical minus.)
b - a = [{'g': 3, 'h': 4}]
In every array, the order of key may be different. I can try following and check for that:
for i in range(len(a)):
current_val = a[i]
for x, y in current_val.items:
//search x keyword in array b and compare it with b
but this approach doesn't feel right. Is there simpler way to do this or any utility library which can do this similar to fnc or pydash?
You can use lambda:
g = lambda a,b : [x for x in a if x not in b]
g(a,b) # a-b
[{'c': 3, 'd': 4}]
g(b,a) # b-a
[{'g': 3, 'h': 4}]
Just test if all elements are in the other array
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 5}]
def find_diff(array_a, array_b):
diff = []
for e in array_a:
if e not in array_b:
diff.append(e)
return diff
print(find_diff(a, b))
print(find_diff(b, a))
the same with list comprehension
def find_diff(array_a, array_b):
return [e for e in array_a if e not in array_b]
here is the code for subtracting list of dictionaries
a = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 6, 'f': 6}]
b = [{'a': 1, 'b': 2}, {'g': 3, 'h': 4}, {'f': 6, 'e': 6}]
a_b = []
b_a = []
for element in a:
if element not in b:
a_b.append( element )
for element in b:
if element not in a:
b_a.append( element )
print("a-b =",a_b)
print("b-a =",b_a)

How to change specific values in dictionary regardless of the keys in python?

How can I change the values in a dictionary in python regardless of the keys? If we take the following dictionary for our example:
d = {'a': 1, 'b': 2, 'c': 0, 'd': 2, 'e': 1}
Now I want to manipulate the dictionary in the way that all the values 2 will be changed into 3 so the output would be:
d = {'a': 1, 'b': 3, 'c': 0, 'd': 3, 'e': 1}
I am sure the problem is very basic, but I somehow haven't managed to find the answer online.
Thank you in advance.
In [338]: d = {'a': 1, 'b': 2, 'c': 0, 'd': 2, 'e': 1}
In [339]: for k,v in d.items():
...: if v == 2:
...: d[k] = 3
...:
In [340]: d
Out[340]: {'a': 1, 'b': 3, 'c': 0, 'd': 3, 'e': 1}

How to convert a nested dictionary to pandas dataframe?

I have a dictionary "my_dict" in this format:
{'l1':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}},
'l2':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}}
}
Currently, I am using pd.DataFrame.from_dict(my_dict, orient='index') and get a df like this:
c2 c1
l1 {u'a': 3, u'c': 5, u'b': 4} {u'a': 0, u'c': 2, u'b': 1}
l2 {u'a': 3, u'c': 5, u'b': 4} {u'a': 0, u'c': 2, u'b': 1}
However, what I want is both l1/l2 and c2/c3 as indexes and a/b/c as columns.
Something like this:
a b c
l1 c1 0 1 2
c2 3 4 5
l2 c1 0 1 2
c2 3 4 5
What's the best way to do this?
Consider a dictionary comprehension to build a dictionary with tuple keys. Then, use pandas' MultiIndex.from_tuples. Below ast is used to rebuild you original dictionary from string (ignore the step on your end).
import pandas as pd
import ast
origDict = ast.literal_eval("""
{'l1':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}},
'l2':{'c1': {'a': 0, 'b': 1, 'c': 2},
'c2': {'a': 3, 'b': 4, 'c': 5}}
}""")
# DICTIONARY COMPREHENSION
newdict = {(k1, k2):v2 for k1,v1 in origDict.items() \
for k2,v2 in origDict[k1].items()}
print(newdict)
# {('l1', 'c2'): {'c': 5, 'a': 3, 'b': 4},
# ('l2', 'c1'): {'c': 2, 'a': 0, 'b': 1},
# ('l1', 'c1'): {'c': 2, 'a': 0, 'b': 1},
# ('l2', 'c2'): {'c': 5, 'a': 3, 'b': 4}}
# DATA FRAME ASSIGNMENT
df = pd.DataFrame([newdict[i] for i in sorted(newdict)],
index=pd.MultiIndex.from_tuples([i for i in sorted(newdict.keys())]))
print(df)
# a b c
# l1 c1 0 1 2
# c2 3 4 5
# l2 c1 0 1 2
# c2 3 4 5

Python: Pandas: making a dictionary from dataframe

I am trying to convert a dataframe to dictionary:
xtest_cat = xtest_cat.T.to_dict().values()
but it gives a warning :
Warning: DataFrame columns are not unique, some columns will be omitted python
I checked the columns names of the dataframe(xtest_cat) :
len(list(xtest_cat.columns.values))
len(set(list(xtest_cat.columns.values)))
they are all unique.
Can anyone help me out ?
You can use reset_index for create unique index:
xtest_cat = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9]})
xtest_cat.index = [0,1,1]
print (xtest_cat)
A B C
0 1 4 7
1 2 5 8
1 3 6 9
print (xtest_cat.index.is_unique)
False
xtest_cat.reset_index(drop=True, inplace=True)
print (xtest_cat)
A B C
0 1 4 7
1 2 5 8
2 3 6 9
xtest_cat = xtest_cat.T.to_dict().values()
print (xtest_cat)
dict_values([{'B': 4, 'C': 7, 'A': 1}, {'B': 5, 'C': 8, 'A': 2}, {'B': 6, 'C': 9, 'A': 3}])
You can also omit T and add parameter orient='index':
xtest_cat = xtest_cat.to_dict(orient='index').values()
print (xtest_cat)
dict_values([{'B': 4, 'C': 7, 'A': 1}, {'B': 5, 'C': 8, 'A': 2}, {'B': 6, 'C': 9, 'A': 3}])
orient='record' is better:
xtest_cat = xtest_cat.to_dict(orient='records')
print (xtest_cat)
[{'B': 4, 'C': 7, 'A': 1}, {'B': 5, 'C': 8, 'A': 2}, {'B': 6, 'C': 9, 'A': 3}]

python - Letter Count Dict

Write a Python function called LetterCount() which takes a string as an argument and returns a dictionary of letter counts.
The line:
print LetterCount("Abracadabra, Monsignor")
Should produce the output:
{'a': 5, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'i': 1, 'm': 1, 'o': 2, 'n': 2, 's': 1, 'r': 3}
I tried:
import collections
c = collections.Counter('Abracadabra, Monsignor')
print c
print list(c.elements())
the answer I am getting looks like this
{'a': 4, 'r': 3, 'b': 2, 'o': 2, 'n': 2, 'A': 1, 'c: 1, 'd': 1, 'g': 1, ' ':1, 'i':1, 'M':1 ',':1's': 1, }
['A', 'a','a','a','a','c','b','b','d','g', and so on
okay now with this code
import collections
c = collections.Counter('Abracadabra, Monsignor'.lower())
print c
am getting this
{'a': 5, 'r': 3, 'b': 2, 'o': 2, 'n': 2, 'c: 1, 'd': 1, 'g': 1, ' ':1, 'i':1, ',':1's': 1, }
but answer should be this
{'a': 5, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'i': 1, 'm': 1, 'o': 2, 'n': 2, 's': 1, 'r': 3}
You are close. Note that in the task description, the case of the letters is not taken into account. They want {'a': 5}, where you have {'a': 4, 'A': 1}.
So you have to convert the string to lower case first (I'm sure you will find out how).
Use dictionary for the letter count:
s = "string is an immutable object"
d = {}
for i in s:
d[i] = d.get(i,0)+1
print d
Output:
{'a': 2, ' ': 4, 'c': 1, 'b': 2, 'e': 2, 'g': 1, 'i': 3, 'j': 1, 'm': 2, 'l': 1, 'o': 1, 'n': 2, 's': 2, 'r': 1, 'u': 1, 't': 3}
Maybe this to count just letters, exclude spaces and overlook its case and sort them alphabetically?:
t = "The cat is out of the bag."
word_count = {}
for i in t.casefold():
if i.isalnum():
word_count[i] = word_count.get(i,0)+1
for letter, count in sorted(word_count.items()):
print(letter, count)
Output:
a 2
b 1
c 1
e 2
f 1
g 1
h 2
i 1
o 2
s 1
t 4
u 1

Categories

Resources