Fill dictionary with value from the same row, but different column - python

Lately I've been trying to map some values, so I'm trying to create a dictionary to do so. The odd thing is my DataFrame has a column made of lists, and DataFrames are always a bit awkward with lists. The DataFrame has the following structure:
rules procedure
['10','11','12'] 1
['13','14'] 2
['20','21','22','24'] 3
So I want to create a dictionary that maps '10' to 1, '14' to 2, and so on. I tried the following:
dicc=dict()
for j in df['rules']:
for i,k in zip(j,df.procedure):
dicc[i]=k
But that isn't making it. Probably something to do with indexes. What am I missing?
Edit: I'm trying to create a dictionary that maps the values '10', '11', '12' to 1; '13','14' to 2; '20','21','22','24' to 3, so if I typedicc['10'] I get 1, if I typedicc['22'] I get 3. Obviously, the actual DataFrame is quite bigger and I can't do it manually.

You can do it like this:
import pandas as pd
data = [[['10', '11', '12'], 1],
[['13', '14'], 2],
[['20', '21', '22', '24'], 3]]
df = pd.DataFrame(data=data, columns=['rules', 'procedure'])
d = {r : p for rs, p in df[['rules', 'procedure']].values for r in rs}
print(d)
Output
{'20': 3, '10': 1, '11': 1, '24': 3, '14': 2, '22': 3, '13': 2, '12': 1, '21': 3}
Notes:
The code {r : p for rs, p in df[['rules', 'procedure']].values for r
in rs} is a dictionary comprehension, the dictionary counterpart of
list.
The df[['rules', 'procedure']].values is equivalent to
zip(df.rules, df.procedure) it outputs a pair of list, int. So the
rs variable is a list and p is an integer.
Finally you iterate over the values of rs using the second for loop
UPDATE
As suggested for #piRSquared you can use zip:
d = {r : p for rs, p in zip(df.rules, df.procedure) for r in rs}

Help from cytoolz
from cytoolz.dicttoolz import merge
merge(*map(dict.fromkeys, df.rules, df.procedure))
{'10': 1,
'11': 1,
'12': 1,
'13': 2,
'14': 2,
'20': 3,
'21': 3,
'22': 3,
'24': 3}
Note
I updated my post to mimic how #jpp passed multiple iterables to map. #jpp's answer is very good. Though I'd advocate for upvoting all useful answers, I wish I could upvote their answer again (-:

Using collections.ChainMap:
from collections import ChainMap
res = dict(ChainMap(*map(dict.fromkeys, df['rules'], df['procedure'])))
print(res)
{'10': 1, '11': 1, '12': 1, '13': 2, '14': 2,
'20': 3, '21': 3, '22': 3, '24': 3}
For many uses, the final dict conversion is not necessary:
A ChainMap class is provided for quickly linking a number of
mappings so they can be treated as a single unit. It is often much
faster than creating a new dictionary and running multiple update()
calls.
See also What is the purpose of collections.ChainMap?

You may check flatten the list
dict(zip(sum(df.rules.tolist(),[]),df.procedure.repeat(df.rules.str.len())))
Out[60]:
{'10': 1,
'11': 1,
'12': 1,
'13': 2,
'14': 2,
'20': 3,
'21': 3,
'22': 3,
'24': 3}

using itertools.chain and DataFrame.itertuples:
dict(
chain.from_iterable(
((rule, row.procedure) for rule in row.rules) for row in df.itertuples()
)
)

Related

random.choice appears to sometimes return keys that are not there

The following code
import random
cards = {'A': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9, '10': 10, 'J': 10, 'Q': 10, 'K': 10}
def random_cards(n):
dealt_cards = []
for i in range(n):
dealt_cards += random.choice(list(cards.keys()))
print(dealt_cards)
produces this result
>>> random_cards(4)
['A', '4', '1', '0', '3']
>>>
The 1 isn't present in cards.keys(), and it doesn't come from the value of the A key. I tested this with setting the value to 100 and it still placed the 1.
Is random.choice broken?
If not, what is broken in my program?
Your error stems from the fact that, to add to a list, you don't use +=, but append. The correct code is:
def random_cards(n):
dealt_cards = []
for _ in range(n):
dealt_cards.append(random.choice(list(cards.keys())))
print(dealt_cards)
This is because addition with lists is defined only with iterables, which are converted to a list, then the two lists are concatenated. That is, ['c'] + 'ab' == ['c'] + list('ab') == ['c'] + ['a', 'b'] == ['c', 'a', 'b'].

Creating a Pandas DataFrame from list of dictionaries? Each dictionary as row in DataFrame?

I have been through several posts, however, I am unable to sort out how to use each dictionary within a list of dictionaries to create a rows in a pandas Dataframe. Specifically, I have two issues that my limited experience with dictionaries is unable to workaround.
So far I have separated each key and value into two columns however, what I am looking for is to create a row for each dictionary and use the key as the column name.
Only the first key in each dictionary is unique, thus I would either like to drop it completely or only use the key as a value to populate a column under the name "id".
Example List of Dictionaries (>500k in total):
pep_list=[{'HV404': 'WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR',
'gene': 'HV404',
'aa_comp': {'W': 4,
'V': 5,
'L': 5,
'S': 10,
'Q': 3,
'E': 1,
'G': 5,
'P': 2,
'K': 1,
'T': 2,
'C': 1,
'A': 1,
'I': 1,
'N': 1,
'R': 1},
'peptide': ['WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR'],
'Length': 43,
'z': 3,
'Mass': 4557,
'm/z': 1519.0},
{'A0A0G2JNQ3': 'ISGNTSR',
'gene': 'A0A0G2JNQ3',
'aa_comp': {'I': 1, 'S': 2, 'G': 1, 'N': 1, 'T': 1, 'R': 1},
'peptide': ['ISGNTSR'],
'Length': 7,
'z': 2,
'Mass': 715,
'm/z': 357.5},etc.]
Expected output:
Dataframe = pd.DataFrame({values from dictionaries}, columns=["id", "gene", 'aa_comp', 'peptide', 'length', 'z', 'mass','m/z')
id
columns of keys
dictionary 1
values in seperate columns
dictionary 2
values in seperate columns
Thank you for any insight!
Whatever these things are
{'HV404': 'WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR',}
{'A0A0G2JNQ3': 'ISGNTSR',}
are messing it up, plus it doesn't look like they are needed because the info is repeated.
If you want to take out a non-representative key you can do something like this
key_intersect = set(pep_list[0].keys()).intersection(set(pep_list[1].keys()))
new_list_of_dictionaries = [{key:value for (key,value) in dicts.items() if key in key_intersect} for dicts in pep_list]
df = pd.DataFrame(new_list_of_dictionaries)
Pretty compact code, but you could unfurl it in loops if needed. Beware of blindly taking out the first element, unless it is an ordered dict the first element is not guaranteed to be the same.

How to order a dictionary based on the keys? [duplicate]

This question already has answers here:
How do I sort a dictionary by key?
(32 answers)
Closed 2 years ago.
I have a dictionary that the keys representing the item and the value represent count of that.
for example in below dictionary:
dict= {'11': 4, '0': 2, '65': 1, '88': 1, '12': 1, '13': 1}
'11' occurred 4 times
'0' occurred 2 times
'65' occurred 1 time
How to order the dictionary that dict.keys() are descending or ascending?
The Ideal out put will be either
dict={'0':2,'11':4,'12':1,'13':1,'65':1,'88':1}
or
dict={'88':1,'65':1,'13':1,'12':1,'11':4,'0':2}
Any help would be appreciated
score = {'eng': 33, 'sci': 85, 'math': 60}
You can do it like this...
score_sorted = sorted(score.items(), key=lambda x:x[0])
If you wanna sort it by val, then score_sorted = sorted(score.items(), key=lambda x:x[1]). You can add reverse=True to change order as well.
Contrary to older posts dictionaries are no longer unordered and can be sorted since CPython 3.6 (unofficially, as a C implementation detail) and Python 3.7 (officially).
To sort by key use a dictionary comprehension to build a new dictionary in the order desired. If you want to sort by string collation order, use the following, but note that '2' comes after '11' as a string:
>>> d = {'11': 4, '2': 2, '65': 1, '88': 1, '12': 1, '13': 1}
>>> {k:d[k] for k in sorted(d)}
{'11': 4, '12': 1, '13': 1, '2': 2, '65': 1, '88': 1}
To order by integer value, pass a key function that converts the string to an integer:
>>> {k:d[k] for k in sorted(d,key=lambda x: int(x))}
{'2': 2, '11': 4, '12': 1, '13': 1, '65': 1, '88': 1}
Or reversed you can use reverse=True or just negate the integer:
>>> {k:d[k] for k in sorted(d,key=lambda x: -int(x))}
{'88': 1, '65': 1, '13': 1, '12': 1, '11': 4, '2': 2}
With older Python versions convert the dictionary to a list with list(d.items()) and use similar sorting.
myDict= {'11': 4, '0': 2, '65': 1, '88': 1, '12': 1, '13': 1}
sortDict = {}
for i in sorted(myDict.keys()) :
sortDict[i] = myDict[i]
print(sortDict)
dict= {'11': 4, '0': 2, '65': 1, '88': 1, '12': 1, '13': 1}
You can try dictionary comprehension like this
sorted_dict={k:dict[k] for k in sorted(dict)}
print(sorted_dict)
Note: Don't use dict as a variable name as it is already a built-in function.
your_dict = {'11': 4, '0': 2, '65': 1, '88': 1, '12': 1, '13': 1} is better.
You can use sample_list = list(your_dict.items()) which convets the given dict into a list.
In the Python Dictionary, items() method is used to return the list with all dictionary keys with values.
Use sample_list.sort() to sort the list.
To reverse the list, use reverse = True
sample_list = list(your_dict.items())
sample_list.sort(reverse = True)
Then use dict = dict(sample_list) to convert it into a dictionary and print it out.

python - how to iterate through values of dict

I have a dict I want to iterate through to find all values that contain the key. My output would be a separate dict that would contain the numbers from each dict value without the key in the value or each specific keys values in the final
dict_in =
{'6': ['2,9,8,10'], '1': ['3,5,8,9,10,12'], '4': ['2,5,7,8,9'], '2': ['3,4,7,6,13'], '12': ['1,7,5,9'], '3': ['9,11,10,1,2,13'], '10': ['1,3,6,11'], '5': ['4,1,7,11,12'], '13': ['2,3'], '8': ['1,6,4,11'], '7': ['5,2,4,9,12'], '11': ['3,5,10,8'], '9': ['12,1,3,6,4,7']}
so the output would be like this:
{'6':['3,4,7,13,1,3,11,1,4,11,12,1,3,4,7'] , '4':['3,4,6,13,1,11,12,1,6,11,12,12,1,3,6'],'13': ['4,7,6,9,11,10,1']}
I am a beginner and I do not even know where to start. Would it be easier to convert it to a list of lists?
There is one thing about your problem that is an extra challenge. That is for you or some other good samaritan to solve. This is just a nudge in your direction. Your values for the keys is actually a single string. Now if it was actual integers, the problem is not too complicated. Also to note, your expected output that you wrote based on your requirement is actually typed wrong, you missed a few values.
In case of having integers instead of a string, I can show you one approach that as a beginner you can hopefully understand:
dict_in = {'6': [2,9,8,10], '1': [3,5,8,9,10,12], '4': [2,5,7,8,9], '2': [3,4,7,6,13], '12': [1,7,5,9], '3': [9,11,10,1,2,13], '10': [1,3,6,11], '5': [4,1,7,11,12], '13': [2,3], '8' :[1,6,4,11], '7': [5,2,4,9,12], '11': [3,5,10,8], '9': [12,1,3,6,4,7]}
dict_out = {}
for key in dict_in:
if key == "6" or key == "4" or key == "13":
for k,v in dict_in.items():
for y in v:
if int(key) in v and y != int(key):
dict_out.setdefault(key, []).append(y)
Output:
{'6': [3, 4, 7, 13, 1, 3, 11, 1, 4, 11, 12, 1, 3, 4, 7], '4': [3, 7, 6, 13, 1, 7, 11, 12, 1, 6, 11, 5, 2, 9, 12, 12, 1, 3, 6, 7], '13': [3, 4, 7, 6, 9, 11, 10, 1, 2]}
Last note, I have no clue whatsoever for why on Earth, you wanted the only keys left to be 6,4 and 13.
In any case, do not consider this as a full answer.
Now sure what you mean, so just created something that will cover most part. Change it with what you want.
dict_in = {'6': ['2,9,8,10'], '1': ['3,5,8,9,10,12'], '4': ['2,5,7,8,9'], '2': ['3,4,7,6,13'], '12': ['1,7,5,9'], '3': ['9,11,10,1,2,13'], '10': ['1,3,6,11'], '5': ['4,1,7,11,12'], '13': ['2,3'], '8' :['1,6,4,11'], '7': ['5,2,4,9,12'], '11': ['3,5,10,8'], '9': ['12,1,3,6,4,7']}
dict_out = {}
for key, values in dict_in.items():
for item in values:
if True:#your condition
if key in dict_out.keys():
dict_out[key].append(item)
else:
dict_out[key] = [item]
print(dict_out)

How to Convert Dictionary Values to sets of Dictionaries

I am trying to solve some graph problems but i am stuck halfway. I have a python dictionary of sets, but i will like to convert the original dictionary values (which are sets) into a dictionary such that each value in the set becomes a key which would have another value of 1. I think this is what is called a nested dictionary - i am not sure.
I looped through the dict.values(), assigned to a variable xxx, and used the dict.fromkeys(xxx, 1) code and it worked, but i am unable to integrate the result back into the original dictionary.
Here is an example of a dictionary:
d = {'35': {'1', '37', '36', '71'}, '37': {'1', '35'}}
I want the output to look like:
d = {35: {1 : 1, 37 : 1, 36 : 1, 71 : 1}, 37: {1 : 1, 35 : 1}}
if you notice, the original dictionary values have become dictionaries of their own, and the apostrophes ('') are off.
Can someone assist me please, or give me pointers. Thank you
You just need a little bit of list comprehension:
def convert(input):
return {key: {val: 1 for val in vals} for key, vals in input.items()}
print(convert({'35': {'1', '37', '36', '71'}, '37': {'1', '35'}}))
# {'35': {'1': 1, '37': 1, '36': 1, '71': 1}, '37': {'1': 1, '35': 1}}
You are almost there. Just wrap keys and values with int:
{int(k):dict.fromkeys(map(int, v), 1) for k, v in d.items()}
Output:
{35: {37: 1, 71: 1, 36: 1, 1: 1}, 37: {35: 1, 1: 1}}

Categories

Resources