I'd like to sort dictionary like below:
from
unsorted_list : {'F': 3, 'A': 1, 'B': 2, 'C': 2, 'E': 2, 'D': 1,}
to
sorted_list : {'D': 1, 'A': 1, 'E': 2, 'C': 2, 'B': 2, 'F': 3,}
So,
sorting by value Ascending then,
sorting by key Descending
How can I do this by python?
Dictionaries preserve order of insertion, so if you want to re-order them, you need to build a new dict where you insert the items in the desired order. You can do this pretty easily by sorting the dict's items() and then passing the result to dict():
>>> d = {'F': 3, 'A': 1, 'B': 2, 'C': 2, 'E': 2, 'D': 1,}
>>> dict(sorted(d.items(), key=lambda i: (-i[1], i[0]), reverse=True))
{'D': 1, 'A': 1, 'E': 2, 'C': 2, 'B': 2, 'F': 3}
Note that if dictionary ordering is important to your use case, you might want to use an OrderedDict, which has more robust support for rearranging items in-place.
I have been through several posts, however, I am unable to sort out how to use each dictionary within a list of dictionaries to create a rows in a pandas Dataframe. Specifically, I have two issues that my limited experience with dictionaries is unable to workaround.
So far I have separated each key and value into two columns however, what I am looking for is to create a row for each dictionary and use the key as the column name.
Only the first key in each dictionary is unique, thus I would either like to drop it completely or only use the key as a value to populate a column under the name "id".
Example List of Dictionaries (>500k in total):
pep_list=[{'HV404': 'WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR',
'gene': 'HV404',
'aa_comp': {'W': 4,
'V': 5,
'L': 5,
'S': 10,
'Q': 3,
'E': 1,
'G': 5,
'P': 2,
'K': 1,
'T': 2,
'C': 1,
'A': 1,
'I': 1,
'N': 1,
'R': 1},
'peptide': ['WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR'],
'Length': 43,
'z': 3,
'Mass': 4557,
'm/z': 1519.0},
{'A0A0G2JNQ3': 'ISGNTSR',
'gene': 'A0A0G2JNQ3',
'aa_comp': {'I': 1, 'S': 2, 'G': 1, 'N': 1, 'T': 1, 'R': 1},
'peptide': ['ISGNTSR'],
'Length': 7,
'z': 2,
'Mass': 715,
'm/z': 357.5},etc.]
Expected output:
Dataframe = pd.DataFrame({values from dictionaries}, columns=["id", "gene", 'aa_comp', 'peptide', 'length', 'z', 'mass','m/z')
id
columns of keys
dictionary 1
values in seperate columns
dictionary 2
values in seperate columns
Thank you for any insight!
Whatever these things are
{'HV404': 'WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR',}
{'A0A0G2JNQ3': 'ISGNTSR',}
are messing it up, plus it doesn't look like they are needed because the info is repeated.
If you want to take out a non-representative key you can do something like this
key_intersect = set(pep_list[0].keys()).intersection(set(pep_list[1].keys()))
new_list_of_dictionaries = [{key:value for (key,value) in dicts.items() if key in key_intersect} for dicts in pep_list]
df = pd.DataFrame(new_list_of_dictionaries)
Pretty compact code, but you could unfurl it in loops if needed. Beware of blindly taking out the first element, unless it is an ordered dict the first element is not guaranteed to be the same.
I have a dict I want to iterate through to find all values that contain the key. My output would be a separate dict that would contain the numbers from each dict value without the key in the value or each specific keys values in the final
dict_in =
{'6': ['2,9,8,10'], '1': ['3,5,8,9,10,12'], '4': ['2,5,7,8,9'], '2': ['3,4,7,6,13'], '12': ['1,7,5,9'], '3': ['9,11,10,1,2,13'], '10': ['1,3,6,11'], '5': ['4,1,7,11,12'], '13': ['2,3'], '8': ['1,6,4,11'], '7': ['5,2,4,9,12'], '11': ['3,5,10,8'], '9': ['12,1,3,6,4,7']}
so the output would be like this:
{'6':['3,4,7,13,1,3,11,1,4,11,12,1,3,4,7'] , '4':['3,4,6,13,1,11,12,1,6,11,12,12,1,3,6'],'13': ['4,7,6,9,11,10,1']}
I am a beginner and I do not even know where to start. Would it be easier to convert it to a list of lists?
There is one thing about your problem that is an extra challenge. That is for you or some other good samaritan to solve. This is just a nudge in your direction. Your values for the keys is actually a single string. Now if it was actual integers, the problem is not too complicated. Also to note, your expected output that you wrote based on your requirement is actually typed wrong, you missed a few values.
In case of having integers instead of a string, I can show you one approach that as a beginner you can hopefully understand:
dict_in = {'6': [2,9,8,10], '1': [3,5,8,9,10,12], '4': [2,5,7,8,9], '2': [3,4,7,6,13], '12': [1,7,5,9], '3': [9,11,10,1,2,13], '10': [1,3,6,11], '5': [4,1,7,11,12], '13': [2,3], '8' :[1,6,4,11], '7': [5,2,4,9,12], '11': [3,5,10,8], '9': [12,1,3,6,4,7]}
dict_out = {}
for key in dict_in:
if key == "6" or key == "4" or key == "13":
for k,v in dict_in.items():
for y in v:
if int(key) in v and y != int(key):
dict_out.setdefault(key, []).append(y)
Output:
{'6': [3, 4, 7, 13, 1, 3, 11, 1, 4, 11, 12, 1, 3, 4, 7], '4': [3, 7, 6, 13, 1, 7, 11, 12, 1, 6, 11, 5, 2, 9, 12, 12, 1, 3, 6, 7], '13': [3, 4, 7, 6, 9, 11, 10, 1, 2]}
Last note, I have no clue whatsoever for why on Earth, you wanted the only keys left to be 6,4 and 13.
In any case, do not consider this as a full answer.
Now sure what you mean, so just created something that will cover most part. Change it with what you want.
dict_in = {'6': ['2,9,8,10'], '1': ['3,5,8,9,10,12'], '4': ['2,5,7,8,9'], '2': ['3,4,7,6,13'], '12': ['1,7,5,9'], '3': ['9,11,10,1,2,13'], '10': ['1,3,6,11'], '5': ['4,1,7,11,12'], '13': ['2,3'], '8' :['1,6,4,11'], '7': ['5,2,4,9,12'], '11': ['3,5,10,8'], '9': ['12,1,3,6,4,7']}
dict_out = {}
for key, values in dict_in.items():
for item in values:
if True:#your condition
if key in dict_out.keys():
dict_out[key].append(item)
else:
dict_out[key] = [item]
print(dict_out)
I am trying to solve some graph problems but i am stuck halfway. I have a python dictionary of sets, but i will like to convert the original dictionary values (which are sets) into a dictionary such that each value in the set becomes a key which would have another value of 1. I think this is what is called a nested dictionary - i am not sure.
I looped through the dict.values(), assigned to a variable xxx, and used the dict.fromkeys(xxx, 1) code and it worked, but i am unable to integrate the result back into the original dictionary.
Here is an example of a dictionary:
d = {'35': {'1', '37', '36', '71'}, '37': {'1', '35'}}
I want the output to look like:
d = {35: {1 : 1, 37 : 1, 36 : 1, 71 : 1}, 37: {1 : 1, 35 : 1}}
if you notice, the original dictionary values have become dictionaries of their own, and the apostrophes ('') are off.
Can someone assist me please, or give me pointers. Thank you
You just need a little bit of list comprehension:
def convert(input):
return {key: {val: 1 for val in vals} for key, vals in input.items()}
print(convert({'35': {'1', '37', '36', '71'}, '37': {'1', '35'}}))
# {'35': {'1': 1, '37': 1, '36': 1, '71': 1}, '37': {'1': 1, '35': 1}}
You are almost there. Just wrap keys and values with int:
{int(k):dict.fromkeys(map(int, v), 1) for k, v in d.items()}
Output:
{35: {37: 1, 71: 1, 36: 1, 1: 1}, 37: {35: 1, 1: 1}}
Lately I've been trying to map some values, so I'm trying to create a dictionary to do so. The odd thing is my DataFrame has a column made of lists, and DataFrames are always a bit awkward with lists. The DataFrame has the following structure:
rules procedure
['10','11','12'] 1
['13','14'] 2
['20','21','22','24'] 3
So I want to create a dictionary that maps '10' to 1, '14' to 2, and so on. I tried the following:
dicc=dict()
for j in df['rules']:
for i,k in zip(j,df.procedure):
dicc[i]=k
But that isn't making it. Probably something to do with indexes. What am I missing?
Edit: I'm trying to create a dictionary that maps the values '10', '11', '12' to 1; '13','14' to 2; '20','21','22','24' to 3, so if I typedicc['10'] I get 1, if I typedicc['22'] I get 3. Obviously, the actual DataFrame is quite bigger and I can't do it manually.
You can do it like this:
import pandas as pd
data = [[['10', '11', '12'], 1],
[['13', '14'], 2],
[['20', '21', '22', '24'], 3]]
df = pd.DataFrame(data=data, columns=['rules', 'procedure'])
d = {r : p for rs, p in df[['rules', 'procedure']].values for r in rs}
print(d)
Output
{'20': 3, '10': 1, '11': 1, '24': 3, '14': 2, '22': 3, '13': 2, '12': 1, '21': 3}
Notes:
The code {r : p for rs, p in df[['rules', 'procedure']].values for r
in rs} is a dictionary comprehension, the dictionary counterpart of
list.
The df[['rules', 'procedure']].values is equivalent to
zip(df.rules, df.procedure) it outputs a pair of list, int. So the
rs variable is a list and p is an integer.
Finally you iterate over the values of rs using the second for loop
UPDATE
As suggested for #piRSquared you can use zip:
d = {r : p for rs, p in zip(df.rules, df.procedure) for r in rs}
Help from cytoolz
from cytoolz.dicttoolz import merge
merge(*map(dict.fromkeys, df.rules, df.procedure))
{'10': 1,
'11': 1,
'12': 1,
'13': 2,
'14': 2,
'20': 3,
'21': 3,
'22': 3,
'24': 3}
Note
I updated my post to mimic how #jpp passed multiple iterables to map. #jpp's answer is very good. Though I'd advocate for upvoting all useful answers, I wish I could upvote their answer again (-:
Using collections.ChainMap:
from collections import ChainMap
res = dict(ChainMap(*map(dict.fromkeys, df['rules'], df['procedure'])))
print(res)
{'10': 1, '11': 1, '12': 1, '13': 2, '14': 2,
'20': 3, '21': 3, '22': 3, '24': 3}
For many uses, the final dict conversion is not necessary:
A ChainMap class is provided for quickly linking a number of
mappings so they can be treated as a single unit. It is often much
faster than creating a new dictionary and running multiple update()
calls.
See also What is the purpose of collections.ChainMap?
You may check flatten the list
dict(zip(sum(df.rules.tolist(),[]),df.procedure.repeat(df.rules.str.len())))
Out[60]:
{'10': 1,
'11': 1,
'12': 1,
'13': 2,
'14': 2,
'20': 3,
'21': 3,
'22': 3,
'24': 3}
using itertools.chain and DataFrame.itertuples:
dict(
chain.from_iterable(
((rule, row.procedure) for rule in row.rules) for row in df.itertuples()
)
)