I have a dictionary like this
dict = {'name':''xyz','c1':['a','b','r','d','c'],'c2':['21','232','11','212','34']}
Here dict.c1 values and dict.c2 values are inter-dependant. That is, 'a' is related to '21', 'd' is related to '212', 'b' is related to '232'...
I have to sort c1 and c2 should get reflected accordingly. The final output should be
dict = {'name':''xyz','c1':['a','b','c','d','r'],'c2':['21','232','34','212','11']}
What is the most efficient way to do this?
This works:
d = {'name': 'xyz','c1':['a','b','r','d','c'],'c2':['21','232','11','212','34']}
s = sorted(list(zip(d['c1'], d['c2'])))
d['c1'] = [x[0] for x in s]
d['c2'] = [x[1] for x in s]
Result:
{'c1': ['a', 'b', 'c', 'd', 'r'],
'c2': ['21', '232', '34', '212', '11'],
'name': 'xyz'}
UPDATE
The call to list is not needed. Thanks to tzaman for the hint. It was a relict of putting together the solution from separate steps.
d = {'name': 'xyz','c1':['a','b','r','d','c'],'c2':['21','232','11','212','34']}
s = sorted(zip(d['c1'], d['c2']))
d['c1'] = [x[0] for x in s]
d['c2'] = [x[1] for x in s]
Your data structure does not reflect the real relationship between its elements. I would start by merging c1 and c2 into an OrderedDict. It would take care of both the relationship and the order of elements. Like this:
dict = dict(
name='xyz',
c = OrderedDict(sorted(
zip(
['a','b','r','d','c'],
['21','232','11','212','34']
)
))
)
Frankly, the most efficient way is to store the data in a form compatible with its use. In this case, you would use a table (e.g. PANDAS data frame) or simply a list:
xyz_table = [('a', '21'), ('b'. 232'), ...]
Then you simply sort the list on the first element of each tuple:
xyz_table = [('a', '21'), ('b', '232'), ('c', '34'), ('d', '212'), ('r', '11')]
xyz_sort = sorted(xyz_table, key = lambda row: row[0])
print xyz_sort
You could also look up a primer on Python sorting, such as this
Here's one way you could do it, using zip for both joining and unjoining. You can re-cast them as lists instead of tuples when you put them back into the dictionary.
a = ['a','b','r','d','c']
b = ['21','232','11','212','34']
mix = list(zip(a, b))
mix.sort()
a_new, b_new = zip(*mix)
>>> a_new
('a', 'b', 'c', 'd', 'r')
>>> b_new
('21', '232', '34', '212', '11')
Related
I have the following DataFrame:
df = pd.DataFrame({
'From':['a','b','c','d'],
'To':['h','m','f','f'],
'week':[1,2,3,3]
})
I want to use column 'To' and 'week' as keys to map to value 'From', create a dictionary like {(1,'h'):'a',(2,'m'):'b',(3,'f'):['c','d']}, is there a way to do this? I tried to use
dict(zip([tuple(x) for x in df[['week','To']].to_numpy()], df['From']))
but it only gives me {(1,'h'):'a',(2,'m'):'b',(3,'f'):'d'}
. If there are multiple 'From's for the same ('week', 'To'), I want to put it in a list or set. Thanks!!
You can use .groupby() method followed by an .apply(list) method on the column From to convert the results into a list. From here, pandas has a .to_dict() method to convert your results to a dictionary.
>>> df.groupby(['To', 'week'])['From'].apply(list).to_dict()
{('f', 3): ['c', 'd'], ('h', 1): ['a'], ('m', 2): ['b']}
>>>
>>> # use lambda to convert lists with only one value to string
>>> df.groupby(['To', 'week'])['From'].apply(lambda x: list(x) if len(x) > 1 else list(x)[0]).to_dict()
{('f', 3): ['c', 'd'], ('h', 1): 'a', ('m', 2): 'b'}
Use below code to get your desired dictionary:
df.groupby(['To','week'])['From'].agg(','.join).apply(lambda s: s.split(',') if ',' in s else s).to_dict()
Output:
>>> df.groupby(['To','week'])['From'].agg(','.join).apply(lambda s: s.split(',') if ',' in s else s).to_dict()
{('f', 3): ['c', 'd'], ('h', 1): 'a', ('m', 2): 'b'}
groupby on To,Week and join the values with ,. Then just use apply to convert , separated values into lists, and finally convert the result to dictionary.
I am currently trying to obtain top 2 maximum values from the following list (Quant) and its corresponding value from the 2nd list (FF).
Quant = ['1', '29', '109', '2', '1', '1', '100']
FF = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
The top 2 max value in 1st list is 100 & 109 and its corresponding value in 2nd list is c & g. I tried to get the position of top values in Quant list by the following method.
a = max(Quant)
pos1 = [i for i, j in enumerate(Qu) if j == a]
Quant.remove(a)
b = max(Quant)
pos2 = [i for i, j in enumerate(Qu) if j == b]
for x, y in zip(pos1, pos2)
FC1 = FF[x]
FC2 = FF[y]
i am not sure if it is the correct way. The current Quant list does not contain duplication in max values. What if there are duplication and in that case pos1 will have 2 index values. If yes, In that i would need those 2 values from list 1 along with the subsequent value from list2.
Kindly assist me on the part.
In one line, you can do this by sorting the zipped list then unzipping only the first two items:
((FC1,FC2), (pos1,pos2)) = zip(
*sorted(zip(Quant,FF), key=lambda x:int(x[0]), reverse=True)[:2])
or if you interchange the variables, you don't even need to unzip:
((FC1,pos1), (FC2,pos2)) = sorted(zip(Quant,FF),
key=lambda x:int(x[0]), reverse=True)[:2]
>>> FC1
'109'
>>> FC2
'100'
>>> pos1
'c'
>>> pos2
'g'
This would do it, I hope you find it an elegant solution:
[*map(lambda x: FF[x], map(lambda x: Quant.index(str(x)), sorted(map(int, Quant),
reverse=True)[:2]))]
['c', 'g']
Or this:
[FF[i] for i in map(lambda x: Quant.index(str(x)), sorted(map(int, Quant),
reverse=True)[:2])]
Will the values in Quant always be strings? If you have control over it, you should make them numbers, because right now max(Quant) returns 29.
Here's one way to get what you're looking for:
Quant = ['1', '29', '109', '2', '1', '1', '100']
FF = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
quantNums = [int(n) for n in Quant]
max2, max1 = sorted(zip(quantNums, FF))[-2:]
max1 # (109, 'c')
max2 # (100, 'g')
You can achieve that using numpy,
import numpy as np
# Convert the list to numpy array
Quant = ['1', '29', '109', '2', '1', '1', '100']
Quant = np.array(Quant).astype(int)
# Get the two largest elements
ind = Quant.argsort()[-2:]
# Get the values from FF
FF = np.array(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
FF[ind]
List A:
[('Harry', 'X', 'A'),
('James', 'Y', 'G'),
('John', 'Z', 'D')]
List B:
[('Helen', '2', '(A; B)', '3'),
('Victor', '9', '(C; D; E)', '4'),
('Alan', '10', '(A)', '57'),
('Paul', '11', '(F; B)', '43'),
('Sandra', '12', '(F)', '31')]
Basically I have to compare the third element (for x in listA -> x[2]) from list A and check if is there any list in list B that has the same element (for y in listB, x[2] == y[2]) but I'm just losing my mind with this.
My idea was to get the third element from each list in list B, put them into a new list, and then remove that ";" so I could access each element way more easily.
for x in listB:
j = x[2]
j = j.strip().split(', ')
for k in j:
FinalB.append(k)
FinalB = [(k[1:-1].split(";")) for k in FinalB]
Then I'd take the third element from each list of list A and compare them with the elements inside each list of FinalB: if there was a match, I'd get the index of the element in FinalB (the one that's matched), use that index to access his list in listB and get the first element of his list inside list B (basically, I have to know the names from the users inside each list that have the same 3rd element)
My code so far:
FinalB= []
DomainsList = []
for x in listA:
j = x[2]
j = j.strip().split(', ')
for k in j:
FinalB.append(k)
FinalB = [(k[1:-1].split(";")) for k in FinalB]
for y in listA:
for z in FinalB:
for k in z:
if y[2] == k:
m = FinalB.index(z)
DomainsList.append([listA[m][0],listB[m][0]])
return DomainsList
Yes, this is not working (no error, I probably just did this in an absolute wrong way) and I can't figure out what and where I'm doing wrong.
First, I think a better way to handle '(C; D; E)' is to change it to 'CDE', so the first loop becomes:
FinalB = [filter(str.isalpha, x[2]) for x in listB]
We take each string and keep only the alpha characters, so we end up with:
In [18]: FinalB
Out[18]: ['AB', 'CDE', 'A', 'FB', 'F']
This means we can use listA[x][2] in FinalB[y] to test if we have a match:
for y in listA:
for z in FinalB:
if y[2] in z:
DomainsList.append([y[0], listB[FinalB.index(z)][0]])
I had to tweak the arguments to the append() to pick the right elements, so we end up with:
In [17]: DomainsList
Out[17]: [['Harry', 'Helen'], ['Harry', 'Alan'], ['John', 'Victor']]
Usefully, if instead of '(C; D; E)' you have '(foo; bar; baz)', then with just one tweak the code can work for that too:
import re
FinalB = [filter(None, re.split("[; \(\)]+", x[2])) for x in listB]
The remaining code works as before.
It will always help to start a question with context and details.
The python version could also come into play.
The data structure you have given for us to work with is very questionable - especially the third element in each of the tuples in listB...why have a string element and then define it like this '(C; D; E)' ??
Even though I don't understand where you are coming from with this or what this is meant to achieve,no context provided in post, this code should get you there.
It will give you a list of tupples ( listC ), with each tuple having two elements. Element one having the name from listA and element 2 having the name from listB where they have a match as described in post.
NOTE: at the moment the match is simply done with a find, which will work perfectly with the provided details, however you may need to change this to be suitable for your needs if you could have data that would cause false positives or if you want to ignore case.
listA = [('Harry', 'X', 'A'), ('James', 'Y', 'G'), ('John', 'Z', 'D')]
listB = [('Helen', '2', '(A; B)', '3'),
('Victor', '9', '(C; D; E)', '4'),
('Alan', '10', '(A)', '57'),
('Paul', '11', '(F; B)', '43'),
('Sandra', '12', '(F)', '31')]
listC = []
for a in listA:
for b in listB:
if b[2].find(a[2]) != -1:
listC.append((a[0], b[0]))
print(listC)
This gives you.
[('Harry', 'Helen'), ('Harry', 'Alan'), ('John', 'Victor')]
I want to make dictionary from lists.
import numpy as np
a1 = [1,2,3,4,5,6,7,8,9]
b1 = ['a','b','c','d','e','f','g','h','i']
c1 = ['A','B','C','D','E','F','G','H','I']
array2 = np.array([a1,b1,c1]).tolist()
keys = ['name', 'type','description','logo']
print dict(zip(keys, zip(*array2)))
Output:
{'logo': ('4', 'd', 'D'), 'type': ('2', 'b', 'B'), 'name': ('1', 'a', 'A'), 'description': ('3', 'c', 'C')}
Why am i getting only 4 elements
Why the dictionary elements are in random order of 4,2,1,3 and why
not 1,2,3,4?
Why am I getting only 4 elements?
According to the zip documentation:
The returned list is truncated in length to the length of the shortest argument sequence.
Since keys only contains 4 elements, your zipped list resulting from zip(keys, zip(*array2)) will also contain 4 values, resulting in your dict only containing 4 values.
Why the dictionary elements are in random order?
The order of elements in a normal dictionary is not guaranteed. I think it's implementation specific, and usually based on the order of the keys' hashes. You should use OrderedDict if you want to maintain a specific order:
import numpy as np
from collections import OrderedDict
a1 = [1,2,3,4,5,6,7,8,9]
b1 = ['a','b','c','d','e','f','g','h','i']
c1 = ['A','B','C','D','E','F','G','H','I']
array2 = np.array([a1,b1,c1]).tolist()
keys = ['name', 'type','description','logo']
print OrderedDict(zip(keys, zip(*array2)))
# prints: OrderedDict([('name', ('1', 'a', 'A')), ('type', ('2', 'b', 'B')), ('description', ('3', 'c', 'C')), ('logo', ('4', 'd', 'D'))])
I have a dictionary I would like to find the minimum key where value[1] is equal to a a specified string.
somedict = {'1': ['110', 'A'], '3': ['1', 'A'], '2': ['3', 'B'], '4': ['1', 'B']}
mindict = min(somedict.iteritems(), key=itemgetter(0) )
This gives me ('1', ['110', 'A'])
I would like to further filter this by finding the min key where value is 'B'
to give me the result ('2', ['3', 'B'])
How would go about this?
Use a generator expression filtering your items first:
min((i for i in somedict.iteritems() if i[1][-1] == 'B'), key=itemgetter(0))
The generator expression produces elements from somedict.iteritems() where the last entry in the value is equal to 'B'.
Note that there is a risk here that no items match your filter! If that could be the case, make sure you catch the ValueError thrown by min() when passed an empty sequence. If you are using Python 3.4 or newer, you can specify a default to be returned for that case:
min((i for i in somedict.iteritems() if i[1][-1] == 'B'),
key=itemgetter(0), default=())
which would return an empty tuple if no items have a last entry 'B' in their value.
Demo:
>>> from operator import itemgetter
>>> somedict = {'1': ['110', 'A'], '3': ['1', 'A'], '2': ['3', 'B'], '4': ['1', 'B']}
>>> min((i for i in somedict.iteritems() if i[1][-1] == 'B'), key=itemgetter(0))
('2', ['3', 'B'])