Fetching elements from multiple columns of lists in a dataframe - python

I have a dataframe like this:
A B C
[1,2,3] ['a','b','c'] ['aa', 'bb', 'cc']
[4,5,6] ['d','e','f'] ['dd', 'ee', 'ff']
[7,8,9] ['g','h','i'] ['gg', 'hh', 'ii']
I would like to combine the values from these columns as follows:
[[[1,'a', 'aa'], [2,'b','bb'], [3, 'c', 'cc']], [[4,'d','dd'], [5,'e', 'ee'], [6,'f','ff']], [[7,'g','gg'], [8,'h','hh'], [9,'i','ii']]]
My idea was to change each column to list like this (which will give a list of list) :
first = df['A'].values.tolist() # similarly for other columns
And then zip all lists and iterate through them and fetch corresponding values from each list and create a new list as per the output format. But, I am sure there are better solutions than mine. Can anyone help me with this?

IIUC explode with groupby
pd.concat([df[[x]].explode(x) for x in df.columns],axis=1)\
.apply(lambda x : x.tolist(),axis=1).groupby(level=0).agg(list).tolist()
Out[366]:
[[[1, 'a', 'aa'], [2, 'b', 'bb'], [3, 'c', 'cc']],
[[4, 'd', 'dd'], [5, 'e', 'ee'], [6, 'f', 'ff']],
[[7, 'g', 'gg'], [8, 'h', 'hh'], [9, 'i', 'ii']]]

An extreme solution with apply:
df.apply(lambda x: list(zip(*x.to_list())), axis=1).to_list()
Output:
[[(1, 'a', 'aa'), (2, 'b', 'bb'), (3, 'c', 'cc')],
[(4, 'd', 'dd'), (5, 'e', 'ee'), (6, 'f', 'ff')],
[(7, 'g', 'gg'), (8, 'h', 'hh'), (9, 'i', 'ii')]]

Related

Possible sets of different sub-list items with one element of each sub-list

I am looking for a way to obtain combinations of single elements of all sub-lists contained in a list without knowing in advance the length of the list and the sub-lists. Let me illustrate what I mean via two examples below. I have two lists (myList1 and myList2) and would like to obtain the two combination sets (setsCombo1 and setsCombo1):
myList1 = [['a'], [1, 2, 3], ['X', 'Y']]
setsCombo1 = [['a', 1, 'X'],
['a', 1, 'Y'],
['a', 2, 'X'],
['a', 2, 'Y'],
['a', 3, 'X'],
['a', 3, 'Y']]
myList2 = [['a'], [1, 2, 3], ['X', 'Y'], [8, 9]]
setsCombo2 = [['a', 1, 'X', 8],
['a', 1, 'X', 9],
['a', 1, 'Y', 8],
['a', 1, 'Y', 9],
['a', 2, 'X', 8],
['a', 2, 'X', 9],
['a', 2, 'Y', 8],
['a', 2, 'Y', 9],
['a', 3, 'X', 8],
['a', 3, 'X', 9],
['a', 3, 'Y', 8],
['a', 3, 'Y', 9]]
I looked a bit into itertools but couldn't really find anything quickly that is appropriate...
itertools.product with unpacking * (almost) does that:
>>> from itertools import product
>>> list(product(*myList1))
[('a', 1, 'X'),
('a', 1, 'Y'),
('a', 2, 'X'),
('a', 2, 'Y'),
('a', 3, 'X'),
('a', 3, 'Y')]
To cast the inner elements to lists, we map:
>>> list(map(list, product(*myList1)))
[['a', 1, 'X'],
['a', 1, 'Y'],
['a', 2, 'X'],
['a', 2, 'Y'],
['a', 3, 'X'],
['a', 3, 'Y']]

How can I apply a permutation to a list?

How one might get Sympy Permutation to act on a list? E.g.,
from sympy.combinatorics import Permutation
lst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
perm = Permutation([[0, 2, 8, 6], [1, 5, 7, 3]])
# Then something like...
perm * lst # This doesn't work. Throws AttributeError because of list
I'd like something like this that returns (in this example):
['g', 'd', 'a', 'h', 'e', 'b', 'i', 'f', 'c']
I have read https://docs.sympy.org/latest/modules/combinatorics/permutations.html, and don't see how.
Any suggestions as to how might one go about this?
You can just do perm(lst)
>>> from sympy.combinatorics import Permutation
>>> lst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
>>> perm = Permutation([[0, 2, 8, 6], [1, 5, 7, 3]])
>>> perm(lst)
['c', 'f', 'i', 'b', 'e', 'h', 'a', 'd', 'g']
Your example output seems to have the result of applying the reverse of the given Permutation to the list - if that is your required output you need to either reverse the final list or each list within the permutation.
From here:
The permutation can be ‘applied’ to any list-like object, not only Permutations.

Sorting dictionaries in python by values that are actually lists

If I'm given a dictionary to represent a graph, where vertices are keys and values are lists, whose entries contain both a neighbor vertex and the weight between the two vertices, how can I return a list of edges in increasing order with no repeats? For example, I may be given the following dictionary...:
{"A": [["B",10], ["D",5]], "B": [["A",10], ["C",5]], "C": [["B",5],["D",15]], "D": [["C",15], ["A",5]]}.
Also I'm only allowed to import the copy library, so I could copy one list and use deepcopy() to create a new object with the same elements.
Right now, I'm trying to turn the dictionary into a list, because I figure it might be easier to sort elements within a list, and delete duplicate edges. So at the moment I have the following (graph is the dictionary, and in this case the one I provided above)...
def edge_get(graph):
input_list = []
sorted_list = []
for key, value in graph.items():
temp = [key,value]
input_list.append(temp)
print(input_list)
This prints out...
[['A', [['B', 10], ['D', 5]]], ['B', [['A', 10], ['C', 5]]], ['C', [['B', 5], ['D', 15]]], ['D', [['C', 15], ['A', 5]]]]
I would like to get it to output:
[['A', 'B', 10], ['A', 'D', 5], ['B', 'A', 10], ['B', 'C', 5],...
I figure if I can get it like this, I can compare the third element of each list, within the list, and if they are the same, check to see if the other elements match (same edge). And based off of that I can add it to the final list or forget it and move on.
For this example the ultimate goal is:
[['A', 'D'], ['B', 'C'], ['A', 'B'], ['C', 'D']]
So you have a dict that represents a graph as adjacency list, and you want to convert that adjacency list into an edge list.
You can do that with a nested list comprehension:
graph = {"A": [["B",10], ["D",5]], "B": [["A",10], ["C",5]], "C": [["B",5],["D",15]], "D": [["C",15], ["A",5]]}
edges = [(src, dst, weight) for src, adjs in graph.items() for dst, weight in adjs]
# edges = [('A', 'B', 10), ('A', 'D', 5), ('B', 'A', 10), ('B', 'C', 5), ('C', 'B', 5), ('C', 'D', 15), ('D', 'C', 15), ('D', 'A', 5)]
Then you can eliminate duplicates edges by converting to a dict, note that if you have duplicate edges with conflicting weights, this will pick one of the weight arbitrarily:
uniques = {frozenset([src, dst]): weight for src, dst, weight in edges}
# uniques = {frozenset({'B', 'A'}): 10, frozenset({'A', 'D'}): 5, frozenset({'B', 'C'}): 5, frozenset({'C', 'D'}): 15}
and then sort the edges with sorted:
sorted_uniques = sorted(uniques.items(), key=lambda v: v[1])
# sorted_uniques = [(frozenset({'A', 'D'}), 5), (frozenset({'C', 'B'}), 5), (frozenset({'A', 'B'}), 10), (frozenset({'C', 'D'}), 15)]
Finally, to get the result in the structure you wanted, you simply do:
result = [sorted(e) for e, weight in sorted_uniques]
# result = [['A', 'D'], ['B', 'C'], ['A', 'B'], ['C', 'D']]
You can represent each edge as frozenset and filter edge duplicates with help of set:
G = {"A": [["B",10], ["D",5]], "B": [["A",10], ["C",5]], "C": [["B",5],["D",15]], "D": [["C",15], ["A",5]]}
edges = {(frozenset((k, i)), j) for k, v in G.items()
for i, j in v}
[sorted(i) for i, _ in sorted(edges, key=lambda x: x[1])]
# [['B', 'C'], ['A', 'D'], ['A', 'B'], ['C', 'D']]
You can use itertools.product to generate the combinations of key with each related sublist. If you sort and unpack the string components of each combination, then you get the initial output you are looking for. From there you can sort the entire list first by the weight value and then by the vertices in order to get an ordered list. If you slice that list with a step value you can remove the duplicates. Then you can just remove the weight value to get the list of pairs for your final output.
You could consolidate the steps below just a bit more but this goes through the steps outlined in your question to hopefully make it a bit easier to follow.
from itertools import product
from operator import itemgetter
d = {"A": [["B",10], ["D",5]], "B": [["A",10], ["C",5]], "C": [["B",5],["D",15]], "D": [["C",15], ["A",5]]}
combos = [[*sorted([c1, c2]), n] for k, v in d.items() for c1, [c2, n] in product(k, v)]
print(combos)
# [['A', 'B', 10], ['A', 'D', 5], ['A', 'B', 10], ['B', 'C', 5], ['B', 'C', 5], ['C', 'D', 15], ['C', 'D', 15], ['A', 'D', 5]]
ordered = sorted(combos, key=itemgetter(2, 0, 1))[::2]
print(ordered)
# [['A', 'D', 5], ['B', 'C', 5], ['A', 'B', 10], ['C', 'D', 15]]
pairs = [o[:-1] for o in ordered]
print(pairs)
# [['A', 'D'], ['B', 'C'], ['A', 'B'], ['C', 'D']]
EDIT (without imports):
Per comment highlighting a restriction on using imports in your solution, here is a modified version of the original. Differences are replacement of itertools.product with list comprehension that accomplishes the same thing and the replacement of operator.itemgetter with a lambda.
d = {"A": [["B",10], ["D",5]], "B": [["A",10], ["C",5]], "C": [["B",5],["D",15]], "D": [["C",15], ["A",5]]}
combos = [[*sorted([k, c]), n] for k, v in d.items() for c, n in v]
print(combos)
# [['A', 'B', 10], ['A', 'D', 5], ['A', 'B', 10], ['B', 'C', 5], ['B', 'C', 5], ['C', 'D', 15], ['C', 'D', 15], ['A', 'D', 5]]
ordered = sorted(combos, key=lambda x: (x[2], x[0], x[1]))[::2]
print(ordered)
# [['A', 'D', 5], ['B', 'C', 5], ['A', 'B', 10], ['C', 'D', 15]]
pairs = [o[:-1] for o in ordered]
print(pairs)
# [['A', 'D'], ['B', 'C'], ['A', 'B'], ['C', 'D']]

python: output data from a list

I'm trying to figure out how to output list items. the code below is taking answers and checking them against a key to see which answers are correct. for each student correct answers are stored in correct_count. Then I'm sorting in ascending order based on the correct count.
def main():
answers = [
['A', 'B', 'A', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['D', 'B', 'A', 'B', 'C', 'A', 'E', 'E', 'A', 'D'],
['E', 'D', 'D', 'A', 'C', 'B', 'E', 'E', 'A', 'D'],
['C', 'B', 'A', 'E', 'D', 'C', 'E', 'E', 'A', 'D'],
['A', 'B', 'D', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['B', 'B', 'E', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['B', 'B', 'A', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['E', 'B', 'E', 'C', 'C', 'D', 'E', 'E', 'A', 'D']]
keys = ['D', 'B', 'D', 'C', 'C', 'D', 'A', 'E', 'A', 'D']
grades = []
# Grade all answers
for i in range(len(answers)):
# Grade one student
correct_count = 0
for j in range(len(answers[i])):
if answers[i][j] == keys[j]:
correct_count += 1
grades.append([i, correct_count])
grades.sort(key=lambda x: x[1])
# print("Student", i, "'s correct count is", correct_count)
if __name__ == '__main__':
main()
if I print out grades the output looks like this
[[0, 7]]
[[1, 6], [0, 7]]
[[2, 5], [1, 6], [0, 7]]
[[3, 4], [2, 5], [1, 6], [0, 7]]
[[3, 4], [2, 5], [1, 6], [0, 7], [4, 8]]
[[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [4, 8]]
[[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [6, 7], [4, 8]]
[[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [6, 7], [7, 7], [4, 8]]
what I'm interested in is the last row. The first number of each set corresponds to a student id and it's sorted in ascending order based on the 2nd number which represents a grade (4, 5, 6, 7, 7, 7, 7, 8).
I'm not sure how to grab that last row and iterate through it so that i get output like
student 3 has a grade of 4 and student 2 has a grade of 5
[[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [6, 7], [7, 7], [4, 8]]
def main():
answers = [
['A', 'B', 'A', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['D', 'B', 'A', 'B', 'C', 'A', 'E', 'E', 'A', 'D'],
['E', 'D', 'D', 'A', 'C', 'B', 'E', 'E', 'A', 'D'],
['C', 'B', 'A', 'E', 'D', 'C', 'E', 'E', 'A', 'D'],
['A', 'B', 'D', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['B', 'B', 'E', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['B', 'B', 'A', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['E', 'B', 'E', 'C', 'C', 'D', 'E', 'E', 'A', 'D']]
keys = ['D', 'B', 'D', 'C', 'C', 'D', 'A', 'E', 'A', 'D']
grades = []
# Grade all answers
for i in range(len(answers)):
# Grade one student
correct_count = 0
for j in range(len(answers[i])):
if answers[i][j] == keys[j]:
correct_count += 1
grades.append([i, correct_count])
grades.sort(key=lambda x: x[1])
for student, correct in grades:
print("Student", student,"'s correct count is", correct)
if __name__ == '__main__':
main()
What you were doing was printing grades while you were still in the loop. If you would've printed grades after both loops, you would've only seen the last line: [[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [6, 7], [7, 7], [4, 8]], then just loop through grades and python will "unpack" the list into the student, and grade, respectively ash shown above.
Here is the output:
Student 3 's correct count is 4
Student 2 's correct count is 5
Student 1 's correct count is 6
Student 0 's correct count is 7
Student 5 's correct count is 7
Student 6 's correct count is 7
Student 7 's correct count is 7
Student 4 's correct count is 8
Don't forget to click the check mark if you like this answer.
What about something like the following:
students_grade = {}
for id, ans in enumerate(answers):
students_grade[id] = sum([x == y for x, y in zip(ans, key)])
Now you have a dictionary with the id of students mapping to their score ;)
Of course, you can change the enumerate to have the true list of ids instead!
While MMelvin0581 already addressed the problem in your code, You can also use nested list comprehension to achieve the same results
>>> [(a,sum([1 if k==i else 0 for k,i in zip(keys,j)])) for a,j in enumerate(answers)]
This will produce output like:
>>> [(0, 7), (1, 6), (2, 5), (3, 4), (4, 8), (5, 7), (6, 7), (7, 7)]
Then you can sort your results based on the criteria
>>> from operator import itemgetter
>>> sorted(out, key=itemgetter(1))
Note: itemgetter will have slight performance benefit over lambda. The above operation will produce output like:
>>> [(3, 4), (2, 5), (1, 6), (0, 7), (5, 7), (6, 7), (7, 7), (4, 8)]
Then finally print your list like:
for item in sorted_list:
print("Student: {} Scored: {}".format(item[0],item[1]))

append list of values to sublists

How do you append each item of one list to each sublist of another list?
a = [['a','b','c'],['d','e','f'],['g','h','i']]
b = [1,2,3]
Result should be:
[['a','b','c',1],['d','e','f',2],['g','h','i',3]]
Keep in mind that I want to do this to a very large list, so efficiency and speed is important.
I've tried:
for sublist,value in a,b:
sublist.append(value)
it returns 'ValueError: too many values to unpack'
Perhaps a listindex or a listiterator could work, but not sure how to apply here
a = [['a','b','c'],['d','e','f'],['g','h','i']]
b = [1,2,3]
for ele_a, ele_b in zip(a, b):
ele_a.append(ele_b)
Result:
>>> a
[['a', 'b', 'c', 1], ['d', 'e', 'f', 2], ['g', 'h', 'i', 3]]
The reason your original solution did not work, is that a,b does create a tuple, but not what you want.
>>> z = a,b
>>> type(z)
<type 'tuple'>
>>> z
([['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']], [1, 2, 3])
>>> len(z[0])
3
>>> for ele in z:
... print ele
...
[['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']] #In your original code, you are
[1, 2, 3] #unpacking a list of 3 elements
#into two values, hence the
#'ValueError: too many values to unpack'
>>> zip(a,b) # using zip gives you what you want.
[(['a', 'b', 'c'], 1), (['d', 'e', 'f'], 2), (['g', 'h', 'i'], 3)]
Here is a simple solution:
a = [['a','b','c'],['d','e','f'],['g','h','i']]
b = [1,2,3]
for i in range(len(a)):
a[i].append(b[i])
print(a)
One option, using list comprehension:
a = [(a[i] + b[i]) for i in range(len(a))]
Just loop through the sublists, adding one item at a time:
for i in range(0,len(listA)):
listA.append(listB[i])
You can do:
>>> a = [['a','b','c'],['d','e','f'],['g','h','i']]
>>> b = [1,2,3]
>>> [l1+[l2] for l1, l2 in zip(a,b)]
[['a', 'b', 'c', 1], ['d', 'e', 'f', 2], ['g', 'h', 'i', 3]]
You can also abuse a side effect of list comprehensions to get this done in place:
>>> [l1.append(l2) for l1, l2 in zip(a,b)]
[None, None, None]
>>> a
[['a', 'b', 'c', 1], ['d', 'e', 'f', 2], ['g', 'h', 'i', 3]]

Categories

Resources