python: output data from a list - python

I'm trying to figure out how to output list items. the code below is taking answers and checking them against a key to see which answers are correct. for each student correct answers are stored in correct_count. Then I'm sorting in ascending order based on the correct count.
def main():
answers = [
['A', 'B', 'A', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['D', 'B', 'A', 'B', 'C', 'A', 'E', 'E', 'A', 'D'],
['E', 'D', 'D', 'A', 'C', 'B', 'E', 'E', 'A', 'D'],
['C', 'B', 'A', 'E', 'D', 'C', 'E', 'E', 'A', 'D'],
['A', 'B', 'D', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['B', 'B', 'E', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['B', 'B', 'A', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['E', 'B', 'E', 'C', 'C', 'D', 'E', 'E', 'A', 'D']]
keys = ['D', 'B', 'D', 'C', 'C', 'D', 'A', 'E', 'A', 'D']
grades = []
# Grade all answers
for i in range(len(answers)):
# Grade one student
correct_count = 0
for j in range(len(answers[i])):
if answers[i][j] == keys[j]:
correct_count += 1
grades.append([i, correct_count])
grades.sort(key=lambda x: x[1])
# print("Student", i, "'s correct count is", correct_count)
if __name__ == '__main__':
main()
if I print out grades the output looks like this
[[0, 7]]
[[1, 6], [0, 7]]
[[2, 5], [1, 6], [0, 7]]
[[3, 4], [2, 5], [1, 6], [0, 7]]
[[3, 4], [2, 5], [1, 6], [0, 7], [4, 8]]
[[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [4, 8]]
[[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [6, 7], [4, 8]]
[[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [6, 7], [7, 7], [4, 8]]
what I'm interested in is the last row. The first number of each set corresponds to a student id and it's sorted in ascending order based on the 2nd number which represents a grade (4, 5, 6, 7, 7, 7, 7, 8).
I'm not sure how to grab that last row and iterate through it so that i get output like
student 3 has a grade of 4 and student 2 has a grade of 5
[[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [6, 7], [7, 7], [4, 8]]

def main():
answers = [
['A', 'B', 'A', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['D', 'B', 'A', 'B', 'C', 'A', 'E', 'E', 'A', 'D'],
['E', 'D', 'D', 'A', 'C', 'B', 'E', 'E', 'A', 'D'],
['C', 'B', 'A', 'E', 'D', 'C', 'E', 'E', 'A', 'D'],
['A', 'B', 'D', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['B', 'B', 'E', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['B', 'B', 'A', 'C', 'C', 'D', 'E', 'E', 'A', 'D'],
['E', 'B', 'E', 'C', 'C', 'D', 'E', 'E', 'A', 'D']]
keys = ['D', 'B', 'D', 'C', 'C', 'D', 'A', 'E', 'A', 'D']
grades = []
# Grade all answers
for i in range(len(answers)):
# Grade one student
correct_count = 0
for j in range(len(answers[i])):
if answers[i][j] == keys[j]:
correct_count += 1
grades.append([i, correct_count])
grades.sort(key=lambda x: x[1])
for student, correct in grades:
print("Student", student,"'s correct count is", correct)
if __name__ == '__main__':
main()
What you were doing was printing grades while you were still in the loop. If you would've printed grades after both loops, you would've only seen the last line: [[3, 4], [2, 5], [1, 6], [0, 7], [5, 7], [6, 7], [7, 7], [4, 8]], then just loop through grades and python will "unpack" the list into the student, and grade, respectively ash shown above.
Here is the output:
Student 3 's correct count is 4
Student 2 's correct count is 5
Student 1 's correct count is 6
Student 0 's correct count is 7
Student 5 's correct count is 7
Student 6 's correct count is 7
Student 7 's correct count is 7
Student 4 's correct count is 8
Don't forget to click the check mark if you like this answer.

What about something like the following:
students_grade = {}
for id, ans in enumerate(answers):
students_grade[id] = sum([x == y for x, y in zip(ans, key)])
Now you have a dictionary with the id of students mapping to their score ;)
Of course, you can change the enumerate to have the true list of ids instead!

While MMelvin0581 already addressed the problem in your code, You can also use nested list comprehension to achieve the same results
>>> [(a,sum([1 if k==i else 0 for k,i in zip(keys,j)])) for a,j in enumerate(answers)]
This will produce output like:
>>> [(0, 7), (1, 6), (2, 5), (3, 4), (4, 8), (5, 7), (6, 7), (7, 7)]
Then you can sort your results based on the criteria
>>> from operator import itemgetter
>>> sorted(out, key=itemgetter(1))
Note: itemgetter will have slight performance benefit over lambda. The above operation will produce output like:
>>> [(3, 4), (2, 5), (1, 6), (0, 7), (5, 7), (6, 7), (7, 7), (4, 8)]
Then finally print your list like:
for item in sorted_list:
print("Student: {} Scored: {}".format(item[0],item[1]))

Related

Convert a list of string to category integer in Python

Given a list of string,
['a', 'a', 'c', 'a', 'a', 'a', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'b', 'b', 'b', 'd', 'b', 'b', 'b']
I would like to convert to an integer-category form
[0, 0, 2, 0, 0, 0, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 1, 1, 1, 3, 1, 1, 1]
This can achieve using numpy unique as below
ipt=['a', 'a', 'c', 'a', 'a', 'a', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'b', 'b', 'b', 'd', 'b', 'b', 'b']
_, opt = np.unique(np.array(ipt), return_inverse=True)
But, I curious if there is another alternative without the need to import numpy.
If you are solely interested in finding integer representation of factors, then you can use a dict comprehension along with enumerate to store the mapping, after using set to find unique values:
lst = ['a', 'a', 'c', 'a', 'a', 'a', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'b', 'b', 'b', 'd', 'b', 'b', 'b']
d = {x: i for i, x in enumerate(set(lst))}
lst_new = [d[x] for x in lst]
print(lst_new)
# [3, 3, 0, 3, 3, 3, 2, 0, 2, 2, 2, 2, 0, 2, 2, 2, 2, 0, 2, 2, 2, 2, 0, 1, 1, 1, 2, 1, 1, 1]
This approach can be used for general factors, i.e., the factors do not have to be 'a', 'b' and so on, but can be 'dog', 'bus', etc. One drawback is that it does not care about the order of factors. If you want the representation to preserve order, you can use sorted:
d = {x: i for i, x in enumerate(sorted(set(lst)))}
lst_new = [d[x] for x in lst]
print(lst_new)
# [0, 0, 2, 0, 0, 0, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 1, 1, 1, 3, 1, 1, 1]
You could take a note out of the functional programming book:
ipt=['a', 'a', 'c', 'a', 'a', 'a', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'b', 'b', 'b', 'd', 'b', 'b', 'b']
opt = list(map(lambda x: ord(x)-97, ipt))
This code iterates through the input array and passes each element through the lambda function, which takes the ascii value of the character, and subtracts 97 (to convert the characters to 0-25).
If each string isn't a single character, then the lambda function may need to be adapted.
You could write a custom function to do the same thing as you are using numpy.unique() for.
def unique(my_list):
''' Takes a list and returns two lists, a list of each unique entry and the index of
each unique entry in the original list
'''
unique_list = []
int_cat = []
for item in my_list:
if item not in unique_list:
unique_list.append(item)
int_cat.append(unique_list.index(item))
return unique_list, int_cat
Or if you wanted your indexing to be ordered.
def unique_ordered(my_list):
''' Takes a list and returns two lists, an ordered list of each unique entry and the
index of each unique entry in the original list
'''
# Unique list
unique_list = []
for item in my_list:
if item not in unique_list:
unique_list.append(item)
# Sorting unique list alphabetically
unique_list.sort()
# Integer category list
int_cat = []
for item in my_list:
int_cat.append(unique_list.index(item))
return unique_list, int_cat
Comparing the computation time for these two vs numpy.unique() for 100,000 iterations of your example list, we get:
numpy = 2.236004s
unique = 0.460719s
unique_ordered = 0.505591s
Showing that either option would be faster than numpty for simple lists. More complicated strings decrease the speed of unique() and unique_ordered much more than numpy.unique(). Doing 10,000 iterations of a random, 100 element list of 20 character strings, we get times of:
numpy = 0.45465s
unique = 1.56963s
unique_ordered = 1.59445s
So if efficiency was important and your list had more complex/a larger variety of strings, it would likely be better to use numpy.unique()

Possible sets of different sub-list items with one element of each sub-list

I am looking for a way to obtain combinations of single elements of all sub-lists contained in a list without knowing in advance the length of the list and the sub-lists. Let me illustrate what I mean via two examples below. I have two lists (myList1 and myList2) and would like to obtain the two combination sets (setsCombo1 and setsCombo1):
myList1 = [['a'], [1, 2, 3], ['X', 'Y']]
setsCombo1 = [['a', 1, 'X'],
['a', 1, 'Y'],
['a', 2, 'X'],
['a', 2, 'Y'],
['a', 3, 'X'],
['a', 3, 'Y']]
myList2 = [['a'], [1, 2, 3], ['X', 'Y'], [8, 9]]
setsCombo2 = [['a', 1, 'X', 8],
['a', 1, 'X', 9],
['a', 1, 'Y', 8],
['a', 1, 'Y', 9],
['a', 2, 'X', 8],
['a', 2, 'X', 9],
['a', 2, 'Y', 8],
['a', 2, 'Y', 9],
['a', 3, 'X', 8],
['a', 3, 'X', 9],
['a', 3, 'Y', 8],
['a', 3, 'Y', 9]]
I looked a bit into itertools but couldn't really find anything quickly that is appropriate...
itertools.product with unpacking * (almost) does that:
>>> from itertools import product
>>> list(product(*myList1))
[('a', 1, 'X'),
('a', 1, 'Y'),
('a', 2, 'X'),
('a', 2, 'Y'),
('a', 3, 'X'),
('a', 3, 'Y')]
To cast the inner elements to lists, we map:
>>> list(map(list, product(*myList1)))
[['a', 1, 'X'],
['a', 1, 'Y'],
['a', 2, 'X'],
['a', 2, 'Y'],
['a', 3, 'X'],
['a', 3, 'Y']]

Averaging an element in 3D list python

I have a 3D list such as -
[
[[ 'A', 'B', 4], [ 'A', 'B', 6], [ 'A', 'B', 5], [ 'A', 'B', 7]],
[[ 'C', 'D', 5], [ 'C', 'D', 3], [ 'C', 'D', 2]],
[[ 'E', 'F', 4], [ 'E', 'F', 7], [ 'E', 'F', 3], [ 'E', 'F', 9], [ 'E', 'F', 11]]
....
..
]
I need to calculate the average of 3rd element of each 2D list and add it as an element in the list itself.
Tried itertools and other techniques to crawl the list but failing to get the number of elements in each 2D list for average calculation.
Expected output of the list -
[
[[ 'A', 'B', 4], [ 'A', 'B', 6], [ 'A', 'B', 5], [ 'A', 'B', 7], [ 'A', 'B', 5.5]],
[[ 'C', 'D', 5], [ 'C', 'D', 3], [ 'C', 'D', 2], [ 'C', 'D', 3.3]],
[[ 'E', 'F', 4], [ 'E', 'F', 7], [ 'E', 'F', 3], [ 'E', 'F', 9], [ 'E', 'F', 11], [ 'E', 'F', 6.8]]
....
..
]
I have tried -
for eachReading in list_final:
avg = sum(eachReading[0], eachReading[len(list_final - 1)]) / len(list_final)
eachReading.append(eachReading[0], eachReading[1], avg)
itertools is definitely an overkill. Use a simple loop (sorry for the terrible variable names):
li = [
[['A', 'B', 4], ['A', 'B', 6], ['A', 'B', 5], ['A', 'B', 7]],
[['C', 'D', 5], ['C', 'D', 3], ['C', 'D', 2]],
[['E', 'F', 4], ['E', 'F', 7], ['E', 'F', 3], ['E', 'F', 9], ['E', 'F', 11]]
]
for inner_list in li:
avg = sum(inner_inner[-1] for inner_inner in inner_list) / len(inner_list)
inner_list.append([inner_list[0][0], inner_list[0][1], round(avg, 2)]) # round to
# however many
# digits you want
print(li)
Outputs
[[['A', 'B', 4], ['A', 'B', 6], ['A', 'B', 5], ['A', 'B', 7], ['A', 'B', 5.5]],
[['C', 'D', 5], ['C', 'D', 3], ['C', 'D', 2], ['C', 'D', 3.33]],
[['E', 'F', 4],
['E', 'F', 7],
['E', 'F', 3],
['E', 'F', 9],
['E', 'F', 11],
['E', 'F', 6.8]]]

Python: how to replicate the same row of a matrix?

How can I copy each row of an array n times?
So if I have a 2x3 array, and I copy each row 3 times, I will have a 6x3 array. For example, I need to convert A to B below:
A = np.array([[1, 2, 3],
[4, 5, 6]])
B = np.array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[4, 5, 6],
[4, 5, 6],
[4, 5, 6]])
If possible, I would like to avoid a for loop.
If I read correctly, this is probably what you want assuming you started with mat:
transformed = np.concatenate([np.vstack([mat[:, i]] * 3).T for i in range(mat.shape[1])], axis=1)
Here's a verifiable example:
# mocking a starting array
import string
mat = np.random.choice(list(string.ascii_lowercase), size=(5,3))
>>> mat
array([['s', 'r', 'e'],
['g', 'v', 'c'],
['i', 'b', 'd'],
['f', 'g', 's'],
['o', 'm', 'w']], dtype='<U1')
Transform it:
# this repeats it 3 times for sake of displaying
transformed = np.concatenate([np.vstack([mat[i, :]] * 3).T for i in range(mat.shape[0])], axis=1).T
>>> transformed
array([['s', 'r', 'e'],
['s', 'r', 'e'],
['s', 'r', 'e'],
['g', 'v', 'c'],
['g', 'v', 'c'],
['g', 'v', 'c'],
['i', 'b', 'd'],
['i', 'b', 'd'],
['i', 'b', 'd'],
['f', 'g', 's'],
['f', 'g', 's'],
['f', 'g', 's'],
['o', 'm', 'w'],
['o', 'm', 'w'],
['o', 'm', 'w']], dtype='<U1')
The idea of this is to use vstack to concatenate each column to itself multiple time, and then concatenate the result of that to get the final array.
You can use np.repeat with integer positional indexing:
B = A[np.repeat(np.arange(A.shape[0]), 3)]
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[4, 5, 6],
[4, 5, 6],
[4, 5, 6]])
v1=[3,2]
v3=v1[:]*10
print(v3)
np.repeat is exactly what you are looking for. You can use the axis option to specify that you want to duplicate rows.
B = np.repeat(A, 3, axis=0)

Slicing flat list into multi-level nested list efficiently

For example, I have a flat list
[1, 2, 3, 4, 5, 6, 7, 8, 9, 'A', 'B', 'C', 'D', 'E', 'F', 'G']
I want to transform it into 4-deep list
[[[[1, 2], [3, 4]], [[5, 6], [7, 8]]], [[[9, 'A'], ['B', 'C']], [['D', 'E'] ['F', 'G']]]]
Is there a way to do it without creating a separate variable for every level? What is the most memory- and performance-efficient way?
UPDATE:
Also, is there a way to do it in a non-symmetrical fashion?
[[[[1, 2, 3], 4], [[5, 6, 7], 8]]], [[[9, 'A', 'B'], 'C']], [['D', 'E', 'F'], 'G']]]]
Note that your first list has 15 elements instead of 16. Also, what should A be? Is it a constant you've defined somewhere else? I'll just assume it's a string : 'A'.
If you work with np.arrays, you could simply reshape your array:
import numpy as np
r = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 'A', 'B', 'C', 'D', 'E', 'F', 'G'])
r.reshape(2,2,2,2)
It outputs:
array([[[['1', '2'],
['3', '4']],
[['5', '6'],
['7', '8']]]
[[['9', 'A'],
['B', 'C']],
[['D', 'E'],
['F', 'G']]]
dtype='<U11')
This should be really efficient because numpy doesn't change the underlying data format. It's still a flat array, displayed differently.
Numpy doesn't support irregular shapes. You'll have to work with standard python lists then:
i = iter([1, 2, 3, 4, 5, 6, 7, 8, 9, 'A', 'B', 'C', 'D', 'E', 'F', 'G'])
l1 = []
for _ in range(2):
l2 = []
for _ in range(2):
l3 = []
l4 = []
for _ in range(3):
l4.append(next(i))
l3.append(l4)
l3.append(next(i))
l2.append(l3)
l1.append(l2)
print(l1)
# [[[[1, 2, 3], 4], [[5, 6, 7], 8]], [[[9, 'A', 'B'], 'C'], [['D', 'E', 'F'], 'G']]]
As you said, you'll have to define a temporary variable for each level. I guess you could use list comprehensions, but they wouldn't be pretty.

Categories

Resources