Python list comparison issues - python

I need to write a program in Python that compares two parallel lists to grade a multiple choice exam. One list has the exam solution and the second list has a student's answers. The question number for each missed question is to be stored in a third list using the natural index numbers. The solution must use indexing.
I keep getting an empty list returned for the third list. All help much appreciated!
def main():
exam_solution = ['B', 'D', 'A', 'A', 'C', 'A', 'B', 'A', 'C', 'D', 'B', 'C',\
'D', 'A', 'D', 'C', 'C', 'B', 'D', 'A']
student_answers = ['B', 'D', 'B', 'A', 'C', 'A', 'A', 'A', 'C', 'D', 'B', 'C',\
'D', 'B', 'D', 'C', 'C', 'B', 'D', 'A']
questions_missed = []
for item in exam_solution:
if item not in student_answers:
questions_missed.append(item)

questions_missed = [i for i, (ex,st) in enumerate(zip(exam_solution, student_answers)) if ex != st]
or alternatively, if you prefer loops over list comprehensions:
questions_missed = []
for i, (ex,st) in enumerate(zip(exam_solution, student_answers)):
if ex != st:
questions_missed.append(i)
Both give [2,6,13]
Explanation:
enumerate is a utility function that returns an iterable object which yields tuples of indices and values, it can be used to, loosely speaking, "have the current index available during an iteration".
Zip creates a list of tuples, containing corresponding elements from two or more iterable objects (in your case lists).
I'd prefer the list comprehension version.
If I add some timing code, I see that performance doesn't really differ here:
def list_comprehension_version():
questions_missed = [i for i, (ex,st) in enumerate(zip(exam_solution, student_answers)) if ex != st]
return questions_missed
def loop_version():
questions_missed = []
for i, (ex,st) in enumerate(zip(exam_solution, student_answers)):
if ex != st:
questions_missed.append(i)
return questions_missed
import timeit
print "list comprehension:", timeit.timeit("list_comprehension_version", "from __main__ import exam_solution, student_answers, list_comprehension_version", number=10000000)
print "loop:", timeit.timeit("loop_version", "from __main__ import exam_solution, student_answers, loop_version", number=10000000)
gives:
list comprehension: 0.895029446804
loop: 0.877159359719

A solution based on iterators
questions_missed = list(index for (index, _)
in filter(
lambda (_, (answer, solution)): answer != solution,
enumerate(zip(student_answers, exam_solution))))
For the purists, note that you should import the equivalents of zip and filter (izip and ifilter) from itertools.

One more solution comes to my mind. I put in in a separate answers as it is "special"
Using numpy this task can be accomplished by:
import numpy as np
exam_solution = np.array(exam_solution)
student_answers = np.array(student_answers)
(exam_solution!=student_answers).nonzero()[0]
With numpy-arrays, elementwise comparison is possible via == and !=. .nonzero() returns the indices of the array elements that are not zero. That's it.
Timing is really interesting now. For your 19-elements lists, performances are (N=19,repetitions=100,000):
list comprehension: 0.904024521544
loop: 0.936516107421
numpy: 0.349371968612
This is already a factor of almost 3. Nice, but not amazing.
But when I increase the size of your lists by a factor of 100, I get (N=19*100=1900, repetitions=1000):
list comprehension: 0.866544042939
loop: 1.04464069977
numpy: 0.0334220694495
Now we have a factor of 26 or 31 - that is definitely a lot.
Probably, performance won't be your problem, but, nevertheless, I thought it's worth pointing out.

Related

How to get only lowercase strings from a list using list comprehension

The question asked:
Use list comprehensions to generate a list with only the lowercase letters in my_list. Print the result list.
['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
My code:
my_list = ['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
hi = ([ char for char in range(len(my_list)) if char%2 == 0])
print(hi)
I tried it out, but got integers as answers and not the strings I wanted.
Note: several answers here assume that what you want is to select the values in the list that are lowercase. This answer assumes that that was an example and that the thing you're trying to do is to select the values in the list that occur at every other list index. (This seems to me to be the correct interpretation, because that's what the implementation in the question appears to be trying to do.) I'm not sure who misunderstood the question here, but since the question can be interpreted multiple ways, I think the question is probably at fault here. Until the question is clarified, I think it should be placed on hold.
The simplest and fastest way to do this is with a slice:
print(my_list[::2]) # Slice the whole list, with step=2
To replicate the logic you're describing, where you want to take the values with indexes that are modulo 2, then you need to generate both the indexes and the values for your list in the comprehension, and use one for the filtering and the other for the result:
hi = [ch for ix, ch in enumerate(my_list) if ix % 2 == 0]
Python strings have islower method. Also, you can directly iterate over the list, no need to check its length or the parity of the indexes.
my_list = ['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
hi = [char for char in my_list if char.islower()]
print(hi)
# ['a', 'b', 'c', d']
Your list comprehension:
[char for char in range(len(my_list)) if char%2 == 0]
Will produce integers instead of characters. This is because range(len(my_list)) gives you indices. You instead need to get the characters.
This can be done using enumerate():
[char for i, char in enumerate(my_list) if i % 2 == 0]
Or a less pythonic approach, using just indexing my_list:
[my_list[i] for i in range(len(my_list)) if i % 2 == 0]
You can also just filter out the lowercase letters with str.islower():
[char for char in my_list if char.islower()]
Which avoids having to use indices altogether.
You can use list comprehension as following where you iterate over your individual elements and check if it is a lower case using .islower()
my_list = ['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
lower = [i for i in my_list if i.islower()]
# ['a', 'b', 'c', 'd']
my_list = ['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
res = [ char for char in my_list if ord(char)>=97]
using islower() function
l = ['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
result = [el for el in l if el.islower()]
To add a range(len(my_list)) that create the following range(0, 8)
and char, in this case, is an integer and you create a list of integers.
To generate a list with only the lowercase letters use 'islower' method
hi = ([ char for char in my_list if char.islower()])

Alternative to using the sort function when adding to a list?

I want to insert a word alphabetically into a list. Originally I would append the word I'm adding to the end of the list and then sort the list, but I am not allowed to use the sort() function.
Is there a way to do this through a function?
Based of of #SheshankS.'s answer. A function to do this for you:
def insert(item, _list):
for index, element in enumerate(_list):
if item < element: # in python, this automatically compares alphabetical precedence.
_list.insert(index, item)
return # exit out of the function since we already inserted
# if the item was not inserted, it must have the lowest precedence, so just append it
_list.append(item)
Note that since lists are mutable, this will actually mutate the given instance.
So, this:
someList = ["a", "b", "d"]
insert("c", someList)
Will actually change someList instead of just returning the new value.
Try doing this:
array = ["asdf", "bsdf", "kkkk", "zssdd"]
insertion_string = "zzat"
i = 0
for element in array:
if insertion_string < element:
array.insert(i, insertion_string)
break
i += 1
# if it is last one
if not insertion_string in array:
array.append(insertion_string)
print (array )
Repl.it = https://repl.it/repls/VitalAvariciousCodec
You did not say if you are allowed to use third-party modules, and you did not say if speed is a factor. If you want to add a new item to your sorted list quickly and you are allowed to use a module, use the SortedList class from sortedcontainers. This is a module included in many distributions of Python, such as Anaconda.
This will be simple and fast, even for large lists.
someList = SortedList(["a", "b", "d"])
someList.add("c")
print(someList)
The printout from that is
SortedList(['a', 'b', 'c', 'd'])
>>> import bisect
>>> someList = ["a", "b", "d"]
>>> bisect.insort(someList,'c')
>>> someList
['a', 'b', 'c', 'd']
>>>
If standard lib is allowed you can use bisect:
>>> import bisect
>>> lst = list('abcefg')
>>> for x in 'Adh':
... lst.insert(bisect.bisect(lst, x), x)
... print(lst)
...
['A', 'a', 'b', 'c', 'e', 'f', 'g']
['A', 'a', 'b', 'c', 'd', 'e', 'f', 'g']
['A', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

For loop accessing two iteration positions in an array

I have the following code:
someList = ['a', 'b', 'c', 'd', 'e', 'f']
for i,j in enumerate(someList) step 2:
print('%s, %s' % (someList[i], someList[i+1]))
My question is, is there any way to simplify the iteration over the array in order to avoid the enumerate part and still accessing two variables at a time?
for x, y in zip(someList, someList[1:]):
print x, y
Standard technique.
You can create two iterators, call next on the second and then zip which avoids the need to copy the list elements by slicing:
someList = ['a', 'b', 'c', 'd', 'e', 'f']
it1, it2 = iter(someList), iter(someList)
next(it2)
for a,b in zip(it1, it2):
print(a, b)

python: compare lists in a sequence using nested for loops

so I have two lists where I compare a person's answers to the correct answers:
correct_answers = ['A', 'C', 'A', 'B', 'D']
user_answers = ['B', 'A', 'C', 'B', 'D']
I need to compare the two of them (without using sets, if that's even possible) and keep track of how many of the person's answers are wrong - in this case, 3
I tried using the following for loops to count how many were correct:
correct = 0
for i in correct_answers:
for j in user_answers:
if i == j:
correct += 1
print(correct)
but this doesn't work and I'm not sure what I need to change to make it work.
Just count them:
correct_answers = ['A', 'C', 'A', 'B', 'D']
user_answers = ['B', 'A', 'C', 'B', 'D']
incorrect = sum(1 if correct != user else 0
for correct, user in zip(correct_answers, user_answers))
I blame #alecxe for convincing me to post this, the ultra-efficient solution:
from future_builtins import map # <-- Only on Python 2 to get generator based map and avoid intermediate lists; on Py3, map is already a generator
from operator import ne
numincorrect = sum(map(ne, correct_answers, user_answers))
Pushes all the work to the C layer (making it crazy fast, modulo the initial cost of setting it all up; no byte code is executed if the values processed are Python built-in types, which removes a lot of overhead), and one-lines it without getting too cryptic.
The less pythonic, more generic (and readable) solution is pretty simple too.
correct_answers = ['A', 'C', 'A', 'B', 'D']
user_answers = ['B', 'A', 'C', 'B', 'D']
incorrect = 0
for i in range(len(correct_answers)):
if correct_answers[i] != user_answers[i]:
incorrect += 1
This assumes your lists are the same length. If you need to validate that, you can do it before running this code.
EDIT: The following code does the same thing, provided you are familiar with zip
correct_answers = ['A', 'C', 'A', 'B', 'D']
user_answers = ['B', 'A', 'C', 'B', 'D']
incorrect = 0
for answer_tuple in zip(correct_answers, user_answers):
if answer_tuple[0] != answer_tuple[1]:
incorrect += 1

Combining elements in list using python

Given input:
list = [['a']['a', 'c']['d']]
Expected Ouput:
mylist = a,c,d
Tried various possible ways, but the error recieved is TypeError: list indices must be integers not tuple.
Tried:
1.
k= []
list = [['a']['a', 'c']['d']]
#k=str(list)
for item in list:
k+=item
print k
2.
print zip(*list)
etc.
Also to strip the opening and closing parenthesis.
What you want is flattening a list.
>>> import itertools
>>> l
[['a'], ['a', 'c'], ['d']]
>>> res = list(itertools.chain.from_iterable(l))
>>> res
['a', 'a', 'c', 'd']
>>> set(res) #for uniqify, but doesn't preserve order
{'a', 'c', 'd'}
Edit: And your problem is, when defining a list, you should seperate values with a comma. So, not:
list = [['a']['a', 'c']['d']]
Use commas:
list = [['a'], ['a', 'c'], ['d']]
And also, using list as a variable is a bad idea, it conflicts with builtin list type.
And, if you want to use a for loop:
l = [['a'], ['a', 'c'], ['d']]
k = []
for sublist in l:
for item in sublist:
if item not in k: #if you want list to be unique.
k.append(item)
But using itertools.chain is better idea and more pythonic I think.
While utdemir's answer does the job efficiently, I think you should read this - start from "11.6. Recursion".
The first examples deals with a similar problem, so you'll see how to deal with these kinds of problems using the basic tools.

Categories

Resources