Finding the common item between two list in python - python

Hey guys I need help on this past test question. Basically I am given two list of objects, and I am suppose to find the number of items that appear in the same position of the first list and the second. I have an example that was provided.
>>> commons(['a', 'b', 'c', 'd'], ['a', 'x', 'b', 'd'])
2
>>> commons(['a', 'b', 'c', 'd', 'e'], ['a', 'x', 'b', 'd'])
2
I am having trouble writing out the code. Our class is using python 3. I have no idea where to start writing this from. It is a first year programming course and I never did programming in my life.

I think a more straightforward solution would be:
def commons(L1,L2):
return len([x for x in zip(L1,L2) if x[0]==x[1]])

This is not a simple problem for a beginner. A more straightforward approach would use functions like sum and zip with a list comprehension like so:
def commons(L1, L2):
return sum(el1 == el2 * 1 for el1, el2 in zip(L1, L2))
A more typical but error prone approach taken by beginners is:
def commons(L1, L2):
count = 0
for i, elem in enumerate(L2):
if elem == L1[i]:
count += 1
return count
I say this is more error prone because there are more parts to get right.
Without using enumerate you could do:
def commons(L1, L2):
count = 0
for i, range(len(L2)):
if L1[i] == L2[i]:
count += 1
return count
but these previous two will work only if len(L2) <= len(L1). See what I mean by more error prone? To fix this you would need to do:
def commons(L1, L2):
count = 0
for i, range(min(len(L2), len(L1))):
if L1[i] == L2[i]:
count += 1
return count

Seems like this would work:
def commons(l1, l2):
return sum(1 for v1,v2 in map(None, l1,l2) if v1 == v2)
Note: The degenerate form of map used here results in None being returned for all values in the shorter list (so even if l1 and l2 are not the same length, it'll work.) It assumes that both lists all have values (i.e., L1 and L2 do not contain None - since that would end up in false positives if one list was shorter than the other.)

Related

New list of not repeated elements

I want to create a function that take a lsit as argument, for example:
list = ['a','b','a','d','e','f','a','b','g','b']
and returns a specific number of list elements ( i chose the number) such that no number occurs twice. For example if i chose 3:
new_list = ['a','b','d']
I tried the following:
def func(j, list):
new_list=[]
for i in list:
while(len(new_list)<j):
for k in new_list:
if i != k:
new_list.append(i)
return new_list
But the function went through infinite loop.
def func(j, mylist):
# dedup, preserving order (dict is insertion-ordered as a language guarantee as of 3.7):
deduped = list(dict.fromkeys(mylist))
# Slice off all but the part you care about:
return deduped[:j]
If performance for large inputs is a concern, that's suboptimal (it processes the whole input even if j unique elements are found in first j indices out of an input where j is much smaller than the input), so the more complicated solution can be used for maximum efficiency. First, copy the itertools unique_everseen recipe:
from itertools import filterfalse, islice # At top of file, filterfalse for recipe, islice for your function
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
now wrap it with islice to only pull off as many elements as required and exiting immediately once you have them (without processing the rest of the input at all):
def func(j, mylist): # Note: Renamed list argument to mylist to avoid shadowing built-in
return list(islice(unique_everseen(mylist), j))
Try this.
lst = ['a','b','a','d','e','f','a','b','g','b']
j = 3
def func(j,list_):
new_lst = []
for a in list_:
if a not in new_lst:
new_lst.append(a)
return new_lst[:j]
print(func(j,lst)) # ['a', 'b', 'd']
I don't know why someone does not post a numpy.unique solution
Here is memory efficient way(I think 😉).
import numpy as np
lst = ['a','b','a','d','e','f','a','b','g','b']
def func(j,list_):
return np.unique(list_).tolist()[:j]
print(func(3,lst)) # ['a', 'b', 'd']
list is a reserved word in python.
If order of the elements is not a concern then
def func(j, user_list):
return list(set(user_list))[:j]
it's bad practice to use "list" as variable name
you can solve the problem by just using the Counter lib in python
from collections import Counter
a=['a','b','a','d','e','f','a','b','g','b']
b = list(Counter(a))
print(b[:3])
so your function will be something like that
def unique_slice(list_in, elements):
new_list = list(Counter(list_in))
print("New list: {}".format(new_list))
if int(elements) <= len(new_list):
return new_list[:elements]
return new_list
hope it solves your question
As others have said you should not Shadow built-in name 'list'. Because that could lead to many issues. This is a simple problem where you should add to a new list and check if the element was already added.
The [:] operator in python lets you separate the list along an index.
>>>l = [1, 2, 3, 4]
>>>l[:1]
[1]
>>>l[1:]
[2, 3, 4]
lst = ['a', 'b', 'a', 'd', 'e', 'f', 'a', 'b', 'g', 'b']
def func(number, _list):
out = []
for a in _list:
if a not in out:
out.append(a)
return out[:number]
print(func(4, lst)) # ['a', 'b', 'd', 'e']

Sequence using recursion

I am trying to write a function which outputs all possible combinations of char list with length and without any repeats like aa, bb etc.
I am now on this stage:
def sequences(char_list, n, lst = []):
if len(lst) == n:
print(lst)
else:
for i in range(len(char_list)):
temp_list = [char_list[j] for j in range(len(char_list)) if i != j]
sequences(temp_list, n, lst + [char_list[i]])
print(sequences(["a", "b", "c"], 2))
Output is correct but I have None at the end. I actually have no idea why.
['a', 'b']
['a', 'c']
['b', 'a']
['b', 'c']
['c', 'a']
['c', 'b']
None
And what is the best way to get strings in the output and not lists?
The function sequences doesn't return anything (there's no return statement anywhere in the code), so it'll automatically return None. print(sequences(["a", "b", "c"], 2)) will execute this function and print its return value, outputting None.
To get strings instead of lists, concatenate all the strings in the list like this:
print(''.join(lst))
Every function has an implicit return None at the end of it. On the last line of your code, you asked Python the print the output of sequences, which is None since no return value was specified.
Well the problem was is that I printed it at the end once again.

Returning semi-unique values from a list

Not sure how else to word this, but say I have a list containing the following sequence:
[a,a,a,b,b,b,a,a,a]
and I would like to return:
[a,b,a]
How would one do this in principle?
You can use itertools.groupby, this groups consecutive same elements in the same group and return an iterator of key value pairs where the key is the unique element you are looking for:
from itertools import groupby
[k for k, _ in groupby(lst)]
# ['a', 'b', 'a']
lst = ['a','a','a','b','b','b','a','a','a']
Psidoms way is a lot better, but I may as well write this so you can see how it'd be possible just using basic loops and statements. It's always good to figure out what steps you'd need to take for any problem, as it usually makes coding the simple things a bit easier :)
original = ['a','a','a','b','b','b','a','a','a']
new = [original[0]]
for letter in original[1:]:
if letter != new[-1]:
new.append(letter)
Basically it will append a letter if the previous letter is something different.
Using list comprehension:
original = ['a','a','a','b','b','b','a','a','a']
packed = [original[i] for i in range(len(original)) if i == 0 or original[i] != original[i-1]]
print(packed) # > ['a', 'b', 'a']
Similarly (thanks to pylang) you can use enumerate instead of range:
[ x for i,x in enumerate(original) if i == 0 or x != original[i-1] ]
more_itertools has an implementation of the unique_justseen recipe from itertools:
import more_itertools as mit
list(mit.unique_justseen(["a","a","a","b","b","b","a","a","a"]))
# ['a', 'b', 'a']

Sort list in place using another list of index

given
a = [1,4,5,3,2,6,0]
b = ['b','e','f','d','c','g','a']
order b in place, the expected order of b is available in the corresponding positional element of a.
output will be
['a','b','c','d','e','f','g']
try for other similar input sets.
a = [4,0,1,3,2]
b = ['E','A','B','D','C']
I can get it done using a third list, even sorted() creates a third list, but the key is to sort b in place
print sorted(b,key=lambda bi : a[b.index(bi)])
core of the problem is how to prevent iterating over items in b that were already iterated.
Try this:
zip(*sorted(zip(a, b)))[1]
Should give:
('a', 'b', 'c', 'd', 'e', 'f', 'g')
Since during sorting the b itself appears to be empty (see my question about that), you can use that piece of code to do it in-place:
b.sort(key=lambda x, b=b[:]: a[b.index(x)])
This uses a copy of the b to search in during sorting. This is certainly not very good for performance, so don't blame me ;-)
The key is to realise that the items in b aren't much use to the key function. You are interested in their counterparts in a. To do this inplace, means you can't just use zip to pair the items up. Here I use the default argument trick to get an iterator over a into the lambda function.
>>> a = [1,4,5,3,2,6,0]
>>> b = ['b','e','f','d','c','g','a']
>>> b.sort(key=lambda x, it=iter(a): next(it))
>>> b
['a', 'b', 'c', 'd', 'e', 'f', 'g']
def sorter(a,b):
for i in range(len(a)):
while i != a[i]:
ai = a[i]
b[i], b[ai], a[i], a[ai] = b[ai], b[i], a[ai], a[i]
return b
Simple bubble sort:
for i in range( len(a) ):
for j in range(len(a)-1-i):
if (a[j] > a[j+1]):
#swap a[j] & a[j+1]
#swap b[j] & b[j+1]

Python - compare nested lists and append matches to new list?

I wish to compare to nested lists of unequal length. I am interested only in a match between the first element of each sub list. Should a match exist, I wish to add the match to another list for subsequent transformation into a tab delimited file. Here is an example of what I am working with:
x = [['1', 'a', 'b'], ['2', 'c', 'd']]
y = [['1', 'z', 'x'], ['4', 'z', 'x']]
match = []
def find_match():
for i in x:
for j in y:
if i[0] == j[0]:
match.append(j)
return match
This returns:
[['1', 'x'], ['1', 'y'], ['1', 'x'], ['1', 'y'], ['1', 'z', 'x']]
Would it be good practise to reprocess the list to remove duplicates or can this be done in a simpler fashion?
Also, is it better to use tuples and/or tuples of tuples for the purposes of comparison?
Any help is greatly appreciated.
Regards,
Seafoid.
Use sets to obtain collections with no duplicates.
You'll have to use tuples instead of lists as the items because set items must be hashable.
The code you posted doesn't seem to generate the output you posted. I do not have any idea how you are supposed to generate that output from that input. For example, the output has 'y' and the input does not.
I think the design of your function could be much improved. Currently you define x, y, and match as the module level and read and mutate them explicitly. This is not how you want to design functions—as a general rule, a function shouldn't mutate something at the global level. It should be explicitly passed everything it needs and return a result, not implicitly receive information and change something outside itself.
I would change
x = some list
y = some list
match = []
def find_match():
for i in x:
for j in y:
if i[0] == j[0]:
match.append(j)
return match # This is the only line I changed. I think you meant
# your return to be over here?
find_match()
to
x = some list
y = some list
def find_match(x, y):
match = []
for i in x:
for j in y:
if i[0] == j[0]:
match.append(j)
return match
match = find_match(x, y)
To take that last change to the next level, I usually replace the pattern
def f(...):
return_value = []
for...
return_value.append(foo)
return return_value
with the similar generator
def f(...):
for...
yield foo
which would make the above function
def find_match(x, y):
for i in x:
for j in y:
if i[0] == j[0]:
yield j
another way to express this generator's effect is with the generator expression (j for i in x for j in y if i[0] == j[0]).
I don't know if I interpret your question correctly, but given your example it seems that you might be using a wrong index:
change
if i[1] == j[1]:
into
if i[0] == j[0]:
You can do this a lot more simply by using sets.
set_x = set([i[0] for i in x])
set_y = set([i[0] for i in y])
matches = list(set_x & set_y)
if i[1] == j[1]
checks whether the second elements of the arrays are identical. You want if i[0] == j[0].
Otherwise, I find your code quite readable and wouldn't necessarily change it.
A simplier expression should work here too:
list_of_lists = filter(lambda l: l[0][0] == l[1][0], zip(x, y))
map(lambda l: l[1], list_of_lists)

Categories

Resources