Code not performing as expected [For in cycle] - python

Why is this not working? Actual result is [] for any entry.
def non_unique(ints):
"""
Return a list consisting of only the non-unique elements from the list lst.
You are given a non-empty list of integers (ints). You should return a
list consisting of only the non-unique elements in this list. To do so
you will need to remove all unique elements (elements which are
contained in a given list only once). When solving this task, do not
change the order of the list.
>>> non_unique([1, 2, 3, 1, 3])
[1, 3, 1, 3]
>>> non_unique([1, 2, 3, 4, 5])
[]
>>> non_unique([5, 5, 5, 5, 5])
[5, 5, 5, 5, 5]
>>> non_unique([10, 9, 10, 10, 9, 8])
[10, 9, 10, 10, 9]
"""
new_list = []
for x in ints:
for a in ints:
if ints.index(x) != ints.index(a):
if x == a:
new_list.append(a)
return new_list
Working code (not from me):
result = []
for c in ints:
if ints.count(c) > 1:
result.append(c)
return result

list.index will return the first index that contains the input parameter, so if x==a is true, then ints.index(x) will always equal ints.index(a). If you want to keep your same code structure, I'd recommend keeping track of the indicies within the loop using enumerate as in:
for x_ind, x in enumerate(ints):
for a_ind, a in enumerate(ints):
if x_ind != a_ind:
if x == a:
new_list.append(a)
Although, for what it's worth, I think your example of working code is a better way of accomplishing the same task.

Although the example of working code is correct, if suffers from quadratic complexity which makes it slow for larger lists. I'd prefer s.th. like this:
from nltk.probability import FreqDist
def non_unique(ints):
fd = FreqDist(ints)
return [x for x in ints if fd[x] > 1]
It precomputes a frequency distribution in the first step, and then selects all non-unique elements. Both steps have a O(n) performance characteristic.

Related

How can I recursively add only unique items from one list to another?

I've been trying to recursively add unique items from one list to another, but I only get the first value. How can I fix this? Here is my code:
def bag_to_set(bag):
new_set = []
if len(bag) > 0:
if bag[0] not in new_set:
new_set.append(bag[0])
bag = bag[1:]
bag = bag_to_set(bag)
return new_set
def fun(arr):
def helper(ind):
if ind==len(arr)-1:
return res.copy()
if arr[ind]!=arr[ind-1] or ind==0:
res.append(arr[ind])
helper(ind+1)
return helper(0)
res=[]
arr=[1,2,3,3]
fun(arr)
print(res)
using recursion
def bag_to_set(array, unique=None):
if unique is None:
unique = set()
if not array:
return unique
val = array.pop()
unique.add(val)
return bag_to_set(array, unique)
array = [1,2,3,4,5,6,12,3,4,5,6]
print(bag_to_set(array))
# output -> {1, 2, 3, 4, 5, 6, 12}
This gives the correct results and is in the spirit of the recursive approach I think you are driving at:
def bag_to_set(bag):
if not bag:
return []
new_set = bag_to_set(bag[:-1])
if bag[-1] not in new_set:
new_set.append(bag[-1])
return new_set
Some example cases:
print(bag_to_set([]))
print(bag_to_set([2]))
print(bag_to_set([2, 2]))
print(bag_to_set([2, 2, 3, 3, 5, 10]))
print(bag_to_set([10, 5, 3, 2, 2, 3]))
which prints:
[]
[2]
[2]
[2, 3, 5, 10]
[10, 5, 3, 2]
Note that if you impose the requirements that (a) the function returns a list, and (b) it operates recursively, then you end up having to check not in against a list, which is O(N), whereas for a set it would be O(1). So the recursive process as a whole will be O(N**2), when non-recursive O(N) approaches exist.

How to calculate a cumulative product of a list using list comprehension

I'm trying my hand at converting the following loop to a comprehension.
Problem is given an input_list = [1, 2, 3, 4, 5]
return a list with each element as multiple of all elements till that index starting from left to right.
Hence return list would be [1, 2, 6, 24, 120].
The normal loop I have (and it's working):
l2r = list()
for i in range(lst_len):
if i == 0:
l2r.append(lst_num[i])
else:
l2r.append(lst_num[i] * l2r[i-1])
Python 3.8+ solution:
:= Assignment Expressions
lst = [1, 2, 3, 4, 5]
curr = 1
out = [(curr:=curr*v) for v in lst]
print(out)
Prints:
[1, 2, 6, 24, 120]
Other solution (with itertools.accumulate):
from itertools import accumulate
out = [*accumulate(lst, lambda a, b: a*b)]
print(out)
Well, you could do it like this(a):
import math
orig = [1, 2, 3, 4, 5]
print([math.prod(orig[:pos]) for pos in range(1, len(orig) + 1)])
This generates what you wanted:
[1, 2, 6, 24, 120]
and basically works by running a counter from 1 to the size of the list, at each point working out the product of all terms before that position:
pos values prod
=== ========= ====
1 1 1
2 1,2 2
3 1,2,3 6
4 1,2,3,4 24
5 1,2,3,4,5 120
(a) Just keep in mind that's less efficient at runtime since it calculates the full product for every single element (rather than caching the most recently obtained product). You can avoid that while still making your code more compact (often the reason for using list comprehensions), with something like:
def listToListOfProds(orig):
curr = 1
newList = []
for item in orig:
curr *= item
newList.append(curr)
return newList
print(listToListOfProds([1, 2, 3, 4, 5]))
That's obviously not a list comprehension but still has the advantages in that it doesn't clutter up your code where you need to calculate it.
People seem to often discount the function solution in Python, simply because the language is so expressive and allows things like list comprehensions to do a lot of work in minimal source code.
But, other than the function itself, this solution has the same advantages of a one-line list comprehension in that it, well, takes up one line :-)
In addition, you're free to change the function whenever you want (if you find a better way in a later Python version, for example), without having to change all the different places in the code that call it.
This should not be made into a list comprehension if one iteration depends on the state of an earlier one!
If the goal is a one-liner, then there are lots of solutions with #AndrejKesely's itertools.accumulate() being an excellent one (+1). Here's mine that abuses functools.reduce():
from functools import reduce
lst = [1, 2, 3, 4, 5]
print(reduce(lambda x, y: x + [x[-1] * y], lst, [lst.pop(0)]))
But as far as list comprehensions go, #AndrejKesely's assignment-expression-based solution is the wrong thing to do (-1). Here's a more self contained comprehension that doesn't leak into the surrounding scope:
lst = [1, 2, 3, 4, 5]
seq = [a.append(a[-1] * b) or a.pop(0) for a in [[lst.pop(0)]] for b in [*lst, 1]]
print(seq)
But it's still the wrong thing to do! This is based on a similar problem that also got upvoted for the wrong reasons.
A recursive function could help.
input_list = [ 1, 2, 3, 4, 5]
def cumprod(ls, i=None):
i = len(ls)-1 if i is None else i
if i == 0:
return 1
return ls[i] * cumprod(ls, i-1)
output_list = [cumprod(input_list, i) for i in range(len(input_list))]
output_list has value [1, 2, 6, 24, 120]
This method can be compressed in python3.8 using the walrus operator
input_list = [ 1, 2, 3, 4, 5]
def cumprod_inline(ls, i=None):
return 1 if (i := len(ls)-1 if i is None else i) == 0 else ls[i] * cumprod_inline(ls, i-1)
output_list = [cumprod_inline(input_list, i) for i in range(len(input_list))]
output_list has value [1, 2, 6, 24, 120]
Because you plan to use this in list comprehension, there's no need to provide a default for the i argument. This removes the need to check if i is None.
input_list = [ 1, 2, 3, 4, 5]
def cumprod_inline_nodefault(ls, i):
return 1 if i == 0 else ls[i] * cumprod_inline_nodefault(ls, i-1)
output_list = [cumprod_inline_nodefault(input_list, i) for i in range(len(input_list))]
output_list has value [1, 2, 6, 24, 120]
Finally, if you really wanted to keep it to a single , self-contained list comprehension line, you can follow the approach note here to use recursive lambda calls
input_list = [ 1, 2, 3, 4, 5]
output_list = [(lambda func, x, y: func(func,x,y))(lambda func, ls, i: 1 if i == 0 else ls[i] * func(func, ls, i-1),input_list,i) for i in range(len(input_list))]
output_list has value [1, 2, 6, 24, 120]
It's entirely over-engineered, and barely legible, but hey! it works and its just for fun.
For your list, it might not be intentional that the numbers are consecutive, starting from 1. But for cases that that pattern is intentional, you can use the built in method, factorial():
from math import factorial
input_list = [1, 2, 3, 4, 5]
l2r = [factorial(i) for i in input_list]
print(l2r)
Output:
[1, 2, 6, 24, 120]
The package numpy has a number of fast implementations of list comprehensions built into it. To obtain, for example, a cumulative product:
>>> import numpy as np
>>> np.cumprod([1, 2, 3, 4, 5])
array([ 1, 2, 6, 24, 120])
The above returns a numpy array. If you are not familiar with numpy, you may prefer to obtain just a normal python list:
>>> list(np.cumprod([1, 2, 3, 4, 5]))
[1, 2, 6, 24, 120]
using itertools and operators:
from itertools import accumulate
import operator as op
ip_lst = [1,2,3,4,5]
print(list(accumulate(ip_lst, func=op.mul)))

How to remove first occurrence of a specific item from a list of items without using .pop() or .remove()

I have a list, let us call it l = [1,2,3,7,8,9,10,7]. Given this list l, I am trying to remove the first occurrence of the number 7 without using the .pop() or .remove() built-in functions.
I have tried
def remove_item(l, item_to_remove):
newlst = []
for item in l:
if item != item_to_remove:
newlst.append(item)
return newlst
However, this removes all instances of the item I am trying to remove when in fact I only want to remove the very first instance of that said specific item. Does anyone have some tips on how to accomplish this??
You only need to take care that the removing part of your code doesn't run twice.
lst = [1,2,3,7,8,9,10,7] # [1, 2, 3, 7, 8, 9, 10, 7]
print(lst)
for i in range(len(lst)):
if lst[i] == 7:
del lst[i] # [1, 2, 3, 8, 9, 10, 7]
break
print(lst)
It does exactly the same as the following:
lst = [1,2,3,7,8,9,10,7]
print(lst) # [1, 2, 3, 7, 8, 9, 10, 7]
for i in range(len(lst)):
if lst[i] == 7:
lst.pop(i)
break
print(lst) # [1, 2, 3, 8, 9, 10, 7]
as well as this
lst = [1,2,3,7,8,9,10,7]
print(lst) # [1, 2, 3, 7, 8, 9, 10, 7]
for i in range(len(lst)):
if lst[i] == 7:
lst.remove(lst[i])
break
print(lst) # [1, 2, 3, 8, 9, 10, 7]
Overview of the used methods:
del list[i] - The del statement can also be used to remove slices from a list
list.pop - remove and return item at index (default last). Raises IndexError if list is empty or index is out of range.
list.remove - remove first occurrence of value.Raises ValueError if the value is not present.
You just need to add a little logic to it. I add a looking variable which signifies that we havent found the entry were looking for. Heres the code
def remove_item(l, item_to_remove):
newlst = []
looking = True
for item in l:
if item != item_to_remove or not looking:
newlst.append(item)
else:
looking = False
return newlst
list = [1,3,4,5,6,7,3,10]
print(remove_item(list, 3))
which returns [1, 4, 5, 6, 7, 3, 10]
Very wasteful, but here you go, a solution:
def remove_first(sequence, element):
return sequence[:sequence.index(element)] + sequence[sequence.index(element)+1:]
Then you can:
>>> remove_first(["a", "b", "a", "c"], "a"):
['b', 'a', 'c']
index returns the index of the first found occurrence of an element.
The rest is sequence splicing and catenation.
Of course, you could generalize this to remove(sequence, element, n) to remove the n-th found element.
EDIT: I just stated falsely that index also supports that. Statement removed.
Or you could choose to mutate the input, but for one, returning the output is cleaner, and you could not have a general "sequence" argument, as not all sequences are mutable. See the tuple type.
.index(x) returns the first index location of x within the list, so just delete it. If x is not found, it returns ValueError.
my_list = [1, 2, 3, 7, 8, 9, 10, 7]
val = 7
if val in my_list:
del my_list[my_list.index(val)]
>>> my_list
[1, 2, 3, 8, 9, 10, 7]
Signature: my_list.index(value, start=0, stop=9223372036854775807, /)
Docstring:
Return first index of value.
Raises ValueError if the value is not present.
Welcome to StackOverflow!
Minor modification to your code,.
I would prefer remove but here is your modified code to do the required job
def remove_item(l, item_to_remove):
newlst = []
for item in l:
if item != item_to_remove:
newlst.append(item)
else:
return newlst + l[len(newlst) + 1 :]
return newlst
In Python, you can add the lists. Using list comprehensions, you select sub-lists(l[len(newlst) + 1 :]).
Testing
>>> list = [1,3,4,5,6,7,3,10]
>>> print(remove_item(list, 3))
[1, 4, 5, 6, 7, 3, 10]
lst = [1,2,3,7,8,9,10,7]
new_lst = lst[:lst.index(7)] + lst[lst.index(7) + 1:]
new_lst
[1, 2, 3, 8, 9, 10, 7]
Similar idea to CEWeinhauer's solution, but one which takes advantage of Python features to minimize overhead once we've found the item to remove:
def remove_item(l, item_to_remove):
newlst = []
liter = iter(l) # Make single pass iterator, producing each item once
for item in liter:
if item == item_to_remove: # Found single item to remove, we're done
break
newlst.append(item) # Not found yet
newlst += liter # Quickly consume all elements after removed item without tests
return newlst
The above works with any input iterable in a single pass, so it's better if the input might not be a list and/or might be huge. But it's admittedly more complex code. The much simpler solution is to just find the element with index and remove it. It might be slightly slower in some cases, since it's two O(n) steps instead of just one, but it uses C built-ins more, so it's likely to be faster in practice:
def remove_item(l, item_to_remove):
newlst = list(l)
del newlst[newlst.index(item_to_remove)]
return newlst

Unable to create duplicate list from existing list using list comprehension with an if condition

I have a sorted list with duplicate elements like
>>> randList = [1, 2, 2, 3, 4, 4, 5]
>>> randList
[1, 2, 2, 3, 4, 4, 5]
I need to create a list that removes the adjacent duplicate elements. I can do it like:
>>>> dupList = []
for num in nums:
if num not in dupList:
dupList.append(num)
But I want to do it with list comprehension. I tried the following code:
>>> newList = []
>>> newList = [num for num in randList if num not in newList]
But I get the result like the if condition isn't working.
>>> newList
[1, 2, 2, 3, 4, 4, 5]
Any help would be appreciated.
Thanks!!
Edit 1: The wording of the question does seem to be confusing given the data I have provided. The for loop that I am using will remove all duplicates but since I am sorting the list beforehand, that shouldn't a problem when removing adjacent duplicates.
Using itertools.groupby is the simplest approach to remove adjacent (and only adjacent) duplicates, even for unsorted input:
>>> from itertools import groupby
>>> [k for k, _ in groupby(randList)]
[1, 2, 3, 4, 5]
Removing all duplicates while maintaining the order of occurence can be efficiently achieved with an OrderedDict. This, as well, works for ordered and unordered input:
>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(randList))
[1, 2, 3, 4, 5]
I need to create a list that removes the adjacent duplicate elements
Note that your for loop based solution will remove ALL duplicates, not only adjacent ones. Test it with this:
rand_list = [1, 2, 2, 3, 4, 4, 2, 5, 1]
according to your spec the result should be:
[1, 2, 3, 4, 2, 5, 1]
but you'll get
[1, 2, 3, 4, 5]
instead.
A working solution to only remove adjacent duplicates is to use a generator:
def dedup_adjacent(seq):
prev = seq[0]
yield prev
for current in seq[1:]:
if current == prev:
continue
yield current
prev = current
rand_list = [1, 2, 2, 3, 4, 4, 2, 5, 1]
list(dedup_adjacent(rand_list))
=> [1, 2, 3, 4, 2, 5, 1]
Python first evaluates the list comprehension and then assigns it to newList, so you cannot refer to it during execution of the list comprehension.
You can remove dublicates in two ways:-
1. Using for loop
rand_list = [1,2,2,3,3,4,5]
new_list=[]
for i in rand_list:
if i not in new_list:
new_list.append(i)
Convert list to set,then again convert set to list,and at last sort the new list.
Since set stores values in any order so when we convert set into list you need to sort the list so that you get the item in ascending order
rand_list = [1,2,2,3,3,4,5]
sets = set(rand_list)
new_list = list(sets)
new_list.sort()
Update: Comparison of different Approaches
There have been three ways of achieving the goal of removing adjacent duplicate elements in a sorted list, i.e. removing all duplicates:
using groupby (only adjacent elements, requires initial sorting)
using OrderedDict (all duplicates removed)
using sorted(list(set(_))) (all duplicaties removed, ordering restored by sorting).
I compared the running times of the different solutions using:
from timeit import timeit
print('groupby:', timeit('from itertools import groupby; l = [x // 5 for x in range(1000)]; [k for k, _ in groupby(l)]'))
print('OrderedDict:', timeit('from collections import OrderedDict; l = [x // 5 for x in range(1000)]; list(OrderedDict.fromkeys(l))'))
print('Set:', timeit('l = [x // 5 for x in range(1000)]; sorted(list(set(l)))'))
> groupby: 78.83623623599942
> OrderedDict: 94.54144410200024
> Set: 65.60372123999969
Note that the set approach is the fastest among all alternatives.
Old Answer
Python first evaluates the list comprehension and then assigns it to newList, so you cannot refer to it during execution of the list comprehension. To illustrate, consider the following code:
randList = [1, 2, 2, 3, 4, 4, 5]
newList = []
newList = [num for num in randList if print(newList)]
> []
> []
> []
> …
This becomes even more evident if you try:
# Do not initialize newList2
newList2 = [num for num in randList if print(newList2)]
> NameError: name 'newList2' is not defined
You can remove duplicates by turning randList into a set:
sorted(list(set(randlist)))
> [1, 2, 3, 4, 5]
Be aware that this does remove all duplicates (not just adjacent ones) and ordering is not preserved. The former also holds true for your proposed solution with the loop.
edit: added a sorted clause as to specification of required ordering.
In this line newList = [num for num in randList if num not in newList], at first the list will be created in right side then then it will be assigned to newList. That's why every time you check if num not in newList returns True. Becasue newList remains empty till the assignment.
You can try this:
randList = [1, 2, 2, 3, 4, 4, 5]
new_list=[]
for i in randList:
if i not in new_list:
new_list.append(i)
print(new_list)
You cannot access the items in a list comprehension as you go along. The items in a list comprehension are only accessible once the comprehension is completed.
For large lists, checking for membership in a list will be expensive, albeit with minimal memory requirements. Instead, you can append to a set:
randList = [1, 2, 2, 3, 4, 4, 5]
def gen_values(L):
seen = set()
for i in L:
if i not in seen:
seen.add(i)
yield i
print(list(gen_values(randList)))
[1, 2, 3, 4, 5]
This algorithm has been implemented in the 3rd party toolz library. It's also known as the unique_everseen recipe in the itertools docs:
from toolz import unique
res = list(unique(randList))
Since your list is sorted, using set will be the fasted way to achieve your goal, as follows:
>>> randList = [1, 2, 2, 3, 4, 4, 5]
>>> randList
[1, 2, 2, 3, 4, 4, 5]
>>> remove_dup_list = list(set(randList))
>>> remove_dup_list
[1, 2, 3, 4, 5]
>>>

concatenate an arbitrary number of lists in a function in Python

I hope to write the join_lists function to take an arbitrary number of lists and concatenate them. For example, if the inputs are
m = [1, 2, 3]
n = [4, 5, 6]
o = [7, 8, 9]
then we I call print join_lists(m, n, o), it will return [1, 2, 3, 4, 5, 6, 7, 8, 9]. I realize I should use *args as the argument in join_lists, but not sure how to concatenate an arbitrary number of lists. Thanks.
Although you can use something which invokes __add__ sequentially, that is very much the wrong thing (for starters you end up creating as many new lists as there are lists in your input, which ends up having quadratic complexity).
The standard tool is itertools.chain:
def concatenate(*lists):
return itertools.chain(*lists)
or
def concatenate(*lists):
return itertools.chain.from_iterable(lists)
This will return a generator which yields each element of the lists in sequence. If you need it as a list, use list: list(itertools.chain.from_iterable(lists))
If you insist on doing this "by hand", then use extend:
def concatenate(*lists):
newlist = []
for l in lists: newlist.extend(l)
return newlist
Actually, don't use extend like that - it's still inefficient, because it has to keep extending the original list. The "right" way (it's still really the wrong way):
def concatenate(*lists):
lengths = map(len,lists)
newlen = sum(lengths)
newlist = [None]*newlen
start = 0
end = 0
for l,n in zip(lists,lengths):
end+=n
newlist[start:end] = list
start+=n
return newlist
http://ideone.com/Mi3UyL
You'll note that this still ends up doing as many copy operations as there are total slots in the lists. So, this isn't any better than using list(chain.from_iterable(lists)), and is probably worse, because list can make use of optimisations at the C level.
Finally, here's a version using extend (suboptimal) in one line, using reduce:
concatenate = lambda *lists: reduce((lambda a,b: a.extend(b) or a),lists,[])
One way would be this (using reduce) because I currently feel functional:
import operator
from functools import reduce
def concatenate(*lists):
return reduce(operator.add, lists)
However, a better functional method is given in Marcin's answer:
from itertools import chain
def concatenate(*lists):
return chain(*lists)
although you might as well use itertools.chain(*iterable_of_lists) directly.
A procedural way:
def concatenate(*lists):
new_list = []
for i in lists:
new_list.extend(i)
return new_list
A golfed version: j=lambda*x:sum(x,[]) (do not actually use this).
You can use sum() with an empty list as the start argument:
def join_lists(*lists):
return sum(lists, [])
For example:
>>> join_lists([1, 2, 3], [4, 5, 6])
[1, 2, 3, 4, 5, 6]
Another way:
>>> m = [1, 2, 3]
>>> n = [4, 5, 6]
>>> o = [7, 8, 9]
>>> p = []
>>> for (i, j, k) in (m, n, o):
... p.append(i)
... p.append(j)
... p.append(k)
...
>>> p
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>>
This seems to work just fine:
def join_lists(*args):
output = []
for lst in args:
output += lst
return output
It returns a new list with all the items of the previous lists. Is using + not appropriate for this kind of list processing?
Or you could be logical instead, making a variable (here 'z') equal to the first list passed to the 'join_lists' function
then assigning the items in the list (not the list itself) to a new list to which you'll then be able add the elements of the other lists:
m = [1, 2, 3]
n = [4, 5, 6]
o = [7, 8, 9]
def join_lists(*x):
z = [x[0]]
for i in range(len(z)):
new_list = z[i]
for item in x:
if item != z:
new_list += (item)
return new_list
then
print (join_lists(m, n ,o)
would output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Categories

Resources