pythonic way of removing similar items from list

pythonic way of removing similar items from list - python

I have a list of items from which i want to remove all similar values but the first and the last one. For example:
listIn = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1]
First three elements "1, 1, 1" are similar, so remove the middle "1".
Next two zeros are unmodified.
One is just one. Leave unmodified.
Four zeros. Remove items in-between the first and the last.
Resulting in:
listOut = [1, 1, 0, 0, 1, 0, 0, 1]
The way of doing this in c++ is very obvious, but it looks very different from the python coding style. Or is it the only way?
Basically, just removing excessive points on the graph where "y" value is not changed:

Use itertools.groupby() to group your values:
from itertools import groupby
listOut = []
for value, group in groupby(listIn):
listOut.append(next(group))
for i in group:
listOut.append(i)
break
or, for added efficiency, as a generator:
from itertools import groupby
def reduced(it):
for value, group in groupby(it):
yield next(group)
for i in group:
yield i
break
Demo:
>>> listIn = [1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1]
>>> list(reduced(listIn))
[1, 1, 0, 0, 1, 0, 0, 1]

One-liner:
listOut = reduce(lambda x, y: x if x[-1] == y and x[-2] == y else x + [y], listIn, listIn[0:2])

This provides a numpythonic solution to the problem; it should be a lot faster for large arrays than one based on itertools. Arguably, if you are doing signal processing of any kind, there is plenty of reason to be using numpy.
import numpy as np
a = np.array([1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1], np.int)
change = a[:-1] != a[1:]
I = np.zeros_like(a, np.bool)
I[:-1] = change
I[1:] += change
print a[I]

Related

How to check that there is no `1` touching a border in a 2D-list of `0` and `1`

I need to consider a swimming pool "legitimate". For the given list, the function should return "illegitimate". However, my code returns "legitimate", even though I haven't done anything to the data.
This is the code that I tried and I was expecting/should return "illegitimate" before trying to modify the list.
pool = [[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[1, 1, 1, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0]]
def is_legitimate_pool(pool):
for r in range(len(pool)):
for l in range(len(pool[r])):
if pool[r][0] == 1 or pool[4][l] == 1:
return str("illegitimate")
elif pool[r][0] == 0 or pool[4][l] == 0:
return str("legitimate")
print(is_legitimate_pool(pool))

Solution
You could start off by checking if any element in the first and last sub-list is non-zero. Any non-zero integer i when passed to bool(i) will evaluate to True and only zero is "falsy" (see Truth Value Testing). This allows us to simply use the built-in any function for checking those two lists. If it returns True, at least one element is not zero.
Then we just iterate through the other sub-lists and check if their first or last element is falsy (i.e. zero). If at least one is not, we can immediately return. If we get to the end of the loop, that means the "pool is legitimate".
Code
LEGIT = "legitimate"
NOT_LEGIT = "illegitimate"
def is_legitimate_pool(pool: list[list[int]]) -> str:
if any(pool[0]) or any(pool[-1]):
return NOT_LEGIT
for row in pool[1:-1]:
if row[0] or row[-1]:
return NOT_LEGIT
return LEGIT
Test
test_pool1 = [
[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[1, 1, 1, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0],
]
test_pool2 = [
[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 0, 0, 0],
]
test_pool3 = [
[0, 0, 0, 0, 0],
[0, 1, 1, 1, 0],
[0, 1, 1, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 0, 0],
]
print(is_legitimate_pool(test_pool1)) # illegitimate
print(is_legitimate_pool(test_pool2)) # illegitimate
print(is_legitimate_pool(test_pool3)) # legitimate
Caveat
The assumption is of course, that we are only interested in the "borders of the pool" being 0 and that an element can only ever be 0 or 1. If you actually need to explicitly check for border elements being 1s, we'd have to be a little more strict:
def is_legitimate_pool(pool: list[list[int]]) -> str:
if any(element == 1 for element in pool[0] + pool[-1]):
return NOT_LEGIT
for row in pool[1:-1]:
if row[0] == 1 or row[-1] == 1:
return NOT_LEGIT
return LEGIT
Errors in your code
There are a number of problems with your original function. One of them is that you must not return, before you have checked each sub-list. You need to check each of them, but you have a statement returning "legitimate", if your elif-condition holds, which would interrupt the loop as soon as just one row satisfies that condition.
The second problem is that you have your indices all messed up. The expression if pool[r][0] == 1 or pool[4][l] == 1 is the equivalent of saying "if the zero-th element in row r or the l-th element in row 4 equals 1". So you the second part is only ever checking row 4. You should check row r in both cases, but for the 0-th and 4-th element in that row being 1, so something like this: if pool[r][0] == 1 or pool[r][4] == 1.
Finally, you are not accounting for the fact that the very first row and the very last row must not contain any 1 at all. You'd have to check that at some point (preferably before starting to loop).
Optimizations
Fixing those problems would make the function work correctly, but it would still be sub-optimal because you would only be able to work on 5x5-lists of lists since you hard-coded the index 4 in a row to mean the "last" element. If you instead use index -1 it will refer to the last element no matter how long the list is.
For the sake of readability, you should avoid "index-juggling" as much as possible and instead leverage the fact that lists are iterable and can therefore be used in for-loops that yield each element one after the other. That way we can explicitly name and work on each sub-list/row in pool, making the code much clearer to the reader/yourself.
str("legitimate") is a no-op on the string literal "legitimate". You don't need the str function.
You should avoid shadowing global names in a local namespace. That means, if you have a global variable named pool, you should not also have a locally scoped variable pool in your function. Change one or the other so they have distinct names.

Numpy: How to find the most frequent nonzero values in array?

Suppose I have a numpy array of shape (1,4,5),
arr = np.array([[[ 0, 0, 0, 3, 0],
[ 0, 0, 2, 3, 2],
[ 0, 0, 0, 0, 0],
[ 2, 1, 0, 0, 0]]])
And I would like to find the most frequent non-zero value in the array across a specific axis, and only returns zero if there are no other non-zero values.
Let's say I'm looking at axis=2, I would like to get something like [[3,2,0,2]] from this array (For the last row either 1 or 2 would be fine). Is there a good way to implement this?
I've tried the solution in this following question (Link) , but I am unsure how to modify it so that it excludes a specific value.Thanks again!

We can use numpy.apply_along_axis and a simple function to solve this. Here, we make use of numpy.bincount to count the occurrences of numeric values and then numpy.argmax to get the highest occurrence. If there are no other values than exclude, we return it.
Code:
def get_freq(array, exclude):
count = np.bincount(array[array != exclude])
if count.size == 0:
return exclude
else:
return np.argmax(count)
np.apply_along_axis(lambda x: get_freq(x, 0), axis=2, arr=arr)
Output:
array([[3, 2, 0, 1]])
Please note, that it will also return exclude if you pass an empty array.
EDIT:
As Ehsan noted, above solution will not work for negative values in the given array. For this case, use Counter from collections:
arr = np.array([[[ 0, -3, 0, 3, 0],
[ 0, 0, 2, 3, 2],
[ 0, 0, 0, 0, 0],
[ 2, -5, 0, -5, 0]]])
from collections import Counter
def get_freq(array, exclude):
count = Counter(array[array != exclude]).most_common(1)
if not count:
return exclude
else:
return count[0][0]
Output:
array([[-3, 2, 0, -5]])
most_common(1) returns the most occurring value in the Counter object as one element list with a tuple in which first element is the value, and second is its number of occurrences. This is returned as a list, thus the double indexing. If list is empty, then most_common has not found any occurrences (either only exclude or empty).

This is an alternate solution (maybe not as efficient as the above one, but a unique one) -
#Gets the positions for the highest frequency numbers in axis=2
count_max_pos = np.argmax(np.sum(np.eye(5)[arr][:,:,:,1:], axis=2), axis=2)[0]+1
#gets the max values in based on the indices
k = enumerate(count_max_pos)
result = [arr[0][i] for i in k]
print(result)
[3,2,0,1]

cannot index igraph vs with output from numpy

I want to index the vertices of a graph with a list generated by numpy. It generates only [], although the same list specified explicitly works as expected. Here is an example:
import igraph as ig
import numpy as np
# Create a graph and give the vertices some numerical values, just 0 to start.
t = ig.Graph()
t.add_vertices(10)
t.vs['inf'] = [0]*10
print('the initial list: ', t.vs['inf'])
# Now choose a few indices and change the values to 1.
choose = [0, 1, 2, 3, 4]
print('the chosen indices: ', choose)
t.vs[choose]['inf'] = 1
print('indices selected explicitly: ', t.vs['inf'])
# That works as expected.
# Create the same list using numpy.
nums = np.arange(0,10)
npchoose = list(np.where(nums < 5)[0])
print('indices selected via numpy: ', npchoose)
t.vs[npchoose]['inf'] = 2
print('this has no effect: ', t.vs['inf'])
# The list appears identical but it does not index the .vs
# The index is actually empty.
print('just the chosen list, explicitly: ', t.vs[choose]['inf'])
print('just the chosen list, by numpy: ', t.vs[npchoose]['inf'])
The output is:
the initial list: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
the chosen indices: [0, 1, 2, 3, 4]
indices selected explicitly: [1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
indices selected via numpy: [0, 1, 2, 3, 4]
this has no effect: [1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
just the chosen list, explicitly: [1, 1, 1, 1, 1]
just the chosen list, by numpy: []
I'm at a loss because I want to select vertices for modification based on several conditions in other arrays, where numpy.where() is the most convenient tool.

Two solutions I came up with, in case someone else has the same problem:
Use npchoose = np.where(nums < 5)[0].tolist()
rather than list(np.where(nums < 5)[0]).
Make better use of the igraph vertex sequencing, rather than direct indexing: t.vs.select(lambda ve: ve.index in npchoose)['inf'] = 2. npchoose works fine as a numpy array in the lambda.

Move zeroes to end of list

I am working on moving all zeroes to end of list. .. is this approach bad and computationally expensive?
a = [1, 2, 0, 0, 0, 3, 6]
temp = []
zeros = []
for i in range(len(a)):
if a[i] !=0:
temp.append(a[i])
else:
zeros.append(a[i])
print(temp+zeros)
My Program works but not sure if this is a good approach?

A sorted solution that avoids changing the order of the other elements is:
from operator import not_
sorted(a, key=not_)
or without an import:
sorted(a, key=lambda x: not x) # Or x == 0 for specific numeric test
By making the key a simple boolean, sorted splits it into things that are truthy followed by things that are falsy, and since it's a stable sort, the order of things within each category is the same as the original input.

This looks like a list. Could you just use sort?
a = [1, 2, 0, 0, 0, 3, 6]
a.sort(reverse=True)
a
[6, 3, 2, 1, 0, 0, 0]

To move all the zeroes to the end of the list while preserving the order of all the elements in one traversal, we can keep the count of all the non-zero elements from the beginning and swap it with the next element when a non-zero element is encountered after zeroes.
This can be explained as:
arr = [18, 0, 4, 0, 0, 6]
count = 0
for i in range(len(arr):
if arr[i] != 0:
arr[i], arr[count] = arr[count], arr[i]
count += 1
How the loop works:
when i = 0, arr[i] will be 18, so according to the code it will swap with itself, which doesn't make a difference, and count will be incremented by one. When i=1, it will have no affect as till now the list we have traversed is what we want(zero in the end). When i=4, arr[i]= 4 and arr[count(1)]= 0, so we swap them leaving the list as[18, 4, 0, 0, 0, 6] and count becomes 2 signifying two non-zero elements in the beginning. And then the loop continues.

You can try my solution if you like
class Solution:
def moveZeroes(self, nums: List[int]) -> None:
for num in nums:
if num == 0:
nums.remove(num)
nums.append(num)
I have tried this code in leetcode & my submission got accepted using above code.

Nothing wrong with your approach, really depends on how you want to store the resulting values. Here is a way to do it using list.extend() and list.count() that preserves order of the non-zero elements and results in a single list.
a = [1, 2, 0, 0, 0, 3, 6]
result = [n for n in a if n != 0]
result.extend([0] * a.count(0))
print(result)
# [1, 2, 3, 6, 0, 0, 0]

You can try this
a = [1, 2, 0, 0, 0, 3, 6]
x=[i for i in a if i!=0]
y=[i for i in a if i==0]
x.extend(y)
print(x)

There's nothing wrong with your solution, and you should always pick a solution you understand over a 'clever' one you don't if you have to look after it.
Here's an alternative which never makes a new list and only passes through the list once. It will also preserve the order of the items. If that's not necessary the reverse sort solution is miles better.
def zeros_to_the_back(values):
zeros = 0
for value in values:
if value == 0:
zeros += 1
else:
yield value
yield from (0 for _ in range(zeros))
print(list(
zeros_to_the_back([1, 2, 0, 0, 0, 3, 6])
))
# [1, 2, 3, 6, 0, 0, 0]
This works using a generator which spits out answers one at a time. If we spot a good value we return it immediately, otherwise we just count the zeros and then return a bunch of them at the end.
yield from is Python 3 specific, so if you are using 2, just can replace this with a loop yielding zero over and over.

Numpy solution that preserves the order
import numpy as np
a = np.asarray([1, 2, 0, 0, 0, 3, 6])
# mask is a boolean array that is True where a is equal to 0
mask = (a == 0)
# Take the subset of the array that are zeros
zeros = a[mask]
# Take the subset of the array that are NOT zeros
temp = a[~mask]
# Join the arrays
joint_array = np.concatenate([temp, zeros])

I tried using sorted, which is similar to sort().
a = [1, 2, 0, 0, 0, 3, 6]
sorted(a,reverse=True)
ans:
[6, 3, 2, 1, 0, 0, 0]

from typing import List
def move(A:List[int]):
j=0 # track of nonzero elements
k=-1 # track of zeroes
size=len(A)
for i in range(size):
if A[i]!=0:
A[j]=A[i]
j+=1
elif A[i]==0:
A[k]=0
k-=1
since we have to keep the relative order. when you see nonzero element, place that nonzero into the index of jth.
first_nonzero=A[0] # j=0
second_nonzero=A[1] # j=1
third_nonzero=A[2] # j=2
With k we keep track of 0 elements. In python A[-1] refers to the last element of the array.
first_zero=A[-1] # k=-1
second_zero=A[-2] # k=-2
third_zero= A[-3] # k=-3

a = [4,6,0,6,0,7,0]
a = filter (lambda x : x!= 0, a) + [0]*a.count(0)
[4, 6, 6, 7, 0, 0, 0]

Can't figure out how to increment certain values in a list

So me and my buddy are helping each other learn about programming, and we have been coming up with challenges for ourselves. He came up with one where there are 20 switches. We need to write a program that first hits every other switch, and then every third switch, and then every fourth switch and have it output which are on and off.
I have the basic idea in my head about how to proceed but, I'm not entirely sure how to pick out every other/3rd/4th value from the list. I think once I get that small piece figured out the rest should be easy.
Here's the list:
start_list = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
I know I can select each element by doing:
start_list[2]
But then, how do I choose every other element, and then increment it by 1?

Use Python's List Slicing Notation:
start_list[::2]
Slicing goes as [start:stop:step]. [::2] means, from the beginning until the end, get every second element. This returns every second element.
I'm sure you can figure out how to get every third and fourth values :p.
To change the values of each, you can do this:
>>> start_list[::2] = len(start_list[::2])*[1]
>>> start_list
[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]

Every other switch:
mylist[::2]
Every third:
mylist[::3]
You can assign to it too:
mylist=[1,2,3,4]
mylist[::2]=[7,8]

>>> start_list = [0] * 20
>>> for i in range(2, len(start_list)):
... start_list[::i] = [1-x for x in start_list[::i]]
...
>>> start_list
[0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

pythonic way of removing similar items from list - python

One-liner: listOut = reduce(lambda x, y: x if x[-1] == y and x[-2] == y else x + [y], listIn, listIn[0:2])

Related

How to check that there is no `1` touching a border in a 2D-list of `0` and `1`

Numpy: How to find the most frequent nonzero values in array?

cannot index igraph vs with output from numpy

Move zeroes to end of list

Can't figure out how to increment certain values in a list

Categories

Resources