How to remove duplicates in nested lists?

How to remove duplicates in nested lists? - python

I want to remove both duplicates and permutations from my nested list.
Input:
[[-1, 0, 1], [-1, 1, 0], [-1, 2, -1], [-1, 2, -1], [-1, -1, 2]]
Expected Output:
[[-1, 0, 1], [-1, 2, -1]]
I tried using a list comprehension but I end up with the output as
[[-1, 1, 0], [-1, 2, -1], [-1, 0, 1], [-1, -1, 2]]
Here is what I attempted.
a = [[-1, 0, 1], [-1, 1, 0], [-1, 2, -1], [-1, 2, -1], [-1, -1, 2]]
b_set = set(tuple(x) for x in a)
b = [ list(x) for x in b_set ]
print(b)

The result is expected because [-1, 0, 1] != [-1, 1, 0]. You can sort the inner tuples if you want to make sure that they are considered equal:
b_set = set(tuple(sorted(x)) for x in a)

Or with map:
b_set = set(map(lambda x: tuple(sorted(x)),a))

Related

Why is this function being applied to a variable which is not called as a parameter?

I am having trouble with some code I am attempting to write.
I am attempting to take a list of lists of coordinates (representing possible positions of a shape in 3D) and form a list which consists of all the elements in the original list and additionally the elements in the original list rotated so that the [x, y, z] coordinates are shifted to include [z, x, y] and [y, z, x] also.
I think this is better illustrated with an example:
Taking the list (representing the possible positions of a 2x2x1 block, hence "two_by_two"):
two_by_two = [
[[-1, -1, 1], [-1, -1, 0], [-1, 0, 0], [-1, 0, 1]],
[[-1, -1, 0], [-1, -1, -1], [-1, 0, -1], [-1, 0, 0]]
...
]
(the ellipses representing more similar lists of coordinates) I am attempting to form the complete list:
two_by_two_comp = [
[[-1, -1, 1], [-1, -1, 0], [-1, 0, 0], [-1, 0, 1]],
[[-1, -1, 0], [-1, -1, -1], [-1, 0, -1], [-1, 0, 0]]
...
[[1, -1, -1], [0, -1, -1], [0, -1, 0], [1, -1, 0]],
[[0, -1, -1], [-1, -1, -1], [-1, -1, 0], [0, -1, 0]]
...
[[-1, 1, -1], [-1, 0, -1], [0, 0, -1], [0, 1, -1]],
[[-1, 0, -1], [-1, -1, -1], [0, -1, -1], [0, 0, -1]]
...
]
I hope that this is clear.
I am attempting to achieve this by using a function which shifts all of the coordinates in two_by_two:
# function to change [x, y, z] to [z, x, y]
def rotate_coordinates(parameter):
coord_list = parameter[len(parameter) - 1]
coordinates = coord_list[len(coord_list) - 1]
z_coordinate = coordinates[2]
coordinates.pop()
coordinates.insert(0, z_coordinate)
# function to change list[x, y, z] to list[z, x, y]
def rotate_coord_list(parameter):
coord_list = parameter[len(parameter) - 1]
a = len(coord_list)
while a > 0:
coordinates = coord_list[len(coord_list) - 1]
rotate_coordinates(parameter)
coord_list.pop()
coord_list.insert(0, coordinates)
a = a - 1
# function to change list[list[x, y, z]] to list[list[z, x, y]]
def rotate_positions_list(parameter):
b = len(parameter)
while b > 0:
coord_list = parameter[len(parameter) - 1]
rotate_coord_list(parameter)
parameter.pop()
parameter.insert(0, coord_list)
b = b - 1
This seems to me to be successful in that when I run:
print(two_by_two)
rotate_positions_list(two_by_two)
print(two_by_two)
It outputs:
[[[-1, -1, 1], [-1, -1, 0], [-1, 0, 0], [-1, 0, 1]],
[[-1, -1, 0], [-1, -1, -1], [-1, 0, -1], [-1, 0, 0]]
...]
[[[1, -1, -1], [0, -1, -1], [0, -1, 0], [1, -1, 0]],
[[0, -1, -1], [-1, -1, -1], [-1, -1, 0], [0, -1, 0]]
...]
And so it shifts all of the coordinates as I intended, the issue arises when I try to begin creating two_by_two_comp as so:
two_by_two_comp = []
two_by_two_comp.extend(two_by_two)
print(two_by_two_comp)
rotate_positions_list(two_by_two)
two_by_two_comp.extend(two_by_two)
print(two_by_two_comp)
Which returns:
[[[-1, -1, 1], [-1, -1, 0], [-1, 0, 0], [-1, 0, 1]],
[[-1, -1, 0], [-1, -1, -1], [-1, 0, -1], [-1, 0, 0]]
...]
[[[1, -1, -1], [0, -1, -1], [0, -1, 0], [1, -1, 0]],
[[0, -1, -1], [-1, -1, -1], [-1, -1, 0], [0, -1, 0]],
...
[[1, -1, -1], [0, -1, -1], [0, -1, 0], [1, -1, 0]],
[[0, -1, -1], [-1, -1, -1], [-1, -1, 0], [0, -1, 0]]
...]
So I end up with the same "version" of two_by_two copied as opposed to the shifted and original version, and I have no idea why the section of two_by_two_comp which I print out first gets affected by the rotate_positons_list(two_by_two) function.
If anyone could clear up my confusion, I would be very grateful. I will include the full script in one piece below.
Thank you,
Dan
two_by_two = [
[[-1, -1, 1], [-1, -1, 0], [-1, 0, 0], [-1, 0, 1]],
[[-1, -1, 0], [-1, -1, -1], [-1, 0, -1], [-1, 0, 0]],
[[-1, 0, 0], [-1, 0, -1], [-1, 1, -1], [-1, 1, 0]],
[[-1, 0, 1], [-1, 0, 0], [-1, 1, 0], [-1, 1, 1]],
[[0, -1, 1], [0, -1, 0], [0, 0, 0], [0, 0, 1]],
[[0, -1, 0], [0, -1, -1], [0, 0, -1], [0, 0, 0]],
[[0, 0, 0], [0, 0, -1], [0, 1, -1], [0, 1, 0]],
[[0, 0, 1], [0, 0, 0], [0, 1, 0], [0, 1, 1]],
[[1, -1, 1], [1, -1, 0], [1, 0, 0], [1, 0, 1]],
[[1, -1, 0], [1, -1, -1], [1, 0, -1], [1, 0, 0]],
[[1, 0, 0], [1, 0, -1], [1, 1, -1], [1, 1, 0]],
[[1, 0, 1], [1, 0, 0], [1, 1, 0], [1, 1, 1]],
]
# function to change [x, y, z] to [z, x, y]
def rotate_coordinates(parameter):
coord_list = parameter[len(parameter) - 1]
coordinates = coord_list[len(coord_list) - 1]
z_coordinate = coordinates[2]
coordinates.pop()
coordinates.insert(0, z_coordinate)
# function to change list[x, y, z] to list[z, x, y]
def rotate_coord_list(parameter):
coord_list = parameter[len(parameter) - 1]
a = len(coord_list)
while a > 0:
coordinates = coord_list[len(coord_list) - 1]
rotate_coordinates(parameter)
coord_list.pop()
coord_list.insert(0, coordinates)
a = a - 1
# function to change list[list[x, y, z]] to list[list[z, x, y]]
def rotate_positions_list(parameter):
b = len(parameter)
while b > 0:
coord_list = parameter[len(parameter) - 1]
rotate_coord_list(parameter)
parameter.pop()
parameter.insert(0, coord_list)
b = b - 1
two_by_two_comp = []
two_by_two_comp.extend(two_by_two)
print(two_by_two_comp)
rotate_positions_list(two_by_two)
two_by_two_comp.extend(two_by_two)
print(two_by_two_comp)

Your problem lies in the difference between deep copy and shallow copy. As per the docs
Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.
The problematic line is thus:
two_by_two_comp.extend(two_by_two)
Let me illustrate with an example using two lists a and b:
a = [[2, 3, 4], [1, 2, 3]]
b = []
b.extend(a)
Now let's say I modify something inside a:
a[0].append(3)
print(a) # [[2, 3, 4, 3], [1, 2, 3]]
Everything is fine, but have a look at what happened to b in the meantime:
print(b) # [[2, 3, 4, 3], [1, 2, 3]]
It was also modified.
To achieve what you want, you need to create a deep copy of two_by_two otherwise you will just be referencing the same memory address. Long story short, instead of:
two_by_two_comp.extend(two_by_two)
You must do:
two_by_two_comp.extend(copy.deepcopy(two_by_two))
Don't forget to import the copy module at the top of your script:
import copy

Transform matrix via rotation of values

How would you go about transofrming the values of a matrix from
A=
[0, 1, 2]
[-1, 0, 1]
[-2, -1, 0]
To this:
[0, -1, -2]
[1, 0, -1]
[2, 1, 0]
The operation is a mirror over the y=-x axis

In numpy, do .T:
>>> A = np.array([[0, 1, 2],
[-1, 0, 1],
[-2, -1, 0]])
>>> A.T
array([[ 0, -1, -2],
[ 1, 0, -1],
[ 2, 1, 0]])
>>>
In regular python, do zip:
>>> A = [[0, 1, 2],
[-1, 0, 1],
[-2, -1, 0]]
>>> list(zip(*A))
[(0, -1, -2), (1, 0, -1), (2, 1, 0)]
>>>

Generating all combinations of 1 and -1 in n Dimensions with numpy?

I need a way to generate a numpy array with all possible combination of [-1, 1] given a number of dimensions.
For example If i have 2 dimensions I would get :
[[1, 1], [1, -1], [-1, 1], [-1, -1]]
If I have 3 dimensions I would get:
[[1, 1, 1], [1, 1, -1], [1, -1, 1], [1, -1, -1], [-1, 1, 1], [-1, 1, -1], [-1, -1, 1], [-1, -1, -1]],
I have tried something like this :
import numpy as np
def permgrid(n):
inds = np.indices((2,) * n)
return inds.reshape(n, -1).T
But this only returns all combinations of 0 and 1.

You can use the product function from itertools.
Basically, you get all the combinations with repeat of 2.
print (list(itertools.product([1,-1], repeat=2)))
itertools.product(*iterables[, repeat])
Cartesian product of input iterables.
Roughly equivalent to nested for-loops in a generator expression.
You can read more in here

Here's a NumPy's broadcasting based method -
def broadcasting_typecast(n):
return -2*((np.arange(2**n)[:,None] & (1 << np.arange(n-1,-1,-1))) != 0)+1
Sample runs -
In [231]: n = 2
In [232]: broadcasting_typecast(n)
Out[232]:
array([[ 1, 1],
[ 1, -1],
[-1, 1],
[-1, -1]])
In [233]: n = 3
In [234]: broadcasting_typecast(n)
Out[234]:
array([[ 1, 1, 1],
[ 1, 1, -1],
[ 1, -1, 1],
[ 1, -1, -1],
[-1, 1, 1],
[-1, 1, -1],
[-1, -1, 1],
[-1, -1, -1]])

Either replace,
def permgrid(n):
inds = np.indices((2,) * n)
out = inds.reshape(n, -1).T
return np.where(out==0, -np.ones_like(out), out)
or do it with math:
def permgrid(n):
inds = np.indices((2,) * n)
return inds.reshape(n, -1).T*2-1

You might want to have a look at itertools. It's a package for the generation of sorted sequences and the like.
import itertools as it
for element in it.combinations_with_replacement([1,-1],3):
print element

you could use np.ix_. advantage: you can easily replace -1,1 with whatever you like (other numbers, other dtypes, more than 2, etc.)
>>> n = 3
>>> out = np.empty(n*(2,)+(n,), dtype=int)
>>> for j, sl in enumerate(np.ix_(*(n*((-1,1),)))):
... out[..., j] = sl
...
>>> out
array([[[[-1, -1, -1],
[-1, -1, 1]],
[[-1, 1, -1],
[-1, 1, 1]]],
[[[ 1, -1, -1],
[ 1, -1, 1]],
[[ 1, 1, -1],
[ 1, 1, 1]]]])
Optionally:
flat_out = np.reshape(out, (-1, n))

Two dimensional matrix using for loop in Python

I want to make the following matrix using loops:
matrix = [[x - 3 , y - 3], [ x - 2 , y - 3], [x - 1, y - 3], [ x , y - 3],
[x - 3, y - 2], [x - 2, y - 2], [x - 1, y - 2], [x, y - 2],
[x - 3, y - 1], [x - 2, y - 1], [x - 1, y - 1], [x, y - 1],
[x - 3, y], [x - 2, y ], [x - 1, y ], [x, y],
[x - 3, y + 1], [x - 2, y + 1], [x - 1, y + 1], [x, y + 1],
[x - 3, y + 2], [x - 2, y + 2], [x - 1, y + 2], [x, y + 2],
[x - 3, y + 3], [x - 2, y + 3], [x - 1, y + 3], [x, y + 3]]
such that when if I want to increase constant from 3 to 5 or any number it automatically creates this matrix accordingly. It is 7x4 matrix. Any suggestions ? Thanks

Here's an approach with np.meshgrid -
r,c = np.ogrid[x-3:x+1, y-3:y+4]
out = np.dstack(np.meshgrid(r,c))
Sample input, output -
In [114]: x,y = 0,0
In [115]: out.tolist() # Showing as list
Out[115]:
[[[-3, -3], [-2, -3], [-1, -3], [0, -3]],
[[-3, -2], [-2, -2], [-1, -2], [0, -2]],
[[-3, -1], [-2, -1], [-1, -1], [0, -1]],
[[-3, 0], [-2, 0], [-1, 0], [0, 0]],
[[-3, 1], [-2, 1], [-1, 1], [0, 1]],
[[-3, 2], [-2, 2], [-1, 2], [0, 2]],
[[-3, 3], [-2, 3], [-1, 3], [0, 3]]]
You can also use np.mgrid that would produce X's and Y's swapped -
np.dstack(np.mgrid[y-3:y+4, x-3:x+1])

Here is another way to do it:
>>> def compute(x,y):
... return [[x+j, y+i] for i in range(-3,4) for j in range(-3,1)]
...
>>> print compute(0,0)
[[-3, -3], [-2, -3], [-1, -3], [0, -3], [-3, -2], [-2, -2], [-1, -2], [0, -2], [-3, -1], [-2, -1], [-1, -1], [0, -1], [-3, 0], [-2, 0], [-1, 0], [0, 0], [-3, 1], [-2, 1], [-1, 1], [0, 1], [-3, 2], [-2, 2], [-1, 2], [0, 2], [-3, 3], [-2, 3], [-1, 3], [0, 3]]

Create 3D array using Python

I would like to create a 3D array in Python (2.7) to use like this:
distance[i][j][k]
And the sizes of the array should be the size of a variable I have. (nnn)
I tried using:
distance = [[[]*n]*n]
but that didn't seem to work.
I can only use the default libraries, and the method of multiplying (i.e.,[[0]*n]*n) wont work because they are linked to the same pointer and I need all of the values to be individual

You should use a list comprehension:
>>> import pprint
>>> n = 3
>>> distance = [[[0 for k in xrange(n)] for j in xrange(n)] for i in xrange(n)]
>>> pprint.pprint(distance)
[[[0, 0, 0], [0, 0, 0], [0, 0, 0]],
[[0, 0, 0], [0, 0, 0], [0, 0, 0]],
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]]
>>> distance[0][1]
[0, 0, 0]
>>> distance[0][1][2]
0
You could have produced a data structure with a statement that looked like the one you tried, but it would have had side effects since the inner lists are copy-by-reference:
>>> distance=[[[0]*n]*n]*n
>>> pprint.pprint(distance)
[[[0, 0, 0], [0, 0, 0], [0, 0, 0]],
[[0, 0, 0], [0, 0, 0], [0, 0, 0]],
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]]
>>> distance[0][0][0] = 1
>>> pprint.pprint(distance)
[[[1, 0, 0], [1, 0, 0], [1, 0, 0]],
[[1, 0, 0], [1, 0, 0], [1, 0, 0]],
[[1, 0, 0], [1, 0, 0], [1, 0, 0]]]

numpy.arrays are designed just for this case:
numpy.zeros((i,j,k))
will give you an array of dimensions ijk, filled with zeroes.
depending what you need it for, numpy may be the right library for your needs.

The right way would be
[[[0 for _ in range(n)] for _ in range(n)] for _ in range(n)]
(What you're trying to do should be written like (for NxNxN)
[[[0]*n]*n]*n
but that is not correct, see #Adaman comment why).

d3 = [[[0 for col in range(4)]for row in range(4)] for x in range(6)]
d3[1][2][1] = 144
d3[4][3][0] = 3.12
for x in range(len(d3)):
print d3[x]
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 144, 0, 0], [0, 0, 0, 0]]
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [3.12, 0, 0, 0]]
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]

"""
Create 3D array for given dimensions - (x, y, z)
#author: Naimish Agarwal
"""
def three_d_array(value, *dim):
"""
Create 3D-array
:param dim: a tuple of dimensions - (x, y, z)
:param value: value with which 3D-array is to be filled
:return: 3D-array
"""
return [[[value for _ in xrange(dim[2])] for _ in xrange(dim[1])] for _ in xrange(dim[0])]
if __name__ == "__main__":
array = three_d_array(False, *(2, 3, 1))
x = len(array)
y = len(array[0])
z = len(array[0][0])
print x, y, z
array[0][0][0] = True
array[1][1][0] = True
print array
Prefer to use numpy.ndarray for multi-dimensional arrays.

You can also use a nested for loop like shown below
n = 3
arr = []
for x in range(n):
arr.append([])
for y in range(n):
arr[x].append([])
for z in range(n):
arr[x][y].append(0)
print(arr)

There are many ways to address your problem.
First one as accepted answer by #robert. Here is the generalised
solution for it:
def multi_dimensional_list(value, *args):
#args dimensions as many you like. EG: [*args = 4,3,2 => x=4, y=3, z=2]
#value can only be of immutable type. So, don't pass a list here. Acceptable value = 0, -1, 'X', etc.
if len(args) > 1:
return [ multi_dimensional_list(value, *args[1:]) for col in range(args[0])]
elif len(args) == 1: #base case of recursion
return [ value for col in range(args[0])]
else: #edge case when no values of dimensions is specified.
return None
Eg:
>>> multi_dimensional_list(-1, 3, 4) #2D list
[[-1, -1, -1, -1], [-1, -1, -1, -1], [-1, -1, -1, -1]]
>>> multi_dimensional_list(-1, 4, 3, 2) #3D list
[[[-1, -1], [-1, -1], [-1, -1]], [[-1, -1], [-1, -1], [-1, -1]], [[-1, -1], [-1, -1], [-1, -1]], [[-1, -1], [-1, -1], [-1, -1]]]
>>> multi_dimensional_list(-1, 2, 3, 2, 2 ) #4D list
[[[[-1, -1], [-1, -1]], [[-1, -1], [-1, -1]], [[-1, -1], [-1, -1]]], [[[-1, -1], [-1, -1]], [[-1, -1], [-1, -1]], [[-1, -1], [-1, -1]]]]
P.S If you are keen to do validation for correct values for args i.e. only natural numbers, then you can write a wrapper function before calling this function.
Secondly, any multidimensional dimensional array can be written as single dimension array. This means you don't need a multidimensional array. Here are the function for indexes conversion:
def convert_single_to_multi(value, max_dim):
dim_count = len(max_dim)
values = [0]*dim_count
for i in range(dim_count-1, -1, -1): #reverse iteration
values[i] = value%max_dim[i]
value /= max_dim[i]
return values
def convert_multi_to_single(values, max_dim):
dim_count = len(max_dim)
value = 0
length_of_dimension = 1
for i in range(dim_count-1, -1, -1): #reverse iteration
value += values[i]*length_of_dimension
length_of_dimension *= max_dim[i]
return value
Since, these functions are inverse of each other, here is the output:
>>> convert_single_to_multi(convert_multi_to_single([1,4,6,7],[23,45,32,14]),[23,45,32,14])
[1, 4, 6, 7]
>>> convert_multi_to_single(convert_single_to_multi(21343,[23,45,32,14]),[23,45,32,14])
21343
If you are concerned about performance issues then you can use some libraries like pandas, numpy, etc.

n1=np.arange(90).reshape((3,3,-1))
print(n1)
print(n1.shape)

I just want notice that
distance = [[[0 for k in range(n)] for j in range(n)] for i in range(n)]
can be shortened to
distance = [[[0] * n for j in range(n)] for i in range(n)]

def n_arr(n, default=0, size=1):
if n is 0:
return default
return [n_arr(n-1, default, size) for _ in range(size)]
arr = n_arr(3, 42, 3)
assert arr[2][2][2], 42

If you insist on everything initializing as empty, you need an extra set of brackets on the inside ([[]] instead of [], since this is "a list containing 1 empty list to be duplicated" as opposed to "a list containing nothing to duplicate"):
distance=[[[[]]*n]*n]*n

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to remove duplicates in nested lists? - python

The result is expected because [-1, 0, 1] != [-1, 1, 0]. You can sort the inner tuples if you want to make sure that they are considered equal: b_set = set(tuple(sorted(x)) for x in a)

Or with map: b_set = set(map(lambda x: tuple(sorted(x)),a))

Related

Why is this function being applied to a variable which is not called as a parameter?

Transform matrix via rotation of values

Generating all combinations of 1 and -1 in n Dimensions with numpy?

Two dimensional matrix using for loop in Python

Create 3D array using Python

Categories

Resources