Delete line from 2D array in Python - python

I'm have a 585L, 2L numpy array in Python.
The data is organized like in this example.
0,0
1,0
2,1
3,0
4,0
5,1
...
I would like to create an array deleting all the lines where 0 is present on the seconds column. Thus, the final result pretended is:
2,1
5,1
I have read some related examples but I'm still struggling with this.

Since you mention your structure being a numpy array, rather than a list, I would use numpy logical indexing to select only the values you care for.
>>> import numpy as np
>>> x = [[0,0], [1,0], [2,1], [3,0], [4,0], [5,1]] # Create dummy list
>>> x = np.asarray(x) # Convert list to numpy array
>>> x[x[:, 1] != 0] # Select out items whose second column don't contain zeroes
array([[2, 1],
[5, 1]])

here is my answer
if your list like this [[0,0],[2,1],[4,3],[2,0]]
if your list structure isnt like this please tell me
my answer prints the list whose second column num dont equal to 0
print [x for x in your_list if x[1] != 0] #your_list is the variable of the list

You could use an list comprehension. These are described on the Python Data Structures page (Python 2, Python 3).
If your array is:
x = [[0,0],
[1,0],
[2,1],
[3,0],
[4,0],
[5,1]]
Then the following command
[y for y in x if y[1] != 0]
Would return the desired result of:
[[2, 1], [5, 1]]
Edit: I overlooked that it was a numpy array. Taking that into account, JoErNanO's answer is better.

Related

Reverse sort a column-based numpy array

I want to sort (descending) a numpy array where the array is reshaped to one column structure. However, the following code seems not be working.
a = array([5,1,2,4,9,2]).reshape(-1, 1)
a_sorted = np.sort(a)[::-1]
print("a=",a)
print("a_sorted=",a_sorted)
Output is
a= [[5]
[1]
[2]
[4]
[9]
[2]]
a_sorted= [[2]
[9]
[4]
[2]
[1]
[5]]
That is due to the reshape function. If I remove that, the sort works fine. How can I fix that?
Here you need Axis should be 0 (Column wise sorting)
np.sort(a,axis=0)[::-1]
Discussion:
a = np.array([[4,1],[23,2]])
print(a)
Output:
[[ 4 1]
[23 2]]
# Axis None (Sort as a flatten array)
print(np.sort(a,axis=None))
Output:
[ 1 2 4 23]
# Axis None (Sort as a row wise **(By default is set to 1)**)
print(np.sort(a,axis=1))
[[ 1 4]
[ 2 23]]
# Axis None (Sort as a column wise)
print(np.sort(a,axis=0))
[[ 4 1]
[23 2]]
For more details have a look in:
https://numpy.org/doc/stable/reference/generated/numpy.sort.html
As #tmdavison pointed out in the comments you forgot to use the axis option since by default np sorts matrices by row. By calling the reshape function, in fact, you're transforming the array into a 1-column matrix which sorting by row is trivially the matrix itself.
This would do the job
import numpy as np
a = np.array([5,1,2,4,9,2]).reshape(-1, 1)
a_sorted = np.sort(a, axis = 0)[::-1]
print("a=",a)
print("a_sorted=",a_sorted)
Extra points:
reference to the doc of sort
Next time remember to make the code reproducible (no np before array and no imports in your example). This was an easy case but it's not always like this

Stable conversion of a multi-column (2D) numpy array to an indicator vector

I often need to convert a multi-column (or 2D) numpy array into an indicator vector in a stable (i.e., order preserved) manner.
For example, I have the following numpy array:
import numpy as np
arr = np.array([
[2, 20, 1],
[1, 10, 3],
[2, 20, 2],
[2, 20, 1],
[1, 20, 3],
[2, 20, 2],
])
The output I like to have is:
indicator = [0, 1, 2, 0, 3, 2]
How can I do this (preferably using numpy only)?
Notes:
I am looking for a high performance (vectorized) approach as the arr (see the example above) has millions of rows in a real application.
I am aware of the following auxiliary solutions, but none is ideal. It would be nice to hear expert's opinion.
My thoughts so far:
1. Numpy's unique: This would not work, as it is not stable:
arr_unq, indicator = np.unique(arr, axis=0, return_inverse=True)
print(arr_unq)
# output 1:
# [[ 1 10 3]
# [ 1 20 3]
# [ 2 20 1]
# [ 2 20 2]]
print(indicator)
# output 2:
# [2 0 3 2 1 3]
Notice how the indicator starts from 2. This is because unique function returns a "sorted" array (see output 1). However, I would like it to start from 0.
Of course I can use LabelEncoder from sklearn to convert the items in a manner that they start from 0 but I feel that there is a simple numpy trick that I can use and therefore avoid adding sklearn dependency to my program.
Or I can resolve this by a dictionary mapping like below, but I can imagine that there is a better or more elegant solution:
dct = {}
for idx, item in enumerate(indicator):
if item not in dct:
dct[item] = len(dct)
indicator[idx] = dct[item]
print(indicator)
# outputs:
# [0 1 2 0 3 2]
2. Stabilizing numpy's unique output: This solution is already posted in stackoverflow and correctly returns an stable unique array. But I do not know how to convert the returned indicator vector (returned when return_inverse=True) to represent the values in an stable order starting from 0.
3. Pandas's get_dummies: function. But it returns a "hot encoding" (matrix of indicator values). In contrast, I would like to have an indicator vector. It is indeed possible to convert the "hot encoding" to the indicator vector by few lines of code and data manipulation. But again that approach is not going to be highly efficient.
In addition to return_inverse, you can add the return_index option. This will tell you the first occurrence of each sorted item:
unq, idx, inv = np.unique(arr, axis=0, return_index=True, return_inverse=True)
Now you can use the fact that np.argsort is its own inverse to fix the order. Note that idx.argsort() places unq into sorted order. The corrected result is therefore
indicator = idx.argsort().argsort()[inv]
And of course the byproduct
unq = unq[idx.argsort()]
Of course there's nothing special about these operations to 2D.
A Note on the Intuition
Let's say you have an array x:
x = np.array([7, 3, 0, 1, 4])
x.argsort() is the index that tells you what elements of x are placed at each of the locations in the sorted array. So
i = x.argsort() # 2, 3, 1, 4, 0
But how would you get from np.sort(x) back to x (which is the problem you express in #2)?
Well, it happens that i tells you the original position of each element in the sorted array: the first (smallest) element was originally at index 2, the second at 3, ..., the last (largest) element was at index 0. This means that to place np.sort(x) back into its original order, you need the index that puts i into sorted order. That means that you can write x as
np.sort(x)[i.argsort()]
Which is equivalent to
x[i][i.argsort()]
OR
x[x.argsort()][x.argsort().argsort()]
So, as you can see, np.argsort is effectively its own inverse: argsorting something twice gives you the index to put it back in the original order.

Generation of numpy arrays for permutations with constraints

To make a long story short, I'm trying to generate all the possible permutations of a set of numpy arrays. I have three numbers [j,k,m] and I would like to specify a maximum value for each one [J,K,M]. How would I then get all the combinations of arrays under these values? How could I force the k values to always be even as well? For instance:
So with the max values set to [1,2,2], the permutations would be: [0,0,0], [0,0,1], [0,0,2], [0,2,0], [0,2,1], [0,2,2], [1,0,0], [1,0,1] ...
I realise I don't have any example to code to show but I'm afraid I have literally no idea where to start with this.
From other answers it seems like sympy would be of some use?
I found answer that might be interested for you here and generalised it. So you can construct list of possible values for each item like so:
X = [[0, 1], [0, 1, 2], [0, 1, 2]]
And then use:
np.array(np.meshgrid(*X)).T.reshape(-1, len(X))
Output contains 18 items that you wanted. Actually, if you have only maximum values [J, K, L], you can construct X using X = [range(J+1), range(K+1), range(L+1)]

best way to create a numpy array from a list and additional individual values

I want to create an array from list entries and some additional individual values.
I am using the following approach which seems clumsy:
x=[1,2,3]
y=some_variable1
z=some_variable2
x.append(y)
x.append(z)
arr = np.array(x)
#print arr --> [1 2 3 some_variable1 some_variable2]
is there a better solution to the problem?
You can use list addition to add the variables all placed in a list to the larger one, like so:
arr = np.array(x + [y, z])
Appending or concatenating lists is fine, and probably fastest.
Concatenating at the array level works as well
In [456]: np.hstack([x,y,z])
Out[456]: array([1, 2, 3, 4, 5])
This is compact, but under the covers it does
np.concatenate([np.array(x),np.array([y]),np.array([z])])

Two dimensional array in python

I want to know how to declare a two dimensional array in Python.
arr = [[]]
arr[0].append("aa1")
arr[0].append("aa2")
arr[1].append("bb1")
arr[1].append("bb2")
arr[1].append("bb3")
The first two assignments work fine. But when I try to do, arr[1].append("bb1"), I get the following error:
IndexError: list index out of range.
Am I doing anything silly in trying to declare the 2-D array?
Edit:
but I do not know the number of elements in the array (both rows and columns).
You do not "declare" arrays or anything else in python. You simply assign to a (new) variable. If you want a multidimensional array, simply add a new array as an array element.
arr = []
arr.append([])
arr[0].append('aa1')
arr[0].append('aa2')
or
arr = []
arr.append(['aa1', 'aa2'])
There aren't multidimensional arrays as such in Python, what you have is a list containing other lists.
>>> arr = [[]]
>>> len(arr)
1
What you have done is declare a list containing a single list. So arr[0] contains a list but arr[1] is not defined.
You can define a list containing two lists as follows:
arr = [[],[]]
Or to define a longer list you could use:
>>> arr = [[] for _ in range(5)]
>>> arr
[[], [], [], [], []]
What you shouldn't do is this:
arr = [[]] * 3
As this puts the same list in all three places in the container list:
>>> arr[0].append('test')
>>> arr
[['test'], ['test'], ['test']]
What you're using here are not arrays, but lists (of lists).
If you want multidimensional arrays in Python, you can use Numpy arrays. You'd need to know the shape in advance.
For example:
import numpy as np
arr = np.empty((3, 2), dtype=object)
arr[0, 1] = 'abc'
You try to append to second element in array, but it does not exist.
Create it.
arr = [[]]
arr[0].append("aa1")
arr[0].append("aa2")
arr.append([])
arr[1].append("bb1")
arr[1].append("bb2")
arr[1].append("bb3")
We can create multidimensional array dynamically as follows,
Create 2 variables to read x and y from standard input:
print("Enter the value of x: ")
x=int(input())
print("Enter the value of y: ")
y=int(input())
Create an array of list with initial values filled with 0 or anything using the following code
z=[[0 for row in range(0,x)] for col in range(0,y)]
creates number of rows and columns for your array data.
Read data from standard input:
for i in range(x):
for j in range(y):
z[i][j]=input()
Display the Result:
for i in range(x):
for j in range(y):
print(z[i][j],end=' ')
print("\n")
or use another way to display above dynamically created array is,
for row in z:
print(row)
When constructing multi-dimensional lists in Python I usually use something similar to ThiefMaster's solution, but rather than appending items to index 0, then appending items to index 1, etc., I always use index -1 which is automatically the index of the last item in the array.
i.e.
arr = []
arr.append([])
arr[-1].append("aa1")
arr[-1].append("aa2")
arr.append([])
arr[-1].append("bb1")
arr[-1].append("bb2")
arr[-1].append("bb3")
will produce the 2D-array (actually a list of lists) you're after.
You can first append elements to the initialized array and then for convenience, you can convert it into a numpy array.
import numpy as np
a = [] # declare null array
a.append(['aa1']) # append elements
a.append(['aa2'])
a.append(['aa3'])
print(a)
a_np = np.asarray(a) # convert to numpy array
print(a_np)
a = [[] for index in range(1, n)]
For compititve programming
1) For input the value in an 2D-Array
row=input()
main_list=[]
for i in range(0,row):
temp_list=map(int,raw_input().split(" "))
main_list.append(temp_list)
2) For displaying 2D Array
for i in range(0,row):
for j in range(0,len(main_list[0]):
print main_list[i][j],
print
the above method did not work for me for a for loop, where I wanted to transfer data from a 2D array to a new array under an if the condition. This method would work
a_2d_list = [[1, 2], [3, 4]]
a_2d_list.append([5, 6])
print(a_2d_list)
OUTPUT - [[1, 2], [3, 4], [5, 6]]
x=3#rows
y=3#columns
a=[]#create an empty list first
for i in range(x):
a.append([0]*y)#And again append empty lists to original list
for j in range(y):
a[i][j]=input("Enter the value")
In my case I had to do this:
for index, user in enumerate(users):
table_body.append([])
table_body[index].append(user.user.id)
table_body[index].append(user.user.username)
Output:
[[1, 'john'], [2, 'bill']]

Categories

Resources