Iterate over numpy.ma array, ignoring masked values - python

I would like to iterate over only unmasked values in a np.ma.ndarray.
With the following:
import numpy as np
a = np.ma.array([1, 2, 3], mask = [0, 1, 0])
for i in a:
print i
I get:
1
--
3
I would like to get the following:
1
3
It seems like np.nditer() may be the way to go, but I don't find any flags that might specify this. How might I do this? Thanks!

you want to use a.compressed()
import numpy as np
a = np.ma.array([1, 2, 3], mask = [0, 1, 0])
for i in a.compressed():
print i
which gives:
1
3

Related

Replace occurences of a numpy array in another numpy array with a value

I have a 2D numpy array:
[[1,2,3,4,5],[2,4,5,6,7],[0,9,3,2,4]]
I also have a second 1D array:
[2,3,4]
I want to replace all occurences of the elements of the second array with 0
So eventually, my second array should look like
[[1,0,0,0,5],[0,0,5,6,7],[0,9,0,0,0]]
is there a way in python/numpy I can do this without using a loop.
I already checked at np.where, but the condition there is only for example where element = 1 value, and not multiple.
Thanks a lot !
Use numpy.isin.
>>> import numpy as np
>>> a = np.array([[1,2,3,4,5],[2,4,5,6,7],[0,9,3,2,4]])
>>> b = np.array([2,3,4])
>>> a[np.isin(a, b)] = 0
>>> a
array([[1, 0, 0, 0, 5],
[0, 0, 5, 6, 7],
[0, 9, 0, 0, 0]])

vaex filter an dataframe using mask from anther series

I want to use a mask from series x to filter out a vaex dataframe y.
I know how to do this in pandas and numpy. In pandas it's like:
import pandas as pd
a = [0,0,0,1,1,1,0,0,0]
b = [4,5,7,8,9,9,0,6,4]
x = pd.Series(a)
y = pd.Series(b)
print(y[x==1])
The result is like:
3 8
4 9
5 9
dtype: int64
But in vaex, the following code doesn't work.
import vaex
import numpy as np
a = np.array([0, 0, 0, 1, 1, 1, 0, 0, 0])
b = np.array([4, 5, 7, 8, 9, 9, 0, 6, 4])
x = vaex.from_arrays(x=a)
y = vaex.from_arrays(x=b)
print(y[x.x == 1].values)
The result is empty:
[]
It seems that vaex doesn't have the same index concept as pandas and numpy. Although the two dataframe is equal shape, array y can't use mask x.x==1.
Is there any way to achieve the equavilent result as pandas does please?
Thanks
While Vaex has a similar API to that of Pandas (similarly named methods, that do the same thing), the implementations of the two libraries is completely different and thus it is not easy to "mix and match".
In order to work with any kind of data, that data needs to be part of the same Vaex dataframe.
So in order to achieve what you want, something like this is possible:
import vaex
import numpy as np
a = np.array([0, 0, 0, 1, 1, 1, 0, 0, 0])
b = np.array([4, 5, 7, 8, 9, 9, 0, 6, 4])
y = vaex.from_arrays(x1=b)
y.add_column(name='x2', f_or_array=a)
print(y[y.x2 == 1])

Make 1 dimensional array 2D using numpy

I have a list of numbers which I wish to add a second column such that the array becomes 2D like in the example below:
a = [1,1,1,1,1]
b = [2,2,2,2,2]
should become:
c = [[1,2],[1,2],[1,2],[1,2],[1,2]]
I am not sure how to do this using numpy?
I would just stack them and then transpose the resulting array with .T:
import numpy as np
a = np.array([1, 1, 1, 1, 1])
b = np.array([2, 2, 2, 2, 2])
c = np.stack((a, b)).T
Use numpy built-in functions:
import numpy as np
c = np.vstack((np.array(a),np.array(b))).T.tolist()
np.vstack stacks arrays vertically. .T transposes the array and tolist() converts it back to a list.
Another similar way to do it, is to add a dimensions using [:,None] and then you can horizontally stack them without the need to transpose:
c = np.hstack((np.array(a)[:,None],np.array(b)[:,None])).tolist())
output:
[[1, 2], [1, 2], [1, 2], [1, 2], [1, 2]]

Efficient way to create a numpy boolean array from an array [duplicate]

This question already has answers here:
Construct two dimensional numpy array from indices and values of a one dimensional array
(3 answers)
Closed 4 years ago.
I am trying to convert a numpy array
np.array([1,3,2])
to
np.array([[1,0,0],[0,0,1],[0,1,0]])
Any idea of how to do this efficiently?
Thanks!
Create an bool array, and then fill it:
import numpy as np
a = np.array([1, 2, 3, 0, 3, 2, 1])
b = np.zeros((len(a), a.max() + 1), bool)
b[np.arange(len(a)), a] = 1
It is also possible to just select the right values from np.eye or the identity matrix:
a = np.array([1,3,2])
b = np.eye(max(a))[a-1]
This would probably be the most straight forward.
You can compare to [1, 2, 3] like so:
>>> a = np.array([1,3,2])
>>> np.equal.outer(a, np.arange(1, 4)).view(np.int8)
array([[1, 0, 0],
[0, 0, 1],
[0, 1, 0]], dtype=int8)
or equivalent but slightly faster
>>> (a[:, None] == np.arange(1, 4)).view(np.int8)
Try pandas's get dummy method.
import pandas as pd
import numpy as np
arr = np.array([1 ,3, 2])
df = pd.get_dummies(arr)
if what you need is numpy array object, do:
arr2 = df.values

R's order equivalent in python

Any ideas what is the python's equivalent for R's order?
order(c(10,2,-1, 20), decreasing = F)
# 3 2 1 4
In numpy there is a function named argsort
import numpy as np
lst = [10,2,-1,20]
np.argsort(lst)
# array([2, 1, 0, 3])
Note that python list index starting at 0 while starting at 1 in R.
It is numpy.argsort()
import numpy
a = numpy.array([10,2,-1, 20])
a.argsort()
# array([2, 1, 0, 3])
and if you want to explore the decreasing = T option. You can try,
(-a).argsort()
#array([3, 0, 1, 2])

Categories

Resources