Print Values From 2D Numpy Array - python

I'm new to numpy and have read several other posts like mine but nothing is working for me.
I have a large array with many NaNs and I'd like to look at values that are not NaN.
flower_matrix = np.array([
[NaN,1,2,NaN,NaN,NaN,NaN,NaN,NaN,NaN,10,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN],
[0,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,12,13,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN],
[0,NaN,NaN,3,4,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN],
[NaN,NaN,2,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,22,23],
[NaN,NaN,2,NaN,NaN,5,6,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,4,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,16,17,NaN,NaN,NaN,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,4,NaN,NaN,7,8,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,],
[NaN,NaN,NaN,NaN,NaN,NaN,6,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,18,19,NaN,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,NaN,NaN,6,NaN,NaN,9,10,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,8,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,20,21,NaN,NaN],
[0,NaN,NaN,NaN,NaN,NaN,NaN,NaN,8,NaN,NaN,11,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,10,NaN,NaN,NaN,14,15,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN],
[NaN,1,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,15,NaN,NaN,NaN,19,NaN,NaN,NaN,NaN],
[NaN,1,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,18,NaN,NaN,NaN,22,NaN],
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,11,NaN,NaN,NaN,NaN,NaN,17,NaN,NaN,NaN,21,NaN,NaN],
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,11,12,NaN,NaN,NaN,16,NaN,NaN,NaN,NaN,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,NaN,5,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,15,NaN,NaN,NaN,NaN,NaN,NaN,NaN,23],
[NaN,NaN,NaN,NaN,NaN,5,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,14,NaN,NaN,NaN,18,NaN,NaN,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,7,NaN,NaN,NaN,NaN,NaN,13,NaN,NaN,NaN,17,NaN,NaN,NaN,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,7,NaN,NaN,NaN,NaN,12,NaN,NaN,NaN,NaN,NaN,NaN,NaN,20,NaN,NaN,NaN],
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,9,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,19,NaN,NaN,NaN,23],
[NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,9,NaN,NaN,NaN,NaN,14,NaN,NaN,NaN,NaN,NaN,NaN,NaN,22,NaN],
[NaN,NaN,NaN,3,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,13,NaN,NaN,NaN,NaN,NaN,NaN,NaN,21,NaN,NaN],
[NaN,NaN,NaN,3,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,16,NaN,NaN,NaN,20,NaN,NaN,NaN]])
I know that I can do
print(flower_matrix[0,1])
to get the value 1.0. I'm looking to do something similar but iterated through columns and rows. My best guess is something like:
for i in flower_matrix:
for j in flower_matrix:
if (i,j) != NaN:
print(i,j)
But of course this doesn't work. I have 24 columns and 24 rows and I want to iterate through each value and return the value if it is not NaN. Does this make sense?
Thanks in advance!

You can try numpy 'isnan' function instead of != comparison which does not work for NaN values. Also you may try 'is' keyword of Python.
For Python:
NaN == NaN gives False
NaN is NaN gives True
This should help you with your problem.
print(flower_matrix[~np.isnan(flower_matrix)])
If you want iterative case:
for i in flower_matrix:
for j in i:
if j == j:
print(j)

Related

Pandas - change cell value based on conditions from cell and from column

I have a Dataframe with a lot of "bad" cells. Let's say, they have all -99.99 as values, and I want to remove them (set them to NaN).
This works fine:
df[df == -99.99] = None
But actually I want to delete all these cells ONLY if another cell in the same row is market as 1 (e.g. in the column "Error").
I want to delete all -99.99 cells, but only if df["Error"] == 1.
The most straight-forward solution I thin is something like
df[(df == -99.99) & (df["Error"] == 1)] = None
but it gives me the error:
ValueError: cannot reindex from a duplicate axis
I tried every given solutions on the internet but I cant get it to work! :(
Since my Dataframe is big I don't want to iterate it (which of course, would work, but take a lot of time).
Any hint?
Try using broadcasting while passing numpy values:
# sample data, special value is -99
df = pd.DataFrame([[-99,-99,1], [2,-99,2],
[1,1,1], [-99,0, 1]],
columns=['a','b','Errors'])
# note the double square brackets
df[(df==-99) & (df[['Errors']]==1).values] = np.nan
Output:
a b Errors
0 NaN NaN 1
1 2.0 -99.0 2
2 1.0 1.0 1
3 NaN 0.0 1
At least, this is working (but with column iteration):
for i in df.columns:
df.loc[df[i].isin([-99.99]) & df["Error"].isin([1]), i] = None

replace values by NAN

I've got a dataframe that looks like this;
[index, Data]
[1, [5,3,6,8,4,5,7etc]]
The data in my "data"column stays in an array. I need to have at least 75 values in each array. The dataframe is 438 rows long.
I need to make a filter where all the arrays that contains less than 75 values, will be replaced by NaN.
I thought of something like this:
for i in range(len(df_window)):
if len(df_window['Data'][i][0])<75:
I don't know if this is right and how to continue. The dataframe called df_window
can someone help me quick please?
You can use lengths = df_window['Data'].apply(len) to get the serie of array lengths. Then by using df_window.loc[(lengths < 75), 'Data'] = np.nan you should get what you want.
EDIT: Corrected first line.

Why is max and min of numpy array nan?

What could be the reason, why the max and min of my numpy array is nan?
I checked my array with:
for i in range(data[0]):
if data[i] == numpy.nan:
print("nan")
And there is no nan in my data.
Is my search wrong?
If not: What could be the reason for max and min being nan?
Here you go:
import numpy as np
a = np.array([1, 2, 3, np.nan, 4])
print(f'a.max() = {a.max()}')
print(f'np.nanmax(a) = {np.nanmax(a)}')
print(f'a.min() = {a.min()}')
print(f'np.nanmin(a) = {np.nanmin(a)}')
Output:
a.max() = nan
np.nanmax(a) = 4.0
a.min() = nan
np.nanmin(a) = 1.0
Balaji Ambresh showed precisely how to find min / max even
if the source array contains NaN, there is nothing to add on this matter.
But your code sample contains also other flaws that deserve to be pointed out.
Your loop contains for i in range(data[0]):.
You probably wanted to execute this loop for each element of data,
but your loop will be executed as many times as the value of
the initial element of data.
Variations:
If it is e.g. 1, it will be executed only once.
If it is 0 or negative, it will not be executed at all.
If it is >= than the size of data, IndexError exception
will be raised.
If your array contains at least 1 NaN, then the whole array
is of float type (NaN is a special case of float) and you get
TypeError exception: 'numpy.float64' object cannot be interpreted
as an integer.
Remedium (one of possible variants): This loop should start with
for elem in data: and the code inside should use elem as the
current element of data.
The next line contains if data[i] == numpy.nan:.
Even if you corrected it to if elem == np.nan:, the code inside
the if block will never be executed.
The reason is that np.nan is by definition not equal to any
other value, even it this other value is another np.nan.
Remedium: Change to if np.isnan(elem): (Balaji wrote in his comment
how to change your code, I added why).
And finally: How to check quickly an array for NaNs:
To get a detailed list, whether each element is NaN, run np.isnan(data)
and you will get a bool array.
To get a single answer, whether data contains at least one NaN,
no matter where, run np.isnan(data).any().
This code is shorter and runs significantly faster.
The reason is that np.nan == x is always False, even when x is np.nan . This is aligned with the NaN definition in Wikipedia.
Check yourself:
In [4]: import numpy as np
In [5]: np.nan == np.nan
Out[5]: False
If you want to check if a number x is np.nan, you must use
np.isnan(x)
If you want to get max/min of an np.array with nan's, use np.nanmax()/ np.nanmin():
minval = np.nanmin(data)
Easy use np.nanmax(variable_name) and np.nanmin(variable_name)
import numpy as np
z=np.arange(10,20)
z=np.where(z<15,np.nan,z)#Making below 15 z value as nan.
print(z)
print("z max value excluding nan :",np.nanmax(z))
print("z min value excluding nan :",np.nanmin(z))

Values being altered in numpy array

So I have a 2D numpy array (256,256), containing values between 0 and 10, which is essentially an image. I need to remove the 0 values and set them to NaN so that I can plot the array using a specific library (APLpy). However whenever I try and change all of the 0 values, some of the other values get altered, in some cases to 100 times their original value (no idea why).
The code I'm using is:
for index, value in np.ndenumerate(tex_data):
if value == 0:
tex_data[index] = 'NaN'
where tex_data is the data array from which I need to remove the zeros. Unfortunately I can't just use a mask for the values I don't need, as APLpy wont except masked arrays as far as I can tell.
Is there anyway I can set the 0 values to NaN without changing the other values in the array?
Use fancy-indexing. Like this:
tex_data[tex_data==0] = np.nan
I don't know why your original code was failing. It looks correct to me, although terribly inefficient.
Using float rules,
tex_data/tex_data*tex_data
make the job here also.

numpy.insert() invalid slice -- Trying to Insert NaN in Numpy Array

I know there are already lots of questions about this, but none of the answers I've seen have solved my problem. I have a pandas DataFrame with 10 columns for data, but on some rows I have just 9 columns-worth of data. For the rows with just 9 datapoints, I need the data to be in the last nine columns. My solution is to insert a NaN value in front of the length-9 arrays so that the data is pushed to the correct columns. But everything I've tried has thrown up errors!
(I'm trying to insert NaN into a numpy array that looks like this: [6070000.0 6639000.0 15004000.0 15944000.0 8888000.0 9896000.0 22502500.0 23577000.0 14835500.0])
My current best guess:
a = np.array(a,dtype=float)
a = np.insert(a,np.nan,0)
**IndexError: invalid slice**
Any ideas about how I can get this doggone NaN into the array?
Your code is currently attempting to insert 0 at index np.nan. Switch the args around:
a = np.insert(a, 0, np.nan)

Categories

Resources