matplotlib 2d numpy array - python

I have created a 2d numpy array as:
for line in finp:
tdos = []
for _ in range(250):
sdata = finp.readline()
tdos.append(sdata.split())
break
tdos = np.array(tdos)
Which results in:
[['-3.463' '0.0000E+00' '0.0000E+00' '0.0000E+00' '0.0000E+00']
['-3.406' '0.0000E+00' '0.0000E+00' '0.0000E+00' '0.0000E+00']
['-3.349' '-0.2076E-29' '-0.3384E-30' '-0.1181E-30' '-0.1926E-31']
...,
['10.594' '0.2089E+02' '0.3886E+02' '0.9742E+03' '0.9664E+03']
['10.651' '0.1943E+02' '0.3915E+02' '0.9753E+03' '0.9687E+03']
['10.708' '0.2133E+02' '0.3670E+02' '0.9765E+03' '0.9708E+03']]
Now, I need to plot $0:$1 and $0:-$2 using matplotlib, so that the in x axis, I will have:
tdata[i][0] (i.e. -3.463, -3.406,-3.349, ..., 10.708)
,and in the yaxis, I will have:
tdata[i][1] (i.e. 0.0000E+00,0.0000E+00,-0.2076E-29,...,0.2133E+02)
How I can define xaxis and yaxis from the numpy array?

Just try the following recipe and see if it is what you want (two image plot methods followed by the same methods but with cropped image):
import matplotlib.pyplot as plt
import numpy as np
X, Y = np.meshgrid(range(100), range(100))
Z = X**2+Y**2
plt.imshow(Z,origin='lower',interpolation='nearest')
plt.show()
plt.pcolormesh(X,Y,Z)
plt.show()
plt.imshow(Z[20:40,30:70],origin='lower',interpolation='nearest')
plt.show()
plt.pcolormesh(X[20:40,30:70],Y[20:40,30:70],Z[20:40,30:70])
plt.show()
, results in:

Related

Strange output in matplotlib

Can someone explain why I get this strange output when running this code:
import matplotlib.pyplot as plt
import numpy as np
def x_y():
return np.random.randint(9999, size=1000), np.random.randint(9999, size=1000)
plt.plot(x_y())
plt.show()
The output:
Your data is a tuple of two 1000 length arrays.
def x_y():
return np.random.randint(9999, size=1000), np.random.randint(9999, size=1000)
xy = x_y()
print(len(xy))
# > 2
print(xy[0].shape)
# > (1000,)
Let's read pyplot's documentation:
plot(y) # plot y using x as index array 0..N-1
Thus pyplot will plot a line between (0, xy[0][i]) and (1, xy[1][i]), for i in range(1000).
You probably try to do this:
plt.plot(*x_y())
This time, it will plot 1000 points joined by lines: (xy[0][i], xy[1][i]) for i in range 1000.
Yet, the lines don't represent anything here. Therefore you probably want to see individual points:
plt.scatter(*x_y())
Your function x_y is returning a tuple, assigning each element to a variable gives the correct output.
import matplotlib.pyplot as plt
import numpy as np
def x_y():
return np.random.randint(9999, size=1000), np.random.randint(9999, size=1000)
x, y = x_y()
plt.plot(x, y)
plt.show()

Normal distribution appears too dense when plotted in matplotlib

I am trying to estimate the probability density function of my data. IN my case, the data is a satellite image with a shape 8200 x 8100.
Below, I present you the code of PDF (the function 'is_outlier' is borrowed by a guy that post this code on here ). As we can see, the PDF is in figure 1 too dense. I guess, this is due to the thousands of pixels that the satellite image is composed of. This is very ugly.
My question is, how can I plot a PDF that is not too dense? something like shown in figure 2 for example.
lst = 'satellite_img.tif' #import the image
lst_flat = lst.flatten() #create 1D array
#the function below removes the outliers
def is_outlier(points, thres=3.5):
if len(points.shape) == 1:
points = points[:,None]
median = np.median(points, axis=0)
diff = np.sum((points - median)**2, axis=-1)
diff = np.sqrt(diff)
med_abs_deviation = np.median(diff)
modified_z_score = 0.6745 * diff / med_abs_deviation
return modified_z_score > thres
lst_flat = np.r_[lst_flat]
lst_flat_filtered = lst_flat[~is_outlier(lst_flat)]
fit = stats.norm.pdf(lst_flat_filtered, np.mean(lst_flat_filtered), np.std(lst_flat_filtered))
plt.plot(lst_flat_filtered, fit)
plt.hist(lst_flat_filtered, bins=30, normed=True)
plt.show()
figure 1
figure 2
The issue is that the x values in the PDF plot are not sorted, so the plotted line is going back and forwards between random points, creating the mess you see.
Two options:
Don't plot the line, just plot points (not great if you have lots of points, but will confirm if what I said above is right or not):
plt.plot(lst_flat_filtered, fit, 'bo')
Sort the lst_flat_filtered array before calculating the PDF and plotting it:
lst_flat = np.r_[lst_flat]
lst_flat_filtered = np.sort(lst_flat[~is_outlier(lst_flat)]) # Changed this line
fit = stats.norm.pdf(lst_flat_filtered, np.mean(lst_flat_filtered), np.std(lst_flat_filtered))
plt.plot(lst_flat_filtered, fit)
Here's some minimal examples showing these behaviours:
Reproducing your problem:
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
lst_flat_filtered = np.random.normal(7, 5, 1000)
fit = stats.norm.pdf(lst_flat_filtered, np.mean(lst_flat_filtered), np.std(lst_flat_filtered))
plt.hist(lst_flat_filtered, bins=30, normed=True)
plt.plot(lst_flat_filtered, fit)
plt.show()
Plotting points
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
lst_flat_filtered = np.random.normal(7, 5, 1000)
fit = stats.norm.pdf(lst_flat_filtered, np.mean(lst_flat_filtered), np.std(lst_flat_filtered))
plt.hist(lst_flat_filtered, bins=30, normed=True)
plt.plot(lst_flat_filtered, fit, 'bo')
plt.show()
Sorting the data
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
lst_flat_filtered = np.sort(np.random.normal(7, 5, 1000))
fit = stats.norm.pdf(lst_flat_filtered, np.mean(lst_flat_filtered), np.std(lst_flat_filtered))
plt.hist(lst_flat_filtered, bins=30, normed=True)
plt.plot(lst_flat_filtered, fit)
plt.show()

Plot class probabilities using matplolitb

I have two numpy arrays y_prob and dataY whose values correspond. dataY is a one dimensional array where each value is a 1 or a 0. y_prob is a two dimensional array. I wish to plot a scatter plot using y_prob to determine the location and dataY to determine the color of the point. How can I do this?
Sample data:
y_prob = [[0.5,0.5], [0.3,0.7], [0.2,0.8], [0.1,0.9]]
dataY = [1,0,0,0]
You can use the standard packages numpy & matplotlib
import numpy as np
import matplotlib.pyplot as plt
y_prob = np.array([[0.5,0.5], [0.3,0.7], [0.2,0.8], [0.1,0.9]])
dataY = [1,0,0,0]
fig = plt.figure()
plt.scatter(x=y_prob[:,0], y=y_prob[:,1], c=dataY)
fig.show()

Plot or reshape 2D array matplotlib

i have no idea how can i plot scatter with a 2D array of this type:
a=[[x0,t0],[x1,t1],...,[xn,tn]]
the plot should be x vs t, maybe instead of doing this with a maplotlib routine be able to reshape a to obtain:
a=[[x0,x1,...,xn],[t0,t1,...,tn]]
thanks!
Assuming your data starts in the format a = [[x0, t0]]:
Split x & t into separate lists, then you can pass them into matplotlib.
import matplotlib.pyplot as plt
x = [i[0] for i in a]
t = [i[1] for i in a]
plt.plot(x, t)
You can use numpy.transpose:
import numpy as np
a=[["x0","t0"],["x1","t1"],["xn","tn"]]
np.transpose(a)
# array([['x0', 'x1', 'xn'],
# ['t0', 't1', 'tn']],
# dtype='<U2')

add legend to numpy array in matplot lib

I am plotting 2D numpy arrays using
import numpy as np
import matplotlib.pyplot as plt
x = np.array([1,2,3])
y = np.array([[2,2.2,3],[1,5,1]])
plt.plot(x,y.T[:,:])
plt.legend()
plt.show()
I want a legend that tells which line belongs to which row. Of course, I realize I can't give it meaningful names, but I need some sort of unique label for the line without running through loop.
import numpy as np
import matplotlib.pyplot as plt
import uuid
x = np.array([1,2,3])
y = np.array([[2,2.2,3],[1,5,1]])
fig, ax = plt.subplots()
lines = ax.plot(x,y.T[:,:])
ax.legend(lines, [str(uuid.uuid4())[:6] for j in range(len(lines))])
plt.show()
(This is off of the current mpl master branch with a preview of the 2.0 default styles)

Categories

Resources