Matplotlib plotting additional lines - python

I have data like this:
X = array([ 24.41, 54.98, 89.57, 114.26, 133.61, 202.14, 250.99, 321.31,
333.47, 373.79, 422.02, 447.41, 522.47, 549.53, 20.15, 39.12,
73.42, 134.03, 179.86, 262.52, 337.23, 432.68, 253.24, 346.62,
450.1 , 552.22, 656.2 , 33.84, 60.41, 94.88, 147.73, 206.76,
237.12, 372.72, 495.47, 544.47, 28.93, 49.87, 85.15, 143.84,
226.86, 339.15, 393.32, 524.7 , 623.86, 39.22, 96.44, 156.92,
223.88, 271.78, 349.52, 429.66, 523.03, 622.05, 748.29, 646.89,
749.27, 851.37, 851.61])
y = array([ 0.70168044, 4.93931985, 8.71831269, 10.84590729, 12.22458808,
15.46380214, 16.61898425, 17.29600649, 17.34369784, 17.43434118,
17.50907445, 17.57419685, 18.00322011, 18.26260499, 0.03686716,
2.85433237, 7.0779359 , 12.25192523, 14.65463193, 16.79352551,
17.35594282, 17.53284075, 16.65553712, 17.38224061, 17.58297862,
18.29143563, 19.71214346, 2.10666383, 5.59990814, 9.21325511,
13.08716841, 15.60686344, 16.36464679, 17.43271999, 17.80134835,
18.20983513, 1.38643181, 4.29326544, 8.28990266, 12.86092195,
16.1416266 , 17.36179504, 17.46194981, 18.02244612, 19.22640164,
2.86822848, 9.35464796, 13.58885705, 16.07082828, 16.91213557,
17.38928103, 17.52563605, 18.00801144, 19.19976288, 20.8797045 ,
19.5713721 , 20.88735117, 20.40458438, 20.39937509])
When I plot them using:
import matplotlib.pyplot as plt
plt.plot(X, y, "ro", markersize=6)
plt.plot(X, y)
I get:
My expectation would be one line that connects from red point to red point, but no matter how I tweak the parameters a subset of points get connected by straight lines. I am reading through the documentation and can't figure out what parameters to tweak to stop these lines from appearing.

Sort your data before doing the line plotting:
index =np.argsort(X)
plt.plot(X, y, "ro", markersize=6)
plt.plot(X[index], y[index])
plt.show()
If you don't do this, the lines will be drawn in the order you have in your data - as you see in your plot.

Related

Interpolate time series, select y value from x

I have been searching for an answer to this for a while, and have gotten close but keep running into errors. There are a lot of similar questions that almost answer this, but I haven't been able to solve it. Any help or a point in the right direction is appreciated.
I have a graph showing temperature as a mostly non-linear function of depth, with the x and y values drawn from a pandas data frame.
import matplotlib.pyplot as plt
x = (22.81, 22.81, 22.78, 22.71, 22.55, 22.54, 22.51, 22.37)
y = (5, 16, 23, 34, 61, 68, 77, 86)
#Plot details
plt.figure(figsize=(10,7)), plt.plot(style='.-')
plt.title("Temperature as a Function of Depth")
plt.xlabel("Temperature"), plt.ylabel("Depth")
plt.gca().invert_yaxis()
plt.plot(x,y, linestyle='--', marker='o', color='b')
Which gives me an image somewhat like this one (note the flipped y axis since I'm talking about depth):
I would like to find the y value at a specific x value of 22.61, which is not one of the original temperature values in the dataset. I've tried the following steps:
np.interp(22.61, x1, y1)
Which gives me a value that I know to be incorrect, as does
s = pd.Series([5,16,23,34,np.nan,61,68,77,86], index=[22.81,22.81,22.78,22.71,22.61,22.55,22.54,22.51,22.37])
s.interpolate(method='index')
where I am trying to just set up a frame and force the interpolation. I also tried
line = plt.plot(x,y)
xvalues = line[0].get_xdata()
yvalues = line[0].get_ydata()
idx = np.where(xvalues==xvalues[3]) ## 3 is the position
yvalues[idx]
but this returns y values for a specific, already-listed x value, rather than an interpolated one.
I hope this is clear enough. I'm brand new to data science, and to stackoverflow, so if I need to rephrase the question please let me know.
You may indeed use the numpy.interp function. As the documentation states
The x-coordinates of the data points, must be increasing [...]
So you need to sort the arrays on the x array, before using this function.
# Sort arrays
xs = np.sort(x)
ys = np.array(y)[np.argsort(x)]
# x coordinate
x0 = 22.61
# interpolated y coordinate
y0 = np.interp(x0, xs, ys)
Complete Code:
import numpy as np
import matplotlib.pyplot as plt
x = (22.81, 22.81, 22.78, 22.71, 22.55, 22.54, 22.51, 22.37)
y = (5, 16, 23, 34, 61, 68, 77, 86)
# Sort arrays
xs = np.sort(x)
ys = np.array(y)[np.argsort(x)]
# x coordinate
x0 = 22.61
# interpolated y coordinate
y0 = np.interp(x0, xs, ys)
#Plot details
plt.figure(figsize=(10,7)), plt.plot(style='.-')
plt.title("Temperature as a Function of Depth")
plt.xlabel("Temperature"), plt.ylabel("Depth")
plt.gca().invert_yaxis()
plt.plot(x,y, linestyle='--', marker='o', color='b')
plt.plot(x0,y0, marker="o", color="C3")
I think Scipy provides a more intuitive API to solve this problem. You can then easily continue working with your data in Pandas.
from scipy.interpolate import interp1d
x = np.array((22.81, 22.81, 22.78, 22.71, 22.55, 22.54, 22.51, 22.37))
y = np.array((5, 16, 23, 34, 61, 68, 77, 86))
# fit the interpolation on the original index and values
f = interp1d(x, y, kind='linear')
# perform interpolation for values across the full desired index
f([22.81,22.81,22.78,22.71,22.61,22.55,22.54,22.51,22.37])
Output:
array([16. , 16. , 23. , 34. , 50.875, 61. , 68. , 77. ,
86. ])
You can choose multiple other non-linear interpolations too (quadratic, cubic and so on). Check out the comprehensive interpolation documentation for more detail.
[Edit]: You will need to sort your arrays on the x axis as #ImportanceOfBeingErnest adds.

plotting conditional distribution in python

I'm new to python and trying to plot a gaussian distribution having the function defined as
I plotted normal distribution P(x,y) and it's giving correct output. code and output are below.
Code :
Output :
Now I need to plot a conditional distribution and the output should like . to do this I need to define a boundary condition for the equation. I tried to define a boundary condition but it's not working. the code which I tried is but it's giving wrong output
please help me how to plot the same.
Thanks,
You used the boundary condition on the wrong parameter, try to do it after creating the grid points.
R = np.arange(-4, 4, 0.1)
X, Y = np.meshgrid(R, R)
then validate X and Y based on the condition
valid_xy = np.sqrt(X**2+Y**2) >= 1
X = X[valid_xy]
Y = Y[valid_xy]
Then continue with the rest of the code.
Update
If you want just to reset values around the peak to zero, you can use the following code:
import numpy as np
import matplotlib.pyplot as plt
R = np.arange(-4, 4, 0.1)
X, Y = np.meshgrid(R, R)
Z = np.sum(np.exp(-0.5*(X**2+Y**2)))
P = (1/Z)*np.exp(-0.5*(X**2+Y**2))
# reset the peak
invalid_xy = (X**2+Y**2)<1
P[invalid_xy] = 0
# plot the result
fig = plt.figure(figsize=(10, 6))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(X, Y, P, s=0.5, alpha=0.5)
plt.show()
You can't use np.meshgrid anymore because it will output a matrix where the coordinates of X and Y form a grid (hence its name) and not a custom shape (a grid minus a disc like you want):
However you can create your custom grid the following way:
R = np.arange(-,4,0.1)
xy_coord = np.array(((x,y) for x in R for y in R if (x*x + y*y) > 1))
X,Y = xy_coord.transpose()
X
# array([ 0. , 0. , 0. , ..., 3.9, 3.9, 3.9])
Y
# array([ 1.1, 1.2, 1.3, ..., 3.7, 3.8, 3.9])

2d density contour plot with matplotlib

I'm attempting to plot my dataset, x and y (generated from a csv file via numpy.genfromtxt('/Users/.../somedata.csv', delimiter=',', unpack=True)) as a simple density plot. To ensure this is self containing I will define them here:
x = [ 0.2933215 0.2336305 0.2898058 0.2563835 0.1539951 0.1790058
0.1957057 0.5048573 0.3302402 0.2896122 0.4154893 0.4948401
0.4688092 0.4404935 0.2901995 0.3793949 0.6343423 0.6786809
0.5126349 0.4326627 0.2318232 0.538646 0.1351541 0.2044524
0.3063099 0.2760263 0.1577156 0.2980986 0.2507897 0.1445099
0.2279241 0.4229934 0.1657194 0.321832 0.2290785 0.2676585
0.2478505 0.3810182 0.2535708 0.157562 0.1618909 0.2194217
0.1888698 0.2614876 0.1894155 0.4802076 0.1059326 0.3837571
0.3609228 0.2827142 0.2705508 0.6498625 0.2392224 0.1541462
0.4540277 0.1624592 0.160438 0.109423 0.146836 0.4896905
0.2052707 0.2668798 0.2506224 0.5041728 0.201774 0.14907
0.21835 0.1609169 0.1609169 0.205676 0.4500787 0.2504743
0.1906289 0.3447547 0.1223678 0.112275 0.2269951 0.1616036
0.1532181 0.1940938 0.1457424 0.1094261 0.1636615 0.1622345
0.705272 0.3158471 0.1416916 0.1290324 0.3139713 0.2422002
0.1593835 0.08493619 0.08358301 0.09691083 0.2580497 0.1805554 ]
y = [ 1.395807 1.31553 1.333902 1.253527 1.292779 1.10401 1.42933
1.525589 1.274508 1.16183 1.403394 1.588711 1.346775 1.606438
1.296017 1.767366 1.460237 1.401834 1.172348 1.341594 1.3845
1.479691 1.484053 1.468544 1.405156 1.653604 1.648146 1.417261
1.311939 1.200763 1.647532 1.610222 1.355913 1.538724 1.319192
1.265142 1.494068 1.268721 1.411822 1.580606 1.622305 1.40986
1.529142 1.33644 1.37585 1.589704 1.563133 1.753167 1.382264
1.771445 1.425574 1.374936 1.147079 1.626975 1.351203 1.356176
1.534271 1.405485 1.266821 1.647927 1.28254 1.529214 1.586097
1.357731 1.530607 1.307063 1.432288 1.525117 1.525117 1.510123
1.653006 1.37388 1.247077 1.752948 1.396821 1.578571 1.546904
1.483029 1.441626 1.750374 1.498266 1.571477 1.659957 1.640285
1.599326 1.743292 1.225557 1.664379 1.787492 1.364079 1.53362
1.294213 1.831521 1.19443 1.726312 1.84324 ]
Now, I have used many attempts to plot my contours using variations on:
delta = 0.025
OII_OIII_sAGN_sorted = numpy.arange(numpy.min(OII_OIII_sAGN), numpy.max(OII_OIII_sAGN), delta)
Dn4000_sAGN_sorted = numpy.arange(numpy.min(Dn4000_sAGN), numpy.max(Dn4000_sAGN), delta)
OII_OIII_sAGN_X, Dn4000_sAGN_Y = np.meshgrid(OII_OIII_sAGN_sorted, Dn4000_sAGN_sorted)
Z1 = matplotlib.mlab.bivariate_normal(OII_OIII_sAGN_X, Dn4000_sAGN_Y, 1.0, 1.0, 0.0, 0.0)
Z2 = matplotlib.mlab.bivariate_normal(OII_OIII_sAGN_X, Dn4000_sAGN_Y, 0.5, 1.5, 1, 1)
# difference of Gaussians
Z = 0.2 * (Z2 - Z1)
pyplot_middle.contour(OII_OIII_sAGN_X, Dn4000_sAGN_Y, Z, 12, colors='k')
This doesn't seem to give the desired output.I have also tried:
H, xedges, yedges = np.histogram2d(OII_OIII_sAGN,Dn4000_sAGN)
extent = [xedges[0],xedges[-1],yedges[0],yedges[-1]]
ax.contour(H, extent=extent)
Not quite working as I wanted either. Essentially, I am looking for something similar to this:
If anyone could help me with this I would be very grateful, either by suggesting a totally new method or modifying my existing code. Please also attach images of your output if you have some useful techniques or ideas.
seaborn does density plots right out of the box:
import seaborn as sns
import matplotlib.pyplot as plt
sns.kdeplot(x, y)
plt.show()
It seems that histogram2d takes some fiddling to plot the contour in the right place. I took the transpose of the histogram matrix and also took the mean values of the elements in xedges and yedges instead of just removing one from the end.
from matplotlib import pyplot as plt
import numpy as np
fig = plt.figure()
h, xedges, yedges = np.histogram2d(x, y, bins=9)
xbins = xedges[:-1] + (xedges[1] - xedges[0]) / 2
ybins = yedges[:-1] + (yedges[1] - yedges[0]) / 2
h = h.T
CS = plt.contour(xbins, ybins, h)
plt.scatter(x, y)
plt.show()

`map.scatter` on basemap not displaying markers

I have a map of Germany, and the coords of a few cities.
plot displays the dots properly. I would like to use scatter instead, in order to be able to color the markets with respect to an other variable and then display a colorbar. The code runs in the console, but the dots are not visualized when I replace map.plot with map.scatter.
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np
plt.figure(1)
map = Basemap(projection='merc',
resolution='l',
llcrnrlat=44.0,
llcrnrlon=5.0,
urcrnrlat=57.0,
urcrnrlon=17)
map.drawcoastlines()
map.drawcountries()
map.fillcontinents(color='lightgray')
map.drawmapboundary()
long = np.array([ 13.404954, 11.581981, 9.993682, 8.682127, 6.960279,
6.773456, 9.182932, 12.373075, 13.737262, 11.07675 ,
7.465298, 7.011555, 12.099147, 9.73201 , 7.628279,
8.801694, 10.52677 , 8.466039, 8.239761, 10.89779 ,
8.403653, 8.532471, 7.098207, 7.216236, 9.987608,
7.626135, 11.627624, 6.852038, 10.686559, 8.047179,
8.247253, 6.083887, 7.588996, 9.953355, 10.122765])
lat = np.array([ 52.520007, 48.135125, 53.551085, 50.110922, 50.937531,
51.227741, 48.775846, 51.339695, 51.050409, 49.45203 ,
51.513587, 51.455643, 54.092441, 52.375892, 51.36591 ,
53.079296, 52.268874, 49.487459, 50.078218, 48.370545,
49.00689 , 52.030228, 50.73743 , 51.481845, 48.401082,
51.960665, 52.120533, 51.47512 , 53.865467, 52.279911,
49.992862, 50.775346, 50.356943, 49.791304, 54.323293])
colors = np.array([ 2.72189792, 3.62138986, 1.7947676 , 1.36524602, 1.75664228,
3.0777491 , 2.39580451, 1.17822874, 1.35503558, 2.28517658,
3.66472978, 1.76467741, 0.72551119, 1.76997962, 4.49420944,
2.34434288, 1.3243405 , 2.35945794, 3.16147488, 2.94025564,
1.68774158, 0.67602518, 1.60727613, 1.85608281, 3.57769226,
1.33501838, 3.32549868, 2.95492675, 2.83391381, 2.33983198,
2.59607424, 1.24260218, 1.89258818, 2.07508363, 3.03319927])
x, y = map(long, lat)
map.plot(x,y,'o')
plt.show()
try adding "zorder" so that the points show up above the map:
map.fillcontinents(color='lightgray',zorder=0)
I also had the problem mentioned in the comment of the accepted answer (whole map becoming blue) so I ended up increasing the zorder of the plot instead :
map.plot(x,y,'o',zorder=9999)

Plotting text in matplotlib

I am trying to plot a graph something similar to this:
For that, I have written the following function in python
def plot_graph_perf(dataset):
#TODO: Give labels as power ranges in spaces of 1000
plotter = ['0',
'1200000-10', '1200000-14', '1200000-18',
'1200000-2', '1200000-22', '1200000-26', '1200000-30',
'1200000-34', '1200000-38', '1200000-42', '1200000-46',
'1200000-6',
'1600000-10', '1600000-14',
'1600000-18', '1600000-2', '1600000-22',
'1600000-26', '1600000-30', '1600000-34',
'1600000-38', '1600000-42', '1600000-46',
'1600000-6',
'2000000-10', '2000000-14',
'2000000-18', '2000000-2', '2000000-22',
'2000000-26', '2000000-30', '2000000-34',
'2000000-38', '2000000-42', '2000000-46',
'2000000-6',
'2400000-10', '2400000-14',
'2400000-18', '2400000-2', '2400000-22',
'2400000-26', '2400000-30', '2400000-34',
'2400000-38', '2400000-42', '2400000-46',
'2400000-6' ,
'800000-10', '800000-14',
'800000-18', '800000-2', '800000-22',
'800000-26', '800000-30', '800000-34',
'800000-38', '800000-42', '800000-46',
'800000-6' ]
x_axis_labels = dataset[1]
x=[a for a in range(len(x_axis_labels))]
y_axis_labels = dataset[0]
y=[a for a in range(len(y_axis_labels))]
width = 0.1
plt.figure
plt.plot(plotter, color = 'g')
plt.tight_layout(pad=1, h_pad=4, w_pad=None)
plt.xticks(x,x_axis_labels, rotation='vertical')
plt.yticks(y,y_axis_labels, rotation='horizontal')
plt.xlabel('Power')
plt.ylabel('perf')
plt.title(file + ' | (Power)')
fig = plt.gcf()
fig.set_size_inches(28.5,10.5)
plt.savefig('watt' + '.png',bbox_inches='tight', pad_inches=0.5,dpi=100)
plt.clf()
Where dataset is two dimensional list something like this
dataset = [[],[]]
each sublist containing same number of elements as plotter.
I plotted dataset[0] and dataset[1] as y and x respectively, but was unable to plot the string values in plotter.
Can you please shed some light and help me plot the plotter values on the graph.
Thanks.
You have to call the text function for each word separately:
words = list("abcdefg")
xs = np.random.randint(0,10,len(words))
ys = np.random.randint(0,10,len(words))
for x, y, s in zip(xs,ys,words):
plt.text(x,y,s)

Categories

Resources