Below I created a simple example of my dataset. I have 4 points and for each steps their value change. The points are plotted in x,y plane and I want their size to change with their value. There is also one other problem, each point is connected by a line and I don't want it. (I cannot use plt.scatter)
import pandas as pd
import matplotlib.pyplot as plt
data=[[1,1,3],[1,2,1],[2,1,9],[2,2,0]]
a=pd.DataFrame(data)
a.columns=['x','y','value']
data2=[[1,1,5],[1,2,2],[2,1,1],[2,2,3]]
b=pd.DataFrame(data2)
b.columns=['x','y','value']
data3=[[1,1,15],[1,2,7],[2,1,4],[2,2,8]]
c=pd.DataFrame(data3)
c.columns=['x','y','value']
final=[a,b,c]
for i in range(0,len(final)):
fig, ax = plt.subplots()
plt.plot(final[i]['x'],final[i]['y'],marker='o',markersize=22)
with this I fix the dimension the line appears in, how can I remove it?
If I change the markersize, it doesn't work:
for i in range(0,len(final)):
fig, ax = plt.subplots()
plt.plot(final[i]['x'],final[i]['y'],marker='o',markersize=final[i]['value'])
As I said before, the result I want is a plot in which there are only the points with different dimensions depending on their value.
Since you cannot use scatter, you need to loop over the values to use the markersize as it does not accept arrays but a scalar. Moreover, to just plot a marker, you use 'o' for a circle. I used size*5 to enlarge the circles further.
for i in range(0,len(final)):
fig, ax = plt.subplots()
for x, y, size in zip(final[i]['x'],final[i]['y'], final[i]['value']):
plt.plot(x, y, 'o', markersize=size*5)
In case you want to plot them as subplots
fig, axes = plt.subplots(1,3, figsize=(9, 2))
for i in range(0,len(final)):
for x, y, size in zip(final[i]['x'],final[i]['y'], final[i]['value']):
axes[i].plot(x, y, 'o', markersize=size*5)
plt.tight_layout()
You have an argument for the line width in plt.plot graphs. Please set it to zero.
plt.plot(final[i]["x"], final[i]["y"], marker="o", markersize=22, linewidth=0)
Related
I am a beginner so this might be a stupid question. If I run the following code, I get shown the scatterplot of x and y, but the regression line plt.plot(x, estimated_y, color="r", linewidth=3.0) does not show up. I estimated y using the covariance matrix. x, y and estimated_y are all numpy arrays. If I run plt.plot(x, estimated_y, color="r", linewidth=3.0) alone, I get shown an empty figure.
plt.figure()
plt.scatter(x,y)
plt.plot(x, estimated_y, color="r", linewidth=3.0)
plt.show()
plt.xlabel("x")
plt.ylabel("y")
Thanks for all help!
Your code looks quite fine. I added some random data to create a minimal reproducible example:
import matplotlib.pyplot as plt
# create dummy data
x = list(range(0,10))
y = list(range(10,0,-1))
estimated_y = [1]*10
plt.figure()
plt.scatter(x,y)
plt.plot(x, estimated_y, color="r", linewidth=3.0)
plt.show()
# do not add anything to the axes after this command. The command forces to terminate the "rendering", which is why everything afterwards opens a new plot
The output looks good:
However, you can make sure that you plot to the same axes (a figure may hold several axes or "subplots") by
# open a figure + create a single axis
fig, ax = plt.subplots()
ax.scatter(x,y)
ax.plot(x, estimated_y, color="r", linewidth=3.0)
plt.show()
Any plotting function that you call from the base, e.g. matplotlib.pyplot.plot() or plt.plot() since you ave renamed/imported part of the library as plt, will plot to the axis that is currently active. This should work in your case.
The good thing, when working with axes is, you can count the number of lines that were plotted on them:
len(ax.get_lines())
1
It is one, because scatter() does not plot "lines" but "points"... Now, if you call it with your data, it should also return 1. If it does and you can't see the red line, it might be that your data contains NaNs or that it was plotted out of limits. It if returns 0, you have plotted it to some different (perhaps invisible or not-yet-drawn) axis.
Consider the figure below.
This image has been set up with the following code.
plt.rc('text', usetex=True)
plt.rc('font', family='serif')
fig, ax = plt.subplots()
ax.set_xlabel("Run Number", fontsize=25)
plt.grid(True, linestyle='--')
plt.tick_params(labelsize=20)
ax.set_xticklabels(map(str,range(number_of_runs)))
ax.minorticks_on()
ax.set_ylim([0.75,1.75])
I have not included the code that actually generates the data for plotting for the sake of clarity.
Unlike the diagram above, I would like to draw grid-lines perpendicular to the X-axis through each orange (and hence blue) dot. How do I do this?
The x-coordinates of the successive orange and blue dots form the same arithmetic progression in my code.
Also I notice that the tick numbers numbered 1,2,... are wrong for my application. Instead, I would like each successive grid-line, which I ask for as perpendicular to the X-axis in the previous step, to be numbered sequentially from 1 along the X-axis. How do I configure the Xtick marks for this?
The grid lines cross the xticks (or yticks).
You need to define xticks properly so that the grid lines cross your data points (the dots)
example below:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
number_of_runs = range(1,10) # use your actual number_of_runs
ax.set_xticks(number_of_runs, minor=False)
ax.xaxis.grid(True, which='major')
In case you want to have only vertical lines, add this:
ax.yaxis.grid(False, which='major')
Similar question here.
You should specify the exact places where you want the grids using a call to ax.set_xticks and then specify the exact numbers you want on the axis using a call to ax.set_xticklabels.
I am plotting some two random arrays in the example below:
plt.rc('text', usetex=True)
plt.rc('font', family='serif')
y1 = np.random.random(10)
y2 = np.random.random(10)
fig, ax = plt.subplots(ncols=2, figsize=(8, 3))
# equivalent to your figure
ax[0].plot(y1, 'o-')
ax[0].plot(y2, 'o-')
ax[0].grid(True, linestyle='--')
ax[0].set_title('Before')
# hopefully what you want
ax[1].plot(y1, 'o-')
ax[1].plot(y2, 'o-')
ax[1].set_title('After')
ax[1].set_xticks(range(0, len(y1)))
ax[1].set_xticklabels(range(1, len(y1)+1))
ax[1].grid(True, linestyle='--')
plt.show()
This is the output:
A note: Looking at your plot, it seems that the actual x-axis is not integers, but you want integers starting from 1, Probably the best way to do this is to just pass in the y axis data array as an argument for the plot command (plt.plot(y) instead of plt.plot(x, y), like what I have done above. You should decide if this is appropriate for your case.
I produce multiple plots containing each 5 subplots, generated in a for loop.
How can I define the coloring of the subplots? Do I need something like a Matrix with numbers and colors and use it somehow like Matrix[z] instead of the Color?
fig = plt.figure()
ax = fig.add_subplot(111)
for z in Var
ax.plot(x, y, color='black', alpha=0.5 , label=labelString)
It is unclear what you exactly mean. But if you mean plotting 5 different curves in the same plot, each in different color, this is one way you can do it. This allows you to choose colors as you want. In case you do not specify colors manually like in the code below, python will assign colors automatically. In that case you just have to write ax.plot(x, y, label=r'y=%dx$^2$' %(i+1))
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot(111)
colors = ['r', 'g', 'b', 'k', 'y']
x = np.linspace(0, 5, 100)
for i in range(5):
y = (i+1)*x**2
ax.plot(x, y, color=colors[i], label=r'y=%dx$^2$' %(i+1))
plt.legend(fontsize=16)
Output
I am trying to plot a large dataset with a scatter plot.
I want to use matplotlib to plot it with single pixel marker.
It seems to have been solved.
https://github.com/matplotlib/matplotlib/pull/695
But I cannot find a mention of how to get a single pixel marker.
My simplified dataset (data.csv)
Length,Time
78154393,139.324091
84016477,229.159305
84626159,219.727537
102021548,225.222662
106399706,221.022827
107945741,206.760239
109741689,200.153263
126270147,220.102802
207813132,181.67058
610704756,50.59529
623110004,50.533158
653383018,52.993885
659376270,53.536834
680682368,55.97628
717978082,59.043843
My code is below.
import pandas as pd
import os
import numpy
import matplotlib.pyplot as plt
inputfile='data.csv'
iplevel = pd.read_csv(inputfile)
base = os.path.splitext(inputfile)[0]
fig = plt.figure()
plt.yscale('log')
#plt.xscale('log')
plt.title(' My plot: '+base)
plt.xlabel('x')
plt.ylabel('y')
plt.scatter(iplevel['Time'], iplevel['Length'],color='black',marker=',',lw=0,s=1)
fig.tight_layout()
fig.savefig(base+'_plot.png', dpi=fig.dpi)
You can see below that the points are not single pixel.
Any help is appreciated
The problem
I fear that the bugfix discussed at matplotlib git repository that you're citing is only valid for plt.plot() and not for plt.scatter()
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(4,2))
ax = fig.add_subplot(121)
ax2 = fig.add_subplot(122, sharex=ax, sharey=ax)
ax.plot([1, 2],[0.4,0.4],color='black',marker=',',lw=0, linestyle="")
ax.set_title("ax.plot")
ax2.scatter([1,2],[0.4,0.4],color='black',marker=',',lw=0, s=1)
ax2.set_title("ax.scatter")
ax.set_xlim(0,8)
ax.set_ylim(0,1)
fig.tight_layout()
print fig.dpi #prints 80 in my case
fig.savefig('plot.png', dpi=fig.dpi)
The solution: Setting the markersize
The solution is to use a usual "o" or "s" marker, but set the markersize to be exactly one pixel. Since the markersize is given in points, one would need to use the figure dpi to calculate the size of one pixel in points. This is 72./fig.dpi.
For aplot`, the markersize is directly
ax.plot(..., marker="o", ms=72./fig.dpi)
For a scatter the markersize is given through the s argument, which is in square points,
ax.scatter(..., marker='o', s=(72./fig.dpi)**2)
Complete example:
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(4,2))
ax = fig.add_subplot(121)
ax2 = fig.add_subplot(122, sharex=ax, sharey=ax)
ax.plot([1, 2],[0.4,0.4], marker='o',ms=72./fig.dpi, mew=0,
color='black', linestyle="", lw=0)
ax.set_title("ax.plot")
ax2.scatter([1,2],[0.4,0.4],color='black', marker='o', lw=0, s=(72./fig.dpi)**2)
ax2.set_title("ax.scatter")
ax.set_xlim(0,8)
ax.set_ylim(0,1)
fig.tight_layout()
fig.savefig('plot.png', dpi=fig.dpi)
For anyone still trying to figure this out, the solution I found was to specify the s argument in plt.scatter.
The s argument refers to the area of the point you are plotting.
It doesn't seem to be quite perfect, since s=1 seems to cover about 4 pixels of my screen, but this definitely makes them smaller than anything else I've been able to find.
https://matplotlib.org/devdocs/api/_as_gen/matplotlib.pyplot.scatter.html
s : scalar or array_like, shape (n, ), optional
size in points^2. Default is rcParams['lines.markersize'] ** 2.
Set the plt.scatter() parameter to linewidths=0 and figure out the right value for the parameter s.
Source: https://stackoverflow.com/a/45803960/4063622
Define data
x = np.linspace(0,2*np.pi,100)
y = 2*np.sin(x)
Plot
fig = plt.figure()
ax = plt.axes()
fig.add_subplot(ax)
ax.plot(x,y)
Add second axis
newax = plt.axes(axisbg='none')
Gives me ValueError: Unknown element o, even though it does the same thing as what I am about to describe. I can also see that this works (no error) to do the same thing:
newax = plt.axes()
fig.add_subplot(newax)
newax.set_axis_bgcolor('none')
However, it turns the background color of the original figure "gray" (or whatever the figure background is)? I don't understand, as I thought this would make newax transparent except for the axes and box around the figure. Even if I switch the order, same thing:
plt.close('all')
fig = plt.figure()
newax = plt.axes()
fig.add_subplot(newax)
newax.set_axis_bgcolor('none')
ax = plt.axes()
fig.add_subplot(ax)
ax.plot(x,y)
This is surprising because I thought the background of one would be overlaid on the other, but in either case it is the newax background that appears to be visible (or at least this is the color I see).
What is going on here?
You're not actually adding a new axes.
Matplotlib is detecting that there's already a plot in that position and returning it instead of a new axes object.
(Check it for yourself. ax and newax will be the same object.)
There's probably not a reason why you'd want to, but here's how you'd do it.
(Also, don't call newax = plt.axes() and then call fig.add_subplot(newax) You're doing the same thing twice.)
Edit: With newer (>=1.2, I think?) versions of matplotlib, you can accomplish the same thing as the example below by using the label kwarg to fig.add_subplot. E.g. newax = fig.add_subplot(111, label='some unique string')
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
# If you just call `plt.axes()` or equivalently `fig.add_subplot()` matplotlib
# will just return `ax` again. It _won't_ create a new axis unless we
# call fig.add_axes() or reset fig._seen
newax = fig.add_axes(ax.get_position(), frameon=False)
ax.plot(range(10), 'r-')
newax.plot(range(50), 'g-')
newax.axis('equal')
plt.show()
Of course, this looks awful, but it's what you're asking for...
I'm guessing from your earlier questions that you just want to add a second x-axis? If so, this is a completely different thing.
If you want the y-axes linked, then do something like this (somewhat verbose...):
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
newax = ax.twiny()
# Make some room at the bottom
fig.subplots_adjust(bottom=0.20)
# I'm guessing you want them both on the bottom...
newax.set_frame_on(True)
newax.patch.set_visible(False)
newax.xaxis.set_ticks_position('bottom')
newax.xaxis.set_label_position('bottom')
newax.spines['bottom'].set_position(('outward', 40))
ax.plot(range(10), 'r-')
newax.plot(range(21), 'g-')
ax.set_xlabel('Red Thing')
newax.set_xlabel('Green Thing')
plt.show()
If you want to have a hidden, unlinked y-axis, and an entirely new x-axis, then you'd do something like this:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
fig.subplots_adjust(bottom=0.2)
newax = fig.add_axes(ax.get_position())
newax.patch.set_visible(False)
newax.yaxis.set_visible(False)
for spinename, spine in newax.spines.iteritems():
if spinename != 'bottom':
spine.set_visible(False)
newax.spines['bottom'].set_position(('outward', 25))
ax.plot(range(10), 'r-')
x = np.linspace(0, 6*np.pi)
newax.plot(x, 0.001 * np.cos(x), 'g-')
plt.show()
Note that the y-axis values for anything plotted on newax are never shown.
If you wanted, you could even take this one step further, and have independent x and y axes (I'm not quite sure what the point of it would be, but it looks neat...):
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
fig.subplots_adjust(bottom=0.2, right=0.85)
newax = fig.add_axes(ax.get_position())
newax.patch.set_visible(False)
newax.yaxis.set_label_position('right')
newax.yaxis.set_ticks_position('right')
newax.spines['bottom'].set_position(('outward', 35))
ax.plot(range(10), 'r-')
ax.set_xlabel('Red X-axis', color='red')
ax.set_ylabel('Red Y-axis', color='red')
x = np.linspace(0, 6*np.pi)
newax.plot(x, 0.001 * np.cos(x), 'g-')
newax.set_xlabel('Green X-axis', color='green')
newax.set_ylabel('Green Y-axis', color='green')
plt.show()
You can also just add an extra spine at the bottom of the plot. Sometimes this is easier, especially if you don't want ticks or numerical things along it. Not to plug one of my own answers too much, but there's an example of that here: How do I plot multiple X or Y axes in matplotlib?
As one last thing, be sure to look at the parasite axes examples if you want to have the different x and y axes linked through a specific transformation.