Scrolling plot using matplotlib "smears" when updating - python

The application I'm coding for requires a real time plot of incoming data that is being stored long term in an excel spreadsheet. So the real time graph displays the 25 most recent data points.
The problem comes when the plot has to shift in the newest data point and shift out the oldest point. When I do this, the graph "smears" as shown here:
I then began to use plt.cla(), but this causes me to lose all formatting in my plots, such as the title, axes, etc. Is there any way for me to update my graph, but keep my graph formatting?
Here's an example after plt.cla():
.
And here's basically how I'm updating my graphs within a larger loop:
if data_point_index < max_data_points:
y_data[data_point_index] = measurement
plt.plot(x_data[:data_point_index + 1], y_data[:data_point_index + 1], 'or--')
else:
plt.cla()
y_data[0:max_data_points - 1] = y_data[1:max_data_points]
y_data[max_data_points - 1] = measurement
plt.plot(x_data, y_data, 'or--')
plt.pause(0.00001)
I know I can just re-add axis labels and such, but I feel like there should be a more eloquent way to do so and it is somewhat of a hassle as there can be multiple sub-plots and reformatting the figure takes a non-trivial amount of time.

Rather than plt.cla(), which as you have found out clears everything on the axes, you could just remove the last line plotted, which will leave you labels and formatting intact.
The Axes instance has an attribute lines, which stores all the lines currently plotted on the axes. To remove the last line plotted, we can access the current axes using plt.gca(), and then pop() from the list of lines on the axes.
else:
plt.gca().lines.pop()
y_data[0:max_data_points - 1] = y_data[1:max_data_points]
y_data[max_data_points - 1] = measurement
plt.plot(x_data, y_data, 'or--')

Related

MatPlotLib not displaying both graphs when sharing X axes

This is a bit of an odd problem I've encountered. I'm trying to read data from a CSV file in Python, and have the two resulting lines be inside of the same box, with different scales so they're both clear to read.
The CSV file looks like this:
Date,difference,current
11/19/20, 0, 606771
11/20/20, 14612, 621383
and the code looks like this:
data = pd.read_csv('data.csv')
time = data['Time']
ycurr = data['current']
ydif = data['difference']
fig, ax = plt.subplots()
line1, = ax.plot(time, ycurr, label='Current total')
line1.set_dashes([2, 2, 10, 2]) # 2pt line, 2pt break, 10pt line, 2pt break
line2, = ax.twinx().plot(time, ydif, dashes=[6, 2], label='Difference')
ax.legend()
plt.show()
I can display the graphs with the X-axis having Date values and Y-axis having difference values or current values just fine.
However, when I attempt to use subplots() and use the twinx() attribute with the second line, I can only see one of two lines.
I initially thought this might be a formatting issue in my code, so I updated the code to have the second line be ax2 = ax1.twin(x) and call upon the second line using this, but the result stayed the same. I suspect that this might be an issue with reading in the CSV data? I tried to do read in x = np.linspace(0, 10, 500) y = np.sin(x) y2 = np.sin(x-0.05) instead and that worked:
Everything is working as expected but probably not how you want it to work!
So each line only consists of two data points which in the end will give you a linear curve. Both of these curves share the same x-coordinates while the y-axis is scaled for each plot. And here comes the problem, both axes are scaled to display the data in the same way. This means, the curves lie perfectly on top of each other. It is difficult to see because both lines are dashed.
You can see what is going on by changing the colors of the line. For example add color='C1' to one of the curves.
By the way, what do you want to show with your plot? A curve consisting of two data points mostly doesn't show much and you are better of if you just show their values directly instead.

Do not display missing values ​matplotlib

I would like to remove the flat lines on my graph by keeping the labels x.
I have this code which gives me a picture
dates = df_stock.loc[start_date:end_date].index.values
x_values = np.array([datetime.datetime.strptime(d, "%Y-%m-%d %H:%M:%S") for d in dates])
fig, ax = plt.subplots(figsize=(15,9))
# y values
y_values = np.array(df_stock.loc[start_date:end_date, 'Bid'])
# plotting
_ = ax.plot(x_values, y_values, label='Bid')
# formatting
formatter = mdates.DateFormatter('%m-%d %H:%M')
ax.xaxis.set_major_formatter(formatter)
The flat lines correspond to data which does not exist I would like to know if it is possible not to display them while keeping the gap of the x labels.
thank you so much
You want to have time on the x-axis and time is equidistant -- independent whether you have data or not.
You now have several options:
don't use time on the x-axis but samples/index
do as in 1. but change the ticks & labels to draw time again (but this time not equidistantly)
make the value-vector equidistant and use NaNs to fill the gaps
Why is this so?
Per default, matplotlib produces a line plot, which connects the points with lines using the order in which they are presented. In contrast to this a scatter plot just plots the individual points, not suggesting any underlying order. You achieve the same result as if you would use a line plot without markers.
In general, you have 3-4 options
use the plot command but only plot markers (add linestyle='')
use the scatter command.
if you use NaNs, plotdoes not know what to plot and plots nothing (but also won't connect non-existing points with lines)
use a loop and plot connected sections as separate lines in the same axes
options 1/2 are the easiest if you want to do almost no changes on your code. Option 3 is the most proper and 4 mimics this result.

Data points cut off when plotting

I have a plot where the data points has been cut off, as can be seen in the picture. I need to fix this issue by showing clearly the data points, I have already tried to use ax.margins from previous questions , but it does not change anything on my plot. The following is the code I am using. I guess the ylim might be raising this issue, but if I don't use ylim all my data stays very near to zero axis.
def doscatterplot(xcoord,ycoord,labellist,ax=None):
ax = ax
ax.scatter(xcoord, ycoord,label=labellist)
# ax.xaxis.set_major_formatter(mtick.FormatStrFormatter('%.2f'))
ax.legend()
ax.margins(0.1,y=0.7)
ax.set_ylim(min(ycoord),max(ycoord))
ax.ticklabel_format(axis='y',style='sci',scilimits=(-3,-4))
ax.axhline(y=0, color='g')
ax.axvline(x=0, color='g')
ax.set_ylabel('Transversal Resistance [\u03A9]')
ax.set_xlabel('HCools [T]')
ax.set_title('Transversal Resistance [\u03A9] vs HCools [T]' )
return
What I meant in my comment is to add this extra line of code:
dy = (max(ycoord) - min(ycoord))/20
to add a little extra space above and below your plotted data (in this case, the 20th part of the range of your data). Change the old line for this one and it should work as you want:
ax.set_ylim(min(ycoord) - dy, max(ycoord) + dy)
Also, note that you can write greek symbols without resorting to unicode language, try for example
ax.set_ylabel(r'Transversal Resistance [$\Omega$]')
You can see more information here.

Change X axis labeling using Pandas/matplotlib in Python

I am plotting some columns of a csv using Pandas/Matplotlib. The index column is the time in seconds (which has very high number).
For example:
401287629.8
401287630.8
401287631.7
401287632.8
401287633.8
401287634.8
I need this to be printed as my xticklabel when i plot. But it is changing the number format as shown below:
plt.figure()
ax = dfPlot.plot()
legend = ax.legend(loc='center left', bbox_to_anchor=(1,0.5))
labels = ax.get_xticklabels()
for label in labels:
label.set_rotation(45)
label.set_fontsize(10)
I couldn't find a way for the xticklabel to print the exact value rather than shortened version of it.
This is essentially the same problem as How to remove relative shift in matplotlib axis
The solution is to tell the formatter to not use an offset
ax.get_xaxis().get_major_formatter().set_useOffset(False)
Also related:
useOffset=False in config file?
https://github.com/matplotlib/matplotlib/issues/2400
https://github.com/matplotlib/matplotlib/pull/2401
If it's not rude of me to point out, you're asking for a great deal of precision from a single chart. Your sample data shows a six-second difference over two times that are both over twelve and a half-years long.
You have to cut your cloth to your measure on this one. If you want to keep the years, you can't keep the seconds. If you want to keep the seconds, you can't have the years.

Adding error bars to Matplotlib-generated graph of Pandas dataframe creates invalid legend

I am trying to graph a Pandas dataframe using Matplotlib. The dataframe contains four data columns composed of natural numbers, and an index of integers. I would like to produce a single plot with line graphs for each of the four columns, as well as error bars for each point. In addition, I would like to produce a legend providing labels for each of the four graphed lines.
Graphing the lines and legend without error bars works fine. When I introduce error bars, however, the legend becomes invalid -- the colours it uses no longer correspond to the appropriate lines. If you compare a graph with error bars and a graph without, the legend and the shapes/positions of the curves remain exactly the same. The colours of the curves get switched about, however, so that though the same four colours are used, they now correspond to different curves, meaning that the legend now assigns the wrong label to each curve.
My graphing code is thus:
def plot_normalized(agged, show_errorbars, filename):
combined = {}
# "agged" is a dictionary containing Pandas dataframes. Each dataframe
# contains both a CPS_norm_mean and CPS_norm_std column. By running the code
# below, the single dataframe "combined" is created, which has integer
# indices and a column for each of the four CPS_norm_mean columns contained
# in agged's four dataframes.
for k in agged:
combined[k] = agged[k]['CPS_norm_mean']
combined = pandas.DataFrame(combined)
plt.figure()
combined.plot()
if show_errorbars:
for k in agged:
plt.errorbar(
x=agged[k].index,
y=agged[k]['CPS_norm_mean'],
yerr=agged[k]['CPS_norm_std']
)
plt.xlabel('Time')
plt.ylabel('CPS/Absorbency')
plt.title('CPS/Absorbency vs. Time')
plt.savefig(filename)
The full 100-line script is available on GitHub. To run, download both graph.py and lux.csv, then run "python2 graph.py". It will generate two PNG files in your working directory -- one graph with error bars and one without.
The graphs are thus:
Correct graph (with no error bars):
Incorrect graph (with error bars):
Observe that the graph without error bars is properly labelled; note that the graph with error bars is improperly labelled, as though the legend is identical, the line graphs' changed colours mean that each legend entry now refers to a different (wrong) curve.
Thanks for any help you can provide. I've spent a number of extremely aggravating hours bashing my head against the wall, and I suspect that I'm making a stupid beginner's mistake. For what it's worth, I've tried with the Matplotlib development tree, version 1.2.0, and 1.1.0, and all three have exhibited identical behaviour.
I am new to programming and python in general but I managed to throw together a dirty fix, the legends are now correct, the colors are not.
def plot_normalized(agged, show_errorbars, filename):
combined = {}
for k in agged:
combined[k] = agged[k]['CPS_norm_mean']
combined = pandas.DataFrame(combined)
ax=combined.plot()
if show_errorbars:
for k in agged:
plt.errorbar(
x=agged[k].index,
y=agged[k]['CPS_norm_mean'],
yerr=agged[k]['CPS_norm_std'],
label = k #added
)
if show_errorbars: #try this, dirty fix
labels, handles = ax.get_legend_handles_labels()
N = len(handles)/2
plt.legend(labels[:N], handles[N:])
#Why does the fix work?:
#labels, handles = ax.get_legend_handles_labels()
#print handles
#out:
#[u'Blank', u'H9A', u'Q180K', u'Wildtype', 'Q180K', 'H9A', 'Wildtype', 'Blank']
#Right half has correct order, these are the labels from label=k above in errorplot
plt.xlabel('Time')
plt.ylabel('CPS/Absorbency')
plt.title('CPS/Absorbency vs. Time')
plt.savefig(filename)
Produces:

Categories

Resources