MatPlotLib not displaying both graphs when sharing X axes - python

This is a bit of an odd problem I've encountered. I'm trying to read data from a CSV file in Python, and have the two resulting lines be inside of the same box, with different scales so they're both clear to read.
The CSV file looks like this:
Date,difference,current
11/19/20, 0, 606771
11/20/20, 14612, 621383
and the code looks like this:
data = pd.read_csv('data.csv')
time = data['Time']
ycurr = data['current']
ydif = data['difference']
fig, ax = plt.subplots()
line1, = ax.plot(time, ycurr, label='Current total')
line1.set_dashes([2, 2, 10, 2]) # 2pt line, 2pt break, 10pt line, 2pt break
line2, = ax.twinx().plot(time, ydif, dashes=[6, 2], label='Difference')
ax.legend()
plt.show()
I can display the graphs with the X-axis having Date values and Y-axis having difference values or current values just fine.
However, when I attempt to use subplots() and use the twinx() attribute with the second line, I can only see one of two lines.
I initially thought this might be a formatting issue in my code, so I updated the code to have the second line be ax2 = ax1.twin(x) and call upon the second line using this, but the result stayed the same. I suspect that this might be an issue with reading in the CSV data? I tried to do read in x = np.linspace(0, 10, 500) y = np.sin(x) y2 = np.sin(x-0.05) instead and that worked:

Everything is working as expected but probably not how you want it to work!
So each line only consists of two data points which in the end will give you a linear curve. Both of these curves share the same x-coordinates while the y-axis is scaled for each plot. And here comes the problem, both axes are scaled to display the data in the same way. This means, the curves lie perfectly on top of each other. It is difficult to see because both lines are dashed.
You can see what is going on by changing the colors of the line. For example add color='C1' to one of the curves.
By the way, what do you want to show with your plot? A curve consisting of two data points mostly doesn't show much and you are better of if you just show their values directly instead.

Related

Combine multiple matplotlib axes without re-plotting data

optional context feel free to skip: I'm currently using cartopy and matplotlib to read in and plot weather model data on a map. I have three different fields I'm plotting: temperature, wind, and surface pressure. I'm using contourf, barbs, and contour respectively to plot each field. I want one image for each field, and then I'd like one image that contains all three fields overlaid on a single map. Currently I'm doing this by plotting each field individually, saving each of the individual images, then replotting all three fields on a single ax and a new fig, and saving that fig. Since the data takes a while to plot, I would like to be able to plot each of the single fields, then combine the axes into one final image.
I'd like to be able to combine multiple matplotlib axes without replotting the data on the axes. I'm not sure if this is possible, but doing so would be a pretty major time and performance saver. An example of what I'm talking about:
from matplotlib import pyplot as plt
import numpy as np
x1 = np.linspace(0, 2*np.pi, 100)
x2 = x1 + 5
y = np.sin(x1)
firstFig = plt.figure()
firstAx = firstFig.gca()
firstAx.scatter(x1, y, 1, "red")
firstAx.set_xlim([0, 12])
secondFig = plt.figure()
secondAx = secondFig.gca()
secondAx.scatter(x2, y, 1, "blue")
secondAx.set_xlim([0, 12])
firstFig.savefig("1.png")
secondFig.savefig("2.png")
This generates two images, 1.png and 2.png.
Is it possible to save a third file, 3.png that would look something like the following, but without calling scatter again, because for my dataset, the actual plotting takes a long time?
If you just want to save images of your plots and you don't intend to further use the Figure objects, you can use the following after saving "2.png".
# get the scatter object from the first figure
scatter = firstAx.get_children()[0]
# remove it from this collection so you can assign it to a new axis
# the axis reassignment will raise an error if it already belongs to another axis
scatter.remove()
scatter.axes = secondAx
# now you can add it to your new axis
secondAx.add_artist(scatter)
secondFig.savefig("3.png")
This modifies both figures, as it removes a scatter from one and adds it to another. If for some reason you want to preserve them, you can copy the contents of secondFig to a new one and then add the scatter to that. However, this will still modify the first plot as you have to remove the scatter from there.

Peculiar horizontal lines in 3d plots

I am trying to make a 3d plot as using 3 2d arrays. I get the plot just fine but I want to truncate the axes using ax.set_xlim()
This is the original plot I have:
I want to remove the long tails for each curve. I use ax.set_xlim(0,30) and this is the result:
I want to get rid of the extra part that is flowing out of the graph area. If I reduce the limits futher with ax.set_xlim(0,25), I get something even more weird:
Ideally, I would want the x-axis to only show points until 25. How can I get rid of the extra parts?
Thanks in advance!
The code (the arrays are being read from a file):
fig = plt.figure()
ax = plt.axes(projection = '3d')
for i in range(5):
ax.plot(E[:,i], freq[:,i], DF[:,i])
ax.set_zlabel(filename + "$\ [eV^{-1} cm^{-3}]$")
ax.set_xlabel("$\epsilon [eV]$")
ax.set_ylabel("f [MHz]")
ax.set_xlim(0,25)
ax.set_yticks(freq[0])
fig.savefig(filename+"3d.png", dpi = 1500, bbox_inches='tight')
Data file from where the data is being read: File

Do not display missing values ​matplotlib

I would like to remove the flat lines on my graph by keeping the labels x.
I have this code which gives me a picture
dates = df_stock.loc[start_date:end_date].index.values
x_values = np.array([datetime.datetime.strptime(d, "%Y-%m-%d %H:%M:%S") for d in dates])
fig, ax = plt.subplots(figsize=(15,9))
# y values
y_values = np.array(df_stock.loc[start_date:end_date, 'Bid'])
# plotting
_ = ax.plot(x_values, y_values, label='Bid')
# formatting
formatter = mdates.DateFormatter('%m-%d %H:%M')
ax.xaxis.set_major_formatter(formatter)
The flat lines correspond to data which does not exist I would like to know if it is possible not to display them while keeping the gap of the x labels.
thank you so much
You want to have time on the x-axis and time is equidistant -- independent whether you have data or not.
You now have several options:
don't use time on the x-axis but samples/index
do as in 1. but change the ticks & labels to draw time again (but this time not equidistantly)
make the value-vector equidistant and use NaNs to fill the gaps
Why is this so?
Per default, matplotlib produces a line plot, which connects the points with lines using the order in which they are presented. In contrast to this a scatter plot just plots the individual points, not suggesting any underlying order. You achieve the same result as if you would use a line plot without markers.
In general, you have 3-4 options
use the plot command but only plot markers (add linestyle='')
use the scatter command.
if you use NaNs, plotdoes not know what to plot and plots nothing (but also won't connect non-existing points with lines)
use a loop and plot connected sections as separate lines in the same axes
options 1/2 are the easiest if you want to do almost no changes on your code. Option 3 is the most proper and 4 mimics this result.

Visual output of Matplotlib bar chart - Python

I am working on getting some graphs generated for 4 columns, with the COLUMN_NM being the main index.
The issue I am facing is the column names are showing along the bottom. This is problematic for 2 reasons, first being there could be dozens of these columns so the graph would look messy and could stretch too far to the right. Second being they are getting cut off (though I am sure that can be fixed)
I would prefer to have the column names listed vertically in the box where 'MAX_COL_LENGTH' current resides, and have the bars different colors per column instead.
Any ideas how I would adjust this or suggestions to make this better?
for col in ['DISTINCT_COUNT', 'MAX_COL_LENGTH', 'MIN_COL_LENGTH', 'NULL_COUNT']:
grid[['COLUMN_NM', col]].set_index('COLUMN_NM').plot.bar(title=col)
plt.show()
In this case you can plot points one by one and setup the label name for each point:
gs = gridspec.GridSpec(1,1)
fig = plt.figure(figsize=(5, 5))
ax = fig.add_subplot(gs[:, :])
data = [1,2,3,4,5]
label = ['l1','l2','l3','l4','l5']
for n,(p,l) in enumerate(zip(data,label)):
ax.bar(n,p,label=l)
ax.set_xticklabels([])
ax.legend()
This is the output for the code above:

Scrolling plot using matplotlib "smears" when updating

The application I'm coding for requires a real time plot of incoming data that is being stored long term in an excel spreadsheet. So the real time graph displays the 25 most recent data points.
The problem comes when the plot has to shift in the newest data point and shift out the oldest point. When I do this, the graph "smears" as shown here:
I then began to use plt.cla(), but this causes me to lose all formatting in my plots, such as the title, axes, etc. Is there any way for me to update my graph, but keep my graph formatting?
Here's an example after plt.cla():
.
And here's basically how I'm updating my graphs within a larger loop:
if data_point_index < max_data_points:
y_data[data_point_index] = measurement
plt.plot(x_data[:data_point_index + 1], y_data[:data_point_index + 1], 'or--')
else:
plt.cla()
y_data[0:max_data_points - 1] = y_data[1:max_data_points]
y_data[max_data_points - 1] = measurement
plt.plot(x_data, y_data, 'or--')
plt.pause(0.00001)
I know I can just re-add axis labels and such, but I feel like there should be a more eloquent way to do so and it is somewhat of a hassle as there can be multiple sub-plots and reformatting the figure takes a non-trivial amount of time.
Rather than plt.cla(), which as you have found out clears everything on the axes, you could just remove the last line plotted, which will leave you labels and formatting intact.
The Axes instance has an attribute lines, which stores all the lines currently plotted on the axes. To remove the last line plotted, we can access the current axes using plt.gca(), and then pop() from the list of lines on the axes.
else:
plt.gca().lines.pop()
y_data[0:max_data_points - 1] = y_data[1:max_data_points]
y_data[max_data_points - 1] = measurement
plt.plot(x_data, y_data, 'or--')

Categories

Resources