I have two similar pieces of matplotlib codes that produce different results.
1:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,10,100)
y = np.linspace(0,10,100)
y[10:40] = np.nan
plt.plot(x,y)
plt.savefig('fig')
2:
from pylab import *
x = linspace(0,10,100)
y = linspace(0,10,100)
y[10:40] = np.nan
plot(x,y)
savefig('fig')
Code #1 produces a straight line with the NaN region filled in with a solid line of a different color
Code #2 produces a figure with a straight line but does not fill in the NaN region with a line. Instead there is a gap there.
How can I make code # 1 produce a gap in place of NaN's like code #2. I have been googling for a couple of days and have come up with nothing. Any help or advice would be appreciated. Thanks in advance
Just to explain what's probably happening:
The two pieces of code you showed are identical. They will always produce the same output if called by themselves. pylab is basically a just a few lines of code that does: (There's a bit more to it than this, but it's the basic idea.)
from numpy import *
from matplotlib.mlab import *
from matplotlib.pyplot import *
There's absolutely no way for pylab.plot to reference a different function than plt.plot
However, if you just call plt.plot (or pylab.plot, they're the same function), it plots on the current figure.
If you plotted something on that figure before, it will still be there. (If you're familiar with matlab, matplotlib defaults to hold('on'). You can change this with plt.hold, but it's best to be more explicit in python and just create a new figure.)
Basically, you probably did this:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,10,100)
y = np.linspace(0,10,100)
plt.plot(x,y)
plt.savefig('fig')
And then, in the same interactive ipython session, you did this:
y[10:40] = np.nan
plt.plot(x, y)
plt.savefig('fig')
Because you didn't call show, the current figure is still the same one as it was before. The "full" line is still present beneath the second one, and the second line with the NaN's is a different color because you've plotted on the same axes.
This is one of the many reasons why it's a good idea to use the object-oriented interface. That way you're aware of exactly which axes and figure you're plotting on.
For example:
fig, ax = plt.subplots()
ax.plot(x, y)
fig.savefig('test.png')
If you're not going to do that, at very least always explicitly create a new figure and/or axes when you want a new figure. (e.g. start by calling plt.figure())
Related
I am trying to plot a scatter diagram. It will take multiple arrays as input but plot into a single graph.
Here is my code:
import numpy as np
import os
import matplotlib.pyplot as plt
ax = plt.gca()
n_p=np.array([17.2,25.7,6.1,0.9,0.5,0.2])
n_d=np.array([1,2,3])
a_p=np.array([4.3,1.4,8.1,1.8,7.9,7.0])
a_d=np.array([12,13,14])
ax.scatter = ([n_d[0]/n_d[1]],[n_p[0]/n_p[1]])
ax.scatter = ([a_d[0]/a_d[1]],[a_p[0]/a_p[1]])
I will read the arrays from csv file, here I just put a simple example (for that I imported os). I want to plot the ratio of array element 2/ element 1 of n_p (as x-axis) and same with n_d (as y-axis). This will give a point in the graph. Similar operation will be followed by a_p and a_d array, and the point will be appended to the graph. There will be more data to append, but to understand the process, two is enough.
I tried to follow example from here.
If I use the color, I get syntax error.
If I do not use color, I get a blank plot.
Sorry, my coding experience is beginner so code is rather nasty.
Thanks in advance.
remove the = from the function call!
import numpy as np
import os
import matplotlib.pyplot as plt
ax = plt.gca()
n_p=np.array([17.2,25.7,6.1,0.9,0.5,0.2])
n_d=np.array([1,2,3])
a_p=np.array([4.3,1.4,8.1,1.8,7.9,7.0])
a_d=np.array([12,13,14])
ax.scatter([n_d[0]/n_d[1]],[n_p[0]/n_p[1]])
ax.scatter([a_d[0]/a_d[1]],[a_p[0]/a_p[1]])
When I plot some data with matplotlib without setting any parameters, the data gets plotted with both x and y axis limits set correctly, meaning that all data is shown and no space is wasted (case 1):
import matplotlib
matplotlib.use('QT5Agg')
import matplotlib.pyplot as plt
x = range(10)
plt.plot(x,'-o',markersize='10')
plt.tight_layout()
plt.show()
Result:
If I set some limits for e. g. the x axis, even using autoscale() does not autoscale the y axis anymore (case 2):
import matplotlib
matplotlib.use('QT5Agg')
import matplotlib.pyplot as plt
x = range(10)
plt.plot(x,'-o',markersize='10')
plt.autoscale(enable=True,axis='y')
plt.xlim(7.5,11)
plt.tight_layout()
plt.show()
Result:
Question: which function is used internally by matplotlib to determine the limits for both axes and update the plot in case 1?
Background: I want to use this function as a base for reimplementing / extending this functionality for case 2.
As #ImportanceOfBeingEarnest pointed out in the answer below, there is no such automatized way at the moment. So, in case you are interested in knowing how to rescale your y-axis, one way to do so is by recomputing the corresponding y-values and then reassigning the y-limits using the method specified in this unaccepted answer. I haven't marked this as a duplicate because there are certain different issues in your example:
First (major one), you have plotted only x-values. So, to apply the method in the other answer, I had to first get the y-values in an array. This is done using get_ydata()
Second, the x-values were changed from range() generator to a NumPy array, as the former does not support indexing.
Third, I had to use a variable for the x-limits to be consistent with the function.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(10)
plt.plot(x,'-o',markersize='10')
x_lims = [7.5, 11]
plt.xlim(x_lims)
ax = plt.gca()
y = ax.lines[0].get_ydata()
def find_nearest(array,value):
idx = (np.abs(array-value)).argmin()
return idx
y_low = y[find_nearest(x, x_lims[0])]
y_high = y[find_nearest(x, x_lims[1])]
ax.set_ylim(y_low, y_high)
plt.tight_layout()
plt.show()
The question has already been asked and has a good solution using masks.
Asking again because I'd like to know if is there a way to make matplotlib handle missing data on its own, something like if any of x or y data is missing just ignore it and draw a line through it.
Here's some sample code:
import numpy as np
import matplotlib.pyplot as plt
plt.figure()
x = np.arange(0, 100, 10)
y = np.random.randint(0, 10, 10)
plt.plot(x,y, "*-")
x_nan = np.arange(100)
y_nan = np.asarray([np.nan] * 100)
y_nan[::10] = np.random.randint(0, 10, 10)
plt.plot(x_nan,y_nan,"*-")
mask = np.isfinite(y_nan)
plt.plot(x_nan[mask],y_nan[mask],"--")
plt.show()
The second plot draws dots only for the non-nan points, but no line through them.
The easiest way to make it look like the first is to define a mask like in the third plot. I'd like to know if is there a way to make matplotlib behave like this automatically without the extra mask.
Short answer: No!
Long answer: One could indeed imagine that some feature would be built into matplotlib's plot function that would allow to remove nans from the input.
However, there is none.
But since the solution is essentially only one extra line of code, the fact that matplotlib does not provide this functionality is bearable.
Just as a fun fact: Interestingly, a scatter plot indeed irgnores nan values, e.g.
line, = plt.plot(x_nan,y_nan,"-")
scatter = plt.scatter(x_nan,y_nan)
print(len(line.get_xdata())) # 100
print(len(scatter.get_offsets())) # 10
while the line has still 100 points, the scatter only has 10, as all nan values are removed.
I have a simple problem that I cannot quite understand why it doesn't work.
MWE:
import numpy as np
import matplotlib as plt
test = np.random.rand(100,5)
plt.plot(test)
plt.show()
Produces
Now all I want to do is to quite literally transpose the whole test matrix so that my data on the x-axis is now plotted vertically instead (so [0-100] is on y instead). But when I do that:
plt.plot(test.T)
plt.show()
I get this instead
The data streams are thus being superimposed on top of each other rather than transposing the array. I was expecting the whole thing to just get flipped as so x --> y and y --> x. Perhaps what I want is not transpose. So the data is plotted horizontally now, and I just want to plot i vertically instead.
Hence, where am I going wrong? I have clearly misunderstood something very basic.
Well this solved it...
plt.plot(test,range(100))
plt.show()
Generalising Astrid's answer, one can define a helper function to transpose the axis for any 1d-array plot, like this:
def transpose_plot(array):
return plt.plot(array, range(len(array)))
Demo:
import numpy as np
import matplotlib.pyplot as plt
def transpose_plot(array):
return plt.plot(array, range(len(array)))
test = np.random.rand(100,5)
transpose_plot(test)
plt.show()
suppose i have the following code which creates one matplotlib figure with two axes, the second of which has x-axis labels as dates:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import datetime as dt
x1 = np.arange(0,30)
x2 = pd.date_range('1/1/2016', periods=30, freq='D')
y1 = np.random.randn(30)
y2 = np.random.randn(30)
%matplotlib inline
fig, ax = plt.subplots(1,2, figsize=(18,5))
ax[0].scatter(x1,y1)
ax[1].scatter(x2,y2)
displaying this in an ipython notebook will show the x axis labels of the graph on the right as running into one another. i would like to rotate the labels to improve visibility. all of the documentation and online searching seems to suggest one of the following 2 options (both after the last line above):
#1
plt.setp(ax[1].xaxis.get_majorticklabels(),rotation=90,horizontalalignment='right')
or #2
plt.xticks(rotation=90)
either of these will work but will also print a list of labels (which for some reason is different in the first example than in the second)
how do i accomplish the rotation/display without also outputting some array?
i was able to use this approach. not sure if it is the most elegant way, but it works without outputting an array
for tick in ax[1].get_xticklabels():
tick.set_rotation(90)
In Jupyter you can also just slap a semicolon on the end a command and it will suppress the output. This is handy for plotting graphs without printing the returned data:
plt.xticks(rotation=90);
A bit late, so just for future reference:
Jupyter notebooks always print the return value from the last command in a cell.
I would just suppress this by adding another function or statement that doesn't return anything, instead of searching for a workaround for the actual function you want to call.
So you could do:
plt.xticks(rotation=90)
pass
or:
plt.xticks(rotation=90)
plt.show()