Matplotlib - pyplot incorrectly setting axes ticks when using scatter() - python

I am trying to customize the xticks and yticks for my scatterplot with the simple code below:
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
y_ticks = np.arange(10, 41, 10)
x_ticks = np.arange(1000, 5001, 1000)
ax.set_yticks(y_ticks)
ax.set_xticks(x_ticks)
ax.scatter(some_x, some_y)
plt.show()
If we comment out the line: ax.scatter(x, y), we get an empty plot with the correct result:
However if the code is run exactly as shown, we get this:
Finally, if we run the code with ax.set_yticks(yticks) and ax.set_xticks(xticks) commented out, we also get the correct result (just with the axes not in the ranges I desire them to be):
Note that I am using Python version 2.7. Additionally, some_x and some_y are omitted.
Any input on why the axes are changing in such an odd manner only after I try plotting a scatterplot would be appreciated.
EDIT:
If I run ax.scatter(x, y) before xticks and yticks are set, I get odd results that are slightly different than before:

Matplotlib axes will always adjust themselves to the content. This is a desirable feature, because it allows to always see the plotted data, no matter if it ranges from -10 to -9 or from 1000 to 10000.
Setting the xticks will only change the tick locations. So if you set the ticks to locations between -10 and -9, but then plot data from 1000 to 10000, you would simply not see any ticks, because they do not lie in the shown range.
If the automatically chosen limits are not what you are looking for, you need to set them manually, using ax.set_xlim() and ax.set_ylim().
Finally it should be clear that in order to have correct numbers appear on the axes, you need to actually use numbers. If some_x and some_y in ax.scatter(some_x, some_y) are strings, they will not obey to any reasonable limits, but simply be plotted one after the other.

Related

Matplotlib: datetime-based linecollection in interactive jupyter plot

I am trying to plot a collection of tens of thousands of line segments in a matplotlib interactive plot in a Jupyter notebook. The problem I have is that
the x-values are datetimes (datetime64[ns], basically POSIX timestamps)
LineCollections can only be based on numbers
when leaving the x-axis of the plot to be numbers, when I zoom the plot, the x-axis nicely adjusts in scale to the zoom. However, the x-axis values are uninformative. When formatting the x-axis to informative datetime values, this information is lost when zooming.
Example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import collections as mc
import matplotlib.dates as mdates
%matplotlib nbagg # interactive plot in jupyter notebook
x = np.array([['2018-03-19T07:01:00.000073810', '2018-03-19T07:01:00.632164618'],
['2018-03-19T07:01:00.000073811', '2018-03-19T07:01:00.742295898'],
['2018-03-19T07:01:00.218747698', '2018-03-19T07:01:00.260067814'],
['2018-03-19T07:01:01.218747698', '2018-03-19T07:01:02.260067814'],
['2018-03-19T07:01:02.218747698', '2018-03-19T07:01:02.260067814'],
['2018-03-19T07:01:02.218747698', '2018-03-19T07:01:02.260067814']],
dtype='datetime64[ns]')
y = np.array([[12355.5, 12355.5],
[12363. , 12363. ],
[12362.5, 12362.5],
[12355.5, 12355.5],
[12363. , 12363. ],
[12362.5, 12362.5]])
fig, ax = plt.subplots()
segs = np.zeros((x.shape[0], x.shape[1], 2))
segs[:, :, 1] = y
segs[:, :, 0] = mdates.date2num(x)
lc = mc.LineCollection(segs)
ax.set_xlim(segs[:,:,0].min(), segs[:,:,0].max())
ax.set_ylim(segs[:,:,1].min()-1, segs[:,:,1].max()+1)
ax.add_collection(lc)
Now, zooming works fine -- the x-axis scale adjusts with the zoom -- but the x-axis values don't tell me anything useful, i.e. the precise time I'm currently looking at. To remedy this I tried to e.g. do:
ax.xaxis.set_major_locator(mdates.SecondLocator())
#ax.xaxis.set_minor_locator(mdates.MicrosecondLocator()) # this causes the plot not to display
Fmt = mdates.DateFormatter("%S")
ax.xaxis.set_major_formatter(Fmt)
Now clearly zooming doesn't work fine since matplotlib doesn't know how format the finer ticks. So if I zoom sufficiently -- which I need to do -- I basically have no ticks on the x-axis.
Is there a way to address this? One way I could think of is to be able to setup a callback that gets called when the plot zooms, and adjust the format of the x-axis. But as far as I could find, this is not possible.
It appears that the main problem is currently to get just any useful ticks and labels on your plot. The default way to do this would be
loc = mdates.AutoDateLocator()
fmt = mdates.AutoDateFormatter(loc)
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(fmt)
This would automatically choose useful tick locations for you and is correct down to some microseconds; below that, ticking may become inaccurate due to floating point restrictions.
Meaning, if you need customized or more accurate tick locations you will need to write your own locator and/or change the units of your data (e.g. to "seconds since midnight").

Multiple x labels on Pyplot

Below is my code for a line graph. I would like another x label under the current one (so I can show the days of the week).
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns;sns.set()
sns.set()
data = pd.read_csv("123.csv")
data['DAY']=["01","02","03","04","05","06","07","08","09","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31"]
plt.figure(figsize=(15,8))
plt.plot('DAY','SWST',data=data,linewidth=2,color="k")
plt.plot('DAY','WMID',data=data,linewidth=2,color="m")
plt.xlabel('DAY', fontsize=20)
plt.ylabel('VOLUME', fontsize=20)
plt.legend()
EDIT: After following the documentation, I have 2 issues. The scale has changed from 31 to 16, and the days of the week do not line up with the day number.
data['DAY']=["01","02","03","04","05","06","07","08","09","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31"]
tick_labels=['1','\n\nThu','2','\n\nFri','3','\n\nSat','4','\n\nSun','5','\n\nMon','6','\n\nTue','7','\n\nWed','8','\n\nThu','9','\n\nFri','10','\n\nSat','11','\n\nSun','12','\n\nMon','13','\n\nTue','14','\n\nWed','15','\n\nThu','16','\n\nFri','17','\n\nSat','18','\n\nSun','19','\n\nMon','20','\n\nTue','21','\n\nWed','22','\n\nThu','23','\n\nFri','24','\n\nSat','25','\n\nSun','26','\n\nMon','27','\n\nTue','28','\n\nWed','29','\n\nThu','30','\n\nFri','31','\n\nSat']
tick_locations = np.arange(31)
plt.figure(figsize=(15,8))
plt.xticks(tick_locations, tick_labels)
plt.plot('DAY','SWST',data=data,linewidth=2,color="k")
plt.plot('DAY','WMID',data=data,linewidth=2,color="m")
plt.xlabel('DAY', fontsize=20)
plt.ylabel('VOLUME', fontsize=20)
plt.legend()
plt.show()
The pyplot function you are looking for is plt.xticks(). This is essentially a combination of ax.set_xticks() and ax.set_xticklabels()
From the documentation:
Parameters:
ticks : array_like
A list of positions at which ticks should be placed. You can pass an
empty list to disable xticks.
labels:
array_like, optional A list of explicit labels to place at the given
locs.
You would want something like the below code. Note you should probably explicitly set the tick locations as well as the labels to avoid setting labels in the wrong positions:
tick_labels = ['1','\n\nThu','2',..., '31','\n\nSat')
plt.xticks(tick_locations, tick_labels)
Note that the object-orientated API (i.e. using ax.) allows for more customisable plots.
Update
After the edit, I see that the labels you want to go below are part of the same list. Therefore your label list actually has a length of 62. So you need to join every 2 elements of your list together:
tick_labels=['1','\n\nThu','2','\n\nFri','3','\n\nSat','4','\n\nSun','5','\n\nMon','6','\n\nTue','7','\n\nWed','8',
'\n\nThu','9','\n\nFri','10','\n\nSat','11','\n\nSun','12','\n\nMon','13','\n\nTue','14','\n\nWed','15',
'\n\nThu','16','\n\nFri','17','\n\nSat','18','\n\nSun','19','\n\nMon','20','\n\nTue','21','\n\nWed','22',
'\n\nThu','23','\n\nFri','24','\n\nSat','25','\n\nSun','26','\n\nMon','27','\n\nTue','28','\n\nWed','29',
'\n\nThu','30','\n\nFri','31','\n\nSat']
tick_locations = np.arange(31)
new_labels = [ ''.join(x) for x in zip(tick_labels[0::2], tick_labels[1::2]) ]
plt.figure(figsize=(15, 8))
plt.xticks(tick_locations, new_labels)
plt.show()
Never use ax.set_xticklabels without setting the locations of the ticks as well. This can be done via ax.set_xticks.
ax.set_xticks(...)
ax.set_xticklabels(...)
Of course you may do the same with pyplot
ax = plt.gca()
ax.set_xticks(...)
ax.set_xticklabels(...)

Avoid overlapping ticks in matplotlib

I am generating plots like this one:
When using less ticks, the plot fits nicely and the bars are wide enough to see them correctly. Nevertheless, when there are lots of ticks, instead of making the plot larger, it just compress the y axe, resulting in thin bars and overlapping tick text.
This is happening both for plt.show() and plt.save_fig().
Is there any solution so it plots the figure in a scale which guarantees that bars have the specified width, not more (if too few ticks) and not less (too many, overlapping)?
EDIT:
Yes, I'm using barh, and yes, I'm setting height to a fixed value (8):
height = 8
ax.barh(yvalues-width/2, xvalues, height=height, color='blue', align='center')
ax.barh(yvalues+width/2, xvalues, height=height, color='red', align='center')
I don't quite understand your code, it seems you do two plots with the same (only shifted) yvalues, but the image doesn't look so. And are you sure you want to shift by width/2 if you have align=center? Anyways, to changing the image size:
No, I am not sure there is no other way, but I don't see anything in the manual at a glance. To set image size by hand:
fig = plt.figure(figsize=(5, 80))
ax = fig.add_subplot(111)
...your_code
the size is in cm. You can compute it beforehand, try for example
import numpy as np
fig_height = (max(yvalues) - min(yvalues)) / np.diff(yvalue)
this would (approximately) set the minimum distance between ticks to a centimeter, which is too much, but try to adjust it.
I think of two solutions for your case:
If you are trying to plot a histogram, use hist function [1]. This will automatically bin your data. You can even plot multiple overlapping histograms as long as you set alpha value lower than 1. See this post
import matplotlib.pyplot as plt
import numpy as np
x = mu + sigma*np.random.randn(10000)
plt.hist(x, 50, normed=1, facecolor='green',
alpha=0.75, orientation='horizontal')
You can also identify interval of your axis ticks. This will place a tick every 10 items. But I doubt this will solve your problem.
import matplotlib.ticker as ticker
...
ax.yaxis.set_major_locator(ticker.MultipleLocator(10))

Colorbar ticklabels don't match tick positions

I'm plotting a meshgrid with pyplot.pcolormesh, and I want to customize the ticklabels on the colorbar. I set a list of tick positions, and provide a list of ticklabels, which should match the tick positions, but I don't know ahead of time which ticks will actually be included, since I don't know the max and the min of the data. The problem is that the first ticklabel I provide is always used at the first visible tick, regardless of whether that is the first tick in my list or not.
Working example:
import matplotlib.pyplot as plt
import numpy as np
a = np.arange(1,10).reshape(3,3)
m = plt.pcolormesh(a)
c = plt.colorbar(m)
c.set_ticks(np.arange(11))
c.set_ticklabels(np.arange(11))
plt.savefig('mesh.png')
This code produces the image below, and the problem here is that the darkest blue is labled 0, while the value in that cell is actually 1, and similarly all the other labels are shifted by 1.
Is this a bug or a feature, and if it's a feature, how can I make sure the labels will match in an elegant manner? I guess I manage with some tests on the data and trying to figure out which tick will be the first visible and so on, but that doesn't seem very pythonic.
Its a feature, because you are setting the ticklabels yourself (with the wrong labels). Its best always trying to avoid setting the ticklabels manually, unless there is no other way.
If you remove this line, the labels will show up correctly:
c.set_ticklabels(np.arange(11))
To improve readability you could also consider normalizing the colors so they become discrete and match specific integer values. But this only works well if the total amount of colors is limited, like in this example.
fig, ax = plt.subplots()
cmap = plt.cm.jet
bounds = np.arange(0.5,10.5,1)
norm = mpl.colors.BoundaryNorm(bounds, cmap.N)
m = ax.pcolormesh(a, cmap=cmap, norm=norm)
c = plt.colorbar(m, ticks=bounds-0.5)

Change distance between boxplots in the same figure in python [duplicate]

I'm drawing the bloxplot shown below using python and matplotlib. Is there any way I can reduce the distance between the two boxplots on the X axis?
This is the code that I'm using to get the figure above:
import matplotlib.pyplot as plt
from matplotlib import rcParams
rcParams['ytick.direction'] = 'out'
rcParams['xtick.direction'] = 'out'
fig = plt.figure()
xlabels = ["CG", "EG"]
ax = fig.add_subplot(111)
ax.boxplot([values_cg, values_eg])
ax.set_xticks(np.arange(len(xlabels))+1)
ax.set_xticklabels(xlabels, rotation=45, ha='right')
fig.subplots_adjust(bottom=0.3)
ylabels = yticks = np.linspace(0, 20, 5)
ax.set_yticks(yticks)
ax.set_yticklabels(ylabels)
ax.tick_params(axis='x', pad=10)
ax.tick_params(axis='y', pad=10)
plt.savefig(os.path.join(output_dir, "output.pdf"))
And this is an example closer to what I'd like to get visually (although I wouldn't mind if the boxplots were even a bit closer to each other):
You can either change the aspect ratio of plot or use the widths kwarg (doc) as such:
ax.boxplot([values_cg, values_eg], widths=1)
to make the boxes wider.
Try changing the aspect ratio using
ax.set_aspect(1.5) # or some other float
The larger then number, the narrower (and taller) the plot should be:
a circle will be stretched such that the height is num times the width. aspect=1 is the same as aspect=’equal’.
http://matplotlib.org/api/axes_api.html#matplotlib.axes.Axes.set_aspect
When your code writes:
ax.set_xticks(np.arange(len(xlabels))+1)
You're putting the first box plot on 0 and the second one on 1 (event though you change the tick labels afterwards), just like in the second, "wanted" example you gave they are set on 1,2,3.
So i think an alternative solution would be to play with the xticks position and the xlim of the plot.
for example using
ax.set_xlim(-1.5,2.5)
would place them closer.
positions : array-like, optional
Sets the positions of the boxes. The ticks and limits are automatically set to match the positions. Defaults to range(1, N+1) where N is the number of boxes to be drawn.
https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.boxplot.html
This should do the job!
As #Stevie mentioned, you can use the positions kwarg (doc) to manually set the x-coordinates of the boxes:
ax.boxplot([values_cg, values_eg], positions=[1, 1.3])

Categories

Resources