I have a DataFrame that looks something like this:
df = [4, -1, 5, -32, 4, -32, -1]
I want to set the xticks like this:
tick_locs = [-30, -10, 0, 10, 30, 100, 300, 1000, 3000]
plt.xticks(tick_locs, tick_locs)
That gives me a weird graph:
I can set the ticks to all positive, but that won't give me negative numbers on the x-axis:
tick_locs = [10, 30, 100, 300, 1000, 3000]
plt.xticks(tick_locs, tick_locs)
Any idea how to get the negative ticks marks?
P.S. The data is set up as logged, but the x-axis is set to show the actual numbers:
bin_edges = 10 ** np.arange(-0.1, np.log10(planes_df['ArrDelay'].max())+0.1, 0.1)
plt.hist(planes_df['ArrDelay'], bins = bin_edges)
plt.xscale('log')
tick_locs = [10, 30, 100, 300, 1000, 3000]
plt.xticks(tick_locs, tick_locs)
Try removing the line plt.xscale('log'). This will make the x-axis scale linear. A logarithmic axis cannot display non-positive values, as log(x) is undefined for x <= 0.
Related
I'm trying to use seaborn to create a colored bubbleplot of 3-D points (x,y,z), each coordinate being an integer in range [0,255]. I want the axes to represent x and y, and the hue color and size of the scatter bubbles to represent the z-coordinate.
The code:
import seaborn
seaborn.set()
import pandas
import matplotlib.pyplot
x = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
y = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
z = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
df = pandas.DataFrame(list(zip(x, y, z)), columns =['x', 'y', 'z'])
ax = seaborn.scatterplot(x="x", y="y",
hue="z",
data=df)
matplotlib.pyplot.xlim(0,255)
matplotlib.pyplot.ylim(0,255)
matplotlib.pyplot.show()
gets me pretty much what I want:
This however makes the hue range be based on the data in z. I instead want to set the range according to the range of the min and max z values (as 0,255), and then let the color of the actual points map onto that range accordingly (so if a point has z-value 50, then that should be mapped onto the color represented by the value 50 in the range [0,255]).
My summarized question:
How to manually set the hue color range of a numerical variable in a scatterplot using seaborn?
I've looked thoroughly online on many tutorials and forums, but have not found an answer. I'm not sure I've used the right terminology. I hope my message got across.
Following #JohanC's suggestion of using hue_norm was the solution. I first tried doing so by removing the [hue=] parameter and only using the [hue_norm=] parameter, which didn't produce any colors at all (which makes sense).
Naturally one should use both the [hue=] and the [hue_norm=] parameters.
import seaborn
seaborn.set()
import pandas
import matplotlib.pyplot
x = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
y = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
z = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 255]
df = pandas.DataFrame(list(zip(x, y, z, my_sizes)), columns =['x', 'y', 'z'])
ax = seaborn.scatterplot(x="x", y="y",
hue="z",
hue_norm=(0,255), # <------- the solution
data=df)
matplotlib.pyplot.xlim(0,255)
matplotlib.pyplot.ylim(0,255)
matplotlib.pyplot.show()
There are multiple polygons for which I want to change the color and width of a specific edge. There is no problem with polygon initializing (Fig.1), but when I want to change the color and width of one edge of polygons in the for-loop (Fig.2 ) it raised error
File "C:\Users\Initi__BC_1024_E2.py", line 41, in <module>
vertex[4,i,0,:] = one_coord[j][-1][:]
IndexError: index 4 is out of bounds for axis 0 with size 4'
Fig 1. Preliminary polygon(Input)
Fig 2. Final Polygon (output)
import numpy as np
import matplotlib.pyplot as plt
pixels = 600
my_dpi = 100
num_geo=4
one_coord = np.array([[[-150, -200], [300, -200], [300, 0], [150, 200], [-150, 200]],
[[-300, -200], [200, -300], [200, -50], [200, 300], [-150, 200]],
[[-140, -230], [350, -260], [350, 0], [140, 200], [-180, 220]],
[[-180, -240], [370, -270], [370, 0], [170, 200], [-190, 230]]])
for i in range(4):
geo =one_coord[i, :, :]
print(one_coord[i])
fig = plt.figure(num_geo, figsize=(pixels/my_dpi, pixels/my_dpi),
facecolor='k', dpi=my_dpi)
plt.axes([0,0,1,1])
rectangle = plt.Rectangle((-300, -300), 600, 600, fc='k')
plt.gca().add_patch(rectangle)
polygon = plt.Polygon(one_coord[i],color='w')
plt.gca().add_patch(polygon)
plt.axis('off')
plt.axis([-300,300,-300,300])
plt.close()
vertex_number = 5
vertex = np.zeros((4,vertex_number,2, 2))
for j in range(num_geo):
one_coord[j]
for k in range(vertex_number-1): #rang(4), (0,1,2,3)
vertex[j] = one_coord[j][k:k+2] #(0:2) to (3:5)
vertex[j,4,0,:] = one_coord[j][-1][:]
vertex[j,4,1,:] = one_coord[j][0][:]
plt.plot( vertex[j,:,0], vertex[j,:,1], linewidth=5, color='r')
plt.savefig('figureBc/%d.jpg' % i, dpi=my_dpi)
plt.close()
The line
vertex = np.zeros((4, vertex_number, 2, 2))
creates a numpy array with shape (4,vertex_number,2,2). Since python indexes start from zero that means the zeroth axis (axis 0 in the error traceback) has indexes 0, 1, 2, 3, so
vertex[4,i,0,:]
Is trying to access index 4 on the first axis - which does not exist. For every axis the index must always be less than the size (exclusive).
I'm plotting a graph on a x axis (solution concentration) against efficiency (y). I have this set up to display for x between 0 to 100, but I want to add another datapoint as a control, without any solution at all. I'm having issues as this doesn't really fit anywhere on the concentration axis, but Id like to add it either before 0 or after 100, potentially with a break in the axis to separate them. So my x-axis would look like ['control', 0, 20, 40, 60, 80, 100]
MWE:
x_array = ['control', 0, 20, 40, 50, 100]
y_array = [1, 2, 3, 4, 5, 6]
plt.plot(x_array, y_array)
Trying this, I get an error of:
ValueError: could not convert string to float: 'control'
Any ideas how i could make something like this work? Ive looked at xticks but that would plot the x axis as strings, therefore losing the continuity of the axis, which would mess up the plot as the datapoints are not spaced equidistant.
You can add a single point to your graph as a separate call to plot, then adjust the x-axis labels.
import matplotlib.pyplot as plt
x_array = [0, 20, 40, 50, 100]
y_array = [2, 3, 4, 5, 6]
x_con = -20
y_con = 1
x_ticks = [-20, 0, 20, 40, 60, 80, 100]
x_labels = ['control', 0, 20, 40, 60, 80, 100]
fig, ax = plt.subplots(1,1)
ax.plot(x_array, y_array)
ax.plot(x_con, y_con, 'ro') # add a single red dot
# set tick positions, adjust label text
ax.xaxis.set_ticks(x_ticks)
ax.xaxis.set_ticklabels(x_labels)
ax.set_xlim(x_con-10, max(x_array)+3)
ax.set_ylim(0,7)
plt.show()
So I've got some data which I wish to plot via a frequency density (unequal class width) histogram, and via some searching online, I've created this to allow me to do this.
import numpy as np
import matplotlib.pyplot as plt
plt.xkcd()
freqs = np.array([3221, 1890, 866, 529, 434, 494, 382, 92, 32, 7, 7])
bins = np.array([0, 5, 10, 15, 20, 30, 50, 100, 200, 500, 1000, 1500])
widths = bins[1:] - bins[:-1]
heights = freqs.astype(np.float)/widths
plt.xlabel('Cost in Pounds')
plt.ylabel('Frequency Density')
plt.fill_between(bins.repeat(2)[1:-1], heights.repeat(2), facecolor='steelblue')
plt.show()
As you may see however, this data stretches into the thousands on the x axis and on the y axis (density) goes from tiny data (<1) to vast data (>100). To solve this I will need to break both axis. The closest to help I've found so far is this, which I've found hard to use. Would you be able to help?
Thanks, Aj.
You could just use a bar plot. Setting the xtick labels to represent the bin values.
With logarithmic y scale
import numpy as np
import matplotlib.pyplot as plt
plt.xkcd()
fig, ax = plt.subplots()
freqs = np.array([3221, 1890, 866, 529, 434, 494, 382, 92, 32, 7, 7])
freqs = np.log10(freqs)
bins = np.array([0, 5, 10, 15, 20, 30, 50, 100, 200, 500, 1000, 1500])
width = 0.35
ind = np.arange(len(freqs))
rects1 = ax.bar(ind, freqs, width)
plt.xlabel('Cost in Pounds')
plt.ylabel('Frequency Density')
tick_labels = [ '{0} - {1}'.format(*bin) for bin in zip(bins[:-1], bins[1:])]
ax.set_xticks(ind+width)
ax.set_xticklabels(tick_labels)
fig.autofmt_xdate()
plt.show()
This question has probably a totally simple solution but I just can't find it. I'd like to plot a contourf plot where the one part of my data varies in steps of order 1 and the other part varies with steps of order 100.
Now I tried to just give contour levels like this:
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
However this leads to the result that the fist 11 levels all have the same color as matplotlib is somehow normalizing this to the maximum value. How can I make every level equally important in terms of my color map?
Thanks a lot HYRY, your answer solved my problem. This is what the plots look like bevore and after the implementation (I adjusted the levels a bit; data from the GOZCARDS team/NASA):
Use colors argument:
import pylab as pl
import numpy as np
x, y = np.mgrid[-1:1:100j, 0:1:100j]
z = ... # your function
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
cmap = pl.cm.BuPu
colors = cmap(np.linspace(0, 1, len(contour_levels)))
pl.contour(x, y, z, levels=contour_levels, colors=colors)
I am a little wary of HYRY's solution as the mapping between the colors level can become arbitrary. I would suggest using LogNorm instead which maps your values -> colors with a log.
import pylab as pl
import numpy as np
x, y = np.mgrid[-1:1:100j, 0:1:100j]
z = ... # your function
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
cmap = pl.cm.BuPu
pl.contourf(x, y, z, levels=contour_levels, norm=matplotlib.colors.LogNorm)
If you also use vmin and vmax you can explicitly control the limits of the normalization and ensure that the color scales match between graphs independent of what levels you use.