Add a string to an x-axis of integers - python

I'm plotting a graph on a x axis (solution concentration) against efficiency (y). I have this set up to display for x between 0 to 100, but I want to add another datapoint as a control, without any solution at all. I'm having issues as this doesn't really fit anywhere on the concentration axis, but Id like to add it either before 0 or after 100, potentially with a break in the axis to separate them. So my x-axis would look like ['control', 0, 20, 40, 60, 80, 100]
MWE:
x_array = ['control', 0, 20, 40, 50, 100]
y_array = [1, 2, 3, 4, 5, 6]
plt.plot(x_array, y_array)
Trying this, I get an error of:
ValueError: could not convert string to float: 'control'
Any ideas how i could make something like this work? Ive looked at xticks but that would plot the x axis as strings, therefore losing the continuity of the axis, which would mess up the plot as the datapoints are not spaced equidistant.

You can add a single point to your graph as a separate call to plot, then adjust the x-axis labels.
import matplotlib.pyplot as plt
x_array = [0, 20, 40, 50, 100]
y_array = [2, 3, 4, 5, 6]
x_con = -20
y_con = 1
x_ticks = [-20, 0, 20, 40, 60, 80, 100]
x_labels = ['control', 0, 20, 40, 60, 80, 100]
fig, ax = plt.subplots(1,1)
ax.plot(x_array, y_array)
ax.plot(x_con, y_con, 'ro') # add a single red dot
# set tick positions, adjust label text
ax.xaxis.set_ticks(x_ticks)
ax.xaxis.set_ticklabels(x_labels)
ax.set_xlim(x_con-10, max(x_array)+3)
ax.set_ylim(0,7)
plt.show()

Related

How to draw a line that crosses all points by using a list of point indices?

I have the following lists that correspond to 6 (clients) points (each one having an id, and x and y coordinates)
allIds = [0, 1, 2, 3, 4, 5],
allxs = [50, 25, 43, 80, 25, 18]
allys = [50, 54, 96, 50, 90, 47]
For example, the 1st point has an id of 0, its x coordinates are 50, and its y coordinates are 50, and so on.
I am trying to solve a traveling salesman problem, so I want to plot the points of a specific route between points, connected with a line that will represent the closed route.
The final route I want to plot is represented by the following list:
final_route = [0, 4, 5, 2, 1, 3, 0]
and represents a path between clients' ids
So far i have only managed to plot the points only, with the following code:
fig, ax = plt.subplots()
fig.set_size_inches(6, 6)
ax.plot(all_x, all_y, ls="", marker="o", markersize=8)
for xi, yi, pidi in zip(all_x, all_y, all_ids):
ax.annotate(str(pidi), xy=(xi,yi))
plt.xlim([0, 100])
plt.ylim([0, 100])
plt.show()
Which produces the following plot:
Plot of poits
Any ideas about how to plot the line between the points? Thanks

How to set the hue range for a numeric variable using a colored bubble plot in seaborn, python?

I'm trying to use seaborn to create a colored bubbleplot of 3-D points (x,y,z), each coordinate being an integer in range [0,255]. I want the axes to represent x and y, and the hue color and size of the scatter bubbles to represent the z-coordinate.
The code:
import seaborn
seaborn.set()
import pandas
import matplotlib.pyplot
x = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
y = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
z = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
df = pandas.DataFrame(list(zip(x, y, z)), columns =['x', 'y', 'z'])
ax = seaborn.scatterplot(x="x", y="y",
hue="z",
data=df)
matplotlib.pyplot.xlim(0,255)
matplotlib.pyplot.ylim(0,255)
matplotlib.pyplot.show()
gets me pretty much what I want:
This however makes the hue range be based on the data in z. I instead want to set the range according to the range of the min and max z values (as 0,255), and then let the color of the actual points map onto that range accordingly (so if a point has z-value 50, then that should be mapped onto the color represented by the value 50 in the range [0,255]).
My summarized question:
How to manually set the hue color range of a numerical variable in a scatterplot using seaborn?
I've looked thoroughly online on many tutorials and forums, but have not found an answer. I'm not sure I've used the right terminology. I hope my message got across.
Following #JohanC's suggestion of using hue_norm was the solution. I first tried doing so by removing the [hue=] parameter and only using the [hue_norm=] parameter, which didn't produce any colors at all (which makes sense).
Naturally one should use both the [hue=] and the [hue_norm=] parameters.
import seaborn
seaborn.set()
import pandas
import matplotlib.pyplot
x = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
y = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200]
z = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 255]
df = pandas.DataFrame(list(zip(x, y, z, my_sizes)), columns =['x', 'y', 'z'])
ax = seaborn.scatterplot(x="x", y="y",
hue="z",
hue_norm=(0,255), # <------- the solution
data=df)
matplotlib.pyplot.xlim(0,255)
matplotlib.pyplot.ylim(0,255)
matplotlib.pyplot.show()

How do I include negative numbers in xticks, using matplotlib.pyplot?

I have a DataFrame that looks something like this:
df = [4, -1, 5, -32, 4, -32, -1]
I want to set the xticks like this:
tick_locs = [-30, -10, 0, 10, 30, 100, 300, 1000, 3000]
plt.xticks(tick_locs, tick_locs)
That gives me a weird graph:
I can set the ticks to all positive, but that won't give me negative numbers on the x-axis:
tick_locs = [10, 30, 100, 300, 1000, 3000]
plt.xticks(tick_locs, tick_locs)
Any idea how to get the negative ticks marks?
P.S. The data is set up as logged, but the x-axis is set to show the actual numbers:
bin_edges = 10 ** np.arange(-0.1, np.log10(planes_df['ArrDelay'].max())+0.1, 0.1)
plt.hist(planes_df['ArrDelay'], bins = bin_edges)
plt.xscale('log')
tick_locs = [10, 30, 100, 300, 1000, 3000]
plt.xticks(tick_locs, tick_locs)
Try removing the line plt.xscale('log'). This will make the x-axis scale linear. A logarithmic axis cannot display non-positive values, as log(x) is undefined for x <= 0.

coloring matplotlib scatterplot by third variable with log color bar

I'm trying to make a scatterplot of two arrays/lists, one of which is the x coordinate and the other the y. I'm not having any trouble with that. However, I need to color-code these points based on their values at a specific point in time, based on data which I have in a 2d array. Also, this 2d array of data has a very large spread, so I'd like to color the points logarithmically (I'm not sure if this means just change the color bar labels or if there's a more fundamental difference.)
Here is my code so far:
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure(1)
time = #I'd like to specify time here.
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]
multi_array = [[1, 1, 10, 100, 1000], [10000, 1000, 100, 10, 1], [300, 400, 5000, 12, 47]]
for counter in np.arange(0, 5):
t = multi_array[time, counter] #I tried this, and it did not work.
s = plt.scatter(x[counter], y[counter], c = t, marker = 's')
plt.show()
I followed the advice I saw elsewhere to color by a third variable, which was to set the color equal to that variable, but then when I tried that with my data set, I just got all the points as one color, and then when I try it with this mockup it gives me the following error:
TypeError: list indices must be integers, not tuple
Could someone please help me color my points the way I need to?
If I understand the question (which I'm not at all sure off) here is the answer:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure(1)
time = 2 #I'd like to specify time here.
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]
multi_array = np.asarray([[1, 1, 10, 100, 1000], [10000, 1000, 100, 10, 1], [300, 400, 5000, 12, 47]])
log_array=np.log10(multi_array)
s = plt.scatter(x, y, c=log_array[time], marker = 's',s=100)
cb = plt.colorbar(s)
cb.set_label('log of ...')
plt.show()
After some tinkering, and using information learned from user4421975's answer and the link in the comments, I've puzzled it out. In short, I used plt.scatter's norm feature/attribute/thingie to mess with the colors and make them logarithmic.
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure(1)
time = 2
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]
multi_array = np.asarray([[1, 1, 10, 100, 1000], [10000, 1000, 100, 10, 1], [300, 400, 5000, 12, 47]])
for counter in np.arange(0, 5):
s = plt.scatter(x[counter], y[counter], c = multi_array[time, counter], cmap = 'winter', norm = matplotlib.colors.LogNorm(vmin=multi_array[time].min(), vmax=multi_array[time].max()), marker = 's', )
cb = plt.colorbar(s)
cb.set_label('Log of Data')
plt.show()

Contourplot with 2 different step sizes in matplotlib

This question has probably a totally simple solution but I just can't find it. I'd like to plot a contourf plot where the one part of my data varies in steps of order 1 and the other part varies with steps of order 100.
Now I tried to just give contour levels like this:
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
However this leads to the result that the fist 11 levels all have the same color as matplotlib is somehow normalizing this to the maximum value. How can I make every level equally important in terms of my color map?
Thanks a lot HYRY, your answer solved my problem. This is what the plots look like bevore and after the implementation (I adjusted the levels a bit; data from the GOZCARDS team/NASA):
Use colors argument:
import pylab as pl
import numpy as np
x, y = np.mgrid[-1:1:100j, 0:1:100j]
z = ... # your function
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
cmap = pl.cm.BuPu
colors = cmap(np.linspace(0, 1, len(contour_levels)))
pl.contour(x, y, z, levels=contour_levels, colors=colors)
I am a little wary of HYRY's solution as the mapping between the colors level can become arbitrary. I would suggest using LogNorm instead which maps your values -> colors with a log.
import pylab as pl
import numpy as np
x, y = np.mgrid[-1:1:100j, 0:1:100j]
z = ... # your function
contour_levels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 200, 300, 400]
cmap = pl.cm.BuPu
pl.contourf(x, y, z, levels=contour_levels, norm=matplotlib.colors.LogNorm)
If you also use vmin and vmax you can explicitly control the limits of the normalization and ensure that the color scales match between graphs independent of what levels you use.

Categories

Resources