How to plot a scatter plot over a line plot? [duplicate] - python

Does anybody have a suggestion on what's the best way to present overlapping lines on a plot? I have a lot of them, and I had the idea of having full lines of different colors where they don't overlap, and having dashed lines where they do overlap so that all colors are visible and overlapping colors are seen.
But still, how do I that.

I have the same issue on a plot with a high degree of discretization.
Here the starting situation:
import matplotlib.pyplot as plt
grid=[x for x in range(10)]
graphs=[
[1,1,1,4,4,4,3,5,6,0],
[1,1,1,5,5,5,3,5,6,0],
[1,1,1,0,0,3,3,2,4,0],
[1,2,4,4,3,2,3,2,4,0],
[1,2,3,3,4,4,3,2,6,0],
[1,1,3,3,0,3,3,5,4,3],
]
for gg,graph in enumerate(graphs):
plt.plot(grid,graph,label='g'+str(gg))
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()
No one can say where the green and blue lines run exactly
and my "solution"
import matplotlib.pyplot as plt
grid=[x for x in range(10)]
graphs=[
[1,1,1,4,4,4,3,5,6,0],
[1,1,1,5,5,5,3,5,6,0],
[1,1,1,0,0,3,3,2,4,0],
[1,2,4,4,3,2,3,2,4,0],
[1,2,3,3,4,4,3,2,6,0],
[1,1,3,3,0,3,3,5,4,3],
]
for gg,graph in enumerate(graphs):
lw=10-8*gg/len(graphs)
ls=['-','--','-.',':'][gg%4]
plt.plot(grid,graph,label='g'+str(gg), linestyle=ls, linewidth=lw)
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()
I am grateful for suggestions on improvement!

Just decrease the opacity of the lines so that they are see-through. You can achieve that using the alpha variable. Example:
plt.plot(x, y, alpha=0.7)
Where alpha ranging from 0-1, with 0 being invisible.

imagine your panda data frame is called respone_times, then you can use alpha to set different opacity for your graphs. Check the picture before and after using alpha.
plt.figure(figsize=(15, 7))
plt.plot(respone_times,alpha=0.5)
plt.title('a sample title')
plt.grid(True)
plt.show()

Depending on your data and use case, it might be OK to add a bit of random jitter to artificially separate the lines.
from numpy.random import default_rng
import pandas as pd
rng = default_rng()
def jitter_df(df: pd.DataFrame, std_ratio: float) -> pd.DataFrame:
"""
Add jitter to a DataFrame.
Adds normal distributed jitter with mean 0 to each of the
DataFrame's columns. The jitter's std is the column's std times
`std_ratio`.
Returns the jittered DataFrame.
"""
std = df.std().values * std_ratio
jitter = pd.DataFrame(
std * rng.standard_normal(df.shape),
index=df.index,
columns=df.columns,
)
return df + jitter
Here's a plot of the original data from Markus Dutschke's example:
And here's the jittered version, with std_ratio set to 0.1:

Replacing solid lines by dots or dashes works too
g = sns.FacetGrid(data, col='config', row='outputs', sharex=False)
g.map_dataframe(sns.lineplot, x='lag',y='correlation',hue='card', linestyle='dotted')

Instead of random jitter, the lines can be offset just a little bit, creating a layered appearance:
import matplotlib.pyplot as plt
from matplotlib.transforms import offset_copy
grid = list(range(10))
graphs = [[1, 1, 1, 4, 4, 4, 3, 5, 6, 0],
[1, 1, 1, 5, 5, 5, 3, 5, 6, 0],
[1, 1, 1, 0, 0, 3, 3, 2, 4, 0],
[1, 2, 4, 4, 3, 2, 3, 2, 4, 0],
[1, 2, 3, 3, 4, 4, 3, 2, 6, 0],
[1, 1, 3, 3, 0, 3, 3, 5, 4, 3]]
fig, ax = plt.subplots()
lw = 1
for gg, graph in enumerate(graphs):
trans_offset = offset_copy(ax.transData, fig=fig, x=lw * gg, y=lw * gg, units='dots')
ax.plot(grid, graph, lw=lw, transform=trans_offset, label='g' + str(gg))
ax.legend(loc='upper left', bbox_to_anchor=(1.01, 1.01))
# manually set the axes limits, because the transform doesn't set them automatically
ax.set_xlim(grid[0] - .5, grid[-1] + .5)
ax.set_ylim(min([min(g) for g in graphs]) - .5, max([max(g) for g in graphs]) + .5)
plt.tight_layout()
plt.show()

Related

how to make loop to create a graph of every column in dataset in python [duplicate]

I am trying to plot multiple histograms on the same window using a list of tuples. I have managed to get it to sketch only 1 tuple at a time and I just can't seem to get it to work with all of them.
import numpy as np
import matplotlib.pyplot as plt
a = [(1, 2, 0, 0, 0, 3, 3, 1, 2, 2), (0, 2, 3, 3, 0, 1, 1, 1, 2, 2), (1, 2, 0, 3, 0, 1, 2, 1, 2, 2),(2, 0, 0, 3, 3, 1, 2, 1, 2, 2),(3,1,2,3,0,0,1,2,3,1)] #my list of tuples
q1,q2,q3,q4,q5,q6,q7,q8,q9,q10 = zip(*a) #split into [(1,0,1,2,3) ,(2,2,2,0,1),..etc] where q1=(1,0,1,2,3)
labels, counts = np.unique(q1,return_counts=True) #labels = 0,1,2,3 and counts the occurence of 0,1,2,3
ticks = range(len(counts))
plt.bar(ticks,counts, align='center')
plt.xticks(ticks, labels)
plt.show()
As you can see from the above code, I can plot one tuple at a time say q1,q2 etc but how do I generalise it so that it plots all of them.
I've tried to mimic this python plot multiple histograms, which is exactly what I want however I had no luck.
Thank you for your time :)
You need to define a grid of axes with plt.subplots taking into account the amount of tuples in the list, and how many you want per row. Then iterate over the returned axes, and plot the histograms in the corresponding axis. You could use Axes.hist, but I've always preferred to use ax.bar, from the result of np.unique, which also can return the counts of unique values:
from matplotlib import pyplot as plt
import numpy as np
l = list(zip(*a))
n_cols = 2
fig, axes = plt.subplots(nrows=int(np.ceil(len(l)/n_cols)),
ncols=n_cols,
figsize=(15,15))
for i, (t, ax) in enumerate(zip(l, axes.flatten())):
labels, counts = np.unique(t, return_counts=True)
ax.bar(labels, counts, align='center', color='blue', alpha=.3)
ax.title.set_text(f'Tuple {i}')
plt.tight_layout()
plt.show()
You can customise the above to whatever amount of rows/cols you prefer, for 3 rows for instance:
l = list(zip(*a))
n_cols = 3
fig, axes = plt.subplots(nrows=int(np.ceil(len(l)/n_cols)),
ncols=n_cols,
figsize=(15,15))
for i, (t, ax) in enumerate(zip(l, axes.flatten())):
labels, counts = np.unique(t, return_counts=True)
ax.bar(labels, counts, align='center', color='blue', alpha=.3)
ax.title.set_text(f'Tuple {i}')
plt.tight_layout()
plt.show()

drawing grid lines beween points in python

I have 6 points in the (x,y) plane: x=[x1,x2,x3,x4,x5,x6] and y=[y1,y2,y3,y4,y5,y6]
import matplotlib.pyplot as plt
x = [0, 2, 4, 0, 2, 4, 0, 2, 4]
y = [0, 0, 0, 3, 3, 3, 7, 7, 7]
plt.scatter(x, y)
plt.show()
I want to between the points, draw entirely parallel lines on each axis x,y(like photo). and how to hide x and y axis on diagram. I want to draw a 2D view of the beams and columns of 3 story building; does matplotlib bring me to my goal or should I go to other libraries?
Absolutely matplotlib can do this. Take a look at their Rectangle Patch:
Example usage (you'll have to modify this to your needs):
import matplotlib.pyplot as plt
import matplotlib.patches as patches
fig = plt.figure()
ax = fig.add_subplot()
rect = patches.Rectangle(
(0.1, 0.1),
0.5,
0.5,
fill=False
)
ax.add_patch(rect)
fig.show()

How can I use Matplotlib to re-adjust limits of an axis (added to host plot) using the interactive zoom tool?

I am displaying information with two y-axes and a common x-axis using the following script.
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA
#creating a host plot with x and y axis
hostplot = host_subplot(111, axes_class=AA.Axes)
#creating a second y axis
extra_y_axis = hostplot.twinx()
extra_y_axis.set_navigate_mode(True)
extra_y_axis.set_navigate(True)
print extra_y_axis.can_zoom() #prints true on output
hostplot.set_xlabel("host_x")
hostplot.set_ylabel("host_y")
extra_y_axis.set_ylabel("extra_y")
hostplot.plot([0, 1, 2], [0, 1, 2])
extra_y_axis.plot([0, 1, 2], [0, 3, 2])
plt.draw()
plt.show()
After this I used the 'Zoom to Rectangle' tool from the tray in the bottom-left as shown below:
.
And I got the following output:
.
Please notice the y-axis scales in both the images. While the zoom functionality is working correctly for the host plot, I am unable to get the extra_y_axis to rescale and it just maintains a constant scale throughout (so I can't really zoom in on plots using the second axis).
How can I make it so that all the axes are rescaled on zooming in a small portion?
Thanks
I've traced down your problem to the sue of the axes_grid1 toolkit. If you don't require the use of this toolkit you can easily fix your issue by initialising your figure in the usual manner:
import matplotlib.pyplot as plt
#creating a host plot with x and y axis
fig, hostplot = plt.subplots()
#creating a second y axis
extra_y_axis = hostplot.twinx()
hostplot.set_xlabel("host_x")
hostplot.set_ylabel("host_y")
extra_y_axis.set_ylabel("extra_y")
hostplot.plot([0, 1, 2], [0, 1, 2])
extra_y_axis.plot([0, 1, 2], [0, 3, 2])
plt.show()
If you do want to use the toolkit then you have to add a couple of lines to get the two y axes to scale and transform together:
import matplotlib.transforms as mtransforms
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.parasite_axes import SubplotHost
fig = plt.figure()
ax1 = SubplotHost(fig, 1, 1, 1)
#set the scale difference between the two y axes
aux_trans = mtransforms.Affine2D().scale(sx = 1.,sy= 1.5)
ax2 = ax1.twin(aux_trans)
fig.add_subplot(ax1)
ax1.plot([0, 1, 2], [0, 1, 2])
ax2.plot([0, 1, 2], [0, 3, 2])
ax1.set_ylim(0,3)
plt.show()

add a subplot to the plot produced by a previous function

I've written a function that reads data from a csv file and plots it. Now I need to add a subplot with another part of the data from the same file, so I've tried to write a function that calls the first function and adds a subplot. When I do this, I get the two to show up as different figures. How can I suppress this and make both of them show in the same figure?
Here is a mockup of my code:
def timex(h_ratio = [3, 1]):
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.gridspec as gridspec
total_height = h_ratio[0] + h_ratio[1]
gs = gridspec.GridSpec(total_height, 1)
time = [1, 2, 3, 4, 5]
x = [1, 2, 3, 4, 5]
y = [1, 1, 1, 1, 1]
ax1 = plt.subplot(gs[:h_ratio[0], :])
plt.plot(time, x)
plot = plt.gcf
plt.show()
return time, x, y, plot, gs, h_ratio
def timeyx():
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
time, x, y, plot, gs, h_ratio = timex(h_ratio = [3, 1])
ax2 = plt.subplot(gs[h_ratio[1], :])
plt.plot(time, y)
plt.show()
timeyx()
I realize that I have two plt.show() statements, but if I remove one that figure will not show at all.
I am not sure whether you need to use matplotlib.gridspec specifically or not, but you can use subplot2grid to make the job easy.
import matplotlib.pyplot as plt
def timex():
time = [1, 2, 3, 4, 5]
x = [1, 2, 3, 4, 5]
y = [1, 1, 1, 1, 1]
ax1 = plt.subplot2grid((1,2), (0,0))
ax1.plot(time, x)
return time, x, y
def timeyx():
time, x, y = timex()
ax2 = plt.subplot2grid((1,2), (0,1))
ax2.plot(time, y)
timeyx()
plt.show()
This produces one figure shown below with two subplots:

coloring matplotlib scatterplot by third variable with log color bar

I'm trying to make a scatterplot of two arrays/lists, one of which is the x coordinate and the other the y. I'm not having any trouble with that. However, I need to color-code these points based on their values at a specific point in time, based on data which I have in a 2d array. Also, this 2d array of data has a very large spread, so I'd like to color the points logarithmically (I'm not sure if this means just change the color bar labels or if there's a more fundamental difference.)
Here is my code so far:
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure(1)
time = #I'd like to specify time here.
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]
multi_array = [[1, 1, 10, 100, 1000], [10000, 1000, 100, 10, 1], [300, 400, 5000, 12, 47]]
for counter in np.arange(0, 5):
t = multi_array[time, counter] #I tried this, and it did not work.
s = plt.scatter(x[counter], y[counter], c = t, marker = 's')
plt.show()
I followed the advice I saw elsewhere to color by a third variable, which was to set the color equal to that variable, but then when I tried that with my data set, I just got all the points as one color, and then when I try it with this mockup it gives me the following error:
TypeError: list indices must be integers, not tuple
Could someone please help me color my points the way I need to?
If I understand the question (which I'm not at all sure off) here is the answer:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure(1)
time = 2 #I'd like to specify time here.
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]
multi_array = np.asarray([[1, 1, 10, 100, 1000], [10000, 1000, 100, 10, 1], [300, 400, 5000, 12, 47]])
log_array=np.log10(multi_array)
s = plt.scatter(x, y, c=log_array[time], marker = 's',s=100)
cb = plt.colorbar(s)
cb.set_label('log of ...')
plt.show()
After some tinkering, and using information learned from user4421975's answer and the link in the comments, I've puzzled it out. In short, I used plt.scatter's norm feature/attribute/thingie to mess with the colors and make them logarithmic.
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure(1)
time = 2
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]
multi_array = np.asarray([[1, 1, 10, 100, 1000], [10000, 1000, 100, 10, 1], [300, 400, 5000, 12, 47]])
for counter in np.arange(0, 5):
s = plt.scatter(x[counter], y[counter], c = multi_array[time, counter], cmap = 'winter', norm = matplotlib.colors.LogNorm(vmin=multi_array[time].min(), vmax=multi_array[time].max()), marker = 's', )
cb = plt.colorbar(s)
cb.set_label('Log of Data')
plt.show()

Categories

Resources