How to correlate geometrical features with target variable?

How to correlate geometrical features with target variable? - python

I want to correlate geometry of surfaces with the target variable. In tutorials on scikit learn or tensorflow and so on some routine features are evaluated. For example relation between price of house in Boston with some other features like numbers of rooms, neighborhood and so on.
In my work I have some coordinates in 3D space (x, y and z) representing surfaces. Then, I want to find out how the arrangement of this points can affect the target variable. I very much appreciate if anyone can propose me maybe especial types of ML methods in python that can do so. I have uploaded a view on two simple surfaces created. Then, I want to correlate depth (z values) of surfaces with an arbitrary target. For each surface I may have hundreds of points i.e. z values.
Follwong code makes the fig:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.ticker import (AutoMinorLocator, MultipleLocator)
%matplotlib qt5
x_s_up = np.array([[1, 1, 1], [2, 2, 2]])
y_s_up = np.array([[1, 2, 3], [1, 2, 3]])
z_s_up = np.array([[5, 5, 5], [5.1, 5.2, 5.1]])
x_s_d = np.array([[1, 1, 1], [2, 2, 2]])
y_s_d = np.array([[1, 2, 3], [1, 2, 3]])
z_s_d = np.array([[3.9, 4., 3.8], [4.1, 4.1, 4.2]])
fig = plt.figure()
ax = fig.add_subplot (111, projection="3d")
ax.plot_surface(x_s_up, y_s_up, z_s_up, color='b') # upper surface
ax.plot_surface(x_s_d, y_s_d, z_s_d, color='r') # lower surface
ax.set_xlabel('X'); ax.set_ylabel('Y'); ax.set_zlabel('Z')

Related

Seaborn Heatmap without lines between cells

I´m trying to create a heatmap with seaborn with a transparent colormap since an image should be displayed in the background. The heatmap creation works fine so far, however some lines between the cells are still visible even though the linewidth of the heatmap is set to 0.0.
The code for the creation of the heatmap looks like the following:
ax = sns.heatmap(image, cmap="rocket_r", linewidths=0.0)
ax.collections[0].set_alpha(0.5)
Where image is 64x64 numpy array. The resulting heatmap looks like this:
Heatmap (sorry not enough repuation for embedding pictures)
The problem are the thin lines between the cells. Strangely they aren´t at every edge.
Anyone knows how to get rid of those lines?
Many Thanks
Update 1 (Complete working example):
image = np.array([[1, 1, 2, 2], [3, 3, 3, 3], [4, 5, 4, 5], [6, 6, 6, 6]])
ax = sns.heatmap(image, cmap="rocket_r", linewidths=0.0)
ax.collections[0].set_alpha(0.5)
plt.show()
Results is this heatmap:
Here you can see that there are thin lines between every column but there isn´t any line between the first and second row.

The lines are the overlapping of semitransparent patches which cannot be perfectly aligned on the pixel grid.
alpha blending
An option is to not use transparency, but instead create opaque colors with alpha blending.
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
import numpy as np
import seaborn as sns
def get_alpha_blend_cmap(cmap, alpha):
cls = plt.get_cmap(cmap)(np.linspace(0,1,256))
cls = (1-alpha) + alpha*cls
return ListedColormap(cls)
image = np.array([[1, 1, 2, 2], [3, 3, 3, 3], [4, 5, 4, 5], [6, 6, 6, 6]])
ax = sns.heatmap(image, cmap=get_alpha_blend_cmap("rocket_r", 0.5), linewidths=0.0)
plt.show()
An obvious advantage of this is that the colorbar has the same colors as the heatmap.
increase dpi
If the above is not an option you may increase the dpi when saving.
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
image = np.array([[1, 1, 2, 2], [3, 3, 3, 3], [4, 5, 4, 5], [6, 6, 6, 6]])
ax = sns.heatmap(image, cmap="rocket_r", linewidths=0.0, edgecolor="none", alpha=0.5)
plt.savefig("test.png", dpi=1000)
This does of course not have any effect on the figure shown on screen though.
imshow
Finally, consider not using seaborn here, but instead a matplotlib imshow plot.
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use("seaborn-dark")
plt.rcParams["axes.facecolor"] = "white"
import numpy as np
image = np.array([[1, 1, 2, 2], [3, 3, 3, 3], [4, 5, 4, 5], [6, 6, 6, 6]])
im = plt.imshow(image, cmap="rocket_r", alpha=0.5)
plt.colorbar(im)
plt.gca().set(xticks=(range(image.shape[1])),yticks=(range(image.shape[0])))
plt.show()

I have just come across this problem. I need to upload the plot to overleaf, so I do not like the dpi solutions. Here is what I came up with, based on the solution of OP:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
image = np.array([[1, 1, 2, 2], [3, 3, 3, 3], [4, 5, 4, 5], [6, 6, 6, 6]])
ax = sns.heatmap(image, cmap="rocket_r", linewidths=0.1)
colors = ax.collections[0].get_facecolors()
ax.collections[0].set_edgecolors(colors)
plt.imshow()
The idea is to create a thin edge around each cell with the cell's facecolor. This removes the lines without changing the original cell colors.

Matplotlib - understanding color values

I found a piece of code which is passing a 1D Numpy array to MatplotLib. The values of array are either 1 or 0, but the graph plotted has colours as yellow or purple. I am unable to find any documentation around it.
Here is the code:
import numpy as np
import matplotlib.pyplot as plt
num_observations = 5000
x1 = np.random.multivariate_normal([0, 0], [[1, .85],[.85, 1]], num_observations) # mean, covariance
x2 = np.random.multivariate_normal([1, 4], [[1, .85],[.85, 1]], num_observations)
features = np.vstack((x1, x2)).astype(np.float32)
labels = np.hstack((np.zeros(num_observations),np.ones(num_observations)))
plt.figure(figsize=(12,8))
plt.scatter(features[:, 0], features[:, 1],
c = labels, alpha = .4)
plt.show()
Can anyone explain how we are getting the colors as yellow and violet? Relevant Documentation would also help.

Its using the default viridis colormap, and so purple represents 0 and yellow represents 1. See here for more about colormaps: https://matplotlib.org/examples/color/colormaps_reference.html.
Adding a colorbar helps here. Adding one to your example is easy:
import numpy as np
import matplotlib.pyplot as plt
num_observations = 5000
x1 = np.random.multivariate_normal([0, 0], [[1, .85],[.85, 1]], num_observations) # mean, covariance
x2 = np.random.multivariate_normal([1, 4], [[1, .85],[.85, 1]], num_observations)
features = np.vstack((x1, x2)).astype(np.float32)
labels = np.hstack((np.zeros(num_observations),np.ones(num_observations)))
plt.figure(figsize=(12,8))
p = plt.scatter(features[:, 0], features[:, 1],
c = labels, alpha = .4)
plt.colorbar(p)
plt.show()

How to plot a scatter plot over a line plot? [duplicate]

Does anybody have a suggestion on what's the best way to present overlapping lines on a plot? I have a lot of them, and I had the idea of having full lines of different colors where they don't overlap, and having dashed lines where they do overlap so that all colors are visible and overlapping colors are seen.
But still, how do I that.

I have the same issue on a plot with a high degree of discretization.
Here the starting situation:
import matplotlib.pyplot as plt
grid=[x for x in range(10)]
graphs=[
[1,1,1,4,4,4,3,5,6,0],
[1,1,1,5,5,5,3,5,6,0],
[1,1,1,0,0,3,3,2,4,0],
[1,2,4,4,3,2,3,2,4,0],
[1,2,3,3,4,4,3,2,6,0],
[1,1,3,3,0,3,3,5,4,3],
]
for gg,graph in enumerate(graphs):
plt.plot(grid,graph,label='g'+str(gg))
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()
No one can say where the green and blue lines run exactly
and my "solution"
import matplotlib.pyplot as plt
grid=[x for x in range(10)]
graphs=[
[1,1,1,4,4,4,3,5,6,0],
[1,1,1,5,5,5,3,5,6,0],
[1,1,1,0,0,3,3,2,4,0],
[1,2,4,4,3,2,3,2,4,0],
[1,2,3,3,4,4,3,2,6,0],
[1,1,3,3,0,3,3,5,4,3],
]
for gg,graph in enumerate(graphs):
lw=10-8*gg/len(graphs)
ls=['-','--','-.',':'][gg%4]
plt.plot(grid,graph,label='g'+str(gg), linestyle=ls, linewidth=lw)
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()
I am grateful for suggestions on improvement!

Just decrease the opacity of the lines so that they are see-through. You can achieve that using the alpha variable. Example:
plt.plot(x, y, alpha=0.7)
Where alpha ranging from 0-1, with 0 being invisible.

imagine your panda data frame is called respone_times, then you can use alpha to set different opacity for your graphs. Check the picture before and after using alpha.
plt.figure(figsize=(15, 7))
plt.plot(respone_times,alpha=0.5)
plt.title('a sample title')
plt.grid(True)
plt.show()

Depending on your data and use case, it might be OK to add a bit of random jitter to artificially separate the lines.
from numpy.random import default_rng
import pandas as pd
rng = default_rng()
def jitter_df(df: pd.DataFrame, std_ratio: float) -> pd.DataFrame:
"""
Add jitter to a DataFrame.
Adds normal distributed jitter with mean 0 to each of the
DataFrame's columns. The jitter's std is the column's std times
`std_ratio`.
Returns the jittered DataFrame.
"""
std = df.std().values * std_ratio
jitter = pd.DataFrame(
std * rng.standard_normal(df.shape),
index=df.index,
columns=df.columns,
)
return df + jitter
Here's a plot of the original data from Markus Dutschke's example:
And here's the jittered version, with std_ratio set to 0.1:

Replacing solid lines by dots or dashes works too
g = sns.FacetGrid(data, col='config', row='outputs', sharex=False)
g.map_dataframe(sns.lineplot, x='lag',y='correlation',hue='card', linestyle='dotted')

Instead of random jitter, the lines can be offset just a little bit, creating a layered appearance:
import matplotlib.pyplot as plt
from matplotlib.transforms import offset_copy
grid = list(range(10))
graphs = [[1, 1, 1, 4, 4, 4, 3, 5, 6, 0],
[1, 1, 1, 5, 5, 5, 3, 5, 6, 0],
[1, 1, 1, 0, 0, 3, 3, 2, 4, 0],
[1, 2, 4, 4, 3, 2, 3, 2, 4, 0],
[1, 2, 3, 3, 4, 4, 3, 2, 6, 0],
[1, 1, 3, 3, 0, 3, 3, 5, 4, 3]]
fig, ax = plt.subplots()
lw = 1
for gg, graph in enumerate(graphs):
trans_offset = offset_copy(ax.transData, fig=fig, x=lw * gg, y=lw * gg, units='dots')
ax.plot(grid, graph, lw=lw, transform=trans_offset, label='g' + str(gg))
ax.legend(loc='upper left', bbox_to_anchor=(1.01, 1.01))
# manually set the axes limits, because the transform doesn't set them automatically
ax.set_xlim(grid[0] - .5, grid[-1] + .5)
ax.set_ylim(min([min(g) for g in graphs]) - .5, max([max(g) for g in graphs]) + .5)
plt.tight_layout()
plt.show()

How can I use Matplotlib to re-adjust limits of an axis (added to host plot) using the interactive zoom tool?

I am displaying information with two y-axes and a common x-axis using the following script.
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA
#creating a host plot with x and y axis
hostplot = host_subplot(111, axes_class=AA.Axes)
#creating a second y axis
extra_y_axis = hostplot.twinx()
extra_y_axis.set_navigate_mode(True)
extra_y_axis.set_navigate(True)
print extra_y_axis.can_zoom() #prints true on output
hostplot.set_xlabel("host_x")
hostplot.set_ylabel("host_y")
extra_y_axis.set_ylabel("extra_y")
hostplot.plot([0, 1, 2], [0, 1, 2])
extra_y_axis.plot([0, 1, 2], [0, 3, 2])
plt.draw()
plt.show()
After this I used the 'Zoom to Rectangle' tool from the tray in the bottom-left as shown below:
.
And I got the following output:
.
Please notice the y-axis scales in both the images. While the zoom functionality is working correctly for the host plot, I am unable to get the extra_y_axis to rescale and it just maintains a constant scale throughout (so I can't really zoom in on plots using the second axis).
How can I make it so that all the axes are rescaled on zooming in a small portion?
Thanks

I've traced down your problem to the sue of the axes_grid1 toolkit. If you don't require the use of this toolkit you can easily fix your issue by initialising your figure in the usual manner:
import matplotlib.pyplot as plt
#creating a host plot with x and y axis
fig, hostplot = plt.subplots()
#creating a second y axis
extra_y_axis = hostplot.twinx()
hostplot.set_xlabel("host_x")
hostplot.set_ylabel("host_y")
extra_y_axis.set_ylabel("extra_y")
hostplot.plot([0, 1, 2], [0, 1, 2])
extra_y_axis.plot([0, 1, 2], [0, 3, 2])
plt.show()
If you do want to use the toolkit then you have to add a couple of lines to get the two y axes to scale and transform together:
import matplotlib.transforms as mtransforms
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.parasite_axes import SubplotHost
fig = plt.figure()
ax1 = SubplotHost(fig, 1, 1, 1)
#set the scale difference between the two y axes
aux_trans = mtransforms.Affine2D().scale(sx = 1.,sy= 1.5)
ax2 = ax1.twin(aux_trans)
fig.add_subplot(ax1)
ax1.plot([0, 1, 2], [0, 1, 2])
ax2.plot([0, 1, 2], [0, 3, 2])
ax1.set_ylim(0,3)
plt.show()

How to plot 2d math vectors with matplotlib?

How can we plot 2D math vectors with matplotlib? Does anyone have an example or suggestion about that?
I have a couple of vectors stored as 2D numpy arrays, and I would like to plot them as directed edges.
The vectors to be plotted are constructed as below:
import numpy as np
# a list contains 3 vectors;
# each list is constructed as the tail and the head of the vector
a = np.array([[0, 0, 3, 2], [0, 0, 1, 1], [0, 0, 9, 9]])
Edit:
I just added the plot of the final answer of tcaswell for anyone interested in the output and want to plot 2d vectors with matplotlib:

The suggestion in the comments by halex is correct, you want to use quiver (doc), but you need to tweak the properties a bit.
import numpy as np
import matplotlib.pyplot as plt
soa = np.array([[0, 0, 3, 2], [0, 0, 1, 1], [0, 0, 9, 9]])
X, Y, U, V = zip(*soa)
plt.figure()
ax = plt.gca()
ax.quiver(X, Y, U, V, angles='xy', scale_units='xy', scale=1)
ax.set_xlim([-1, 10])
ax.set_ylim([-1, 10])
plt.draw()
plt.show()

It's pretty straightforward. Hope this example helps.
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(10,5,100)
y = 3 + .5*x + np.random.normal(0,1,100)
myvec = np.array([x,y])
plt.plot(myvec[0,],myvec[1,],'ro')
plt.show()
Will produce:
To plot the arrays you can just slice them up into 1D vectors and plot them. I'd read the full documentation of matplotlib for all the different options. But you can treat a numpy vector as if it were a normal tuple for most of the examples.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to correlate geometrical features with target variable? - python

Related

Seaborn Heatmap without lines between cells

Matplotlib - understanding color values

How to plot a scatter plot over a line plot? [duplicate]

How can I use Matplotlib to re-adjust limits of an axis (added to host plot) using the interactive zoom tool?

How to plot 2d math vectors with matplotlib?

Categories

Resources