2D histogram where one axis is cumulative and the other is not - python

Let's say I have instances of two random variables that can be treated as paired.
import numpy as np
x = np.random.normal(size=1000)
y = np.random.normal(size=1000)
Using matplotlib it is pretty easy to make a 2D histogram.
import matplotlib.pyplot as plt
plt.hist2d(x,y)
In 1D, matplotlib has an option to make a histogram cumulative.
plt.hist(x,cumulative=True)
What I would like incorporates elements of both classes. I would like to construct a 2D histogram such that the horizontal axis is cumulative and the vertical axis is not cumulative.
Is there are way to do this with Python/Matplotlib?

You can take advantage of np.cumsum to create your cumulative histogram. First save the output from hist2d, then apply to your data when plotting.
import matplotlib.pyplot as plt
import numpy as np
#Some random data
x = np.random.normal(size=1000)
y = np.random.normal(size=1000)
#create a figure
plt.figure(figsize=(16,8))
ax1 = plt.subplot(121) #Left plot original
ax2 = plt.subplot(122) #right plot the cumulative distribution along axis
#What you have so far
ax1.hist2d(x,y)
#save the data and bins
h, xedge, yedge,image = plt.hist2d(x,y)
#Plot using np.cumsum which does a cumulative sum along a specified axis
ax2.pcolormesh(xedge,yedge,np.cumsum(h.T,axis=1))
plt.show()

Related

Vertically draw plot with matplotlib where each row in an array is a line

I have a dataset, an even numpy array where each row represents a line:
matrix = np.random.rand(10,10)
Ultimately, I would like a graph like this:
Which I have made before in R. But I can't get it to work in Python. I'm not too proficient yet with Python, and have to use it this time.
I simply plot with:
plt.plot(matrix)
Which results in a good starting point:
My first step would be to flip the x and y axis, but the plot function requires an x_vals and y_vals, which my array does not have. There are just values. How can I (for starters) swap the x- and y-axis so that each row in the array gets drawn as an individual vertical line as shown in the image above?
If you don't provide x values, matplotlib will just use a range.
Try:
import numpy as np
import matplotlib.pyplot as plt
matrix = np.random.rand(10,10)
x_vals = np.arange(10)
for y_vals in matrix:
plt.plot(y_vals, x_vals)
plt.show()

How do I cluster values of y axis against x axis in scatterplot?

Lets say I've 2 arrays
x = [1,2,3,4,5,6,7]
y = [1,2,2,2,3,4,5]
its scatter plot looks like this
what I want to do is that I want my x axis to look like this in the plot
0,4,8
as a result of which values of y in each piece of x should come closer .
The similar behavior I've seen is bar plots where this is called clustering , how do I do the same in case of scatter plot , or is there any other plot I should be using ?
I hope my question is clear/understandable .
All the help is appreciated
With you plot, try this, before you display the plot.
plt.xticks([0,4,8]))
or
import numpy as np
plt.xticks(np.arange(0, 8+1, step=4))
Then to change the scale you can try something like this,
plt.xticks([0,4,8]))
plt.rcParams["figure.figsize"] = (10,5)
I got this with my example,
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 30)
y = np.sin(x)
plt.xticks([0,4,8])
plt.rcParams["figure.figsize"] = (7,3)
plt.plot(x, y, 'o', color='black')
output
I think what you are looking for is close to swarmplots and stripplots in Seaborn. However, Seaborn's swarmplot and stripplot are purely categorical on one of the axes, which means that they wouldn't preserve the relative x-axis order of your elements inside each category.
One way to do what you want would be to increase the space in your x-axis between categories ([0,4,8]) and modify your xticks accordingly.
Below is an example of this where I assign the data to 3 different categories: [-2,2[, [2,6[, [6,10[. And each bar is dil_k away from its directly neighboring bars.
import matplotlib.pyplot as plt
import numpy as np
#Generating data
x= np.random.choice(8,size=(100))
y= np.random.choice(8,size=(100))
dil_k=20
#Creating the spacing between categories
x[np.logical_and(x<6, x>=2)]+=dil_k
x[np.logical_and(x<10, x>=6)]+=2*dil_k
#Plotting
ax=plt.scatter(x,y)
#Modifying axes accordingly
plt.xticks([0,2,22,24,26,46,48,50],[0,2,2,4,6,6,8,10])
plt.show()
And the output gives:
Alternatively, if you don't care about keeping the order of your elements along the x-axis inside each category, then you can use swarmplot directly.
The code can be seen below:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
#Generating data
x= np.random.choice(8,size=(100))
y= np.random.choice(8,size=(100))
#Creating the spacing between categories
x[np.logical_and(x<2,x>=-2)]=0
x[np.logical_and(x<6, x>=2)]=4
x[np.logical_and(x<10, x>=6)]=8
#Plotting
sns.swarmplot(x=x,y=y)
plt.show()
And the output gives:

Seaborn jointplot with defined axes limits

I am trying to generate some figures using the joint plot command of Seaborn. I have a list of tuples called "data", with the coordinates (x,y) that needed to be plotted. The x-coordinates are in range (200,1400) and the y-coordinates are in range (300,900). However, I need to show the entire region I am working, demonstrating the concentration of the points. I need the x-coordinate to be in range (0,3000) and the y-coordinate to be in range (0-1200), and I am failing to do so.
Here is my code:
import seaborn as sns
import numpy as np
np.shape(y)
xx = np.linspace(0,1080,np.shape(y))
yy = np.linspace(0,1920,np.shape(y))
sns.jointplot(xx="x", yy="y", data=data, kind="kde")
I returns the error: "TypeError: jointplot() missing 2 required positional arguments: 'x' and 'y'".
If I don't use the xx and yy variables, it gives the plot with the axes limited automatically.
How can I set the axes to the ranges I need?
You can plot your data and modify the plot's axis limits later:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# generate some random date
x = np.random.normal(loc=650, scale=100, size=1000)
y = np.random.normal(loc=600, scale=200, size=1000)
plot = sns.jointplot(x, y, kind="kde")
plot.ax_marg_x.set_xlim(0, 3000)
plot.ax_marg_y.set_ylim(0, 1200)
plt.show()

plot log-scale and linear scale functions and histograms on same canvas

I have a probability density function of that I can only evaluate the logarithm without running into numeric issues. I have a histogram that I would like to plot on the same canvas. However, for the histogram, I need the option log=True to have it plotted in log scale, wheras for the function, I can only have the logarithms of the values directly. How can I plot both on the same canvas?
Please look at this MWE for illustration of the problem:
import matplotlib.pyplot as plt
import random
import math
import numpy as np
sqrt2pi = math.sqrt(2*math.pi)
def gauss(l):
return [ 1/sqrt2pi * math.exp(-x*x) for x in l]
def loggauss(l):
return [ -math.log(sqrt2pi) -x*x for x in l ]
# just fill a histogram
h = [ random.gauss(0,1) for x in range(0,1000) ]
plt.hist(h,bins=21,normed=True,log=True)
# this works nicely
xvals = np.arange(-4,4,0.1)
plt.plot(xvals,gauss(xvals),"-k")
# but I would like to plot this on the same canvas:
# plt.plot(xvals,loggauss(xvals),"-r")
plt.show()
Any suggestions?
If I understand correctly, you want to plot two data sets in the same figure, on the same x-axis, but one on a log y-scale and one on a linear y-scale. You can do this using twinx:
fig, ax = plt.subplots()
ax.hist(h,bins=21,normed=True,log=True)
ax2 = ax.twinx()
ax2.plot(xvals, loggauss(xvals), '-r')

Python - Line colour of 3D parametric curve

I have 2 lists tab_x (containe the values of x) and tab_z (containe the values of z) which have the same length and a value of y.
I want to plot a 3D curve which is colored by the value of z. I know it's can be plotted as a 2D plot but I want to plot a few of these plot with different values of y to compare so I need it to be 3D.
My tab_z also containe negatives values
I've found the code to color the curve by time (index) in this question but I don't know how to transforme this code to get it work in my case.
Thanks for the help.
I add my code to be more specific:
fig8 = plt.figure()
ax8 = fig8.gca(projection = '3d')
tab_y=[]
for i in range (0,len(tab_x)):
tab_y.append(y)
ax8.plot(tab_x, tab_y, tab_z)
I have this for now
I've tried this code
for i in range (0,len(tab_t)):
ax8.plot(tab_x[i:i+2], tab_y[i:i+2], tab_z[i:i+2],color=plt.cm.rainbow(255*tab_z[i]/max(tab_z)))
A total failure:
Your second attempt almost has it. The only change is that the input to the colormap cm.jet() needs to be on the range of 0 to 1. You can scale your z values to fit this range with Normalize.
import numpy as np
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import colors
fig = plt.figure()
ax = fig.gca(projection='3d')
N = 100
y = np.ones((N,1))
x = np.arange(1,N + 1)
z = 5*np.sin(x/5.)
cn = colors.Normalize(min(z), max(z)) # creates a Normalize object for these z values
for i in xrange(N-1):
ax.plot(x[i:i+2], y[i:i+2], z[i:i+2], color=plt.cm.jet(cn(z[i])))
plt.show()

Categories

Resources