Related
I have a csv table, lets say:
this csv table
I'm trying to create a heat map, where the higher number the warmer color is, so the goal is to have a heat map where 20 is warm, and the 0 is blue for example.
I tried a lot of heat maps but all of them are looking squared and I want it to look better
like this.
I used matplot but the result looks like squares, image.
and I tried using
this code where data is the table that I've shown above:
n = 1e5
Z1 = ml.bivariate_normal(data, data, 2, 2, 0, 0)
Z2 - ml.bivariate_normal(data, data, 4, 1, 1, 1)
ZD = Z2 - 21
X = data.ravel()
y = data.ravel
z = ZD.ravel gridsize=10
plt.subplot(111)
plt.hexbin(x, y, C=z, gridsize=gridsize, cmaprcm.jet, bins=None)
plt.axis ([x.min(), X.max(), y.min(), y.max()])
cb = plt.colorbar() cb.set_label('mean value') plt.show()
but it shows me a linear line (don't know why maybe because the number were divided by themself, therefore, creating a linear line)
linear line image
If you look at this answer you see how to properly create the heatmap. To get your desidered result, just change the inteporlation argument from nearest to, for instance, bicubic:
import matplotlib.pyplot as plt
import numpy as np
a = np.random.random((16, 16))
plt.imshow(a, cmap="hot", interpolation="bicubic")
plt.show()
The other avaible interpolation options are the following:
'antialiased', 'bilinear', 'bicubic', 'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser', 'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc', 'lanczos', 'blackman'
To convert your csv file to a numpy array you can use the following code:
from numpy import genfromtxt
my_data = genfromtxt("test.csv", delimiter=",")
but, of course, you can use other data structures, like pandas dataframes (in this case use pandas.read_csv("test.csv")).
I am looking to produce a graph plotting the points of particles under the action of gravity and am currently producing a plot as below:
However, I would like to produce a clearer plot showing a line for the path of the particles and a marker at the final point indicating their final positions, like in the plot below:
My current line of code plotting each line is:
plt.plot(N_pos[:,0] * AU, N_pos[:,1], 'o')
This just plots the x and y coordinate from an array listing the x, y and z coordinate for each particle
Is the simplest way to do this remove the 'o' marker from the code and just plot the last position of each particle again but this time using a marker? If so, how to I make the line and final marker the same colour instead of like below?:
for i in range(len(all_positions[0])):
N_pos = all_positions[:,i]
plt.plot(N_pos[:,0] , N_pos[:,1])
plt.plot(N_pos[:,0][-1] , N_pos[:,1][-1], 'o')
When no explicit color is given, plt.plot() cycles through a list of default colors.
A simple solution would be to extract the color from the lineplot and provide it as the color for the dot:
import numpy as np
import matplotlib.pyplot as plt
a = np.random.randn(200, 10, 1).cumsum(axis=0) * 0.1
all_positions = np.dstack([np.sin(a), np.cos(a)]).cumsum(axis=0)
for i in range(len(all_positions[0])):
N_pos = all_positions[:, i]
line, = plt.plot(N_pos[:, 0], N_pos[:, 1])
plt.plot(N_pos[:, 0][-1], N_pos[:, 1][-1], 'o', color=line.get_color())
plt.show()
Another option would be to create a scatter plot, and set the size of the dots via an array. For example, N-1 times 1 and one time 20:
for i in range(len(all_positions[0])):
N_pos = all_positions[:, i]
plt.scatter(N_pos[:, 0], N_pos[:, 1], s=np.append(np.ones(len(N_pos) - 1), 20))
You can define your own color palette and give each trace its unique(ish) color:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
np.random.random(123)
all_positions = np.random.randn(10, 5, 2).cumsum(axis=0) #shamelessly stolen from JohanC
l = all_positions.shape[1]
my_cmap = cm.plasma
for i in range(l):
N_pos = all_positions[:,i]
plt.plot(N_pos[:,0], N_pos[:,1], c= my_cmap(i/l))
plt.plot(N_pos[:,0][-1], N_pos[:,1][-1], 'o', color=my_cmap(i/l))
plt.show()
Output:
You can reset the color cycler and plot the markers in a second round (not recommended, just to illustrate cycler properties):
import numpy as np
import matplotlib.pyplot as plt
np.random.random(123)
all_positions = np.random.randn(10, 5, 2).cumsum(axis=0)
l = all_positions.shape[1]
for i in range(l):
N_pos = all_positions[:,i]
plt.plot(N_pos[:,0], N_pos[:,1])
plt.gca().set_prop_cycle(None)
for i in range(l):
N_pos = all_positions[:,i]
plt.plot(N_pos[:,0][-1], N_pos[:,1][-1], 'o')
plt.show()
Sample output:
I would like to plot a vector field with curved arrows in python, as can be done in vfplot (see below) or IDL.
You can get close in matplotlib, but using quiver() limits you to straight vectors (see below left) whereas streamplot() doesn't seem to permit meaningful control over arrow length or arrowhead position (see below right), even when changing integration_direction, density, and maxlength.
So, is there a python library that can do this? Or is there a way of getting matplotlib to do it?
If you look at the streamplot.py that is included in matplotlib, on lines 196 - 202 (ish, idk if this has changed between versions - I'm on matplotlib 2.1.2) we see the following:
... (to line 195)
# Add arrows half way along each trajectory.
s = np.cumsum(np.sqrt(np.diff(tx) ** 2 + np.diff(ty) ** 2))
n = np.searchsorted(s, s[-1] / 2.)
arrow_tail = (tx[n], ty[n])
arrow_head = (np.mean(tx[n:n + 2]), np.mean(ty[n:n + 2]))
... (after line 196)
changing that part to this will do the trick (changing assignment of n):
... (to line 195)
# Add arrows half way along each trajectory.
s = np.cumsum(np.sqrt(np.diff(tx) ** 2 + np.diff(ty) ** 2))
n = np.searchsorted(s, s[-1]) ### THIS IS THE EDITED LINE! ###
arrow_tail = (tx[n], ty[n])
arrow_head = (np.mean(tx[n:n + 2]), np.mean(ty[n:n + 2]))
... (after line 196)
If you modify this to put the arrow at the end, then you could generate the arrows more to your liking.
Additionally, from the docs at the top of the function, we see the following:
*linewidth* : numeric or 2d array
vary linewidth when given a 2d array with the same shape as velocities.
The linewidth can be a numpy.ndarray, and if you can pre-calculate the desired width of your arrows, you'll be able to modify the pencil width while drawing the arrows. It looks like this part has already been done for you.
So, in combination with shortening the arrows maxlength, increasing the density, and adding start_points, as well as tweaking the function to put the arrow at the end instead of the middle, you could get your desired graph.
With these modifications, and the following code, I was able to get a result much closer to what you wanted:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import matplotlib.patches as pat
w = 3
Y, X = np.mgrid[-w:w:100j, -w:w:100j]
U = -1 - X**2 + Y
V = 1 + X - Y**2
speed = np.sqrt(U*U + V*V)
fig = plt.figure(figsize=(14, 18))
gs = gridspec.GridSpec(nrows=3, ncols=2, height_ratios=[1, 1, 2])
grains = 10
tmp = tuple([x]*grains for x in np.linspace(-2, 2, grains))
xs = []
for x in tmp:
xs += x
ys = tuple(np.linspace(-2, 2, grains))*grains
seed_points = np.array([list(xs), list(ys)])
# Varying color along a streamline
ax1 = fig.add_subplot(gs[0, 1])
strm = ax1.streamplot(X, Y, U, V, color=U, linewidth=np.array(5*np.random.random_sample((100, 100))**2 + 1), cmap='winter', density=10,
minlength=0.001, maxlength = 0.07, arrowstyle='fancy',
integration_direction='forward', start_points = seed_points.T)
fig.colorbar(strm.lines)
ax1.set_title('Varying Color')
plt.tight_layout()
plt.show()
tl;dr: go copy the source code, and change it to put the arrows at the end of each path, instead of in the middle. Then use your streamplot instead of the matplotlib streamplot.
Edit: I got the linewidths to vary
Starting with David Culbreth's modification, I rewrote chunks of the streamplot function to achieve the desired behaviour. Slightly too numerous to specify them all here, but it includes a length-normalising method and disables the trajectory-overlap checking. I've appended two comparisons of the new curved quiver function with the original streamplot and quiver.
Here's a way to obtain the desired output in vanilla pyplot (i.e., without modifying the streamplot function or anything that fancy). For reminder, the goal is to visualize a vector field with curved arrows whose length is proportional to the norm of the vector.
The trick is to:
make streamplot with no arrows that is traced backward from a given point (see)
plot a quiver from that point. Make the quiver small enough so that only the arrow is visible
repeat 1. and 2. in a loop for every seed and scale the length of the streamplot to be proportional to the norm of the vector.
import matplotlib.pyplot as plt
import numpy as np
w = 3
Y, X = np.mgrid[-w:w:8j, -w:w:8j]
U = -Y
V = X
norm = np.sqrt(U**2 + V**2)
norm_flat = norm.flatten()
start_points = np.array([X.flatten(),Y.flatten()]).T
plt.clf()
scale = .2/np.max(norm)
plt.subplot(121)
plt.title('scaling only the length')
for i in range(start_points.shape[0]):
plt.streamplot(X,Y,U,V, color='k', start_points=np.array([start_points[i,:]]),minlength=.95*norm_flat[i]*scale, maxlength=1.0*norm_flat[i]*scale,
integration_direction='backward', density=10, arrowsize=0.0)
plt.quiver(X,Y,U/norm, V/norm,scale=30)
plt.axis('square')
plt.subplot(122)
plt.title('scaling length, arrowhead and linewidth')
for i in range(start_points.shape[0]):
plt.streamplot(X,Y,U,V, color='k', start_points=np.array([start_points[i,:]]),minlength=.95*norm_flat[i]*scale, maxlength=1.0*norm_flat[i]*scale,
integration_direction='backward', density=10, arrowsize=0.0, linewidth=.5*norm_flat[i])
plt.quiver(X,Y,U/np.max(norm), V/np.max(norm),scale=30)
plt.axis('square')
Here's the result:
Just looking at the documentation on streamplot(), found here -- what if you used something like streamplot( ... ,minlength = n/2, maxlength = n) where n is the desired length -- you will need to play with those numbers a bit to get your desired graph
you can control for the points using start_points, as shown in the example provided by #JohnKoch
Here's an example of how I controlled the length with streamplot() -- it's pretty much a straight copy/paste/crop from the example from above.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import matplotlib.patches as pat
w = 3
Y, X = np.mgrid[-w:w:100j, -w:w:100j]
U = -1 - X**2 + Y
V = 1 + X - Y**2
speed = np.sqrt(U*U + V*V)
fig = plt.figure(figsize=(14, 18))
gs = gridspec.GridSpec(nrows=3, ncols=2, height_ratios=[1, 1, 2])
grains = 10
tmp = tuple([x]*grains for x in np.linspace(-2, 2, grains))
xs = []
for x in tmp:
xs += x
ys = tuple(np.linspace(-2, 2, grains))*grains
seed_points = np.array([list(xs), list(ys)])
arrowStyle = pat.ArrowStyle.Fancy()
# Varying color along a streamline
ax1 = fig.add_subplot(gs[0, 1])
strm = ax1.streamplot(X, Y, U, V, color=U, linewidth=1.5, cmap='winter', density=10,
minlength=0.001, maxlength = 0.1, arrowstyle='->',
integration_direction='forward', start_points = seed_points.T)
fig.colorbar(strm.lines)
ax1.set_title('Varying Color')
plt.tight_layout()
plt.show()
Edit: made it prettier, though still not quite what we were looking for.
I am quite a beginner in coding ... Im trying to plot curves from 2columns xy data with full line not scatter. I want y to be colored according to the value of y.
I can make it work for scatter but not for line plot.
my code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib
read data ... (data are xy 2 columns so one can simply use 2 lists, say a and b)
# a = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]
# b = [11,12,3,34,55,16,17,18,59,50,51,42,13,14,35,16,17]
fig = plt.figure()
ax = fig.add_subplot(111)
bnorm = []
for i in b:
i = i/float(np.max(b)) ### normalizing the data
bnorm.append(i)
plt.scatter(a, b, c = plt.cm.jet(bnorm))
plt.show()
with scatter it works ...
how can I make it as a line plot with colors ? something like this:
I have a set of data which I want plotted as a line-graph. For each series, some data is missing (but different for each series). Currently matplotlib does not draw lines which skip missing data: for example
import matplotlib.pyplot as plt
xs = range(8)
series1 = [1, 3, 3, None, None, 5, 8, 9]
series2 = [2, None, 5, None, 4, None, 3, 2]
plt.plot(xs, series1, linestyle='-', marker='o')
plt.plot(xs, series2, linestyle='-', marker='o')
plt.show()
results in a plot with gaps in the lines. How can I tell matplotlib to draw lines through the gaps? (I'd rather not have to interpolate the data).
You can mask the NaN values this way:
import numpy as np
import matplotlib.pyplot as plt
xs = np.arange(8)
series1 = np.array([1, 3, 3, None, None, 5, 8, 9]).astype(np.double)
s1mask = np.isfinite(series1)
series2 = np.array([2, None, 5, None, 4, None, 3, 2]).astype(np.double)
s2mask = np.isfinite(series2)
plt.plot(xs[s1mask], series1[s1mask], linestyle='-', marker='o')
plt.plot(xs[s2mask], series2[s2mask], linestyle='-', marker='o')
plt.show()
This leads to
Qouting #Rutger Kassies (link) :
Matplotlib only draws a line between consecutive (valid) data points,
and leaves a gap at NaN values.
A solution if you are using Pandas, :
#pd.Series
s.dropna().plot() #masking (as #Thorsten Kranz suggestion)
#pd.DataFrame
df['a_col_ffill'] = df['a_col'].ffill()
df['b_col_ffill'] = df['b_col'].ffill() # changed from a to b
df[['a_col_ffill','b_col_ffill']].plot()
A solution with pandas:
import matplotlib.pyplot as plt
import pandas as pd
def splitSerToArr(ser):
return [ser.index, ser.as_matrix()]
xs = range(8)
series1 = [1, 3, 3, None, None, 5, 8, 9]
series2 = [2, None, 5, None, 4, None, 3, 2]
s1 = pd.Series(series1, index=xs)
s2 = pd.Series(series2, index=xs)
plt.plot( *splitSerToArr(s1.dropna()), linestyle='-', marker='o')
plt.plot( *splitSerToArr(s2.dropna()), linestyle='-', marker='o')
plt.show()
The splitSerToArr function is very handy, when plotting in Pandas. This is the output:
Without interpolation you'll need to remove the None's from the data. This also means you'll need to remove the X-values corresponding to None's in the series. Here's an (ugly) one liner for doing that:
x1Clean,series1Clean = zip(* filter( lambda x: x[1] is not None , zip(xs,series1) ))
The lambda function returns False for None values, filtering the x,series pairs from the list, it then re-zips the data back into its original form.
For what it may be worth, after some trial and error I would like to add one clarification to Thorsten's solution. Hopefully saving time for users who looked elsewhere after having tried this approach.
I was unable to get success with an identical problem while using
from pyplot import *
and attempting to plot with
plot(abscissa[mask],ordinate[mask])
It seemed it was required to use import matplotlib.pyplot as plt to get the proper NaNs handling, though I cannot say why.
Another solution for pandas DataFrames:
plot = df.plot(style='o-') # draw the lines so they appears in the legend
colors = [line.get_color() for line in plot.lines] # get the colors of the markers
df = df.interpolate(limit_area='inside') # interpolate
lines = plot.plot(df.index, df.values) # add more lines (with a new set of colors)
for color, line in zip(colors, lines):
line.set_color(color) # overwrite the new lines colors with the same colors as the old lines
I had the same problem, but the mask eliminate the point between and the line was cut either way (the pink lines that we see in the picture were the only not NaN data that was consecutive, that´s why the line). Here is the result of masking the data (still with gaps):
xs = df['time'].to_numpy()
series1 = np.array(df['zz'].to_numpy()).astype(np.double)
s1mask = np.isfinite(series1)
fplt.plot(xs[s1mask], series1[s1mask], ax=ax_candle, color='#FF00FF', width = 1, legend='ZZ')
Maybe because I was using finplot (to plot candle chart), so I decided to make the Y-axe points that was missing with the linear formula y2-y1=m(x2-x1) and then formulate the function that generate the Y values between the missing points.
def fillYLine(y):
#Line Formula
fi=0
first = None
next = None
for i in range(0,len(y),1):
ne = not(isnan(y[i]))
next = y[i] if ne else next
if not(next is None):
if not(first is None):
m = (first-next)/(i-fi) #m = y1 - y2 / x1 - x2
cant_points = np.abs(i-fi)-1
if (cant_points)>0:
points = createLine(next,first,i,fi,cant_points)#Create the line with the values of the difference to generate the points x that we need
x = 1
for p in points:
y[fi+x] = p
x = x + 1
first = next
fi = i
next = None
return y
def createLine(y2,y1,x2,x1,cant_points):
m = (y2-y1)/(x2-x1) #Pendiente
points = []
x = x1 + 1#first point to assign
for i in range(0,cant_points,1):
y = ((m*(x2-x))-y2)*-1
points.append(y)
x = x + 1#The values of the line are numeric we don´t use the time to assign them, but we will do it at the same order
return points
Then I use simple call the function to fill the gaps between like y = fillYLine(y), and my finplot was like:
x = df['time'].to_numpy()
y = df['zz'].to_numpy()
y = fillYLine(y)
fplt.plot(x, y, ax=ax_candle, color='#FF00FF', width = 1, legend='ZZ')
You need to think that the data in Y variable is only for the plot, I need the NaN values between in the operations (or remove them from the list), that´s why I created a Y variable from the pandas dataset df['zz'].
Note: I noticed that the data is eliminated in my case because if I don´t mask X (xs) the values slide left in the graph, in this case they become consecutive not NaN values and it draws the consecutive line but shrinked to the left:
fplt.plot(xs, series1[s1mask], ax=ax_candle, color='#FF00FF', width = 1, legend='ZZ') #No xs masking (xs[masking])
This made me think that the reason for some people to work the mask is because they are only plotting that line or there´s no great difference between the non masked and masked data (few gaps, not like my data that have a lot).
Perhaps I missed the point, but I believe Pandas now does this automatically. The example below is a little involved, and requires internet access, but the line for China has lots of gaps in the early years, hence the straight line segments.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# read data from Maddison project
url = 'http://www.ggdc.net/maddison/maddison-project/data/mpd_2013-01.xlsx'
mpd = pd.read_excel(url, skiprows=2, index_col=0, na_values=[' '])
mpd.columns = map(str.rstrip, mpd.columns)
# select countries
countries = ['England/GB/UK', 'USA', 'Japan', 'China', 'India', 'Argentina']
mpd = mpd[countries].dropna()
mpd = mpd.rename(columns={'England/GB/UK': 'UK'})
mpd = np.log(mpd)/np.log(2) # convert to log2
# plots
ax = mpd.plot(lw=2)
ax.set_title('GDP per person', fontsize=14, loc='left')
ax.set_ylabel('GDP Per Capita (1990 USD, log2 scale)')
ax.legend(loc='upper left', fontsize=10, handlelength=2, labelspacing=0.15)
fig = ax.get_figure()
fig.show()