Draw seamless distribution of tweets

Draw seamless distribution of tweets - python

I've collected tweets from twitter now I'm trying to draw the distribution of tweets geographically. To do that, I divide the entire square area into small square and count number of tweets in each square. Finally, I use matplotlib to draw the following figure:
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha=0.3, cmap='Accent')
The problem is that the elevation map is not smooth. I'd like a way to draw smooth curve from the data. One example for that in 2D is when we have a histogram of image, we can draw smooth curve over the distribution as follows:
So my question is that is there a way to draw a smooth surface from the discrete data?

Expanding on my answer, here's what you can get with resampling and smoothing (gaussian_filter())/spline interpolation (RectBivariateSpline). Note that it would be nice of you to provide a template code that plots your graph, but since you haven't, I had to improvise.
import numpy
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def plot(name, method):
numpy.random.seed(123)
x = numpy.linspace(0, 50, 51)
X, Y = numpy.meshgrid(x, x)
Z = numpy.zeros((x.size, x.size))
for n in range(50):
i = numpy.random.randint(0, x.size)
j = numpy.random.randint(0, x.size)
Z[i, j] = numpy.abs(numpy.random.normal()) * 1000
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
if method == 0:
# regular plot
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha=0.3, cmap='Accent')
else:
# create a finer grid
resample_coeff = 2
Z2 = numpy.repeat(Z, resample_coeff, 0).repeat(resample_coeff, 1)
x2 = numpy.linspace(x[0], x[-1], x.size * resample_coeff)
X2, Y2 = numpy.meshgrid(x2, x2)
if method == 1:
# smoothing
from scipy.ndimage.filters import gaussian_filter
Z2 = gaussian_filter(Z2, 1)
elif method == 2:
# interpolation
from scipy.interpolate import RectBivariateSpline
spline = RectBivariateSpline(
x, x, Z, bbox=[x[0], x[-1], x[0], x[-1]])
Z2 = spline.ev(X2, Y2)
ax.plot_surface(X2, Y2, Z2, rstride=1, cstride=1, alpha=0.3, cmap='Accent')
fig.savefig(name)
if __name__ == '__main__':
plot('t0.png', 0)
plot('t1.png', 1)
plot('t2.png', 2)
Initial graph:
Smoothing:
Interpolation (notice the negative regions; that's polynomial interpolation for you):

Related

scipy griddata produces nan values between samples

I'm trying to interpolate grid points based on unstructured samples. My samples are taken from a log space between 0.01 and 10 (x axis) and between 1e-8 and 1 (y axis). When I run this code:
from scipy.interpolate import griddata
data = pd.read_csv('data.csv')
param1, param2, errors = data['param1'].values, data['param2'].values, data['error'].values
x = np.linspace(param1.min(), param1.max(), 100, endpoint=True)
y = np.linspace(param2.min(), param2.max(), 100, endpoint=True)
X, Y = np.meshgrid(x, y)
Z = griddata((param1, param2), errors, (X, Y), method='linear')
fig, ax = plt.subplots(figsize=(10, 7))
cax = ax.contourf(X, Y, Z, 25, cmap='hot')
ax.scatter(param1, param2, s=1, color='black', alpha=0.4)
ax.set(xscale='log', yscale='log')
cbar = fig.colorbar(cax)
fig.tight_layout()
I get this result.The white area shows NaN values. Both x and y axes are in log scale:
Even though there are samples in the white area (scatter points prove that), griddata produces NaNs. There are no NaNs/infs in the data. Am I missing something or it's just a bug in Scipy?
data.csv

This is due to the linear spacing of your X-Y interpolation grid, and logarithmic scaling of axes. This is fairly easily fixed by geometrically ("logarithmically") spacing the interpolation grid.
One can also interpolate in log-space; IMO this gives a better looking result, but it may not be valid.
Here's a more-coarsely-sampled version of your figure, showing how the interpolation grid points are "clumped up" to the top right in the log-scaled plot. Here the top row of axes is shows where the data is finite, the bottom row is the "real" plot:
You can see the extreme left and extreme bottom points of a linearly-spaced sample grid are (just!) outside set of values; this is especially bad because the next closest lines of points are visually far away due to the logarithmic scaling.
Here's a result with the interpolation grid geometrically spaced, and interpolation also done in that space.
You can run the code below to view the other two variants.
from itertools import product
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import griddata
import pandas as pd
CMAP = None
# crude, to make interpolation grid visible
NX = 11
NY = 11
def plot_general(log_grid=False, log_interp=False):
data = pd.read_csv('data.csv')
param1, param2, errors = data['param1'].values, data['param2'].values, data['error'].values
if log_grid:
x = np.geomspace(param1.min(), param1.max(), NX)
y = np.geomspace(param2.min(), param2.max(), NY)
else:
x = np.linspace(param1.min(), param1.max(), NX)
y = np.linspace(param2.min(), param2.max(), NY)
X, Y = np.meshgrid(x, y)
if log_interp:
Z = griddata((np.log10(param1), np.log10(param2)), errors, (np.log10(X), np.log10(Y)), method='linear')
else:
Z = griddata((param1, param2), errors, (X, Y), method='linear')
fZ = np.isfinite(Z)
fig, ax = plt.subplots(2, 2)
ax[0,0].contourf(X, Y, fZ, levels=[0.5,1.5])
ax[0,0].scatter(param1, param2, s=1, color='black')
ax[0,0].plot(X.flat, Y.flat, '.', color='blue')
ax[0,1].contourf(X, Y, fZ, levels=[0.5,1.5])
ax[0,1].scatter(param1, param2, s=1, color='black')
ax[0,1].plot(X.flat, Y.flat, '.', color='blue')
ax[0,1].set(xscale='log', yscale='log')
ax[1,0].contourf(X, Y, Z, levels=25, cmap=CMAP)
ax[1,0].scatter(param1, param2, s=1, color='black')
ax[1,0].plot(X.flat, Y.flat, '.', color='blue')
ax[1,1].contourf(X, Y, Z, levels=25, cmap=CMAP)
ax[1,1].scatter(param1, param2, s=1, color='black')
ax[1,1].set(xscale='log', yscale='log')
ax[1,1].plot(X.flat, Y.flat, '.', color='blue')
fig.suptitle(f'{log_grid=}, {log_interp=}')
fig.tight_layout()
return fig
plt.close('all')
for log_grid, log_interp in product([False, True],
[False, True]):
fig = plot_general(log_grid, log_interp)
#if you want to save results:
#fig.savefig(f'log_grid{log_grid}-log_interp{log_interp}.png')

Plot surface with binary colormap

I would like to make a 3d plot of a surface parametrised by a function, and I would like the surface to be of one color (say white) where it is above some value a, and of another color (say black) where it is below a.
Here is the code to generate and plot the surface (the way the surface is generated is not important, it could be a much simpler function):
from __future__ import division
import numpy as np
import time,random
random.seed(-2)
def build_spden(N,M, alpha):
#computes the spectral density in momentum space
sp_den = np.zeros((N,M))
for k1 in prange(-N//2, N//2):
for k2 in prange(-M//2, M//2):
sp_den[k1,k2] = np.abs(2*(np.cos(2*np.pi*k1/N)+np.cos(2*np.pi*k2/M)-2))
sp_den[0,0]=1
return 1/sp_den**(alpha/2)
def gaussian_field(N,M,alpha):
'''Builds a correlated gaussian field on a surface NxM'''
spectral_density = build_spden(N,M, alpha)
# FFT of gaussian noise:
noise_real = np.random.normal(0, 1, size = (N,M))
noise_fourier = np.fft.fft2(noise_real)
# Add correlations by Fourier Filtering Method:
convolution = noise_fourier*np.sqrt(spectral_density)
# Take IFFT and exclude residual complex part
correlated_noise = np.fft.ifft2(convolution).real
# Return normalized field
return correlated_noise * (np.sqrt(N*M)/np.sqrt(np.sum(spectral_density)) )
#PLOT
N = 2**5
alpha = .75
a = -.1985
surf = gaussian_field(N,N,alpha)
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
x = np.outer(np.arange(0, N), np.ones(N))
y = x.copy().T # transpose
z = surf
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot_surface(x, y, z,alpha=.4) #plot the surface
z2 = a*np.ones((N,N))
ax.plot_surface(x, y, z2, alpha=0.9) #plot a plane z = a.
plt.show()
The output is:
I would therefore like the surface to be white above the plane and black below.
Many thanks !

You can define a custom color map and pass to plot_surface:
from matplotlib.colors import ListedColormap, BoundaryNorm
cmap = ListedColormap(['r', 'b'])
norm = BoundaryNorm([z.min(), a, z.max()], cmap.N)
ax.plot_surface(x, y, z, cmap=cmap, norm=norm, alpha=.4) #plot the surface
z2 = a*np.ones((N,N))
ax.plot_surface(x, y, z2, colalpha=0.9) #plot a plane z = a.
plt.show()
Output:

How to take into account the data's uncertainty (standard deviation) when fitting with scipy.linalg.lstsq?

I am trying to surface fit 3d data (z is a function of x and y). I have assymetrical error bars for each point. I would like the fit to take this uncertainty into account.
I am using scipy.linalg.lstsq(). It does not have any option for uncertainties in its arguments.
I am trying to adapt some code found on this page.
import numpy as np
import scipy.linalg
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
# Create data with x and y random over [-2, 2], and z a Gaussian function of x and y.
np.random.seed(12345)
x = 2 * (np.random.random(500) - 0.5)
y = 2 * (np.random.random(500) - 0.5)
def f(x, y):
return np.exp(-(x + y ** 2))
z = f(x, y)
data = np.c_[x,y,z]
# regular grid covering the domain of the data
mn = np.min(data, axis=0)
mx = np.max(data, axis=0)
X,Y = np.meshgrid(np.linspace(mn[0], mx[0], 20), np.linspace(mn[1], mx[1], 20))
XX = X.flatten()
YY = Y.flatten()
# best-fit quadratic curve (2nd-order)
A = np.c_[np.ones(data.shape[0]), data[:,:2], np.prod(data[:,:2], axis=1), data[:,:2]**2]
C,_,_,_ = scipy.linalg.lstsq(A, data[:,2])
# evaluate it on a grid
Z = np.dot(np.c_[np.ones(XX.shape), XX, YY, XX*YY, XX**2, YY**2], C).reshape(X.shape)
# plot points and fitted surface using Matplotlib
fig = plt.figure(figsize=(10, 10))
ax = fig.gca(projection='3d')
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha=0.2)
ax.scatter(data[:,0], data[:,1], data[:,2], c='r', s=50)
plt.xlabel('X')
plt.ylabel('Y')
ax.set_zlabel('Z')
ax.axis('equal')
ax.axis('tight')

How do you create a 3D surface plot with missing values matplotlib?

I am trying to create a 3D surface energy diagram where an x,y position on a grid contains an associated z level. The issue is that the grid is not uniform (ie, there is not a z component for every x,y position). Is there a way to refrain from plotting those values by calling them NaN in the corresponding position in the array?
Here is what I have tried so far:
import numpy as np
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import pylab
from matplotlib import cm
#Z levels
energ = np.array([0,3.5,1,-0.3,-1.5,-2,-3.4,-4.8])
#function for getting x,y associated z values?
def fun(x,y,array):
return array[x]
#arrays for grid
x = np.arange(0,7,0.5)
y = np.arange(0,7,0.5)
#create grid
X, Y = np.meshgrid(x,y)
zs = np.array([fun(x,y,energ) for x in zip(np.ravel(X))])
Z = zs.reshape(X.shape)
plt3d = plt.figure().gca(projection='3d')
#gradients now with respect to x and y, but ideally with respect to z only
Gx, Gz = np.gradient(X * Y)
G = (Gx ** 2 + Gz ** 2) ** .5 # gradient magnitude
N = G / G.max() # normalize 0..1
plt3d.plot_surface(X, Y, Z, rstride=1, cstride=1,
facecolors=cm.jet(N), edgecolor='k', linewidth=0, antialiased=False, shade=False)
plt.show()
I cannot post image here of this plot but if you run the code you will see it
But I would like to not plot certain x,y pairs, so the figure should triangle downward to the minimum. Can this be accomplished by using nan values? Also would like spacing between each level, to be connected by lines.
n = np.NAN
#energ represents the z levels, so the overall figure should look like a triangle.
energ = np.array([[0,0,0,0,0,0,0,0,0,0,0,0,0],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,2.6,n,2.97,n,2.6,n,2.97,n,2.6,n,3.58,n],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,1.09,n,1.23,n,1.09,n,1.23,n,1.7,n,n],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,n,-0.65,n,-0.28,n,-0.65,n,0.33,n,n,n],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,n,n,-2.16,n,-2.02,n,-1.55,n,n,n,n],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,n,n,n,-3.9,n,-2.92,n,n,n,n,n,],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,n,n,n,n,-4.8,n,n,n,n,n,n,]])
plt3d = plt.figure().gca(projection='3d')
Gx, Gz = np.gradient(X * energ) # gradients with respect to x and z
G = (Gx ** 2 + Gz ** 2) ** .5 # gradient magnitude
N = G / G.max() # normalize 0..1
x = np.arange(0,13,1)
y = np.arange(0,13,1)
X, Y = np.meshgrid(x,y)
#but the shapes don't seem to match up
plt3d.plot_surface(X, Y, energ, rstride=1, cstride=1,
facecolors=cm.jet(N), edgecolor='k',
linewidth=0, antialiased=False, shade=False
)
Using masked arrays generates the following error: local Python[7155] : void CGPathCloseSubpath(CGMutablePathRef): no current point.
n = np.NAN
energ = np.array([[0,0,0,0,0,0,0,0,0,0,0,0,0],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,2.6,n,2.97,n,2.6,n,2.97,n,2.6,n,3.58,n],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,1.09,n,1.23,n,1.09,n,1.23,n,1.7,n,n],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,n,-0.65,n,-0.28,n,-0.65,n,0.33,n,n,n],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,n,n,-2.16,n,-2.02,n,-1.55,n,n,n,n],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,n,n,n,-3.9,n,-2.92,n,n,n,n,n,],[n,n,n,n,n,n,n,n,n,n,n,n,n],[n,n,n,n,n,n,-4.8,n,n,n,n,n,n,]])
x = np.arange(0,13,1)
y = np.arange(0,13,1)
X, Y = np.meshgrid(x,y)
#create masked arrays
mX = ma.masked_array(X, mask=[[0,0,0,0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,0,1,0,1,0,1,0,1,0,1,0,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,0,1,0,1,0,1,0,1,0,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,0,1,0,1,0,1,0,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,0,1,0,1,0,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,1,0,1,0,1,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,1,1,0,1,1,1,1,1,1]])
mY = ma.masked_array(Y, mask=[[0,0,0,0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,0,1,0,1,0,1,0,1,0,1,0,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,0,1,0,1,0,1,0,1,0,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,0,1,0,1,0,1,0,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,0,1,0,1,0,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,1,0,1,0,1,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,1,1,0,1,1,1,1,1,1]])
m_energ = ma.masked_array(energ, mask=[[0,0,0,0,0,0,0,0,0,0,0,0,0],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,0,1,0,1,0,1,0,1,0,1,0,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,0,1,0,1,0,1,0,1,0,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,0,1,0,1,0,1,0,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,0,1,0,1,0,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,1,0,1,0,1,1,1,1,1],[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,1,1,1,0,1,1,1,1,1,1]])
plt3d = plt.figure().gca(projection='3d')
plt3d.plot_surface(mX, mY, m_energ, rstride=1, cstride=1, edgecolor='k', linewidth=0, antialiased=False, shade=False)
plt.show()

I was playing around with the code from this forum post, and I was able to make the graph have missing values. You can try the code yourself! I got it to work using float("nan") for the missing values.
import plotly.graph_objects as go
import numpy as np
x = np.arange(0.1,1.1,0.1)
y = np.linspace(-np.pi,np.pi,10)
#print(x)
#print(y)
X,Y = np.meshgrid(x,y)
#print(X)
#print(Y)
result = []
for i,j in zip(X,Y):
result.append(np.log(i)+np.sin(j))
result[0][0] = float("nan")
upper_bound = np.array(result)+1
lower_bound = np.array(result)-1
fig = go.Figure(data=[
go.Surface(z=result),
go.Surface(z=upper_bound, showscale=False, opacity=0.3,colorscale='purp'),
go.Surface(z=lower_bound, showscale=False, opacity=0.3,colorscale='purp')])
fig.show()

3d plotting of data analysis from a file

I need to compute 2 functions, mutualinfo2d and mutualinfo3d on data from files and get a 3d plots of these analysis.
I have to run over all the files that are indexed with 3 numbers:
The first and the third are parameters, these values change among the file names into two arrays, mu krat known and given.
The second number say the kind of file (is an input of my code)
So for each file I got back a number that has to be the value that I have to 3d plot referred to the parameters mu krat that characterized the file name.
So looping over the parameter I compute these functions put in a list and at the end I want a 3d plot where x, y are mu krat and z the values computed using my functions mutualinfo2d and mutualinfo3d.
I wrote a simple script that run on these files and analyze them correctly, I am able to make the 2d plot of my functions fixing one of the two parameter but I am not able to vary together and get back the 3d plot.
This is the code that I wrote to get 3d plot:
n= 0 # selected the folder to analyze
path="/storage1/monti/Desktop/info_topology/VaryingK_mu_GRN"
path= path + "%d/"%(n)
krat=[1,3,5,7,9,10,30,50,70,90,100,300,500,700,900,1000,3000,5000,7000,9000]
mu=[1,3,5,7,9,10,30,50,70,90,100,300,500,700,900]
mu = array(mu)
krat=array(krat)
inf0=[]
inf1=[]
inf2=[]
for j in mu:
for k in krat:
filename="InfoDati_"+"%lf_"%(j)+"%d_"%(n) +"%lf.txt"%(k) # InfoDati_mu_GRN_krat
fil = path + filename
data=loadtxt(fil)
t=data[:,0]
x=data[:,1]
y=data[:,2] # y, z should be the simulation, x the sin
z=data[:,3]
x=array(x)
y=array(y)
z=array(z)
t=tran(t,T=24,dx=0.05)
dx1=[0.05, 1]
dx2=[0.05,1,1]
inf0.append(mutualinfo2d(t,x, dx=dx1[0],dy=dx1[1]))
inf1.append(mutualinfo2d(t,y, dx=dx1[0],dy=dx2[1]))
inf2.append(mutualinfo3d(t,y,z, dx2))
fig = plt.figure()
ax = fig.gca(projection='3d')
Y = krat
X = mu
X, Y = meshgrid(X, Y)
Gx, Gy = gradient(inf0) # gradients with respect to x and y
G = (Gx**2+Gy**2)**.5 # gradient magnitude
N = G/G.max()
Gx1, Gy1 = gradient(inf1) # gradients with respect to x and y
G1 = (Gx1**2+Gy1**2)**.5 # gradient magnitude
N1 = G1/G1.max()
Gx2, Gy2 = gradient(inf2) # gradients with respect to x and y
G2 = (Gx2**2+Gy2**2)**.5 # gradient magnitude
N2 = G2/G2.max()
surf = ax.plot_surface(X, Y, inf0, rstride=1, cstride=1,facecolors=cm.jet(N),linewidth=0, antialiased=False, shade=False)
surf1 = ax.plot_surface(X, Y, inf1, rstride=1, cstride=1,facecolors=cm.jet(N1),linewidth=0, antialiased=False, shade=False)
surf2 = ax.plot_surface(X, Y, inf2, rstride=1, cstride=1,facecolors=cm.jet(N2),linewidth=0, antialiased=False, shade=False)
m = cm.ScalarMappable(cmap=cm.jet)
m.set_array(G)
m1 = cm.ScalarMappable(cmap=cm.jet)
m1.set_array(G1)
m2 = cm.ScalarMappable(cmap=cm.jet)
m2.set_array(G2)
plt.colorbar(m)
plt.colorbar(m1)
plt.colorbar(m2)
plt.show()
I got back different 3d plot empty... and I would like to have just one 3d plot but with the three different surfaces!

The problem in the code is probably before the plot_surface commands.
An example:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
# create some surfaces
X, Y = np.meshgrid(np.arange(10, dtype='float'), np.arange(8, dtype='float'))
z1 = X+Y
z2 = X-Y
z3 = np.sin(Y / 8. * 2* np.pi)
N1 = z1 / z1.max()
N2 = z2 / z2.max()
N3 = z3 / z3.max()
# create the figure
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, z1, cstride=1, rstride=1, facecolors=plt.cm.jet(N1), linewidth=0, shade=False)
ax.plot_surface(X, Y, z2, cstride=1, rstride=1, facecolors=plt.cm.jet(N2), linewidth=0, shade=False)
ax.plot_surface(X, Y, z3, cstride=1, rstride=1, facecolors=plt.cm.jet(N3), linewidth=0, shade=False)
This gives what it should:
Check that your inf0 and N (etc) are valid. If you have even a single nan in your gradient, the normalization (calculation of N) will give an array full of nans, and then nothing is drawn.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Draw seamless distribution of tweets - python

Related

scipy griddata produces nan values between samples

Plot surface with binary colormap

How to take into account the data's uncertainty (standard deviation) when fitting with scipy.linalg.lstsq?

How do you create a 3D surface plot with missing values matplotlib?

3d plotting of data analysis from a file

Categories

Resources