Graphing a line and scatter points using Matplotlib?

Graphing a line and scatter points using Matplotlib? - python

I'm using matplotlib at the moment to try and visualise some data I am working on. I'm trying to plot around 6500 points and the line y = x on the same graph but am having some trouble in doing so. I can only seem to get the points to render and not the line itself. I know matplotlib doesn't plot equations as such rather just a set of points so I'm trying to use and identical set of points for x and y co-ordinates to produce the line.
The following is my code
from matplotlib import pyplot
import numpy
from pymongo import *
class Store(object):
"""docstring for Store"""
def __init__(self):
super(Store, self).__init__()
c = Connection()
ucd = c.ucd
self.tweets = ucd.tweets
def fetch(self):
x = []
y = []
for t in self.tweets.find():
x.append(t['positive'])
y.append(t['negative'])
return [x,y]
if __name__ == '__main__':
c = Store()
array = c.fetch()
t = numpy.arange(0., 0.03, 1)
pyplot.plot(array[0], array[1], 'ro', t, t, 'b--')
pyplot.show()
Any suggestions would be appreciated,
Patrick

Correct me if I'm wrong (I'm not a pro at matplotlib), but 't' will simply get the value [0.].
t = numpy.arange(0.,0.03,1)
That means start at 0 and go to 0.03 (not inclusive) with a step size of 1. Resulting in an array containing just 0.
In that case you are simply plotting one point. It takes two to make a line.

Related

Filling a 3D Array and Plotting the Values

I would like to write a code in Python that evaluates the time evolution of a density distribution, p(x,y). The initial conditions is p(t=0,x,y)=exp[-((x-500)^2)/500] and the formula for the solution is in the code below: t-time index, i-space index (x-direction), j-space index (y-direction), and v=0.8
My goal is to run the scheme for 10 iterations and plot the results at the final time step (t=9). What I'm getting is a big array just filled with zeros. I think it's because I am not using the 3D arrays correctly, does anyone have any suggestions? Thank you.
My attempt:
import numpy as np
import matplotlib.pyplot as plt
#Input Parameters
Nx = 1000 #number of grid points in x-direction
Ny = 500 #number of grid points in y-direction
T = 10 #number of time steps
v = 0.8
p = np.zeros((T,Nx,Ny))
P = np.zeros((T,Nx,Ny))
for t in range(0,T-1):
for i in range(0,Nx-1):
for j in range(0,Ny-1):
P[t,i,j] = p[t,i,j]-((v/2)*(p[t,i+1,j]-p[t,i,j]))
p[0,i,j] = np.exp(((-1*(i-500))**2)/500)
x = P[9,i]
y = P[9,j]
print(x)
plt.plot(x,y)
plt.xlim([0,1000])
plt.ylim([0,500])
plt.xlabel('x-direction')
plt.ylabel('y-direction')
plt.title("Density Distribution After 10 Iterations")

Looks like you only fill the values for t in range(0,T-1) which stops at T=8, and you are trying to get x = P[9,i]. They never get filled so obviously they are all 0.
Try to use range(0, T), it will loop over 0,1,2,...,T-1. Also change range(0,Nx), range(0,Ny)

How to visually represent time evolution in 2-d Brownian motion simulation

I have modeled Brownian motion in both the x and y directions as random walks. I have plotted the data on a 2-d plot but, while it is not so difficult to trace the simulated particle's path from the origin, I want to be able to see the time-evolution of the particle's path visually represented on the plot, whether it be by changing the color of the line over time, or by adding a third dimension to the plot to represent time, or by using some sort of dynamic graph type.
I haven't tried implementing anything, but I have tried to look at what options are available to me. I want to avoid using a 3d plot if possible. That said, I am open to using something other than matplotlib if it makes sense for this situation (like pyqtgraph).
Here is my code:
import random
import numpy as np
import matplotlib.pyplot as plt
#n is how many trajectory evaluations
n = 1000
t= np.linspace(0,10000,num=n)
def brownianMotion(time):
B = [0]
for t in range(len(time)-1):
nrand = random.gauss(0,(time[t+1] - time[t])**.5)
B.append(B[t]+nrand)
return B
xpath = brownianMotion(t)
ypath = brownianMotion(t)
def plot(x,y):
plt.figure()
xplot = np.insert(x,0,0)
yplot = np.insert(y,0,0)
plt.plot(xplot,yplot,'go-',lw=1,ms=.1)
#np.arange(0,n+1),'go-', lw=1, ms = .1)
plt.xlim([-150,150])
plt.ylim([-150,150])
plt.title('Brownian Motion')
plt.xlabel('xDisplacement')
plt.ylabel('yDisplacement')
plt.show()
plot(xpath,ypath)
All in all, this is just for fun and something I did while bored at work. All suggestions are welcome! Thank you for your time!
Please let me know if I should post a picture of my code's output.
Edit: Additionally, if I wanted to represent multiple particles in the same graph, how could I do that so that the multiple pathes are distinguishable? I have modified my code for this purpose shown below but currently this code outputs a messy green mixture of particles.
import random
import numpy as np
import matplotlib.pyplot as plt
nparticles = 20
#n is how many trajectory evaluations
n = 100
t= np.linspace(0,1000,num=n)
def brownianMotion(time):
B = [0]
for t in range(len(time)-1):
nrand = random.gauss(0,(time[t+1] - time[t])**.5)
B.append(B[t]+nrand)
return B
xs = []
ys = []
for i in range(nparticles):
xs.append(brownianMotion(t))
ys.append(brownianMotion(t))
#xpath = brownianMotion(t)
#ypath = brownianMotion(t)
def plot(x,y):
plt.figure()
for xpath, ypath in zip(x,y):
xplot = np.insert(xpath,0,0)
yplot = np.insert(ypath,0,0)
plt.plot(xplot,yplot,'go-',lw=1,ms=.1)
#np.arange(0,n+1),'go-', lw=1, ms = .1)
plt.xlim([np.amin(x),np.amax(x)])
plt.ylim([np.amin(y),np.amax(y)])
plt.title('Brownian Motion')
plt.xlabel('xDisplacement')
plt.ylabel('yDisplacement')
plt.show()
plot(xs,ys)

How to properly use semilogy?

For class, we are supposed to calculate the absolute error, realistic error for an e series. In the end, we have to graph both the relative error and the absolute error using a "semilogy" graph. The code itself works fine and produces a numerical calculation as expected. But in terms of the graph, it doesn't get near the actual result.
Any idea of why this isn't working?
An image of the graph has been attached at the end.
import numpy as np
import math as m
import matplotlib.pyplot as plt
Exp_List = []
Relative_Error = []
Absolute_Error = []
def ExpCalc(x,N,t):
exp = 0.0
for i in range(N-1):
fac = m.factorial(i)
next_term = x**i/fac
Exp_List.append(next_term)
exp += next_term
Relative = abs((sum(Exp_List)-t)/sum(Exp_List))
Relative_Error.append(Relative)
Absolute = abs(sum(Exp_List)-t)
Absolute_Error.append(Absolute)
i += 1
return (Absolute_Error, Relative_Error)
A = m.exp(1)
B = m.exp(20)
C = m.exp(100)
print(ExpCalc(1,20,A))
plt.figure()
plt.semilogy(ExpCalc(1,20,A))
plt.show()
The image shows the numerical calculations as well as the graph obtained from the code

If you look at the result of your print(ExpCalc(1,20,A)) it is actually a tuple, which is causing the plotting behaviour you are seeing.
To fix this you can call the function and unpack the values, then do the plotting seperately to ensure you are plotting the correct values.
x, y = ExpCalc(1,20,A)
plt.figure()
plt.semilogy(x,y)
plt.show()
Which gives:

ZeroDivisionError: float division by zero in a code for Surface plot

I have got this code to generate a surface plot. But it gives a zero division error. I am not able to figure out what is wrong. Thank you.
import pylab, csv
import numpy
from mayavi.mlab import *
def getData(fileName):
try:
data = csv.reader(open(fileName,'rb'))
except:
print 'File not found'
else:
data = [[float(row[0]), float(row[1]),float(row[2])] for row in data]
x = [row[0] for row in data]
y = [row[1] for row in data]
z = [row[2] for row in data]
return (x, y, z)
def plotData(fileName):
xVals, yVals, zVals = getData(fileName)
xVals = pylab.array(xVals)
yVals = pylab.array(yVals)
zVals = (pylab.array(zVals)*10**3)
x, y = numpy.mgrid[-0.5:0.5:0.001, -0.5:0.5:0.001]
s = surf(x, y, zVals)
return s
plotData('data')

If I have understood the code correctly, there is a problem with zVals in mayavi.mlab.surf.
According to the documentation of the function, s is the elevation matrix, a 2D array, where indices along the first array axis represent x locations, and indices along the second array axis represent y locations. Your file reader seems to return a 1D vector instead of an array.
However, this may not be the most difficult problem. Your file seems to contain triplets of x, y, and z coordinates. You can use mayavi.mlab.surf only if your x and y coordinates in the file form a regular square grid. If this is the case, then you just have to recover that grid and form nice 2D arrays of all three parts. If the points are in the file in a known order, it is easy, otherwise it is rather tricky.
Maybe you would want to start with mayavi.mlab.points3d(xVals, yVals, zVals). That will give you an overall impression of your data. (Or if already know more about your data, you might give us a hint by editing your question and adding more information!)
Just to give you an idea of probably slightly pythonic style of writing this, your code is rewritten (and surf replaced) in the following:
import mayavi.mlab as ml
import numpy
def plot_data(filename):
data = numpy.loadtxt(filename)
xvals = data[:,0]
yvals = data[:,1]
zvals = data[:,2] * 1000.
return ml.points3d(x, y, z)
plot_data('data')
(Essential changes: the use of numpy.loadtxt, get rid of pylab namespace here, no import *, no CamelCase variable or function names. For more information, see PEP 8.)
If you only need to see the shape of the surface, and the data in the file is ordered row-by-row and with the same number of data points in each row (i.e. fixed number of columns), then you may use:
import mayavi.mlab as ml
import numpy
importt matplotlib.pyplot as plt
# whatever you have as the number of points per row
columns = 13
data = numpy.loadtxt(filename)
# draw the data points into a XY plane to check that they really for a rectangular grid:
plt.plot(data[:,0], data[:,1])
# draw the surface
zvals = data[:,2].reshape(-1,columns)
ml.surf(zvals, warp_scale='auto')
As you can see, this code allows you to check that your values really are in the right kind of grid. It does not check that they are in the correct order, but at least you can see they form a nice grid. Also, you have to input the number of columns manually. The keyword warp_scale takes care of the surface scaling so that it should look reasonable.

get bins coordinates with hexbin in matplotlib

I use matplotlib's method hexbin to compute 2d histograms on my data.
But I would like to get the coordinates of the centers of the hexagons in order to further process the results.
I got the values using get_array() method on the result, but I cannot figure out how to get the bins coordinates.
I tried to compute them given number of bins and the extent of my data but i don't know the exact number of bins in each direction. gridsize=(10,2) should do the trick but it does not seem to work.
Any idea?

I think this works.
from __future__ import division
import numpy as np
import math
import matplotlib.pyplot as plt
def generate_data(n):
"""Make random, correlated x & y arrays"""
points = np.random.multivariate_normal(mean=(0,0),
cov=[[0.4,9],[9,10]],size=int(n))
return points
if __name__ =='__main__':
color_map = plt.cm.Spectral_r
n = 1e4
points = generate_data(n)
xbnds = np.array([-20.0,20.0])
ybnds = np.array([-20.0,20.0])
extent = [xbnds[0],xbnds[1],ybnds[0],ybnds[1]]
fig=plt.figure(figsize=(10,9))
ax = fig.add_subplot(111)
x, y = points.T
# Set gridsize just to make them visually large
image = plt.hexbin(x,y,cmap=color_map,gridsize=20,extent=extent,mincnt=1,bins='log')
# Note that mincnt=1 adds 1 to each count
counts = image.get_array()
ncnts = np.count_nonzero(np.power(10,counts))
verts = image.get_offsets()
for offc in xrange(verts.shape[0]):
binx,biny = verts[offc][0],verts[offc][1]
if counts[offc]:
plt.plot(binx,biny,'k.',zorder=100)
ax.set_xlim(xbnds)
ax.set_ylim(ybnds)
plt.grid(True)
cb = plt.colorbar(image,spacing='uniform',extend='max')
plt.show()

I would love to confirm that the code by Hooked using get_offsets() works, but I tried several iterations of the code mentioned above to retrieve center positions and, as Dave mentioned, get_offsets() remains empty. The workaround that I found is to use the non-empty 'image.get_paths()' option. My code takes the mean to find centers but which means it is just a smidge longer, but it does work.
The get_paths() option returns a set of x,y coordinates embedded that can be looped over and then averaged to return the center position for each hexagram.
The code that I have is as follows:
counts=image.get_array() #counts in each hexagon, works great
verts=image.get_offsets() #empty, don't use this
b=image.get_paths() #this does work, gives Path([[]][]) which can be plotted
for x in xrange(len(b)):
xav=np.mean(b[x].vertices[0:6,0]) #center in x (RA)
yav=np.mean(b[x].vertices[0:6,1]) #center in y (DEC)
plt.plot(xav,yav,'k.',zorder=100)

I had this same problem. I think what needs to be developed is a framework to have a HexagonalGrid object which can then be applied to many different data sets (and it would be awesome to do it for N dimensions). This is possible and it surprises me that neither Scipy or Numpy has anything for it (furthermore there seems to be nothing else like it except perhaps binify)
That said, I assume you want to use hexbinning to compare multiple binned data sets. This requires some common base. I got this to work using matplotlib's hexbin the following way:
import numpy as np
import matplotlib.pyplot as plt
def get_data (mean,cov,n=1e3):
"""
Quick fake data builder
"""
np.random.seed(101)
points = np.random.multivariate_normal(mean=mean,cov=cov,size=int(n))
x, y = points.T
return x,y
def get_centers (hexbin_output):
"""
about 40% faster than previous post only cause you're not calculating the
min/max every time
"""
paths = hexbin_output.get_paths()
v = paths[0].vertices[:-1] # adds a value [0,0] to the end
vx,vy = v.T
idx = [3,0,5,2] # index for [xmin,xmax,ymin,ymax]
xmin,xmax,ymin,ymax = vx[idx[0]],vx[idx[1]],vy[idx[2]],vy[idx[3]]
half_width_x = abs(xmax-xmin)/2.0
half_width_y = abs(ymax-ymin)/2.0
centers = []
for i in xrange(len(paths)):
cx = paths[i].vertices[idx[0],0]+half_width_x
cy = paths[i].vertices[idx[2],1]+half_width_y
centers.append((cx,cy))
return np.asarray(centers)
# important parts ==>
class Hexagonal2DGrid (object):
"""
Used to fix the gridsize, extent, and bins
"""
def __init__ (self,gridsize,extent,bins=None):
self.gridsize = gridsize
self.extent = extent
self.bins = bins
def hexbin (x,y,hexgrid):
"""
To hexagonally bin the data in 2 dimensions
"""
fig = plt.figure()
ax = fig.add_subplot(111)
# Note mincnt=0 so that it will return a value for every point in the
# hexgrid, not just those with count>mincnt
# Basically you fix the gridsize, extent, and bins to keep them the same
# then the resulting count array is the same
hexbin = plt.hexbin(x,y, mincnt=0,
gridsize=hexgrid.gridsize,
extent=hexgrid.extent,
bins=hexgrid.bins)
# you could close the figure if you don't want it
# plt.close(fig.number)
counts = hexbin.get_array().copy()
return counts, hexbin
# Example ===>
if __name__ == "__main__":
hexgrid = Hexagonal2DGrid((21,5),[-70,70,-20,20])
x_data,y_data = get_data((0,0),[[-40,95],[90,10]])
x_model,y_model = get_data((0,10),[[100,30],[3,30]])
counts_data, hexbin_data = hexbin(x_data,y_data,hexgrid)
counts_model, hexbin_model = hexbin(x_model,y_model,hexgrid)
# if you want the centers, they will be the same for both
centers = get_centers(hexbin_data)
# if you want to ignore the cells with zeros then use the following mask.
# But if want zeros for some bins and not others I'm not sure an elegant way
# to do this without using the centers
nonzero = counts_data != 0
# now you can compare the two data sets
variance_data = counts_data[nonzero]
square_diffs = (counts_data[nonzero]-counts_model[nonzero])**2
chi2 = np.sum(square_diffs/variance_data)
print(" chi2={}".format(chi2))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Graphing a line and scatter points using Matplotlib? - python

Related

Filling a 3D Array and Plotting the Values

How to visually represent time evolution in 2-d Brownian motion simulation

How to properly use semilogy?

ZeroDivisionError: float division by zero in a code for Surface plot

get bins coordinates with hexbin in matplotlib

Categories

Resources