Creating a heatmap in python on given csv table - python

I have a csv table, lets say:
this csv table
I'm trying to create a heat map, where the higher number the warmer color is, so the goal is to have a heat map where 20 is warm, and the 0 is blue for example.
I tried a lot of heat maps but all of them are looking squared and I want it to look better
like this.
I used matplot but the result looks like squares, image.
and I tried using
this code where data is the table that I've shown above:
n = 1e5
Z1 = ml.bivariate_normal(data, data, 2, 2, 0, 0)
Z2 - ml.bivariate_normal(data, data, 4, 1, 1, 1)
ZD = Z2 - 21
X = data.ravel()
y = data.ravel
z = ZD.ravel gridsize=10
plt.subplot(111)
plt.hexbin(x, y, C=z, gridsize=gridsize, cmaprcm.jet, bins=None)
plt.axis ([x.min(), X.max(), y.min(), y.max()])
cb = plt.colorbar() cb.set_label('mean value') plt.show()
but it shows me a linear line (don't know why maybe because the number were divided by themself, therefore, creating a linear line)
linear line image

If you look at this answer you see how to properly create the heatmap. To get your desidered result, just change the inteporlation argument from nearest to, for instance, bicubic:
import matplotlib.pyplot as plt
import numpy as np
a = np.random.random((16, 16))
plt.imshow(a, cmap="hot", interpolation="bicubic")
plt.show()
The other avaible interpolation options are the following:
'antialiased', 'bilinear', 'bicubic', 'spline16', 'spline36', 'hanning', 'hamming', 'hermite', 'kaiser', 'quadric', 'catrom', 'gaussian', 'bessel', 'mitchell', 'sinc', 'lanczos', 'blackman'
To convert your csv file to a numpy array you can use the following code:
from numpy import genfromtxt
my_data = genfromtxt("test.csv", delimiter=",")
but, of course, you can use other data structures, like pandas dataframes (in this case use pandas.read_csv("test.csv")).

Related

How to plot vector addition in Matplotlib?

I am trying to plot vector addition and I am not getting the result as expected, I am completely new at 3D plotting I need serious help
My plot looks like this:
What I want is to connect the green line to the head of the two arrows. My code looks something like this:
import numpy as np
import matplotlib.pyplot as plt
u = np.array([1, 2, 3]) # vector u
v = np.array([5, 6, 2]) # vector v:
fig = plt.figure()
ax = plt.axes(projection = "3d")
start = [0,0,0]
ax.quiver(start[0],start[1],start[2],u[0],u[1],u[2],color='red')
ax.quiver(start[0],start[1],start[2],v[0],v[1],v[2])
ax.quiver(v[0],v[1],v[2],u[0],u[1],u[2],color="green")
ax.set_xlim([-1,10])
ax.set_ylim([-10,10])
ax.set_zlim([0,10])
plt.show()
Apologies for any kind of mistake , thnks
it's vector addition, just add the vectors
sum_vector = u+v
ax.quiver(start[0], start[1], start[2], sum_vector[0], sum_vector[1], sum_vector[2], color="green")

Extracting data from an existing plot in pandas

So I was trying to extract some data from existing plots, I'm using the below code and it works perfectly, however, it seems that the original data are not integers and therefore, I end up getting alot of float datas which I dont need. I tried to use round() function but then I will have repetitave values which is not the required output. I'm not sure whether it's possible, but I was wondering if there's away to extract the values from the plot immediately as integers. below is a small sample of what iam trying to achieve.
any help is much appreciated, thanks!
This is the code:
from IPython.display import Image
ax = Image(r'Desktop\comp.png')
ax = plt.gca()
line = ax.lines[0]
x = line.get_xydata()
dataframe=pd.DataFrame(x, columns=['a','b'])
This is the image:
This what I get as a result:
However, I'd like to get something similar to this result:
Assuming you have the plot as matplotlib.axes object, you can extract the data with ax.lines methods get_xdata() and get_ydata()
line = ax.lines[0]
data_x, data_y = line.get_xdata(), line.get_ydata()
Then, create integer values for the new axis with
import math
new_x = range(math.ceil(min(data_x)), math.floor(max(data_x))+1)
And interpolate values with interp1d to get the corresponding y-values:
f = interp1d(data_x, data_y, kind='linear', bounds_error=False, fill_value=np.nan)
new_y = f(new_x)
The output as pandas DataFrame would look like this:
In [3]: pd.DataFrame(dict(a=new_x, b=new_y))
Out[3]:
a b
0 1 1.022186
1 2 4.899643
2 3 9.032727
3 4 16.073667
4 5 25.066514
5 6 36.888971
6 7 49.033702
7 8 64.018056
and as a plot like this:
Full example code
Full example code would look something like this:
import math
from matplotlib import pyplot as plt
import numpy as np
from scipy.interpolate import interp1d
# Create data for example
data_x = np.array(range(9)) + np.random.rand(9)
data_y = data_x**2
# Create the plot
fig, ax = plt.subplots(nrows=1, ncols=1)
ax.plot(data_x, data_y, marker='s', label='original')
# Extract data from plot (your starting point)
line = ax.lines[0]
data_x, data_y = line.get_xdata(), line.get_ydata()
# Get the x-axis data as integer values
new_x = range(math.ceil(min(data_x)), math.floor(max(data_x))+1)
# Get the y-axis data at these points (interpolate)
f = interp1d(data_x, data_y, kind='linear', bounds_error=False, fill_value=np.nan)
new_y = f(new_x)
plt.plot(new_x, new_y, ls='', marker='o', label='new')
plt.grid()
plt.legend()
plt.show()

Matplotlib plot already binned data

I want to plot the mean local binary patterns histograms of a set of images. Here is what I did:
#calculates the lbp
lbp = feature.local_binary_pattern(image, 24, 8, method="uniform")
#Now I calculate the histogram of LBP Patterns
(hist, _) = np.histogram(lbp.ravel(), bins=np.arange(0, 27))
After that I simply sum up all the LBP histograms and take the mean of them. These are the values found, which are saved in a txt file:
2.962000000000000000e+03
1.476000000000000000e+03
1.128000000000000000e+03
1.164000000000000000e+03
1.282000000000000000e+03
1.661000000000000000e+03
2.253000000000000000e+03
3.378000000000000000e+03
4.490000000000000000e+03
5.010000000000000000e+03
4.337000000000000000e+03
3.222000000000000000e+03
2.460000000000000000e+03
2.495000000000000000e+03
2.599000000000000000e+03
2.934000000000000000e+03
2.526000000000000000e+03
1.971000000000000000e+03
1.303000000000000000e+03
9.900000000000000000e+02
7.980000000000000000e+02
8.680000000000000000e+02
1.119000000000000000e+03
1.479000000000000000e+03
4.355000000000000000e+03
3.112600000000000000e+04
I am trying to simply plot these values (don't need to calculate the histogram, because the values are already from a histogram). Here is what I've tried:
import matplotlib
matplotlib.use('Agg')
import numpy as np
import matplotlib.pyplot as plt
import plotly.plotly as py
#load data
data=np.loadtxt('original_dataset1.txt')
#convert to float
data=data.astype('float32')
#define number of Bins
n_bins = data.max() + 1
plt.style.use("ggplot")
(fig, ax) = plt.subplots()
fig.suptitle("Local Binary Patterns")
plt.ylabel("Frequency")
plt.xlabel("LBP value")
plt.bar(n_bins, data)
fig.savefig('lbp_histogram.png')
However, look at the Figure these commands produce:
I still dont understand what is happening. I would like to make a Figure like the one I produced in Excel using the same data, as follows:
I must confess that I am quite rookie with matplotlib. So, what was my mistake?
Try this. Here the array is your mean values from bins.
array = [2962,1476,1128,1164,1282,1661,2253]
fig,ax = plt.subplots(nrows=1, ncols=1,)
ax.bar(np.array(range(len(array)))+1,array,color='orangered')
ax.grid(axis='y')
for i, v in enumerate(array):
ax.text(i+1, v, str(v),color='black',fontweight='bold',
verticalalignment='bottom',horizontalalignment='center')
plt.savefig('savefig.png',dpi=150)
The plot look like this.

How to plot a CVS file with python? My plot comes up blank

I have the code below that seems to run without issues until I try to plot it. A blank plot will show when asked to plot.
import numpy as np
import matplotlib.pyplot as plt
data = np.genfromtxt('/home/oem/Documents/620157.csv', delimiter=',', skip_header=01, skip_footer=01, names=['x', 'y'])
plt.plot(data,'o-')
plt.show()
I'm not sure what your data looks like, but I believe you need to do something like this:
data = np.genfromtxt('/home/oem/Documents/620157.csv',
delimiter=',',
skip_header=1,
skip_footer=1)
name, x, y, a, b = zip(*data)
plt.plot(x, y, 'o-')
As per your comment, the data is currently an array containing tuples of the station name and the x and y data. Using zip with the * symbol assigns them back to individual variables which can then be used for plotting.

Matplotlib contour plot with intersecting contour lines

I am trying to make a contour plot of the following data using matplotlib in python. The data is of this form -
# x y height
77.23 22.34 56
77.53 22.87 63
77.37 22.54 72
77.29 22.44 88
The data actually consists of nearly 10,000 points, which I am reading from an input file. However the set of distinct possible values of z is small (within 50-90, integers), and I wish to have a contour lines for every such distinct z.
Here is my code -
import matplotlib
import numpy as np
import matplotlib.cm as cm
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import csv
import sys
# read data from file
data = csv.reader(open(sys.argv[1], 'rb'), delimiter='|', quotechar='"')
x = []
y = []
z = []
for row in data:
try:
x.append(float(row[0]))
y.append(float(row[1]))
z.append(float(row[2]))
except Exception as e:
pass
#print e
X, Y = np.meshgrid(x, y) # (I don't understand why is this required)
# creating a 2D array of z whose leading diagonal elements
# are the z values from the data set and the off-diagonal
# elements are 0, as I don't care about them.
z_2d = []
default = 0
for i, no in enumerate(z):
z_temp = []
for j in xrange(i): z_temp.append(default)
z_temp.append(no)
for j in xrange(i+1, len(x)): z_temp.append(default)
z_2d.append(z_temp)
Z = z_2d
CS = plt.contour(X, Y, Z, list(set(z)))
plt.figure()
CB = plt.colorbar(CS, shrink=0.8, extend='both')
plt.show()
Here is the plot of a small sample of data -
Here is a close look to one of the regions of the above plot (note the overlapping/intersecting lines) -
I don't understand why it doesn't look like a contour plot. The lines are intersecting, which shouldn't happen. What can be possibly wrong? Please help.
Try to use the following code. This might help you -- it's the same thing which was in the Cookbook:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.mlab import griddata
# with this way you can load your csv-file really easy -- maybe you should change
# the last 'dtype' to 'int', because you said you have int for the last column
data = np.genfromtxt('output.csv', dtype=[('x',float),('y',float),('z',float)],
comments='"', delimiter='|')
# just an assigning for better look in the plot routines
x = data['x']
y = data['y']
z = data['z']
# just an arbitrary number for grid point
ngrid = 500
# create an array with same difference between the entries
# you could use x.min()/x.max() for creating xi and y.min()/y.max() for yi
xi = np.linspace(-1,1,ngrid)
yi = np.linspace(-1,1,ngrid)
# create the grid data for the contour plot
zi = griddata(x,y,z,xi,yi)
# plot the contour and a scatter plot for checking if everything went right
plt.contour(xi,yi,zi,20,linewidths=1)
plt.scatter(x,y,c=z,s=20)
plt.xlim(-1,1)
plt.ylim(-1,1)
plt.show()
I created a sample output file with an Gaussian distribution in 2D. My result with using the code from above:
NOTE:
Maybe you noticed that the edges are kind of cropped. This is due to the fact that the griddata-function create masked arrays. I mean the border of the plot is created by the outer points. Everything outside the border is not there. If your points would be on a line then you will not have any contour for plotting. This is kind of logical. I mention it, cause of your four posted data points. It seems likely that you have this case. Maybe you don't have it =)
UPDATE
I edited the code a bit. Your problem was probably that you didn't resolve the dependencies of your input-file correctly. With the following code the plot should work correctly.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.mlab import griddata
import csv
data = np.genfromtxt('example.csv', dtype=[('x',float),('y',float),('z',float)],
comments='"', delimiter=',')
sample_pts = 500
con_levels = 20
x = data['x']
xmin = x.min()
xmax = x.max()
y = data['y']
ymin = y.min()
ymax = y.max()
z = data['z']
xi = np.linspace(xmin,xmax,sample_pts)
yi = np.linspace(ymin,ymax,sample_pts)
zi = griddata(x,y,z,xi,yi)
plt.contour(xi,yi,zi,con_levels,linewidths=1)
plt.scatter(x,y,c=z,s=20)
plt.xlim(xmin,xmax)
plt.ylim(ymin,ymax)
plt.show()
With this code and your small sample I get the following plot:
Try to use my snippet and just change it a bit. For example, I had to change for the given sample csv-file the delimitter from | to ,. The code I wrote for you is not really nice, but it's written straight foreword.
Sorry for the late response.

Categories

Resources