Necessary data structure for making heatmaps in Python

Necessary data structure for making heatmaps in Python - python

EDIT Just realized the way I was parsing in the data was deleting numbers so I didn't have an array for the correct shape. Thanks mgilson, you provided fantastic answers!
I'm trying to make a heatmap of data using python. I have found this basic code that works:
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(3,3)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)
plt.show()
f.close()
However, when I try to put in my data, which is currently formatted as a list of lists (data=[[1,2,3],[1,2,3],[1,2,3]]), it gives me the error: AttributeError: 'list' object has no attribute 'shape.'
What is the data structure that np.random.rand() produces/ python uses for heatmaps? How do I convert my list of lists into that data structure? Thanks so much!
This is what my data looks like, if that helps:
[[0.174365079365079, 0.147356200527704, 0.172903394255875, 0.149252948885976, 0.132479381443299, 0.279736780258519, 0.134908163265306, 0.127802340702211, 0.131209302325581, 0.100632627646326, 0.127636363636364, 0.146028409090909],
[0.161473684210526, 0.163691529709229, 0.166841698841699, 0.144, 0.13104, 0.146225563909774, 0.131002409638554, 0.125977358490566, 0.107940372670807, 0.100862068965517, 0.13436641221374, 0.130921518987342],
[0.15640362225097, 0.152472361809045, 0.101713567839196, 0.123847328244275, 0.101428924598269, 0.102045112781955, 0.0999014778325123, 0.11909887359199, 0.186751958224543, 0.216221343873518, 0.353571428571429],
[0.155185378590078, 0.151626168224299, 0.112484210526316, 0.126333764553687, 0.108763358778626],
[0.792675, 0.681526248399488, 0.929269035532995, 0.741649167733675, 0.436010126582278, 0.462519447929736, 0.416332480818414, 0.135318181818182, 0.453331639135959, 0.121893919793014, 0.457028132992327, 0.462558139534884],
[0.779800766283525, 1.02741401273885, 0.893561712846348, 0.710062015503876, 0.425114754098361, 0.388704980842912, 0.415049608355091, 0.228122605363985, 0.128575796178344, 0.113307392996109, 0.404273195876289, 0.414923673997413],
[0.802428754813864, 0.601316326530612, 0.156620689655172, 0.459367588932806, 0.189442875481386, 0.118344827586207, 0.127080939947781, 0.2588, 0.490834196891192, 0.805660574412533, 3.17598959687906],
[0.873314136125655, 0.75143661971831, 0.255721518987342, 0.472793854033291, 0.296584980237154]]

It's a numpy.ndarray. You can construct it easily from your data:
import numpy as np
data = np.array([[1,2,3],[1,2,3],[1,2,3]])
(np.asarray would also work -- If given an array, it just returns it, otherwise it constructs a new one compared to np.array which always constructs a new array)

Related

Python Array Copying

I'm trying to figure out how to copy an array output into a new array for multiple iterations. The scenario is to run a function in a for loop with varying inputs then overlay the results on a single plot for comparison. Currently I have it running where I get three arrays from the for loop, but this results in three independent plots.
My coding is not very solid so some guidance would be appreciated. I was reading up on the list copy function but have not been able to get it to do what I want.
for z,wn in mylist:
G1 = y_numeric(z,wn)
#np.array(output[i,:])=G1.copy()
#plt.figure()
#plt.plot(t,G1[:])
#print(G1)
#print(output)

#user158430, Here is a simple example of a working code that might help you navigate through your code:
import matplotlib.pyplot as plt
import numpy as np
#creating empty list to append all y variables
all_y = []
#creating random variables
x,y1,y2,y3=np.arange(0,50,1),np.arange(50,100,1),np.arange(100,150,1),np.arange(150,200,1)
#appending only y variables into the created empty list
all_y.extend((y1,y2,y3))
#looping to plot on one single figure
for i in (all_y):
plt.plot(x,i)
plt.figure() #this code is kept outside the for loop if desired to print all the plots in one figure, if wanted the plots to be separated then indent it to match the plt.plot (i.e., put it in the loop)
This is what you should get:

Declare a function to do exponential smothing on data

I am trying to do an exponential smothing in Python on some detrended data on a Jupyter notebook. I try to import
from statsmodels.tsa.api import ExponentialSmoothing
but the following error comes up
ImportError: cannot import name 'SimpleExpSmoothing'
I don't know how to solve that problem from a Jupyter notebook, so I am trying to declare a function that does the exponential smoothing.
Let's say the function's name is expsmoth(list,a) and takes a list list and a number a and gives another list called explist whose elements are given by the following recurrence relation:
explist[0] == list[0]
explist[i] == a*list[i] + (1-a)*explist[i-1]
I am still leargnin python. How to declare a function that takes a list and a number as arguments and gives back a list whose elements are given by the above recurrence relation?

A simple solution to your problem would be
def explist(data, a):
smooth_data = data.copy() # make a copy to avoid changing the original list
for i in range(1, len(data)):
smooth_data[i] = a*data[i] + (1-a)*smooth_data[i-1]
return smooth_data
The function should work with both native python lists or numpy arrays.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.random(100) # some random data
smooth_data = explist(data, 0.2)
plt.plot(data, label='orginal')
plt.plot(smooth_data, label='smoothed')
plt.legend()
plt.show()

Choosing the correct values in excel in Python

General Overview:
I am creating a graph of a large data set, however i have created a sample text document so that it is easier to overcome the problems.
The Data is from an excel document that will be saved as a CSV.
Problem:
I am able to compile the data a it will graph (see below) However how i pull the data will not work for all of the different excel sheet i am going to pull off of.
More Detail of problem:
The Y-Values (Labeled 'Value' and 'Value1') are being pulled for the excel sheet from the numbers 26 and 31 (See picture and Code).
This is a problem because the Values 26 and 31 will not be the same for each graph.
Lets take a look for this to make more sense.
Here is my code
import pandas as pd
import matplotlib.pyplot as plt
pd.read_csv('CSV_GM_NB_Test.csv').T.to_csv('GM_NB_Transpose_Test.csv,header=False)
df = pd.read_csv('GM_NB_Transpose_Test.csv', skiprows = 2)
DID = df['SN']
Value = df['26']
Value1 = df['31']
x= (DID[16:25])
y= (Value[16:25])
y1= (Value1[16:25])
"""
print(x,y)
print(x,y1)
"""
plt.plot(x.astype(int), y.astype(int))
plt.plot(x.astype(int), y1.astype(int))
plt.show()
Output:
Data Set:
Below in the comments you will find the 0bin to my Data Set this is because i do not have enough reputation to post two links.
As you can see from the Data Set
X- DID = Blue
Y-Value = Green
Y-Value1 = Grey
Troublesome Values = Red
The problem again is that the data for the Y-Values are pulled from Row 10&11 from values 26,31 under SN
Let me know if more information is needed.
Thank you

Not sure why you are creating the transposed CSV version. It is also possible to work directly from your original data. For example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('CSV_GM_NB_Test.csv', skiprows=8)
data = df.ix[:,19:].T
data.columns = df['SN']
data.plot()
plt.show()
This would give you:
You can use pandas.DataFrame.ix() to give you a sliced version of your data using integer positions. The [:,19:] says to give you columns 19 onwards. The final .T transposes it. You can then apply the values for the SN column as column headings using .columns to specify the names.

Imshow and pcolor throw errors when trying to create test pattern-style bars

I am trying to create an image to use as a test pattern for a new colormap I'm creating. The map is supposed to have nine unique colors with breaks at the integers from 0-8. The colormap itself is fine, but I can't seem to generate the image itsel.
I'm using pandas to make the test array like this:
mask=pan.DataFrame(index=np.arange(0,100),columns=np.arange(1,91))
mask.ix[:,1:10]=0.0
mask.ix[:,11:20]=1.0
mask.ix[:,21:30]=2.0
mask.ix[:,31:40]=3.0
mask.ix[:,41:50]=4.0
mask.ix[:,51:60]=5.0
mask.ix[:,61:70]=6.0
mask.ix[:,71:80]=7.0
mask.ix[:,81:90]=8.0
Maybe not the most elegant method, but it creates the array I want.
However, when I try to plot it using either imshow or pcolor I get an error. So:
fig=plt.figure()
ax=fig.add_subplot(111)
image=ax.imshow(mask)
fig.canvas.draw()
yields the error: "TypeError: Image data can not convert to float"
and substituting pcolor for imshow yields this error: "AttributeError: 'float' object has no attribute 'view'"
However, when I replace he values in mask with anything else - say random numbers - it plots just fine:
mask=pan.DataFrame(values=rand(100,90),index=np.arange(0,100),columns=np.arange(1,91))
fig=plt.figure()
ax=fig.add_subplot(111)
image=ax.imshow(mask)
fig.canvas.draw()
yields the standard colored speckle one would expect (no errors).

The problem here is that your dataframe is full of objects, not numbers. You can see it if you do mask.dtypes. If you want to use pandas dataframes, create mask by specifying the data type:
mask=pan.DataFrame(index=np.arange(0,100),columns=np.arange(1,91), dtype='float')
otherwise pandas cannot know which data type you want. After that change your code should work.
However, if you want to just test the color maps with integers, then you might be better off using simple numpy arrays:
mask = np.empty((100,90), dtype='int')
mask[:, :10] = 0
mask[:, 10:20] = 1
...
And, of course, there are shorter ways to do that filling, as well. For example:
mask[:] = np.arange(90)[None,:] / 10

get numpy array of matplotlib tricontourf

I had x,y,height vars to build a contour in python.
I created a Triangulation grid using
x,y,height and traing are numpy arrays
tri = Tri.Triangulation(x, y, triang)
then i did a contour using tricontourf
tricontourf(tri,height)
how can i get the output of the tricontourf into a numpy array. I can display the image using pyplot but I dont want to.
when I tried this:
triout = tricontourf(tri,height)
print triout
I got:
<matplotlib.tri.tricontour.TriContourSet instance at 0xa9ab66c>
I need to get the image data and if I could get numpy array its easy for me.
Is it possible to do this?
if its not possible can I do what tricontourf does without matplotlib in python?

You should try this :
cs = tricontourf(tri,height)
for collection in cs.collections:
for path in collection.get_paths():
print path.to_polygons()
as I learned on:
https://github.com/matplotlib/matplotlib/issues/367
(it is better to use path.to_polygons() )

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Necessary data structure for making heatmaps in Python - python

It's a numpy.ndarray. You can construct it easily from your data: import numpy as np data = np.array([[1,2,3],[1,2,3],[1,2,3]]) (np.asarray would also work -- If given an array, it just returns it, otherwise it constructs a new one compared to np.array which always constructs a new array)

Related

Python Array Copying

Declare a function to do exponential smothing on data

Choosing the correct values in excel in Python

Imshow and pcolor throw errors when trying to create test pattern-style bars

get numpy array of matplotlib tricontourf

Categories

Resources