This question already has an answer here:
What estimators does seaborn support
(1 answer)
Closed 4 months ago.
I am reading the documentation of seaborn.barplot and I read the following.
estimator : string or callable that maps vector -> scalar, optional
Statistical function to estimate within each categorical bin.
I could not understand what callable that maps vector -> scalar means. What does this statement convey?
When I passed estimator = 'mean', I got this error.
TypeError: 'str' object is not callable
What should we pass as a string?
Callable means a function. Named function or lambda function. A function that takes a vector as argument and returns a scalar. Function are "first class citizens" in Python so can be passed as argument and generally treated as any other object.
See example here:
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
penguin_data = sns.load_dataset("penguins")
f = plt.figure(figsize=(6, 4))
fig = sns.barplot(x="species", y="body_mass_g", palette = "flare",
estimator = np.mean, data=penguin_data)
I am trying to do an exponential smothing in Python on some detrended data on a Jupyter notebook. I try to import
from statsmodels.tsa.api import ExponentialSmoothing
but the following error comes up
ImportError: cannot import name 'SimpleExpSmoothing'
I don't know how to solve that problem from a Jupyter notebook, so I am trying to declare a function that does the exponential smoothing.
Let's say the function's name is expsmoth(list,a) and takes a list list and a number a and gives another list called explist whose elements are given by the following recurrence relation:
explist[0] == list[0]
explist[i] == a*list[i] + (1-a)*explist[i-1]
I am still leargnin python. How to declare a function that takes a list and a number as arguments and gives back a list whose elements are given by the above recurrence relation?
A simple solution to your problem would be
def explist(data, a):
smooth_data = data.copy() # make a copy to avoid changing the original list
for i in range(1, len(data)):
smooth_data[i] = a*data[i] + (1-a)*smooth_data[i-1]
return smooth_data
The function should work with both native python lists or numpy arrays.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.random(100) # some random data
smooth_data = explist(data, 0.2)
plt.plot(data, label='orginal')
plt.plot(smooth_data, label='smoothed')
plt.legend()
plt.show()
This question already has answers here:
AssertionError: incompatible sizes: argument 'height' must be length 2 or scalar (Matplotlib, Python 2.7, drawing charts)
(2 answers)
Closed 7 years ago.
Whats the problem here? Using 2.7. Thanks.
This is the error:
AssertionError: incompatible sizes: argument 'height' must be length 0 or scalar
from numpy import zeros, random
m=zeros(10,int)
for i in range(10000):
n=random.random()
if 0.0<=n and n<0.1: m[0]=m[0]+1
if 0.1<=n and n<0.2: m[1]=m[1]+1
if 0.2<=n and n<0.3: m[2]=m[2]+1
if 0.3<=n and n<0.4: m[3]=m[3]+1
if 0.4<=n and n<0.5: m[4]=m[4]+1
if 0.5<=n and n<0.6: m[5]=m[5]+1
if 0.6<=n and n<0.7: m[6]=m[6]+1
if 0.7<=n and n<0.8: m[7]=m[7]+1
if 0.8<=n and n<0.9: m[8]=m[8]+1
if 0.9<=n and n<1.0: m[9]=m[9]+1
print m
from pylab import *
bar(arange(0.1,0.1),m,width=0.1)
#show()
savefig('5.4graph.png')
This should accomplish what you want, though it might not make complete sense to you:
# this does pretty much what you're trying to do with your for loop
m = map(lambda x: (random.random()/10)+1+.1*x,range(10))
print m
bar(arange(10)+.1,m)
show()
#or savefig('test.png')
import numpy as np
import matplotlib.pyplot as plt
x = np.random.random(1000000)
fig = plt.figure()
ax1 = fig.add_subplot(1, 1, 1)
n, bins, patches = ax1.hist(x,25,normed=True)
ax1.set_title('Distribution from random numbers')
#plt.show()
plt.savefig('histogram1.png')
I tried to plot the output of the defined function with respect to z. However the error TypeError: unhashable type: 'numpy.ndarray' is shown. Please help.
import numpy as np
import matplotlib.pyplot as plt
import sympy as sp
a=1.48185562
b=0.57081914
c=-0.25098188
H0=70.32724312
z=np.linspace(0.0,1.5,100)
omega_m0=0.3
dlabel= 'w(z) vz z'
def func(z):
sp.var('z+1')
H=((2/H0)*((b*(z+1)+c*(z+1)**0.5+2.0-a-b-c)*(1-0.5*a*(z+1)**(-0.5)) - ((z+1)-a*(z+1)**0.5-1.0+a)*(b+c*0.5*(z+1)**(-0.5)))/(b*(z+1)+c*(z+1)**0.5+2.0-a-b-c)**2)**(-1)
return ((2*(z+1)/3)*(sp.diff(sp.log(H)))-1)/(1-(H/H0)**2*omega_m0*(z+1)**3)
wz=func(z)
plt.plot(z,wz)
plt.xlabel('z')
plt.ylabel('w(z)')
plt.show()
I'm not sure what you want to do with sp.var('z+1')... at least I hope you were not trying to create a variable named z+1. I got the code to run but I let you make sure it does what you want and complain if not :)
import numpy as np
import matplotlib.pyplot as plt
import sympy as sp
a=1.48185562
b=0.57081914
c=-0.25098188
H0=70.32724312
x=np.linspace(0.0,1.5,100)
omega_m0=0.3
dlabel= 'w(z) vz z'
sp.var('z')
def func(z):
H=((2/H0)*((b*(z+1)+c*(z+1)**0.5+2.0-a-b-c)*(1-0.5*a*(z+1)**(-0.5)) - ((z+1)-a*(z+1)**0.5-1.0+a)*(b+c*0.5*(z+1)**(-0.5)))/(b*(z+1)+c*(z+1)**0.5+2.0-a-b-c)**2)**(-1)
return ((2*(z+1)/3)*(sp.diff(sp.log(H)))-1)/(1-(H/H0)**2*omega_m0*(z+1)**3)
wz = [func(z).evalf(subs = {z : y}) for y in x]
plt.plot(x,wz)
plt.xlabel('z')
plt.ylabel('w(z)')
plt.show()
EDIT: in order to get wz, the following piece is much faster ( cf Evaluate sympy expression from an array of values ):
from sympy.utilities.lambdify import lambdify
func_np_ready = lambdify(z, func(z),'numpy') # returns a numpy-ready function
wz = func_np_ready(x)
You may be better off flagging your question with sympy - it's probably the behaviour of one of those functions that's causing the issue, and someone else might know all about it.
It's probably a good idea to split those really long formulas up into multi lines (at least while debugging) to help you track down the error. Also put in some prints etc.
I know it's not what you want to achieve but if I cut out the sympy (I don't have it installed!) and adjust the array lengths it plots without error:
...
H=((2/H0)*((b*(z+1)+c*(z+1)**0.5+2.0-a-b-c)*(1-0.5*a*(z+1)**(-0.5)) - ((z+1)-a*(z+1)**0.5-1.0+a)*(b+c*0.5*(z+1)**(-0.5)))/(b*(z+1)+c*(z+1)**0.5+2.0-a-b-c)**2)**(-1)
return ((2*(z[:-1]+1)/3)*(np.diff(np.log(H)))-1)/(1-(H[:-1]/H0)**2*omega_m0*(z[:-1]+1)**3)
wz=func(z)
plt.plot(z[:-1],wz)
EDIT Just realized the way I was parsing in the data was deleting numbers so I didn't have an array for the correct shape. Thanks mgilson, you provided fantastic answers!
I'm trying to make a heatmap of data using python. I have found this basic code that works:
import matplotlib.pyplot as plt
import numpy as np
data = np.random.rand(3,3)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)
plt.show()
f.close()
However, when I try to put in my data, which is currently formatted as a list of lists (data=[[1,2,3],[1,2,3],[1,2,3]]), it gives me the error: AttributeError: 'list' object has no attribute 'shape.'
What is the data structure that np.random.rand() produces/ python uses for heatmaps? How do I convert my list of lists into that data structure? Thanks so much!
This is what my data looks like, if that helps:
[[0.174365079365079, 0.147356200527704, 0.172903394255875, 0.149252948885976, 0.132479381443299, 0.279736780258519, 0.134908163265306, 0.127802340702211, 0.131209302325581, 0.100632627646326, 0.127636363636364, 0.146028409090909],
[0.161473684210526, 0.163691529709229, 0.166841698841699, 0.144, 0.13104, 0.146225563909774, 0.131002409638554, 0.125977358490566, 0.107940372670807, 0.100862068965517, 0.13436641221374, 0.130921518987342],
[0.15640362225097, 0.152472361809045, 0.101713567839196, 0.123847328244275, 0.101428924598269, 0.102045112781955, 0.0999014778325123, 0.11909887359199, 0.186751958224543, 0.216221343873518, 0.353571428571429],
[0.155185378590078, 0.151626168224299, 0.112484210526316, 0.126333764553687, 0.108763358778626],
[0.792675, 0.681526248399488, 0.929269035532995, 0.741649167733675, 0.436010126582278, 0.462519447929736, 0.416332480818414, 0.135318181818182, 0.453331639135959, 0.121893919793014, 0.457028132992327, 0.462558139534884],
[0.779800766283525, 1.02741401273885, 0.893561712846348, 0.710062015503876, 0.425114754098361, 0.388704980842912, 0.415049608355091, 0.228122605363985, 0.128575796178344, 0.113307392996109, 0.404273195876289, 0.414923673997413],
[0.802428754813864, 0.601316326530612, 0.156620689655172, 0.459367588932806, 0.189442875481386, 0.118344827586207, 0.127080939947781, 0.2588, 0.490834196891192, 0.805660574412533, 3.17598959687906],
[0.873314136125655, 0.75143661971831, 0.255721518987342, 0.472793854033291, 0.296584980237154]]
It's a numpy.ndarray. You can construct it easily from your data:
import numpy as np
data = np.array([[1,2,3],[1,2,3],[1,2,3]])
(np.asarray would also work -- If given an array, it just returns it, otherwise it constructs a new one compared to np.array which always constructs a new array)