I am working now on plot my dataset by boxplot as in below code
plt.figure(figsize=(8,5))
fig = plt.figure()
num_list=Final_dataset.columns.values.tolist()
for i in range(len(num_list)):
column=num_list[i]
sns.boxplot(x="label", y=column, data=Final_dataset, palette='Set2')
plt.savefig('{}.png'. format(i))
plt.show()
I need to produce one image that combine all attributes figures as in this figure rather than several figures.
how Ican fix it?
thanks, a lot
Related
I am working now on plot my dataset by boxplot as in below code
plt.figure(figsize=(8,5))
fig = plt.figure()
num_list=Final_dataset.columns.values.tolist()
for i in range(len(num_list)):
column=num_list[i]
sns.boxplot(x="label", y=column, data=Final_dataset, palette='Set2')
plt.savefig('{}.png'. format(i))
plt.show()
I need to produce one image that combine all attributes figures as in this figure rather than several figures. how Ican fix it? thanks, a lot
See subplot function in matplotlib.
nrows = 3 # decide how many you want
ncols = 4 # decide how many you want
plt.figure(figsize=(8,5))
num_list=Final_dataset.columns.values.tolist()
for i in range(len(num_list)):
column=num_list[i]
sns.boxplot(x="label", y=column, data=Final_dataset, palette='Set2')
plt.subplot(nrows, ncols, index = 1+i)
plt.savefig('{}.png'. format(i))
plt.show()
I am trying to build a heat map which is working great if there are lot of categorical variables but not looking great when there are two to three data points as shown in the second image. I am looking for a way to auto adjust based on the data points.
here is the function
def bivariate(col1,col2,Title,cbar_size):
temp2=modifiedloan.groupby([col1,col2,'loan_status']).id.agg('count').to_frame('count').reset_index()
temp3=temp2.pivot_table(index=(col1,col2), columns='loan_status', values='count').fillna(0)
temp3['default%']=(temp3[0]/(temp3[0]+temp3[1]))
temp3=temp3.reset_index()
temp4=temp3.pivot_table(index=col1, columns=col2, values='default%').fillna(0)
temp5=temp3.pivot_table(index=col1, columns=col2, values=[0]).fillna(0)
f, (ax1) = plt.subplots(nrows=1, ncols=1, figsize=(12,6))
cmap = sns.cm.rocket_r
sns.heatmap(temp4,linewidths=1, ax=ax1,annot=False, fmt='g',cmap=cmap,cbar=True,cbar_kws={"shrink": cbar_size})
sns.heatmap(temp5, annot=True, annot_kws={'va':'top'}, fmt="", cbar=False,ax=ax1)
sns.heatmap(temp4, annot=True, fmt=".1%",annot_kws={'va':'bottom'}, cbar=False,cmap=cmap)
plt.ylim(b, t) # update the ylim(bottom, top) values
ax1.set_title(Title)
plt.tight_layout()
I realized the problem is with the version of seaborn installed. The same functions worked well on one of my colleagues machine.
I'm pretty new to Python. I'm trying to plot a box plot for a sample data
I'm trying to plot box plots of mean value of the shared data. I got that part of the code. I'm also trying to plot standard error values on this box plot using yerr().
My code:
data3=pd.read_csv('demo1.csv')
names=['brow', 'harr', 'hage', 'buch', 'mcre']
d=[data3['brow'].mean(),data3['harr'].mean(),data3['hage'].mean(),data3['buch'].mean(),data3['mcre'].mean()]
N=len(data3['co'])
l=math.sqrt(N)
k=[(data3['brow'].std())/l,(data3['harr'].std())/l,(data3['hage'].std())/l,(data3['buch'].std())/l,(data3['mcre'].std())/l,(data3['phil'].std())/l,(data3['moor'].std())/l]
fig, ax = plt.subplots()
plt.bar(names,d)
plt.bar(len(names),d,yerr=k,align='center',alpha=0.5,ecolor='black',capsize=10)
Im getting an image such as this
But I want the black lines to be against each bar graph and not as a new bar in the plot with all of them together. How can I change this. Am I using the plt the wrong way? Please help.
I don't understand what you were trying to do with your second call to plt.bar()
import math
names=['brow', 'harr', 'hage', 'buch', 'mcre']
data3 = pd.DataFrame({n: np.random.normal(loc=np.random.randint(5,10), scale=np.random.randint(1,10), size=(100,)) for n in names})
d=data3[names].mean()
N=100
l=math.sqrt(N)
k=data3[names].std()/l
fig, ax = plt.subplots()
plt.bar(names,d,yerr=k,align='center',alpha=0.5,ecolor='black',capsize=10)
im trying to plot 2 different plot, one at the left of the other, i try to use sublot from matplot but it is putting the seccond plot down the other i supose i use it wrong the subplot, this is my code
# Create bars
from matplotlib.pyplot import figure
bar = plt.figure(figsize=(10,5))
plt.subplot(121)
plt.barh(plottl['Nombres'] ,plottl['Probas'])
presunto= plt.figure(figsize=(10,10))
presunto = plt.subplot(122)
img=mpimg.imread((predict+names[0]+ '/'+ onlyfiles[0]))
mgplot = plt.imshow(img)
plt.show()
predictions=[]
now here is a pic. of what is happening
i was hopping you can helpme to solve this, thank you all in advance
edit: i put here the asked picture
You are creating 2 figures, instead of one with 2 subplots. remove the line presunto= plt.figure(figsize=(10,10)) and it should work.
You are creating 2 figures instead of 2 subplots, although it's better to use gridspec when you want to draw subplots with different sizes. look at this link
from matplotlib.pyplot import figure
bar = plt.figure(figsize=(10,5))
plt.subplot(121)
plt.barh(plottl['Nombres'] ,plottl['Probas'])
presunto = plt.subplot(122)
img=mpimg.imread((predict+names[0]+ '/'+ onlyfiles[0]))
mgplot = plt.imshow(img,aspect="auto")
plt.show()
I'm making some EDA using pandas and seaborn, this is the code I have to plot the histograms of a group of features:
skewed_data = pd.DataFrame.skew(data)
skewed_features =skewed_data.index
fig, axs = plt.subplots(ncols=len(skewed_features))
plt.ticklabel_format(style='sci', axis='both', scilimits=(0,0))
for i,skewed_feature in enumerate(skewed_features):
g = sns.distplot(data[column])
sns.distplot(data[skewed_feature], ax=axs[i])
This is the result I'm getting:
Is not readable, how can I avoid that issue?
I know you are concerning about the layout of the figures. However, you need to first decide how to represent your data. Here are two choices for your case
(1) Multiple lines in one figure and
(2) Multiple subplots 2x2, each subplot draws one line.
I am not quite familiar with searborn, but the plotting of searborn is based on matplotlib. I could give you some basic ideas.
To archive (1), you can first declare the figure and ax, then add all line to this ax. Example codes:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
# YOUR LOOP, use the ax parameter
for i in range(3)
sns.distplot(data[i], ax=ax)
To archive (2), same as above, but with different number subplots, and put your line in the different subplot.
# Four subplots, 2x2
fig, axarr = plt.subplots(2,2)
# YOUR LOOP, use different cell
You may check matplotlib subplots demo. To do a good visualization is a very tough work. There are so many documents to read. Check the gallery of matplotlib or seaborn is a good and quick way to understand how some kinds of visualization are implemented.
Thanks.