I'm trying to increase the number of xticks for each chart in the dataframe.
for c in df:
fig = plt.figure(figsize=[10,5]);
ax = df[c].plot(kind='hist', color=(0.2,0.4,0.6,0.6), bins=30);
I've tried:
ax.xticks(np.arange(min(c),max(x)+1,1));
Results in an AttributeError.
Thus are there any methods to increase the number of xticks without specifying the ticks explicitly but rather dynamically so it works for all the charts?
the function doesn't understand the c in min (and I guess it is max(c) too.
it works this way:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure(figsize=[10,5])
for c in df:
ax = df[c].plot(kind='hist', color=(0.2,0.4,0.6,0.6), bins=30)
plt.xticks(np.arange(min(df[c]),max(df[c]), step = 1))
Related
I have the following code where I am trying to plot a bar plot in seaborn. (This is a sample data and both x and y variables are continuous variables).
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
xvar = [1,2,2,3,4,5,6,8]
yvar = [3,6,-4,4,2,0.5,-1,0.5]
year = [2010,2011,2012,2010,2011,2012,2010,2011]
df = pd.DataFrame()
df['xvar'] = xvar
df['yvar']=yvar
df['year']=year
df
sns.set_style('whitegrid')
fig,ax=plt.subplots()
fig.set_size_inches(10,5)
sns.barplot(data=df,x='xvar',y='yvar',hue='year',lw=0,dodge=False)
It results in the following plot:
Two questions here:
I want to be able to plot the two bars on 2 side by side and not overlapped the way they are now.
For the x-labels, in the original data, I have alot of them. Is there a way I can set xticks to a specific frequency? for instance, in the chart above only I only want to see 1,3 and 6 for x-labels.
Note: If I set dodge = True then the lines become very thin with the original data.
For the first question, get the patches in the bar chart and modify the width of the target patch. It also shifts the position of the x-axis to represent the alignment.
The second question can be done by using slices to set up a list or a manually created list in a specific order.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
xvar = [1,2,2,3,4,5,6,8]
yvar = [3,6,-4,4,2,0.5,-1,0.5]
year = [2010,2011,2012,2010,2011,2012,2010,2011]
df = pd.DataFrame({'xvar':xvar,'yvar':yvar,'year':year})
fig,ax = plt.subplots(figsize=(10,5))
sns.set_style('whitegrid')
g = sns.barplot(data=df, x='xvar', y='yvar', hue='year', lw=0, dodge=False)
for idx,patch in enumerate(ax.patches):
current_width = patch.get_width()
current_pos = patch.get_x()
if idx == 8 or idx == 15:
patch.set_width(current_width/2)
if idx == 15:
patch.set_x(current_pos+(current_width/2))
ax.set_xticklabels([1,'',3,'','',6,''])
plt.show()
I have a situation with my data. I like the behaviour of .plot() over a data frame. But sometimes it doesn't work, because the frequency of the time index is not an integer.
But reproducing the plot in matplotlib is OK. Just ugly.
The part that bother me the most is the settings of the x axis. The tick frequency and the limits. Is there any easy way that I can reproduce this behaviour in matplotlib?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Create Data
f = lambda x: np.sin(0.1*x) + 0.1*np.random.randn(1,x.shape[0])
x = np.arange(0,217,0.001)
y = f(x)
# Create DataFrame
data = pd.DataFrame(y.transpose(), columns=['dp'], index=None)
data['t'] = pd.date_range('2021-01-01 14:32:09', periods=len(data['dp']),freq='ms')
data.set_index('t', inplace=True)
# Pandas plot()
data.plot()
# Matplotlib plot (ugly x-axis)
plt.plot(data.index,data['dp'])
EDIT: Basically, what I want to achieve is a similar spacing in the xtics labels, and the tight margin adjust of the values. Legends and axis title, I can do them
Pandas output
Matplotlib output
Thanks
You can use some matplotlib date utilities:
Figure.autofmt_xdate() to unrotate and center the date labels
Axis.set_major_locator() to change the interval to 1 min
Axis.set_major_formatter() to reformat as %H:%M
fig, ax = plt.subplots()
ax.plot(data.index, data['dp'])
import matplotlib.dates as mdates
fig.autofmt_xdate(rotation=0, ha='center')
ax.xaxis.set_major_locator(mdates.MinuteLocator(interval=1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
# uncomment to remove the first `xtick`
# ax.set_xticks(ax.get_xticks()[1:])
I 'm using Seaborn in a Jupyter notebook to plot histograms like this:
import numpy as np
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
df = pd.read_csv('CTG.csv', sep=',')
sns.distplot(df['LBE'])
I have an array of columns with values that I want to plot histogram for and I tried plotting a histogram for each of them:
continous = ['b', 'e', 'LBE', 'LB', 'AC']
for column in continous:
sns.distplot(df[column])
And I get this result - only one plot with (presumably) all histograms:
My desired result is multiple histograms that looks like this (one for each variable):
How can I do this?
Insert plt.figure() before each call to sns.distplot() .
Here's an example with plt.figure():
Here's an example without plt.figure():
Complete code:
# imports
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [6, 2]
%matplotlib inline
# sample time series data
np.random.seed(123)
df = pd.DataFrame(np.random.randint(-10,12,size=(300, 4)), columns=list('ABCD'))
datelist = pd.date_range(pd.datetime(2014, 7, 1).strftime('%Y-%m-%d'), periods=300).tolist()
df['dates'] = datelist
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)
df.iloc[0]=0
df=df.cumsum()
# create distplots
for column in df.columns:
plt.figure() # <==================== here!
sns.distplot(df[column])
Distplot has since been deprecated in seaborn versions >= 0.14.0. You can, however, use sns.histplot() to plot histogram distributions of the entire dataframe (numerical features only) in the following way:
fig, axes = plt.subplots(2,5, figsize=(15, 5))
ax = axes.flatten()
for i, col in enumerate(df.columns):
sns.histplot(df[col], ax=ax[i]) # histogram call
ax[i].set_title(col)
# remove scientific notation for both axes
ax[i].ticklabel_format(style='plain', axis='both')
fig.tight_layout(w_pad=6, h_pad=4) # change padding
plt.show()
If, you specifically want a way to estimate the probability density function of a continuous random variable using the Kernel Density Function (mimicing the default behavior of sns.distplot()), then inside the sns.histplot() function call, add kde=True, and you will have curves overlaying the histograms.
Also works when looping with plt.show() inside:
for column in df.columns:
sns.distplot(df[column])
plt.show()
I am trying to make a line plot in which every one of the elements from the index appears as an xtick.
import pandas as pd
ind = ['16-12', '17-01', '17-02', '17-03', '17-04',
'17-05','17-06', '17-07', '17-08', '17-09', '17-10', '17-11']
data = [1,3,5,2,3,6,4,7,8,5,3,8]
df = pd.DataFrame(data,index=ind)
df.plot(kind='line',x_compat=True)
however the resultant plot skips every second element of the index like so:
My code to call the plot includes the (x_compat=True) parameter which the documentation for pandas suggests should stop the auto tick configuratioin but it seems to have no effect.
You need to use ticker object on axis and then use that axis when plotting.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
ind = ['16-12', '17-01', '17-02', '17-03', '17-04',
'17-05','17-06', '17-07', '17-08', '17-09', '17-10', '17-11']
data = [1,3,5,2,3,6,4,7,8,5,3,8]
df = pd.DataFrame(data,index=ind)
ax2 = plt.axes()
ax2.xaxis.set_major_locator(ticker.MultipleLocator(1))
df.plot(kind='line', ax=ax2)
[The resolution is described below.]
I'm trying to create a PairGrid. The X-axis has at least 2 different value ranges, although even when 'cvar' below is plotted by itself the x-axis overwrites itself.
My question: is there a way to tilt the x-axis labels to be vertical or have fewer x-axis labels so they don't overlap? Is there another way to solve this issue?
====================
import seaborn as sns
import matplotlib.pylab as plt
import pandas as pd
import numpy as np
columns = ['avar', 'bvar', 'cvar']
index = np.arange(10)
df = pd.DataFrame(columns=columns, index = index)
myarray = np.random.random((10, 3))
for val, item in enumerate(myarray):
df.ix[val] = item
df['cvar'] = [400,450,43567,23000,19030,35607,38900,30202,24332,22322]
fig1 = sns.PairGrid(df, y_vars=['avar'],
x_vars=['bvar', 'cvar'],
palette="GnBu_d")
fig1.map(plt.scatter, s=40, edgecolor="white")
# The fix: Add the following to rotate the x axis.
plt.xticks( rotation= -45 )
=====================
The code above produces this image
Thanks!
I finally figured it out. I added "plt.xticks( rotation= -45 )" to the original code above. More can be fund on the MatPlotLib site here.