Python barchart overlapping vertical bars - python

I have two graphs that share the same x-axis. They are both time series with 2880 times (4 months with hourly data). I have an array with the values of precipitation for every hour (2880). I want to overlay this data via a vertical bar chart over the first graph, so that the bars width is equivalent to 1 hr and centered over the corresponding hour.
My issue is that the widths of the bars are too wide and overlap with each other. I have tried changing the width option to width=1/24 in the plot with no success (bars don't appear at all). Here is a snippet of the code where I do not set the width at all.
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import matplotlib.dates as mdates
import datetime
import pandas as pd
import numpy as np
t = np.arange(datetime.datetime(2010,1,01,0), datetime.datetime(2010,5,01,0),datetime.timedelta(hours=1)).astype(datetime.datetime)
stn_temp = np.random.rand(2880)
model_temp = stn_temp-0.2
stn_rh = np.random.randint(0,100,2880)
model_rh = stn_rh -1
fig, (ax1,ax2) = plt.subplots(2,1,sharex=True)
ax1.plot(t,stn_temp,'r',linewidth=0.3)
ax1.plot(t,model_temp,'k',linewidth=0.3)
minor_ticks_temp = np.arange(min(stn_temp),max(stn_temp),1)
ax1.set_yticks(minor_ticks_temp, minor=True)
myFmt = mdates.DateFormatter('%m-%d')
ax1.xaxis.set_major_formatter(myFmt)
ax1.legend(loc=0)
ax1.set_ylabel('2 m Temperature ($^\circ$C)')
ax1 = ax1.twinx()
ax1.bar(t,prec,alpha=0.7,color='g')
ax1.set_ylabel('Accumulated \n Precipitation (mm)')
ax2.plot(t,stn_RH,'b',linewidth=0.3)
ax2.plot(t,rh,'k',linewidth=0.3)
ax2.set_ylim([0,100.5])
ax2.set_ylabel('Relative Humidity (%)')
fig.tight_layout()
The widths of the bars should be a lot smaller, only the width of an hour. This image is a zoomed in version to show the bar width issue.

Related

how to set the width of daily bar chart in python matplotlib

I want to plot the daily rainfall data of 5 years by the bar chart. when the width of bars is 1, they become lines without any width, while I changed the width of bars they overlapped each other like the image below. I want to have discrete bar charts with a good looking width. This my code.
import pandas as pd
from datetime import datetime, timedelta
from matplotlib import pyplot as plt
data=pd.read_excel('final.xlsx')
data['Date']=pd.to_datetime(data['Date'])
date = data['Date']
amount = data['Amount']
plt.bar (date, amount, color='gold', edgecolor='blue', align='center', width=5)
plt.ylabel('rainfall amount (mm)')
plt.show()
Just to note, you can also pass a Timedelta to the width parameter; I find this helpful to be explicit about how many units in x (e.g. days here) the bars will take up. Additionally for some time series the int widths are less intuitive:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#fake data with minute frequency for an hour
dr = pd.date_range('01-01-2016 9:00:00', '01-01-2016 10:00:00', freq='1T')
df = pd.DataFrame(np.random.rand(len(dr)), index=dr)
#graph 1, using int width
plt.figure(figsize=(10,2))
plt.bar (df.index, df[0], color='gold', edgecolor='blue', align='center',
width=1)
#graph 2, using Timedelta width
plt.figure(figsize=(10,2))
plt.bar (df.index, df[0], color='gold', edgecolor='blue', align='center',
width=pd.Timedelta(minutes=1))
Graph 1:
Graph 2:
This was what came to mind when I saw your issue, but I think the real problem is the amount of data points (as #JohanC pointed out). Already when you plot 365 days, you can barely see the yellow anymore (and by 3 or 4 years its definitely gone):
You can also see in the above that different bars get rendered with different apparent widths, but that is just because there are too few pixels in the space provided to accurately show the bar fill and bar widths the same for each point.

Matplotlib - Overlaying charts but with different box size

I am plotting Revenues and Volume across dates, with Revenues as graph and Volume as bar. All, I want is the bars should be plotted in the lower 30% of the plot and not encompassing the entire plot. I think, that can be done with matplotlib.transforms.Bbox but I don't know how. The following is the code:
Data can be found here.
import matplotlib
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize']=(20,10) # set the figure size
plt.style.use('fivethirtyeight') # using the fivethirtyeight matplotlib theme
sales = pd.read_csv('sales.csv') # Read the data in
sales.Date = pd.to_datetime(sales.Date) #set the date column to datetime
sales.set_index('Date', inplace=True) #set the index to the date column
print(sales)
fig, ax1 = plt.subplots()
ax2 = ax1.twinx() # set up the 2nd axis
ax1.plot(sales.Sales_Dollars) #plot the Revenue on axis #1
#ax2.set_position(matplotlib.transforms.Bbox([[0.125,0.1],[0.9,0.32]]))
ax2.bar(sales.index, sales.Quantity,width=20, alpha=0.2, color='orange')
ax2.grid(b=False) # turn off grid #2
ax1.set_title('Monthly Sales Revenue vs Number of Items Sold Per Month')
ax1.set_ylabel('Monthly Sales Revenue')
ax2.set_ylabel('Number of Items Sold')
plt.show()
print('Done!')
Following is the plot. I want that the bars should be plotted only in the red box (bottom 30% of the height) I have marked, insteading spanning the entire height. May be, I have to do something like ax2.set_position(matplotlib.transforms.Bbox([[...],[...]])), but don't know how!!
fig, ax1 = plt.subplots()
bb = ax1.get_position()
ax2 = fig.add_axes([bb.x0, bb.y0, bb.width, bb.height*0.3])
ax2.yaxis.tick_right()
ax2.xaxis.set_visible(False)
ax2.spines['top'].set_visible(False)

Adjust spacing on X-axis in python boxplots

I plot boxplots using sns.boxplot and pandas.DataFrame.boxplot in python 3.x.
And I want to ask is it possible to adjust the spacing between boxes in boxplot, so the box of Group_b is farther right to the box of Group_a than in the output figures. Thanks
Codes:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
dict_a = {'value':[1,2,3,7,8,9],'name':['Group_a']*3+['Group_b']*3}
dataframe = pd.DataFrame(dict_a)
sns.boxplot( y="value" , x="name" , data=dataframe )
Output figure:
dataframe.boxplot("value" ,by = "name" )
Output figure 2:
The distance between the two boxes is determined by the x axis limits. For a constant distance in data units between the boxes, what makes them spaced more or less appart is the fraction of this data unit distance compared to the overall data space shown on the axis.
For example, in the seaborn case, the first box sits at x=0, the second at x=1. The difference is 1 unit. The maximal distance between the two boxplots is hence achieved by setting the x axis limits to those exact limits,
ax.set_xlim(0, 1)
Of course this will cut half of each box.
So a more useful value would be ax.set_xlim(0-val, 1+val) with val being somewhere in the range of the width of the boxes.
One needs to mention that pandas uses different units. The first box is at x=1, the second at x=2. Hence one would need something like ax.set_xlim(1-val, 2+val).
The following would add a slider to the plot to see the effect of different values.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
dict_a = {'value':[1,2,3,7,8,9],'name':['Group_a']*3+['Group_b']*3}
dataframe = pd.DataFrame(dict_a)
fig, (ax, ax2, ax3) = plt.subplots(nrows=3,
gridspec_kw=dict(height_ratios=[4,4,1], hspace=1))
sns.boxplot( y="value" , x="name" , data=dataframe, width=0.1, ax=ax)
dataframe.boxplot("value", by = "name", ax=ax2)
from matplotlib.widgets import Slider
slider = Slider(ax3, "", valmin=0, valmax=3)
def update(val):
ax.set_xlim(-val, 1+val)
ax2.set_xlim(1-val, 2+val)
slider.on_changed(update)
plt.show()

How to limit the display limits of a colorbar in matplotlib

I am plotting 3 channels of my time series measurements which are more or less centered around (-80). Missing values are filled with (-50) so that they get a bright yellow color and contrast with the rest of the plot. It has no meaning numerically. See the figure and the code below:
f, ax = plt.subplots(figsize=(12.5, 12.5))
sns.heatmap(df.loc[:, ['Ch2', 'Ch3', 'Ch1']].fillna(-50)[:270], cmap='viridis', yticklabels=27, cbar=True, ax=ax)
How can I keep the color range but limit the display scale (i.e the heatmap should stay the same but the color bar ranges only from -70 to -90)?
(Note that the question of how to Set Max value for color bar on seaborn heatmap has already been answered and it is not what I am aiming at, I want vmin and vmax to stay just as they are).
You can set the limits of the colorbar axes similar to any other axes.
ax.collections[0].colorbar.ax.set_ylim(-90,-70)
Complete example:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
data = np.random.rand(82*3)*20-90
data[np.random.randint(1,82*3, size=20)] = np.nan
df = pd.DataFrame(data.reshape(82,3))
ax = sns.heatmap(df, vmin=-90, vmax=-50, cmap="viridis")
ax.set_facecolor("gold")
ax.collections[0].colorbar.ax.set_ylim(-90,-70)
plt.show()

Seaborn/Matplotlib Date Axis barplot minor-major tick formatting

I'm building a Seaborn barplot. The x-axis are dates, and the y-axis are integers.
I'd like to format major/minor ticks for the dates. I'd like Mondays' ticks to be bold and a different color (ie, "major ticks"), with the rest of the week less bold.
I have not been able to get major and minor tick formatting on the x-axis to work with Seaborn barplots. I'm stumped, and thus turning here for help.
I'm starting with the stackoverflow example that answered this question: Pandas timeseries plot setting x-axis major and minor ticks and labels
If I do a simple modification it to use a Seaborn barplot and I lose my X-axis ticks:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
import seaborn as sns
idx = pd.date_range('2011-05-01', '2011-07-01')
s = pd.Series(np.random.randn(len(idx)), index=idx)
###########################################
## Swap out these two lines of code:
#fig, ax = plt.subplots()
#ax.plot_date(idx.to_pydatetime(), s, 'v-')
## with this one
ax = sns.barplot(idx.to_pydatetime(), s)
###########################################
ax.xaxis.set_minor_locator(dates.WeekdayLocator(byweekday=(1),
interval=1))
ax.xaxis.set_minor_formatter(dates.DateFormatter('%d\n%a'))
ax.xaxis.grid(True, which="minor")
ax.yaxis.grid()
ax.xaxis.set_major_locator(dates.MonthLocator())
ax.xaxis.set_major_formatter(dates.DateFormatter('\n\n\n%b\n%Y'))
plt.tight_layout()
## save the result to a png instead of plotting to screen:
myFigure = plt.gcf()
myFigure.autofmt_xdate()
myFigure.set_size_inches(11,3.8)
plt.title('Example Chart', loc='center')
plt.savefig('/tmp/chartexample.png', format='png', bbox_inches='tight')
I've tried a variety of approaches but something in Seaborn seems to be overriding or undoing any attempts at major and minor axis formatting that I've managed to cook up yet beyond some simple styling for all ticks when I use set_xticklabels().
I can sort of get formatting on just the major ticks by using MultipleLocator(), but I can't get any formatting on the minor ticks.
I've also experimented with myFigure.autofmt_xdate() to see if it would help, but it doesn't seem to like mixed major & minor ticks on the same axis either.
I came across this while trying to solve the same problem. Based on the useful pointer from #mwaskom (that categorical plots like boxplots lose their structure and just become date-named categories) and ended up doing the location and formatting in Python as so:
from datetime import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates
import seaborn as sns
idx = pd.date_range('2011-05-01', '2011-07-01')
s = pd.Series(np.random.randn(len(idx)), index=idx)
fig, ax = plt.subplots(figsize = (12,6))
ax = sns.barplot(idx.to_pydatetime(), s, ax = ax)
major_ticks = []
major_tick_labels = []
minor_ticks = []
minor_tick_labels = []
for loc, label in zip(ax.get_xticks(), ax.get_xticklabels()):
when = datetime.strptime(label.get_text(), '%Y-%m-%d %H:%M:%S')
if when.day == 1:
major_ticks.append(loc)
major_tick_labels.append(when.strftime("\n\n\n%b\n%Y"))
else:
minor_ticks.append(loc)
if when.weekday() == 0:
minor_tick_labels.append(when.strftime("%d\n%a"))
else:
minor_tick_labels.append(when.strftime("%d"))
ax.set_xticks(major_ticks)
ax.set_xticklabels(major_tick_labels)
ax.set_xticks(minor_ticks, minor=True)
ax.set_xticklabels(minor_tick_labels, minor=True)
Of course, you don't have to set the ticks based on parsing the labels which were installed from the data, if it's easier to start with the source data and just keep the indices aligned, but I prefer to have a single source of truth.
You can also mess with font weight, rotation, etc, on individual labels by getting the Text objects for the relevant label and calling set_ methods on it.

Categories

Resources