Combining a boxplot and a histogram into one plot

Combining a boxplot and a histogram into one plot - python

I'm trying to plot a boxplot and a histgramm as you can see in the following image.
boxplot and histogramm combinaion
I have this for the moment:
fig,ax=plt.subplots()
fig.set_size_inches(10,7)
ax = fig.add_axes([0,0,1,1])
ax.set_title('Heating need [kWh/m^2]')
ax.set_xlabel('Cluster')
ax.set_ylabel('Heating need')
bp1=ax.boxplot([heating0,heating1945,heating1960,heating1970,heating1980,heating1990,heating2000],labels=['<1945', '1945-1959', '1960-1969', '1970-1979', '1980-1989', '1990-1999', '>=2000'],showfliers=False,patch_artist=True)
plt.setp(bp1['boxes'], color='blue')
ax.plot([200,200,220,230,230,170,130,100,30,30],label='underline for swiss energetic index') #underline for the norms
ax.plot([230,230,250,260,260,200,160,130,60,60],label='upperline for swiss energetic index') #upperline for the norms
#plt.yticks([0,200,400,600,800])
plt.legend(loc='upper right')
The result is :
and I want to replace the plot line by a histogramm.

I think you're looking for plt.bar
A minimal example:
fig,ax=plt.subplots()
ax.boxplot([100,200,300,400,500],1)
data = [200,200]
ax.bar(range(0,len(data)*2,2),data)

Related

Plot pandas df into boxplot & histogram

Currently I am trying to plot a boxplot into a histogram (without using seaborn). I have tried many varieties but I always get skewed graphs.
This was my starting point:
#Histogram
df.hist(column="Q7", bins=20, figsize=(14,6))
#Boxplot
df.boxplot(column="Q7",vert=False,figsize=(14,6))
which resulted in the following graph:
As you can see the boxplot and outliers are at the bottom, but I want it to be on top.
Anyone an idea?

You can use subplots and set the percentage of the plot to ensure that the boxplot is first and the hist plot is below. In below example, I am using 30% for boxplot and 70% for bistogram. Also adjusted the spacing between the plots and used a common x-axis using sharex. Hope this is what you are looking for...
fig, ax = plt.subplots(2, figsize=(14, 6), sharex=True, # Common x-axis
gridspec_kw={"height_ratios": (.3, .7)}) # boxplot 30% of the vertical space
#Boxplot
df.boxplot(column="Q7",vert=False,figsize=(14,6), ax=ax[0])
#Histogram
df.hist(column="Q7", bins=20, figsize=(14,6), ax=ax[1])
ax[1].title.set_size(0)
plt.subplots_adjust(hspace=0.1) ##Adjust gap between the two plots

Problems with pandas boxplot showing points on it

I am plotting a box plot with se following code:
plt.figure(figsize=(7,7))
plt.title("Title")
plt.ylabel('Y-ax')
boxplot = df.boxplot(grid=False, rot=90, fontsize=10)
plt.show()
And I get this plot:
Is there any way I can just show like the normal boxplot with the 50/75/90 percentiles and not those circles that I have no clue what do they mean?
The data frame is huge, maybe that is why these points are shown?

how to convert 168 data points to hourly plot data for weekdays

I have a data frame as:
and I can plot this data as :
how can I make x axis of this plot like the following plot:
fig, ax1 = plt.subplots(figsize=(15, 5))
ax1.set(xlabel='hours in a week', ylabel='occupancy ratio(0-1)')
ax1.plot(HoursOfWeek.values, color='g')
plt.show()

After plotting the graph, you can edit the x-ticks. Documentation:
plt.xticks()
For the markers,
plt.xticks(np.r_[0:15]*7, ['00','12']*7+['00'])
For the x-axis labels, plt.xlabel or plt.text should do the job.
Documentation: plt.xlabel
I suggest you use this method to write the x-label:
How to put text outside python plots?(ImportanceOfBeingErnest's answer)

Matplotlib adjacent plots

I've written a Python library that uses Matplotlib and Astropy to generate spectra from data of solar radio emissions. I'm satisfied with how it's plotting data from a single FITS file, but now I'm trying to plot data from multiple FITS files in a single figure adjacently. I've read some of Matplotlib's documentation and some questions related to this like How do I get multiple subplots in matplotlib?.
Here's part of the code that plots data from a single FITS file:
def plot_freq_range_db_above_background(self, start_freq, end_freq):
plt.figure(1, figsize=(11, 6))
plt.imshow(self.hdul_dataset['db'] - self.hdul_dataset['db_median'],
cmap='magma', norm=plt.Normalize(self.hdul_dataset['v_min'],
self.hdul_dataset['v_max']
),
aspect='auto', extent=[self.hdul_dataset['time_axis'][0],
self.hdul_dataset['time_axis']
[-1000],
self.hdul_dataset['frequency'][-1],
self.hdul_dataset['frequency'][0]])
plt.ylim(start_freq, end)
plt.gca().invert_yaxis()
plt.colorbar(label='dB above background')
plt.xlabel('Time (UT)', fontsize=15)
plt.ylabel('Frequency (MHz)', fontsize=15)
plt.title(self.filename, fontsize=16)
plt.tick_params(labelsize=14)
plt.show()
And this is an example of a plot generated by the method above:
So, what I'm trying to do now is to plot data from different files, and have all of them adjacent to each other in a single figure. The X-axis(frequency) is the same for every single plot, and the Y-axis(time) is continuous from one file to the next one.
Here's the method I've written trying to accomplish what I just described:
def plot_fits_files_list(files_list, start_freq, end_freq):
dim = len(files_list)
plt_index = 1
plt.figure(1)
for file in files_list:
fits_filename = file.split(os.sep)[-1]
fitsfile = ECallistoFitsFile(fits_filename)
fitsfile.set_file_path()
fitsfile.set_hdul_dataset()
plt.subplot(dim, dim, plt_index)
plt_index += 1
plt.imshow(
fitsfile.hdul_dataset['db'] -
fitsfile.hdul_dataset['db_median'],
cmap='magma',
norm=plt.Normalize(fitsfile.hdul_dataset['v_min'],
fitsfile.hdul_dataset['v_max']),
aspect='auto', extent=[fitsfile.hdul_dataset['time_axis'][0],
fitsfile.hdul_dataset['time_axis']
[-1000],
fitsfile.hdul_dataset['frequency'][-1],
fitsfile.hdul_dataset['frequency'][0]])
plt.ylim(start_freq, end_freq)
plt.gca().invert_yaxis()
plt.colorbar(label='dB above background')
plt.xlabel('Time (UT)', fontsize=15)
plt.ylabel('Frequency (MHz)', fontsize=15)
plt.title("Multiple Plots Test", fontsize=16)
plt.tick_params(labelsize=14)
plt.show()
And here's the plot it's generating at the moment:

Stacked area chart display eventhough df contains 0 or NaN

I am using df.plot.area() and am very confused by the result. The dataframe has integers as index. The values to plot are in different columns. One column contains zeros from a specific integer onwards, however I can still see a thin line in the plot which isn't right.
After data processing this is the code I am using to actually plot:
# Start plotting
df.plot(kind='area', stacked=True, color=colors)
plt.legend(loc='best')
plt.xlabel('Year', fontsize=12)
plt.ylabel(mylabel, fontsize=12)
# Reverse Legend
ax = plt.gca()
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles[::-1], labels[::-1])
plt.title(filename[:-4])
plt.tight_layout()
plt.autoscale(enable=True, axis='x', tight=True)
And this is a snapshot of the result, the orange thin line shouldn't be visiable because the value in the dataframe is zero.
Thanks for your support!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Combining a boxplot and a histogram into one plot - python

I think you're looking for plt.bar A minimal example: fig,ax=plt.subplots() ax.boxplot([100,200,300,400,500],1) data = [200,200] ax.bar(range(0,len(data)*2,2),data)

Related

Plot pandas df into boxplot & histogram

Problems with pandas boxplot showing points on it

how to convert 168 data points to hourly plot data for weekdays

Matplotlib adjacent plots

Stacked area chart display eventhough df contains 0 or NaN

Categories

Resources