Matplotlib: different stacked bars? - python

I want to create a stacked bar plot with different amount of stacks for each bar. The general example for stacked bars works fine if my data are all homogenous, but I want something that rather looks like the shown example.
This turned out to be whole other level in Matplotlib (while still easy with some Excel-like tool, as you can see). Is there a convenient way of creating this kind of plot in Matplotlib? Thanks.

I guess you are working directly in matplotlib, but these days plotting data, especially for quick a view can be easily done with pandas, following your example we get:
import matplotlib.pyplot as plt
import matplotlib
matplotlib.style.use("ggplot")
import pandas as pd
import numpy as np
df = pd.DataFrame([pd.Series([10,20,40,10,np.nan]), pd.Series([20,10,30,10,10]), pd.Series([30,40, np.nan, np.nan, np.nan])], index=["Bar1", "Bar2", "Bar3"])
df.plot.bar(stacked=True)
plt.show()

Related

Matplotlib plt.show() doesn't display anything

I want to create a barchart for my dataframe but it doesn't show up, so I made this small script to try out some things and this does display the barchart the way i want. The dataframe is structured the exact same way (I assume) as my big script where all my data is transformed.
Even if I copy paste this code in my other script it doesn't show the the plot
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({
'soortfout':['totaalnoodstoppen','aantaltrapopen','aantaltrapdicht','aantalrectdicht','aantalphotocellopen','aantalphotocelldicht','aantalsafetyedgeopen', 'aantalsafetyedgeclose'],
'aantalfouten':[19,9,0,0,10,0,0,0],
})
print(df)
df.plot(kind='bar',x='soortfout',y='aantalfouten')
plt.show()
I can't really paste my other code in here since it's pretty big. But is it possible that other code that doesn't even use anything from matplotlib interferes with plotting a chart?
I've tried most other solutions like:
matplotlib.rcParams['backend'] = "Qt4Agg"
Currently using Pycharm 2.5
It does work when i use Jupyter notebook.
I was importing modules that i wasn't using so they were grayed out.
But apparently you shouldn't use import pandas_profiling if you want to plot with matplotlib
Don't import modules that can interfere with plotting like pandas_profiling

Graphing a scatterplot in Python to compare photometric and spectroscopic redshifts

I have a list of photometric redshifts and spectroscopic redshifts, and I need to make a scatterplot of these numbers to compare them. The problem is that I don't know how to make a scatterplot in python. How do you graph a scatterplot in python?
Simple Approach
First import the matplotlib package
Use the plot method, then the scatter method (both contained within the matplotlib package) to create the scatterplot
import matplotlib
%matplotlib inline # to ensure the scatter output will be shown instead of code
your_data = pd.read_csv('your_dataset')
data = your_data # to avoid typing your_data each time
scatterplot = data.plot.scatter(x='select_your_x_axis', y='select_your_y_axis')
scatterplot.plot()
Hope this helps :)

Rotating parallel coordinate axis-names in Pandas

When using some of the built in visualization tools in Pandas, one that is very helpful for me is the parallel_coordinates visualization. However, since I have around 18 features in the dataframe, the bottom of the parallel_coords plot gets really messy.
Therefore, I was wondering if anyone knew how to rotate the axis-names to be vertical rather than horizontal as shown here:
I did find a way to use parallel_coords in a polar set up, creating a radar-chart; while that was helpful for getting the different features to be visible, that solution doesn't quite work since whenever the values are close to 0, it becomes almost impossible to see the curve. Furthermore, doing it with the polar coord frame required me to break from using pandas' dataframe which is part of what made the this method so appealing.
Use plt.xticks(rotation=90) should be enough. Here is an example with the “Iris” dataset:
import matplotlib.pyplot as plt
import pandas as pd
from pandas.plotting import parallel_coordinates
data = pd.read_csv('iris.csv')
parallel_coordinates(data, 'Name')
plt.xticks(rotation=90)
plt.show()

Python Heatmaps (Basic and Complex)

What's the best way to do a heatmap in python (2.7)? I've found the heatmap.py module, and I was wondering if people have any advice on using it, or if there are other packages that do a good job.
I'm dealing with pretty basic data, like xy = np.random.rand(1000,2) superimposed on an image.
Although there's another thing I want to try, which is doing a heatmap that's scaled to a different heatmap. E.g., I have
attempts = np.random.rand(5000,2)
successes = np.random.rand(500,2)
And I want a heatmap of the successes relative to the density of the attempts. Is this possible?
Seaborn is a pretty widely-used library for making nice-looking plots, and has a heatmap function. Seaborn uses matplotlib under the hood.
import numpy as np
import seaborn as sns
xy = np.random.rand(1000,2)
sns.heatmap(xy, yticklabels=100)
Regarding your second question, I'm not sure what you mean. But my advice would be to create a numpy array or pandas dataframe of "successes [scaled] relative to the density of the attempts", however you mean that, and then pass that scaled array or dataframe to sns.heatmap
You can plot very complex heatmap using python package PyComplexHeatmap: https://github.com/DingWB/PyComplexHeatmap
https://github.com/DingWB/PyComplexHeatmap/blob/main/examples.ipynb
The most basic heatmap you can get is an image plot:
import matplotlib.pyplot as plt
import numpy as np
xy = np.random.rand(100,2)
plt.imshow(xy, aspect="auto")
plt.colorbar()
plt.show()
Note that using more points than you have pixels to show the heatmap might not make too much sense.
There are of course also different methods to draw a heatmaps and you may go through the matplotlib example gallery and see which plot appeals most to you.

Reproducing default plot behaviour of pandas.DataFrame.plot

As a frequent user of pandas, I often want to plot my data.
Using the df.plot() is very convenient, and I like the layout it gives me.
I often have the problem that when I show the generated graph to someone else, they like it, but want some tiny changes.
This often digresses into me trying to recreate the exact graph in matplotlib, which turns into a couple of hundred rows of code and it still does not work quite the same way as the df.plot()
Is there a way to get the settings for the default plotting behaviour from pandas and just ad something to the plot?
Example:
df = pd.DataFrame([1,2,3,6],index=[15,16,17,18], columns=['values'])
df.plot(kind='bar')
This little piece of code makes this pretty graph:
Trying to recreate this with matplotlib turns into a few hours of digging through documentation and still not comming up with quite the right solution.
Not to mention how many lines of configuration code it is.
import matplotlib.pyplot as plt
import matplotlib.ticker as plticker
import matplotlib.patches as mpatches
fig, ax1 = plt.subplots()
ax1.bar(df.index, df['values'], 0.4, align='center')
loc = plticker.MultipleLocator(base=1.0)
ax1.xaxis.set_major_locator(loc)
ax1.xaxis.set_ticklabels(["","15", "16", "17", "18"])
plt.show()
TLDR;
How can I easily copy the behaviour of df.plot() and extend it, without having to recreate everything manually?

Categories

Resources