Plot pandas dataframe with varying number of columns along imshow - python

I want to plot an image and a pandas bar plot side by side in an iPython notebook. This is part of a function so that the dataframe containing the values for the bar chart can vary with respect to number of columns.
The libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
%matplotlib inline
Dataframe
faces = pd.Dataframe(...) # return values for 8 characteristics
This returns the the bar chart I'm looking for and works for a varying number of columns.
faces.plot(kind='bar').set_xticklabels(result[0]['scores'].keys())
But I didn't find a way to plot it in a pyplot figure also containing the image. This is what I tried:
fig, (ax_l, ax_r) = plt.subplots(nrows=1, ncols=2, figsize=(15, 5))
ax_l.imshow( img )
ax_r=faces.plot(kind='bar').set_xticklabels(result[0]['scores'].keys())
The output i get is the image on the left and an empty plot area with the correct plot below. There is
ax_r.bar(...)
but I couldn't find a way around having to define the columns to be plotted.

You just need to specify your axes object in your DataFrame.plot calls.
In other words: faces.plot(kind='bar', ax=ax_r)

Related

Plot pandas all columns from and use their dataframe

I would like to have every column on my x-Axis and every value on my y-Axis.
With plotly and seaborn I could only find a way to plot the values against each other (column 1 on x vs coulmn 2 on y).
So for my shown example following would be columns:
"Import Files", "Defining Variables", "Simulate Cutting Down",...
I would like to have all theri values on the y-Axis.
So what I basically want is
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('timings.csv')
df.T.plot()
plt.show()
but with scatter. Matplotlib, Seaborn or Plotly is fine by me.
This would be an example for a csv File, since I can't upload a file:
Import Files,Defining Variables,Copy All Cutters,Simulate Cutting Down,Calculalte Circle, Simulate Cutting Circle, Calculate Unbalance,Write to CSV,Total Time
0.015956878662109375,0.0009989738464355469,0.022938966751098633,0.1466083526611328,0.0009968280792236328,48.128061294555664,0.0,0.014995098114013672,48.33055639266968
0.015958786010742188,0.0,0.024958133697509766,0.14598894119262695,0.0,49.22848296165466,0.0,0.004987239837646484,49.42037606239319
0.015943288803100586,0.0,0.036900997161865234,0.14561033248901367,0.0,46.80884146690369,0.0,0.004009723663330078,47.011305809020996
I only used the data you provided; as mentioned by others in the comments, barplot is more suited for this data but here it is with scatter plot:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(16,5))
sns.scatterplot(data=df.melt(), x='variable', y ='value', ax=ax)
ax.set_xlabel('')
ax.set_ylabel('Time in seconds')

Creating multiple plot using for loop from dataframe

I am trying to create a figure which contains 9 subplots (3 x 3). X, and Y axis data is coming from the dataframe using groupby. Here is my code:
fig, axs = plt.subplots(3,3)
for index,cause in enumerate(cause_list):
df[df['CAT']==cause].groupby('RYQ')['NO_CONSUMERS'].mean().axs[index].plot()
axs[index].set_title(cause)
plt.show()
However, it does not produce the desired output. In fact it returned the error. If I remove the axs[index]before plot() and put inside the plot() function like plot(ax=axs[index]) then it worked and produces nine subplot but did not display the data in it (as shown in the figure).
Could anyone guide me where am I making the mistake?
You need to flatten axs otherwise it is a 2d array. And you can provide the ax in plot function, see documentation of pandas plot, so using an example:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
cause_list = np.arange(9)
df = pd.DataFrame({'CAT':np.random.choice(cause_list,100),
'RYQ':np.random.choice(['A','B','C'],100),
'NO_CONSUMERS':np.random.normal(0,1,100)})
fig, axs = plt.subplots(3,3,figsize=(8,6))
axs = axs.flatten()
for index,cause in enumerate(cause_list):
df[df['CAT']==cause].groupby('RYQ')['NO_CONSUMERS'].mean().plot(ax=axs[index])
axs[index].set_title(cause)
plt.tight_layout()

Why matplotlib is not displaying the chart with values generated using numpy random array?

I have written following code,
import numpy as np
import matplotlib.pyplot as plt
x=np.random.randint(0,10,[1,5])
y=np.random.randint(0,10,[1,5])
x.sort(),y.sort()
fig, ax=plt.subplots(figsize=(10,10))
ax.plot(x,y)
ax.set( title="random data plot", xlabel="x",ylabel="y")
I am getting a blank figure.
Same code prints chart if I manually assign below value to x and y and not use random function.
x=[1,2,3,4]
y=[11,22,33,44]
Am I missing something or doing something wrong.
x=np.random.randint(0,10,[1,5]) returns an array if you specify the shape as [1,5]. Either you would want x=np.random.randint(0,10,[1,5])[0] or x=np.random.randint(0,10,size = 5). See: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.randint.html
Matplotlib doesn't plot markers by default, only a line. As per #Can comment, matplotlib then interprets your (1, 5) array as 5 different datasets each with 1 point, so there is no line as there is no second point.
If you add a marker to your plot function then you can see the data is actually being plotted, just probably not as you wish:
import matplotlib.pyplot as plt
import numpy as np
x=np.random.randint(0,10,[1,5])
y=np.random.randint(0,10,[1,5])
x.sort(),y.sort()
fig, ax=plt.subplots(figsize=(10,10))
ax.plot(x,y, marker='.') # <<< marker for each point added here
ax.set( title="random data plot", xlabel="x",ylabel="y")

How to plot only one half of a scatter matrix using pandas

I am using pandas scatter_matrix (couldn't get PairgGrid in seaborn to work) to plot all combinations of a set of columns in a pandas frame. Each column as 1000 data points and there are nine columns.
I am using the following code:
pandas.plotting.scatter_matrix(df, alpha=0.2, figsize=(8,8))
I get the figure shown below:
This is nice., However, you'll notice that across the main diagonal I have a mirror image. Is it possible to plot only the lower portion as in the following fake plot I made using paint:
This is probably not the cleanest way to do it, but it works:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
axes = pd.plotting.scatter_matrix(iris, alpha=0.2, figsize=(8,8))
for i in range(np.shape(axes)[0]):
for j in range(np.shape(axes)[1]):
if i < j:
axes[i,j].set_visible(False)

How to add black lines to stacked pandas area plot?

I want to add separating black lines to a Python area plot created using pandas. In other words, I want the stacked areas to be separated by black lines.
My current code is the following:
figure1=mydataframe.plot(kind='area', stacked=True)
And I am looking for an additional argument to pass on to the function, such as:
figure1=mydataframe.plot(kind='area', stacked=True, blacklines=TRUE)
Is there a way I can achieve this using pandas or additional matplotlib commands?
Use plt.stackplot(). You can control line width and color with linewidth and edgecolor arguments:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(10,3)
df = pd.DataFrame(abs(data))
plt.stackplot(np.arange(10),[df[0],df[1],df[2]])
plt.show()

Categories

Resources