Creating multiple plot using for loop from dataframe - python

I am trying to create a figure which contains 9 subplots (3 x 3). X, and Y axis data is coming from the dataframe using groupby. Here is my code:
fig, axs = plt.subplots(3,3)
for index,cause in enumerate(cause_list):
df[df['CAT']==cause].groupby('RYQ')['NO_CONSUMERS'].mean().axs[index].plot()
axs[index].set_title(cause)
plt.show()
However, it does not produce the desired output. In fact it returned the error. If I remove the axs[index]before plot() and put inside the plot() function like plot(ax=axs[index]) then it worked and produces nine subplot but did not display the data in it (as shown in the figure).
Could anyone guide me where am I making the mistake?

You need to flatten axs otherwise it is a 2d array. And you can provide the ax in plot function, see documentation of pandas plot, so using an example:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
cause_list = np.arange(9)
df = pd.DataFrame({'CAT':np.random.choice(cause_list,100),
'RYQ':np.random.choice(['A','B','C'],100),
'NO_CONSUMERS':np.random.normal(0,1,100)})
fig, axs = plt.subplots(3,3,figsize=(8,6))
axs = axs.flatten()
for index,cause in enumerate(cause_list):
df[df['CAT']==cause].groupby('RYQ')['NO_CONSUMERS'].mean().plot(ax=axs[index])
axs[index].set_title(cause)
plt.tight_layout()

Related

How to plot multiple numpy array in one figure?

I know how to plot a single numpy array
import matplotlib.pyplot as plt
plt.imshow(np.array)
But is there any way to plot multiple numpy array in one figure? I know plt.subplots() can display multiple pictures. But it seems hard in my case. I have tried to use a for loop.
fig, ax = plt.subplots()
for i in range(10):
ax.imshow(np.array) # i_th numpy array
This seems to plot a single numpy array one by one. Is there any ways to plot all my 10 numpy arrays in one figure?
PS: Here each my 2d numpy array represents the pixel of a picture. plot seems to plot lines which is not suitable in my case.
The documentation for plt.subplots() (here) specifies that it takes an nrows and ncols arguments and returns a fig object and an array of ax objects. For your case this would look like this:
fig, axs = plt.subplots(3, 4)
axs now contains a 2D array filled with ax objects that you can use to plot various things, e.g. axs[0,1].plot([1,2],[3,4,]) would plot something on the first row, second column.
If you want to remove a particular ax object you can do that with .remove(), e.g. axs[0,1].remove().
For .imshow it works in exactly the same way as .plot: select the ax you want and call imshow on it.
A full example with simulated image data for your case would be:
import numpy as np
import matplotlib.pyplot as plt
fig, axs = plt.subplots(3, 4)
images = [np.array([[1,2],[3,4]]) for _ in range(10)]
for i, ax in enumerate(axs.flatten()):
if i < len(images):
ax.imshow(images[i])
else:
ax.remove()
plt.show()
With as the result:

Why matplotlib is not displaying the chart with values generated using numpy random array?

I have written following code,
import numpy as np
import matplotlib.pyplot as plt
x=np.random.randint(0,10,[1,5])
y=np.random.randint(0,10,[1,5])
x.sort(),y.sort()
fig, ax=plt.subplots(figsize=(10,10))
ax.plot(x,y)
ax.set( title="random data plot", xlabel="x",ylabel="y")
I am getting a blank figure.
Same code prints chart if I manually assign below value to x and y and not use random function.
x=[1,2,3,4]
y=[11,22,33,44]
Am I missing something or doing something wrong.
x=np.random.randint(0,10,[1,5]) returns an array if you specify the shape as [1,5]. Either you would want x=np.random.randint(0,10,[1,5])[0] or x=np.random.randint(0,10,size = 5). See: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.randint.html
Matplotlib doesn't plot markers by default, only a line. As per #Can comment, matplotlib then interprets your (1, 5) array as 5 different datasets each with 1 point, so there is no line as there is no second point.
If you add a marker to your plot function then you can see the data is actually being plotted, just probably not as you wish:
import matplotlib.pyplot as plt
import numpy as np
x=np.random.randint(0,10,[1,5])
y=np.random.randint(0,10,[1,5])
x.sort(),y.sort()
fig, ax=plt.subplots(figsize=(10,10))
ax.plot(x,y, marker='.') # <<< marker for each point added here
ax.set( title="random data plot", xlabel="x",ylabel="y")

How do I make subplots with this method

I'm trying to make a subplot of histograms for each of the features in the dataset.
The following code is what I have already tried to fix the problem. Consider train dataset, which has 9 columns and which I want to be plotted in a 3*3 subplot.
import matplotlib.pyplot as plt
fig, ax = plt.subplots(nrows=3, ncols=3)
i=0
for row in ax:
for col in row:
train.iloc[:,i].hist()
i=i+1
I'm getting all histograms in the last subplot.
here my suggestion:
import matplotlib.pyplot as plt
import random
for i in range(1,7):
# Cut your figure into 3 row and 3 columns
# and create the plot in the i subplot.
# here I used the f-string formatting that is available from python3.6
plt.subplot(f'33{i}')
plt.hist(random.randrange(0, 10))
you can find more ideas at this amazing website: The Python Graph Gallery
pandas.DataFrame.hist can take an ax parameter which is the Matplotlib axes to use.

Getting error when plotting a figure with sublpots using axes in matplotlib

I tried to plot the subplots using the below code .But I am getting 'AttributeError: 'numpy.ndarray' object has no attribute 'boxplot'.
but changing the plt.subplots(1,2) it is plotting the box plot with indexerror.
import matplotlib.pyplot as plt
import seaborn as sns
fig = plt.Figure(figsize=(10,5))
x = [i for i in range(100)]
fig , axes = plt.subplots(2,2)
for i in range(4):
sns.boxplot(x, ax=axes[i])
plt.show();
I am expecting four subplots should be plotted but AttributeError is throwing
Couple of issues in your plot:
You are defining the figure twice which is not needed. I merged them into one.
You were looping 4 times using range(4) and using axes[i] for accessing the subplots. This is wrong for the following reason: Your axes is 2 dimensional so you need 2 indices to access it. Each dimension has length 2 because you have 2 rows and 2 columns so the only indices you can use are 0 and 1 along each axis. For ex. axes[0,1], axes[1,0] etc.
As #DavidG pointed out, you don't need the list comprehension. YOu can directly use range(100)
The solution is to expand/flatten make your 2d axes object and then directly iterate over it which gives you individual subplot, one at a time. The order of subplots will be row wise.
Complete working code
import matplotlib.pyplot as plt
import seaborn as sns
x = range(100)
fig , axes = plt.subplots(2,2, figsize=(10,5))
for ax_ in axes.flatten():
sns.boxplot(x, ax=ax_)
plt.show()

secondary_y=True changes x axis in pandas

I'm trying to plot two series together in Pandas, from different dataframes.
Both their axis are datetime objects, so they can be plotted together:
amazon_prices.Close.plot()
data[amazon].BULL_MINUS_BEAR.resample("W").plot()
plt.plot()
Yields:
All fine, but I need the green graph to have its own scale. So I use the
amazon_prices.Close.plot()
data[amazon].BULL_MINUS_BEAR.resample("W").plot(secondary_y=True)
plt.plot()
This secondary_y creates a problem, as instead of having the desired graph, I have the following:
Any help with this is hugely appreciated.
(Less relevant notes: I'm (evidently) using Pandas, Matplotlib, and all this is in an Ipython notebook)
EDIT:
I've since noticed that removing the resample("W") solves the issue. It is still a problem however as the non-resampled data is too noisy to be visible. Being able to plot sampled data with a secondary axis would be hugely helpful.
import matplotlib.pyplot as plt
import pandas as pd
from numpy.random import random
df = pd.DataFrame(random((15,2)),columns=['a','b'])
df.a = df.a*100
fig, ax1 = plt.subplots(1,1)
df.a.plot(ax=ax1, color='blue', label='a')
ax2 = ax1.twinx()
df.b.plot(ax=ax2, color='green', label='b')
ax1.set_ylabel('a')
ax2.set_ylabel('b')
ax1.legend(loc=3)
ax2.legend(loc=0)
plt.show()
I had the same issue, always getting a strange plot when I wanted a secondary_y.
I don't know why no-one mentioned this method in this post, but here's how I got it to work, using the same example as cphlewis:
import matplotlib.pyplot as plt
import pandas as pd
from numpy.random import random
df = pd.DataFrame(random((15,2)),columns=['a','b'])
ax = df.plot(secondary_y=['b'])
plt.show()
Here's what it'll look like

Categories

Resources