plot multiple column using subplot of a single data frame - python

I am new to python
I have a few Pandas Data Frames having different columns
I am trying to plot different column using subplot plot as subplots, but I'm unfortunately failing to come up with a solution to how and would highly appreciate some help.
I also tried few things pasting the code below but I am getting the error -'AxesSubplot object is not subscriptable'
enter code here
df = pd.read_csv('D:\data_ana\\file_r.csv')
figure , axis = plt.subplots(1,1)
axis[0,0].df.plot(x = "Age", y = ["K_DistX","K_DistY"],
kind="line", figsize=(5, 5))
axis[0,1].df.plot(x="Age", y=["K_VabsX","K_VabsY"],
kind="line", figsize=(5, 5))
# display plot
plt.show()

Well since you are a newbee, here is an explanation of your given error
AxesSubplot, is the object you create in this line of code, to be exact your axis variable
figure , axis = plt.subplots(1,1)
axis[0,0].df.plot(x = "Age", y = ["K_DistX","K_DistY"],
When you do this line of code, python be like wtf. Since the AxesSubplot is not an mutable interator he gives you that error
So basically what you need to do is to stop slicin the axis object this will stop giving you that specific error, but unfortunately for you there are other issues in your code but they seen to be pretty basic stuff like what is a df object and so on i would recommend you study a little more.
axis.plot(x_axis_data,y_axis_data)

Related

How do I create a count plot with multiple columns without the axes being stored in a numpy.ndarray?

I'm new to coding and this is my first post. Sorry if it could be worded better!
I'm taking a free online course, and for one of the projects I have to make a count plot with 2 subplot columns.
I've managed to make a count plot with multiple subplots using the code below, and all of the values are correct.
fig = sns.catplot(x = 'variable', hue = 'value', order = ['active', 'alco', 'cholesterol', 'gluc', 'overweight', 'smoke'], col='cardio', data = df_cat, kind = 'count')
But because of the way I've done it, the fig.axes is stored in a 2 dimensional array. The only difference between both rows of the array is the title (cardio = 0 or cardio = 1). I'm assuming this is because of the col='cardio'. Does the col argument always cause the fig.axes to be stored in a 2D array? Is there a way around this or do I have to completely change how I'm making my graph?
I'm sure it's not usually a problem, but because of this, when I run my program through the test module, it fails since some of the functions in the test module don't work on numpy.ndarrays.
I pass the test if I change the reference from fig.axes[0] to fig.axes[0,0], but obviously I cant just change the test module to pass.
I found something. This is just an implementation detail, so it would be nuts to rely on it. If you set col_wrap, then you get an axes ndarray of a different shape.
Reproduced like this:
import seaborn as sns
# I don't have your data but I have this example
tips = sns.load_dataset("tips")
fig = sns.catplot(x='day', hue='sex', col='time', data=tips, kind='count', col_wrap=2)
fig.axes.shape
And it has shape (2,) i.e it's 1D. seaborn==0.11.2.

Combining Dataframe plots into single figure

I am trying to merge an arbitrary number of line charts into a single image, and while there are many, many questions about this sort of thing, none of them seem applicable to the code I'm working with.
Unlike a large number of answers, I don't want to have the separate graphs displayed side by side, or above one another, in a single output, but rather, combined together.
For all of these graphs the value of the "y_x" column would be the same, but the "yhat_y" produced during each loop would be different.
Adding subplots = True to the plot method of a dataframe seems to change the return type to something that is no longer compatible with the code numpy.ndarray' object has no attribute 'get_figure'
#ax = plt.subplot(111) doesnt seem to do anything
for variable in range(max_num):
forecast = get_forecast(variable)
cmp1 = forecast.set_index("ds")[["yhat", "yhat_lower", "yhat_upper"]].join(
both.set_index("ds")
)
e.augmented_error[variable]= sklearn.metrics.mean_absolute_error(
cmp["y"].values, cmp1["yhat"].values
)
cmp2=cmp.merge(cmp1,on='ds')
plot = cmp2[['y_x', 'yhat_y']].plot(title =e)
fig1 = plot.get_figure()
plot.set_title("prediction")
plt.show()
fig1.savefig('output.pdf', format="pdf")
plt.close()
The most straightforward way would be to create a reusable ax handle outside the loop, then call ax.plot inside the loop:
fig, ax = plt.subplots() # create reusable `fig` and `ax` handles
for variable in range(max_num):
...
ax.plot(cmp2['y_x'], cmp2['yhat_y']) # use `ax.plot(cmp2...)` instead of `cmp2.plot()`
ax.set_title('predictions')
fig.savefig('output.pdf', format='pdf')

Python - Multiple Plots in a Single Figure - Loop in DIfferent columns

I'm trying to plot in a single image, multiple columns of a table.
The idea is to optimize the process with a loop.
It is important to note that all the columns share the same y-axis, and that the x scale varies for each column.
The Final result should look something like this:
I've already tried some things, but with no success, in my code I'm creating several figures, only plotting in the first graph:
def facies_plot_all(logs):
logs = sort_values(by='y')
ztop=logs.Y.min(); zbot=logs.Y.max()
for col in logs.columns:
numcol = (logs.shape[1])
f, ax = plt.subplots (nrows=1, ncols=numcol, figsize (20,25))
ax[x+1].plot(logs[col],logs.Y,'-')
I'm relatively new to programming and still searching for a way to solve this issue.
Any help will be welcome!
Put subplots outside of for loop:
logs = sort_values(by='y')
ztop=logs.Y.min(); zbot=logs.Y.max()
numcol = (logs.shape[1])
f, axes es= plt.subplots (nrows=1, ncols=numcol,
sharey=True,
figsize=(20,25))
for (ax, col) in zip(axes,logs.columns):
ax.plot(logs[col],logs.Y,'-')

User defined function in Python won't accept a Series/iterable as `bottom` arguments for a matplotlib bar plot

Edited:
I am plotting stacked bars from my pandas DataFrame. Code generally works, but when I put the same code in a user defined function and try to plot the bars between 2 specific dates, it returns an error. Here are the codes:
import pandas as pd
import matplotlib.pyplot as plt
time_table = pd.DataFrame(data =[['2019-09-03',-1.089987,5.085],
['2019-09-04',-5.982087,-2.7],
['2019-09-05',-2.887029,57.46659545],
['2019-09-06',-5.634726,-47.45]] , columns=['Exec_date','Trade_cost', 'Closed_PnL'])
time_table['Exec_date'] = pd.to_datetime(time_table['Exec_date'] )
def between_dates(df_time , start = pd.to_datetime('09/3/2019'), finish = pd.to_datetime('09/6/2019')):
mask = (df_time['Exec_date'] >= start ) &(df_time['Exec_date'] <= finish)
df_time = df_time.loc[mask].copy()
plot_it(df_time)
return df_time
def plot_it(time_table):
fig1=plt.figure()
axes1 = fig1.add_axes([0.1,0.1,1,1])
# Plots
axes1.bar(time_table['Exec_date'], time_table['Closed_PnL'], color='0.7')
axes1.bar(time_table['Exec_date'], time_table['Trade_cost'],bottom = time_table['Closed_PnL'],color='r')
plt.show();
between_dates(time_table)
the above code works. but if I change the dates in between_dates function to anything which doesn't cover all of my data, say change the start to 09/04/2019 it returns this error: only size-1 arrays can be converted to Python scalars,which, I guess, has probably something to do with functions not being able to send multiple values for bottom argument. If I take out the bottom or assign a single value to it, my function works without any errors, but the plot is not a stacked bar plot anymore. To Solve the problem, I am using a plotting loop inside my function:
def plot_it(time_table):
fig1=plt.figure()
axes1 = fig1.add_axes([0.1,0.1,1,1])
# Plots
axes1.bar(time_table['Exec_date'], time_table['Closed_PnL'], color='0.7',edgecolor = 'k')
for i in time_table.index:
axes1.bar(time_table.loc[i,'Exec_date'], time_table.loc[i,'Trade_cost'],bottom = time_table.loc[i,'Closed_PnL'],
color='r')
plt.show();
It works! But doesn't look that great. I am wondering if there is a more elegant way of writing this code. Something that doesn't require going through a for loop and plotting bars one by one. I read a little about itertools.starmap, but can't get it to work in this case yet.
Here is a link to the code on Github for testing.
So, is there any better way of plotting these stacked bars? Appreciate your help and comments!

seaborn/matplotlib change number of columns in legend object

I've seen Creating multi column legend in python seaborn plot but I think my question is a bit different. In short, I've got a dataframe that I'm plotting in seaborn's lmplot and getting a FacetGrid. Trouble is, there are tons of values for hue so I get a super long, single column legend. Code example below:
ers = sns.lmplot(
data=emorb,
x="Pb",
y="Nd",
row="Ridge Sys",
hue="Seg Name",
scatter=True,
fit_reg=False,
scatter_kws={"alpha":0.7, "edgecolor": "w"},
palette=sns.color_palette("bright", 20),
legend=True
)
ers.set(ylim=(0.5122,0.5134))
I can access the legend object that is created by calling ers._legend and this returns an object with type Legend (basically, a matplotlib object). However, I can't then call to this legend object to change the number of columns, e.g., with:
l = ers._legend
l(ncols=9)
Any suggestions, or am I missing something perhaps more obvious, such as a way to redraw the legend and specify any parameters?
Thanks.
Whoops, figured it out:
The FacetGrid object has an attribute fig, i.e.
g = sns.lmplot()
parent_mpl_figure = g.fig
And so if I set legend=False in sns.lmplot(), I can then specify parent_mpl_figure.legend(labels=[], ncol=9, bbox_to_anchor=(1,1)).
Written cleanly:
g = sns.lmplot(legend = False)
parent_mpl_figure = g.fig
parent_mpl_figure.legend(labels = [], ncol = 9, bbox_to_anchor = (1,1))
Hope this is instructive for someone else / now to figure out how to have each Facet span the full color palette so that different hue groups within each Facet group are easier to distinguish...

Categories

Resources