Plot Column from different dataframe into one chart of lines - python

I have two dataframes with the same headers. I want to plot the 'Close' column from both data frames into one chart of lines.
so I have:
(aapl.Close).plot()
and
(tsla.Close).plot()
which clearly does what I need to plot but in two different charts. I need two lines in one line chart.

Tried the below, literally after I posted.
fig = plt.figure()
ax = fig.add_subplot()
ax.plot(aapl.Close)
ax.plot(tsla.Close)

Related

Make subplot from multiple dataframes loaded with looping through .dat files

I have multiple .dat file (30) and I load them as Dataframes and add more columns:
A_files = glob.glob("*.A*.dat") #load .dat file which contains "A" in their name
for files in A_files:
df=pd.read_fwf(files,header=None,infer_nrows=300,names=["Time","Result",'Error']) #loading the files as dataframes
df['error_plus']=df["Result"]+df['Error'] #defining first curve for error bars
df['error_minus']=df["Result"]-df['Error'] #defining second curve for error bars
Now I'd like to make subplots from the dataframes, where x='Time', y='Result', and 'error_plus' with 'error_minus' will serve for function ax.fill_between. I tried to extend the code above with this:
A_files = glob.glob("*.A*.dat")
for files in A_files:
df=pd.read_fwf(files,header=None,infer_nrows=300,names = ["Time","Result",'Error'])
df['error_plus']=df["Result"]+df['Error']
df['error_minus']=df["Result"]-df['Error']
ax=df.plot(subplots=True,x='Time', y="Result",sharey=True, sharex=True)
ax.fill_between(df["Time"], df["error_plus"],df["error_minus"],color="r")
However, it didn't make subplots as I expected and this error was raised: 'numpy.ndarray' object has no attribute 'fill_between' (but when I plot just one dataframe without looping, then this error doesn't occurr).
Is there some easy/elegant approach how to make subplots from a loop of dataframes? And also containing fill_between function to highlight error and shared axis?
Thanks.
The problem is in the use of ax=df.plot(), you need to give a previously created "matplotlib axes object" as a parameter to df.plot().
You also don't need the subplots parameter, he has a different meaning, he says whether to plot each of df's columns in a different subplot, or all together in one subplot.
See: pandas.DataFrame.plot documentation.
Then you also have to tell sharex, sharey to the pyplot.subplots() function instead, because this is where you create one plot with subplots.
Here is a mockup of all the changes:
fig, axes = plt.subplots(nrows=5, ncols=6, sharex=True, sharey=True)
axes_flat = axes.flatten()
for file, ax in zip(A_files, axes_flat):
df = pd.read_fwf(file, ...)
...
df.plot(x='Time', y="Result", ax=ax)
plt.show()

Creating a single tidy seaborn plot in a 'for' loop

I'm trying to generate a plot in seaborn using a for loop to plot the contents of each dataframe column on its own row.
The number of columns that need plotting can vary between 1 and 30. However, the loop creates multiple individual plots, each with their own x-axis, which are not aligned and with a lot of wasted space between the plots. I'd like to have all the plots together with a shared x-axis without any vertical spacing between each plot that I can then save as a single image.
The code I have been using so far is below.
comp_relflux = measurements.filter(like='rel_flux_C', axis=1) *# Extracts relevant columns from larger dataframe
comp_relflux=comp_relflux.reindex(comp_relflux.mean().sort_values().index, axis=1) # Sorts into order based on column mean.
plt.rcParams["figure.figsize"] = [12.00, 1.00]
for column in comp_relflux.columns:
plt.figure()
sns.scatterplot((bjd)%1, comp_relflux[column], color='b', marker='.')
This is a screenshot of the resultant plots.
I have also tried using FacetGrid, but this just seems to plot the last column's data.
p = sns.FacetGrid(comp_relflux, height=2, aspect=6, despine=False)
p.map(sns.scatterplot, x=(bjd)%1, y=comp_relflux[column])
To combine the x-axis labels and have just one instead of having it for each row, you can use sharex. Also, using plt.subplot() to the number of columns you have, you would also be able to have just one figure with all the subplots within it. As there is no data available, I used random numbers below to demonstrate the same. There are 4 columns of data in my df, but have kept as much of your code and naming convention as is. Hope this is what you are looking for...
comp_relflux = pd.DataFrame(np.random.rand(100, 4)) #Random data - 4 columns
bjd=np.linspace(0,1,100) # Series of 100 points - 0 to 1
rows=len(comp_relflux.columns) # Use this to get column length = subplot length
fig, ax = plt.subplots(rows, 1, sharex=True, figsize=(12,6)) # The subplots... sharex is assigned here and I move the size in here from your rcParam as well
for i, column in enumerate(comp_relflux.columns):
sns.scatterplot((bjd)%1, comp_relflux[column], color='b',marker='.', ax=ax[i])
1 output plot with 4 subplots

Plot multiple bar graph of multiple pandas dataframe columns generated in a for loop

I have a list of data frames, called listofdf. It contains 5 data frames, and within each data frame, I am trying to print a barchart of 2 of the columns in a certain size. This should produce a total of 10 charts. I am trying to plot 10 separate plots using a single for loop, I tried subplots in the code below because I couldn't find any way to do separate plots but separate plots would be ideal.
The columns all contain normal numerical data and I can plot them if I do,
listofdf[0]['col1'].plot(kind='bar', figsize=(20,5))
so the data should be fine.
Here is my code I am trying to use to iterate over the all the data frames and the columns I want to display within the data frames,
plotsperloop = 2
fig, ax = plt.subplots(nrows=len(listofdf)*plotsperloop, ncols=1)
for idx, df in enumerate(listofdf):
idxcount = idx * plotsperloop
ax[idxcount].plot(df['col1'],kind='bar', figsize=(20,5))
ax[idxcount+1].plot(df['col2'],kind='bar', figsize=(20,5))
#plt.show()
plt.show()
However, I am unable to select the kind as bar. I have tried adding the argument kind = 'bar' inside the plot method, but I keep getting an error AttributeError: 'Line2D' object has no property 'kind'.
If I don't include the kind and figsize arguments, I am able to display multiple line graphs in a column. But I need them to be bar graphs of a certain size.
I also tried,
for df in listofdf:
df.plot(kind='bar', figsize=(20,5))
but I only get 1 plot instead of 10.
What is the proper way to print multiple dynamically generated data frames as plots like this?

How to plot multiple lines in subplot using python and matplotlib

I've been following the solutions provided by Merge matplotlib subplots with shared x-axis. See solution 35. In each subplot, there is one line, but I would like to have multiple lines in each subplot. For example, the top plot has the price of IBM and a 30 day moving average. The bottom plot has a 180 day and 30 day variance.
To plot multiple lines in my other python programs I used (data).plot(figsize=(10, 7)) where data is a dataframe indexed by date, but in the author's solution he uses line0, = ax0.plot(x, y, color='r') to assign the data series (x,y) to the plot. In the case of multiple lines in solution 35, how does one assign a dataframe with multiple columns to the plot?
You'll need to use (data).plot(ax=ax0) to work with pandas plotting.
For the legend you can use:
handles0, labels0 = ax0.get_legend_handles_labels()
handles1, labels1 = ax1.get_legend_handles_labels()
ax0.legend(handles=handles0 + handles1, labels=labels0 + labels1)

Show first and last label in pandas plot

I have a DataFrame with 361 columns. I want to plot it but showing only the first and last columns in the legend. For instance:
d = {'col1':[1,2],'col2':[3,4],'col3':[5,6],'col4':[7,8]}
df = pd.DataFrame(data=d)
If I plot through df.plot() all the legends will be displayed, but I only want 'col1' and 'col4' in my legend with the proper color code (I am using a colormap) and legend title.
One way to do this is to plot each column separately through matplotlib without using legends and then plot two more empty plots with only the labels (example below), but I wonder if there is a direct way to do it with pandas.
for columns in df:
plt.plot(df[columns])
plt.plot([],[],label=df.columns[0])
plt.plot([],[],label=df.columns[-1])
plt.legend()
plt.show()
Let's try extracting the handlers/labels from the axis and defining new legend:
ax = df.plot()
handlers, labels = ax.get_legend_handles_labels()
new_handlers, new_labels = [], []
for h,l in zip(handlers, labels):
if l in ['col1','col4']:
new_handlers.append(h)
new_labels.append(l)
ax.legend(new_handlers, new_labels)
Output:
You can try to split your df into two dfs which the second one will contain only the columns of interest and then plot both dfs showing only the second legend.

Categories

Resources