I'm trying to plot in a single image, multiple columns of a table.
The idea is to optimize the process with a loop.
It is important to note that all the columns share the same y-axis, and that the x scale varies for each column.
The Final result should look something like this:
I've already tried some things, but with no success, in my code I'm creating several figures, only plotting in the first graph:
def facies_plot_all(logs):
logs = sort_values(by='y')
ztop=logs.Y.min(); zbot=logs.Y.max()
for col in logs.columns:
numcol = (logs.shape[1])
f, ax = plt.subplots (nrows=1, ncols=numcol, figsize (20,25))
ax[x+1].plot(logs[col],logs.Y,'-')
I'm relatively new to programming and still searching for a way to solve this issue.
Any help will be welcome!
Put subplots outside of for loop:
logs = sort_values(by='y')
ztop=logs.Y.min(); zbot=logs.Y.max()
numcol = (logs.shape[1])
f, axes es= plt.subplots (nrows=1, ncols=numcol,
sharey=True,
figsize=(20,25))
for (ax, col) in zip(axes,logs.columns):
ax.plot(logs[col],logs.Y,'-')
Related
I'm trying to generate a plot in seaborn using a for loop to plot the contents of each dataframe column on its own row.
The number of columns that need plotting can vary between 1 and 30. However, the loop creates multiple individual plots, each with their own x-axis, which are not aligned and with a lot of wasted space between the plots. I'd like to have all the plots together with a shared x-axis without any vertical spacing between each plot that I can then save as a single image.
The code I have been using so far is below.
comp_relflux = measurements.filter(like='rel_flux_C', axis=1) *# Extracts relevant columns from larger dataframe
comp_relflux=comp_relflux.reindex(comp_relflux.mean().sort_values().index, axis=1) # Sorts into order based on column mean.
plt.rcParams["figure.figsize"] = [12.00, 1.00]
for column in comp_relflux.columns:
plt.figure()
sns.scatterplot((bjd)%1, comp_relflux[column], color='b', marker='.')
This is a screenshot of the resultant plots.
I have also tried using FacetGrid, but this just seems to plot the last column's data.
p = sns.FacetGrid(comp_relflux, height=2, aspect=6, despine=False)
p.map(sns.scatterplot, x=(bjd)%1, y=comp_relflux[column])
To combine the x-axis labels and have just one instead of having it for each row, you can use sharex. Also, using plt.subplot() to the number of columns you have, you would also be able to have just one figure with all the subplots within it. As there is no data available, I used random numbers below to demonstrate the same. There are 4 columns of data in my df, but have kept as much of your code and naming convention as is. Hope this is what you are looking for...
comp_relflux = pd.DataFrame(np.random.rand(100, 4)) #Random data - 4 columns
bjd=np.linspace(0,1,100) # Series of 100 points - 0 to 1
rows=len(comp_relflux.columns) # Use this to get column length = subplot length
fig, ax = plt.subplots(rows, 1, sharex=True, figsize=(12,6)) # The subplots... sharex is assigned here and I move the size in here from your rcParam as well
for i, column in enumerate(comp_relflux.columns):
sns.scatterplot((bjd)%1, comp_relflux[column], color='b',marker='.', ax=ax[i])
1 output plot with 4 subplots
I am trying to merge an arbitrary number of line charts into a single image, and while there are many, many questions about this sort of thing, none of them seem applicable to the code I'm working with.
Unlike a large number of answers, I don't want to have the separate graphs displayed side by side, or above one another, in a single output, but rather, combined together.
For all of these graphs the value of the "y_x" column would be the same, but the "yhat_y" produced during each loop would be different.
Adding subplots = True to the plot method of a dataframe seems to change the return type to something that is no longer compatible with the code numpy.ndarray' object has no attribute 'get_figure'
#ax = plt.subplot(111) doesnt seem to do anything
for variable in range(max_num):
forecast = get_forecast(variable)
cmp1 = forecast.set_index("ds")[["yhat", "yhat_lower", "yhat_upper"]].join(
both.set_index("ds")
)
e.augmented_error[variable]= sklearn.metrics.mean_absolute_error(
cmp["y"].values, cmp1["yhat"].values
)
cmp2=cmp.merge(cmp1,on='ds')
plot = cmp2[['y_x', 'yhat_y']].plot(title =e)
fig1 = plot.get_figure()
plot.set_title("prediction")
plt.show()
fig1.savefig('output.pdf', format="pdf")
plt.close()
The most straightforward way would be to create a reusable ax handle outside the loop, then call ax.plot inside the loop:
fig, ax = plt.subplots() # create reusable `fig` and `ax` handles
for variable in range(max_num):
...
ax.plot(cmp2['y_x'], cmp2['yhat_y']) # use `ax.plot(cmp2...)` instead of `cmp2.plot()`
ax.set_title('predictions')
fig.savefig('output.pdf', format='pdf')
I'm struggling hard with subplot madness. I've made a bunch of bar charts, which I want to save to one PDF in sequence. Each of which summarizes a binary variable (usually stacked, but unstacked is ok if it's simpler). The charts are fine, but when I try fitting them into a grid of subplots I muck it up!
My problems are 1) I'm not iterating through the data properly, and 2) I can't seem to stack one column of charts--only works with 2+.
Sorry for such a lame question, but this is the closest I've gotten! Any suggestions?
df = pd.DataFrame(np.random.randint(0,2,size=(100, 12)), columns=list('ABCDEFGHIJKL')) #load data
key_vars = list('ABCDEFGH') #variables to plot
num_plots = len(key_vars) #number of subplots
fig, ax = plt.subplots(num_plots, 2, sharex='col', sharey='row') #create figure
for i in range(num_plots):
for j in range(2):
ax[i,j].barh(df[key_vars[i]].value_counts(),10) #create subplots
fig.savefig('binary_barcharts.pdf') #save to .pdf
Are you looking for something like this:
(df[key_vars].apply(pd.Series.value_counts
.T.plot.bar(stacked=True)
)
Output:
I have been trying to merge these two plots together but have not found a built-in in the documentation for MatPlotLib on how to do so. I want to show the two bar values next to each and for every new entry, add the new entry to the graph while shifting the other entries over to make space. The plots are below.
As stated prior, when I say merge, I do not simply mean just plop Plot A onto Plot B, but rather join the plots together so both bar values are shown in the same graph, like this:
The reasoning for this is that I will be able to log all the entries in a single plot without having to manually do so. By implementing something like this in my code, it would make entries go a lot quicker.
EDIT: I understand that I can graph these two together, but that is not what I am looking for. Once I get the necessary input, my program creates a graph of that data and saves it as a file. I am looking to append any new data to that original file by just shifting the original value over to the left in order to make space.
EDIT 2: How could I extract the data from each plot and after doing so, create a new graph? This would seem to be another acceptable workaround.
Is there anything preventing you from plotting each of them side by side but changing the index?
a, b, c = 2, 5, 3
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1)
count = 0
ax.bar(count, a)
# if prgoram produces a new output then...
count += 1
ax.bar(count, b) # index means new bar plot has shifted
# again
count += 1
ax.bar(count, c) # shifted again
This should automatically expand the x-axis anyway. You may have to alter this slightly if you've particularly concenred about the width of these bars.
If this isn't what you wanted you could consider replotting with the bar container or even just stripping the height to reuse.
fig, ax = plt.subplots(1, 1)
count = 0
bar_cont = ax.bar(count, a) # reference to the bar container of interest
print(bar_cont.get_height())
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
people = ['JOHN DOE', 'BOB SMITH']
values = [14,14]
ax.bar(people,values)
plt.show()
Should be the solution. You just have to pass a list instead of a single value to the plt.bar() function. More detailed explaination here.
I am working on getting some graphs generated for 4 columns, with the COLUMN_NM being the main index.
The issue I am facing is the column names are showing along the bottom. This is problematic for 2 reasons, first being there could be dozens of these columns so the graph would look messy and could stretch too far to the right. Second being they are getting cut off (though I am sure that can be fixed)
I would prefer to have the column names listed vertically in the box where 'MAX_COL_LENGTH' current resides, and have the bars different colors per column instead.
Any ideas how I would adjust this or suggestions to make this better?
for col in ['DISTINCT_COUNT', 'MAX_COL_LENGTH', 'MIN_COL_LENGTH', 'NULL_COUNT']:
grid[['COLUMN_NM', col]].set_index('COLUMN_NM').plot.bar(title=col)
plt.show()
In this case you can plot points one by one and setup the label name for each point:
gs = gridspec.GridSpec(1,1)
fig = plt.figure(figsize=(5, 5))
ax = fig.add_subplot(gs[:, :])
data = [1,2,3,4,5]
label = ['l1','l2','l3','l4','l5']
for n,(p,l) in enumerate(zip(data,label)):
ax.bar(n,p,label=l)
ax.set_xticklabels([])
ax.legend()
This is the output for the code above: