I have a data file which consists of 131 columns and 4 rows. I am plotting it into python as follows
df = pd.read_csv('data.csv')
df.plot(figsize = (15,10))
Once it is plotted, all 131 legends are coming together like a huge tower over the line plots.
Please see the image here, which I have got :
Link to Image, I have clipped after v82 for better understanding
I have found some solutions on Stackoverflow (SO) to shift legend anywhere in the plot but I could not find any solution to break this legend tower into multiple small-small pieces and stack them one beside another.
Moreover, I want my plot something look like this
My desired plot :
Any help would be appreciable. Thank you.
You can specify the position of the legend in relative coordinates using loc and use ncol parameter to split the single legend column into multiple columns. To do so, you need an axis handle returned by the df.plot
df = pd.read_csv('data.csv')
ax = df.plot(figsize = (10,7))
ax.legend(loc=(1.01, 0.01), ncol=4)
plt.tight_layout()
Related
I've been following the solutions provided by Merge matplotlib subplots with shared x-axis. See solution 35. In each subplot, there is one line, but I would like to have multiple lines in each subplot. For example, the top plot has the price of IBM and a 30 day moving average. The bottom plot has a 180 day and 30 day variance.
To plot multiple lines in my other python programs I used (data).plot(figsize=(10, 7)) where data is a dataframe indexed by date, but in the author's solution he uses line0, = ax0.plot(x, y, color='r') to assign the data series (x,y) to the plot. In the case of multiple lines in solution 35, how does one assign a dataframe with multiple columns to the plot?
You'll need to use (data).plot(ax=ax0) to work with pandas plotting.
For the legend you can use:
handles0, labels0 = ax0.get_legend_handles_labels()
handles1, labels1 = ax1.get_legend_handles_labels()
ax0.legend(handles=handles0 + handles1, labels=labels0 + labels1)
I have a 1:1 plot in which the dot colour are different based on the condition (A-F), which comes from the same data frame column.
df is a data frame with data for every 1 min. df60 is a data frame with data for every 1 hour.
plt.figure()
colors = {'A':'green', 'B':'aqua', 'C':'blue','D':'black','E':'yellow','F':'red'}
x = df['Method1'].loc['2020-01-01 00:00':'2020-01-15 23:59'].resample('h').mean()
y = df['Method2'].loc['2020-01-01 00:00':'2020-01-15 23:59'].resample('h').mean()
plt.scatter(x, y, c=df60['Method1'].loc['2020-01-01 00:00':'2020-01-15 23:59'].map(colors))
plt.show()
I have tried to plot the legend showing that which is A-F. However, since the data comes from the same column, it does not show what I am expecting. Are there any methods which help me to show the legend properly without breaking the column into several columns?
You can define the legend manually by, for instance:
handles=[Line2D([0],[0],label=k,marker="o",markerfacecolor=v,markeredgecolor=v,linestyle="None") for k,v in colors.items()]
plt.legend(handles=handles)
This should produce:
I hope this helps. Not really sure if there is a more elegant solution, though...
i am unable to get the following plot to align properly along the x-axis. specifically, i want to plot a horizontal line representing the last value in the dataframe on top of a boxplot which describes the full sample. here is the code. currently i have commented out the line which would plot the boxplot
index = pd.date_range('1/1/2018', '2/1/2018')
data = pd.DataFrame({'a':np.random.randn(32)}, index=index)
fig, ax = plt.subplots(figsize=(6,3))
ax.hlines(data.iloc[-1],xmin=pd.RangeIndex(stop=len(list(data.columns)))+.15,xmax=pd.RangeIndex(stop=len(list(data.columns)))+.85,
**{'linewidth':1.5})
# ax.boxplot(data.values)
ax.set_xticks(pd.RangeIndex(stop=len(list(data.columns)))+0.5)
ax.set_xticklabels(list(data.columns), rotation=0)
ax.tick_params(axis='x',length=5, bottom=True)
here is the output from the above (so far so good)
if i uncomment the line from above, the code would produce this, which is misaligned:
any tips for how to get them to line up?
Apparently you have a very clear opinion about the boxplot to be positionned at x=0.5. But you forgot to tell the boxplot about that.
ax.boxplot(data.values, positions=[0.5])
I am working on getting some graphs generated for 4 columns, with the COLUMN_NM being the main index.
The issue I am facing is the column names are showing along the bottom. This is problematic for 2 reasons, first being there could be dozens of these columns so the graph would look messy and could stretch too far to the right. Second being they are getting cut off (though I am sure that can be fixed)
I would prefer to have the column names listed vertically in the box where 'MAX_COL_LENGTH' current resides, and have the bars different colors per column instead.
Any ideas how I would adjust this or suggestions to make this better?
for col in ['DISTINCT_COUNT', 'MAX_COL_LENGTH', 'MIN_COL_LENGTH', 'NULL_COUNT']:
grid[['COLUMN_NM', col]].set_index('COLUMN_NM').plot.bar(title=col)
plt.show()
In this case you can plot points one by one and setup the label name for each point:
gs = gridspec.GridSpec(1,1)
fig = plt.figure(figsize=(5, 5))
ax = fig.add_subplot(gs[:, :])
data = [1,2,3,4,5]
label = ['l1','l2','l3','l4','l5']
for n,(p,l) in enumerate(zip(data,label)):
ax.bar(n,p,label=l)
ax.set_xticklabels([])
ax.legend()
This is the output for the code above:
Let's look at a swarmplot, made with Python 3.5 and Seaborn on some data (which is stored in a pandas dataframe df with column lables stored in another class. This does not matter for now, just look at the plot):
ax = sns.swarmplot(x=self.dte.label_temperature, y=self.dte.label_current, hue=self.dte.label_voltage, data = df)
Now the data is more readable if plotted in log scale on the y-axis because it goes over some decades.
So let's change the scaling to logarithmic:
ax.set_yscale("log")
ax.set_ylim(bottom = 5*10**-10)
Well I have a problem with the gaps in the swarms. I guess they are there because they have been there when the plot is created with a linear axis in mind and the dots should not overlap there. But now they look kind of strange and there is enough space to from 4 equal looking swarms.
My question is: How can I force seaborn to recalculate the position of the dots to create better looking swarms?
mwaskom hinted to me in the comments how to solve this.
It is even stated in the swamplot doku:
Note that arranging the points properly requires an accurate transformation between data and point coordinates. This means that non-default axis limits should be set before drawing the swarm plot.
Setting an existing axis to log-scale and use this for the plot:
fig = plt.figure() # create figure
rect = 0,0,1,1 # create an rectangle for the new axis
log_ax = fig.add_axes(rect) # create a new axis (or use an existing one)
log_ax.set_yscale("log") # log first
sns.swarmplot(x=self.dte.label_temperature, y=self.dte.label_current, hue=self.dte.label_voltage, data = df, ax = log_ax)
This yields in the correct and desired plotting behaviour: