Pandas line plot suppresses half of the xticks, how to stop it? - python

I am trying to make a line plot in which every one of the elements from the index appears as an xtick.
import pandas as pd
ind = ['16-12', '17-01', '17-02', '17-03', '17-04',
'17-05','17-06', '17-07', '17-08', '17-09', '17-10', '17-11']
data = [1,3,5,2,3,6,4,7,8,5,3,8]
df = pd.DataFrame(data,index=ind)
df.plot(kind='line',x_compat=True)
however the resultant plot skips every second element of the index like so:
My code to call the plot includes the (x_compat=True) parameter which the documentation for pandas suggests should stop the auto tick configuratioin but it seems to have no effect.

You need to use ticker object on axis and then use that axis when plotting.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
ind = ['16-12', '17-01', '17-02', '17-03', '17-04',
'17-05','17-06', '17-07', '17-08', '17-09', '17-10', '17-11']
data = [1,3,5,2,3,6,4,7,8,5,3,8]
df = pd.DataFrame(data,index=ind)
ax2 = plt.axes()
ax2.xaxis.set_major_locator(ticker.MultipleLocator(1))
df.plot(kind='line', ax=ax2)

Related

Reproduce simple pandas plot

I have a situation with my data. I like the behaviour of .plot() over a data frame. But sometimes it doesn't work, because the frequency of the time index is not an integer.
But reproducing the plot in matplotlib is OK. Just ugly.
The part that bother me the most is the settings of the x axis. The tick frequency and the limits. Is there any easy way that I can reproduce this behaviour in matplotlib?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Create Data
f = lambda x: np.sin(0.1*x) + 0.1*np.random.randn(1,x.shape[0])
x = np.arange(0,217,0.001)
y = f(x)
# Create DataFrame
data = pd.DataFrame(y.transpose(), columns=['dp'], index=None)
data['t'] = pd.date_range('2021-01-01 14:32:09', periods=len(data['dp']),freq='ms')
data.set_index('t', inplace=True)
# Pandas plot()
data.plot()
# Matplotlib plot (ugly x-axis)
plt.plot(data.index,data['dp'])
EDIT: Basically, what I want to achieve is a similar spacing in the xtics labels, and the tight margin adjust of the values. Legends and axis title, I can do them
Pandas output
Matplotlib output
Thanks
You can use some matplotlib date utilities:
Figure.autofmt_xdate() to unrotate and center the date labels
Axis.set_major_locator() to change the interval to 1 min
Axis.set_major_formatter() to reformat as %H:%M
fig, ax = plt.subplots()
ax.plot(data.index, data['dp'])
import matplotlib.dates as mdates
fig.autofmt_xdate(rotation=0, ha='center')
ax.xaxis.set_major_locator(mdates.MinuteLocator(interval=1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
# uncomment to remove the first `xtick`
# ax.set_xticks(ax.get_xticks()[1:])

Can we create scatter plot with a single data line

I have sample data in dataframe as below
Header=['Date','EmpCount','DeptCount']
2009-01-01,100,200
print(df)
Date EmpCount DeptCount
0 2009-01-01 100 200
Can we generate Scatter plot(or any Line chart etc..) only with this one record.
I tried multiple approaches but i am getting
TypeError: no numeric data to plot
In X Axis: Dates
In Y Axis: Two dots one for Emp Count , and other one is for dept count
Starting from #the-cauchy-criterion, try this:
import pandas as pd
import matplotlib.pyplot as plt
header=['Date','EmpCount','DeptCount']
df = pd.DataFrame([['2009-01-01',100,200]],columns=header)
b=df.set_index('Date')
ax = plt.plot(b, linewidth=3, markersize=10, marker='.')
What are you using to plot the scatter plot?
Here's how to do it with pyplot.
import pandas as pd
import matplotlib.pyplot as plt
header=['Date','EmpCount','DeptCount']
df = pd.DataFrame([['2009-01-01',100,200]],columns=header)
plt.scatter(*df.iloc[0][1:])
plt.show()
iloc[0] gets the first entry, [1:] takes all the columns except the first and the * operator unpacks the arguments.

Plot Multiple DataFrames into one single plot

I have two dataFrames that I would like to plot into a single graph. Here's a basic code:
#!/usr/bin/python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
scenarios = ['scen-1', 'scen-2']
for index, item in enumerate(scenarios):
df = pd.DataFrame({'A' : np.random.randn(4)})
print df
df.plot()
plt.ylabel('y-label')
plt.xlabel('x-label')
plt.title('Title')
plt.show()
However, this only plots the last dataFrame. If I use pd.concat() it plots one line with the combined values.
How can I plot two lines, one for the first dataFrame and one for the second one?
You need to put your plot in the for loop.
If you want them on a single plot then you need to use plot's ax kwarg to put them to plot on the same axis. Here I have created a fresh axis using subplots but this could be an already populated axis,
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
scenarios = ['scen-1', 'scen-2']
fig, ax = plt.subplots()
for index, item in enumerate(scenarios):
df = pd.DataFrame({'A' : np.random.randn(4)})
print df
df.plot(ax=ax)
plt.ylabel('y-label')
plt.xlabel('x-label')
plt.title('Title')
plt.show()
The plot function is only called once, and as you say this is with the last value of df. Put df.plot() inside the loop.

How to plot a Python Dataframe with category values like this picture?

How can I achieve that using matplotlib?
Here is my code with the data you provided. As there's no class [they are all different, despite your first example in your question does have classes], I gave colors based on the numbers. You can definitely start alone from here, whatever result you want to achieve. You just need pandas, seaborn and matplotlib:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# import xls
df=pd.read_excel('data.xlsx')
# exclude Ranking values
df1 = df.ix[:,1:-1]
# for each element it takes the value of the xls cell
df2=df1.applymap(lambda x: float(x.split('\n')[1]))
# now plot it
df_heatmap = df2
fig, ax = plt.subplots(figsize=(15,15))
sns.heatmap(df_heatmap, square=True, ax=ax, annot=True, fmt="1.3f")
plt.yticks(rotation=0,fontsize=16);
plt.xticks(fontsize=12);
plt.tight_layout()
plt.savefig('dfcolorgraph.png')
Which produces the following picture.

Python Pandas Matplotlib Plot Colored by type value defined in single column

I have data of the following format:
import pandas as ps
table={'time':[1,2,3,4,5,1,2,3,4,5,1,2,3,4,5],\
'data':[1,1,2,2,2,1,2,3,4,5,1,2,2,2,3],\
'type':['a','a','a','a','a','b','b','b','b','b','c','c','c','c','c']}
df=ps.DataFrame(table,columns=['time','data','type']
I would like to plot data as a function of time connected as a line, but I would like each line to be a separate color for unique types. In this example, the result would be three lines: a data(time) line for each type a, b, and, c. Any guidance is appreciated.
I have been unable to produce a line with this data--pandas.scatter will produce a plot, while pandas.plot will not. I have been messing with loops to produce a plot for each type, but I have not found a straight forward way to do this. My data typically has an unknown number of unique 'type's. Does pandas and/or matpltlib have a way to create this type of plot?
Pandas plotting capabilities will allow you to do this if everything is indexed properly. However, sometimes it's easier to just use matplotlib directly:
import pandas as pd
import matplotlib.pyplot as plt
table={'time':[1,2,3,4,5,1,2,3,4,5,1,2,3,4,5],
'data':[1,1,2,2,2,1,2,3,4,5,1,2,2,2,3],
'type':['a','a','a','a','a','b','b','b','b','b','c','c','c','c','c']}
df=pd.DataFrame(table, columns=['time','data','type'])
groups = df.groupby('type')
fig, ax = plt.subplots()
for name, group in groups:
ax.plot(group['time'], group['data'], label=name)
ax.legend(loc='best')
plt.show()
If you'd prefer to use the pandas plotting wrapper, you'll need to override the legend labels:
import pandas as pd
import matplotlib.pyplot as plt
table={'time':[1,2,3,4,5,1,2,3,4,5,1,2,3,4,5],
'data':[1,1,2,2,2,1,2,3,4,5,1,2,2,2,3],
'type':['a','a','a','a','a','b','b','b','b','b','c','c','c','c','c']}
df=pd.DataFrame(table, columns=['time','data','type'])
df.index = df['time']
groups = df[['data', 'type']].groupby('type')
fig, ax = plt.subplots()
groups.plot(ax=ax, legend=False)
names = [item[0] for item in groups]
ax.legend(ax.lines, names, loc='best')
plt.show()
Just to throw in the seaborn solution.
import seaborn as sns
import matplotlib.pyplot as plt
g = sns.FacetGrid(df, hue="type", size=5)
g.map(plt.plot, "time", "data")
g.add_legend()

Categories

Resources