This question already has answers here:
How to plot different groups of data from a dataframe into a single figure
(5 answers)
plot multiple pandas dataframes in one graph
(3 answers)
Closed 25 days ago.
I want to superpose two graphs where x-axis corresponds. The first is on the full range, while second is upon a sub-interval.
test1 = pd.DataFrame(
{
'x': [1,2,3,4,5,6,7,8,9],
'y': [0,1,1,2,1,2,1,1,1]
}
)
test2 = pd.DataFrame(
{
'x': [1,2,4,5,8],
'y': [3,2,2,3,3]
}
)
You can use the xlim() function in matplotlib.
Example:
import matplotlib.pyplot as plt
import pandas as pd
test1 = pd.DataFrame(
{
'x': [1,2,3,4,5,6,7,8,9],
'y': [0,1,1,2,1,2,1,1,1]
}
)
test2 = pd.DataFrame(
{
'x': [1,2,4,5,8],
'y': [3,2,2,3,3]
}
)
plt.plot(test1['x'], test1['y'], 'b-', label='test1')
plt.plot(test2['x'], test2['y'], 'r-', label='test2')
plt.xlim(min(test1['x']), max(test1['x']))
plt.legend()
plt.show()
Result: https://i.stack.imgur.com/bz41W.png
IIUC, you can add to the "y" values in the first one the "y" values colliding in the second where collision is over "x" values:
plt.plot(test1["x"], test1["y"].add(test1["x"].map(test1.set_index("x")["y"])))
to get
Related
This question already has answers here:
How to plot multiple pandas columns
(3 answers)
Plot multiple columns of pandas DataFrame using Seaborn
(2 answers)
Closed 5 months ago.
My dataframe looks like the following:
df = pd.DataFrame(
{'id': [543476, 539345, 536068, 537710, 538255],
'true_distance': [22836.49,7920.67,720.39,1475.87,35212.81],
'simulated_distance': [19670.69,7811.64,386.67,568.95,24720.94]}
)
df
id true_distance simulated_distance
0 543476 22836.49 19670.69
1 539345 7920.67 7811.64
2 536068 720.39 386.67
3 537710 1475.87 568.95
4 538255 35212.81 24720.94
I need to compare the true distance and simulated distance in a single cdf plot.
EDIT
I want the cdf of true_distance and simulated_distance in one figure (identified by legend).
import pandas as pd
import seaborn as sns
#create a long format of your df
df_long = df.melt(id_vars=["id"],
value_vars= ["true_distance",
"simulated_distance"],
var_name="Variable",
value_name= "Distance",
ignore_index=True,
)
# create lineplot filtered by your two
Variables
sns.lineplot(data = df_long,
y = "Distance",
x = "id",
hue = "Variable",
linewidth = 2,
)
This question already has answers here:
Passing datetime-like object to seaborn.lmplot
(2 answers)
format x-axis (dates) in sns.lmplot()
(1 answer)
How to plot int to datetime on x axis using seaborn?
(1 answer)
Closed 10 months ago.
I would really really appreciate it if you guys can point me to where to look. I have been trying to do it for 3 days and still can't find the right one. I need to draw the chart which looks as the first picture's chart and I need to display the dates on the X axis as it gets displayed on the second chart. I am complete beginner with seaborn, python and everything. I used lineplot first, which only met one criteria, display the dates on X-axis. But, the lines are actually sharp like in the second picture rather than smooth like in the first picture. Then, I kept digging and found implot. With that, I could get the design of the chart I wanted (Smoothed chart). But, the problem is when I tried to display the dates on the X-axis, it didn't work. I got an error could not convert string to float: '2022-07-27T13:31:00Z'.
Here is the code for implot, got the wanted plot design but date can't be displayed on X-axis
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
T = np.array([ "2022-07-27T13:31:00Z",
"2022-08-28T13:31:00Z",
"2022-09-29T13:31:00Z",
"2022-10-30T13:31:00Z",])
power = np.array([10,25,60,42])
df = pd.DataFrame(data = {'T': T, 'power': power})
sns.lmplot(x='T', y='power', data=df, ci=None, order=4, truncate=False)
If I use the number instead of date, the output is this. Exactly as I need
Here is the code with which all the data gets displayed correctly. But, the plot design is not smoothed.
import seaborn as sns
import numpy as np
import scipy
import matplotlib.pyplot as plt
import pandas as pd
from pandas.core.apply import frame_apply
years = ["2022-03-22T13:30:00Z",
"2022-03-23T13:31:00Z",
"2022-04-24T19:27:00Z",
"2022-05-25T13:31:00Z",
"2022-06-26T13:31:00Z",
"2022-07-27T13:31:00Z",
"2022-08-28T13:31:00Z",
"2022-09-29T13:31:00Z",
"2022-10-30T13:31:00Z",
]
feature_1 =[0,
6,
1,
5,
9,
15,
21,
4,
1,
]
data_preproc = pd.DataFrame({
'Period': years,
# 'Feature 1': feature_1,
# 'Feature 2': feature_2,
# 'Feature 3': feature_3,
# 'Feature 4': feature_4,
"Feature 1" :feature_1
})
data_preproc['Period'] = pd.to_datetime(data_preproc['Period'],
format="%Y-%m-%d",errors='coerce')
data_preproc['Period'] = data_preproc['Period'].dt.strftime('%b')
# aiAlertPlot =sns.lineplot(x='Period', y='value', hue='variable',ci=None,
# data=pd.melt(data_preproc, ['Period']))
sns.lineplot(x="Period",y="Feature 1",data=data_preproc)
# plt.xticks(np.linspace(start=0, stop=21, num=52))
plt.xticks(rotation=90)
plt.legend(title="features")
plt.ylabel("Alerts")
plt.legend(loc='upper right')
plt.show()
The output is this. Correct data, wrong chart design.
lmplot is a model based method, which requires numeric x. If you think the date values are evenly spaced, you can just create another variable range which is numeric and calculate lmplot on that variable and then change the xticks labels.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
T = np.array([ "2022-07-27T13:31:00Z",
"2022-08-28T13:31:00Z",
"2022-09-29T13:31:00Z",
"2022-10-30T13:31:00Z",])
power = np.array([10,25,60,42])
df = pd.DataFrame(data = {'T': T, 'power': power})
df['range'] = np.arange(df.shape[0])
sns.lmplot(x='range', y='power', data=df, ci=None, order=4, truncate=False)
plt.xticks(df['range'], df['T'], rotation = 45);
This question already has an answer here:
matplotlib subplots - too many indices for array [duplicate]
(1 answer)
Closed 1 year ago.
I would like to plot pandas dataframes as subplots.
I read this post: How can I plot separate Pandas DataFrames as subplots?
Here is my minimum example where, like the accepted answer in the post, I used the ax keyword:
import pandas as pd
from matplotlib.pyplot import plot, show, subplots
import numpy as np
# Definition of the dataframe
df = pd.DataFrame({'Pressure': {0: 1, 1: 2, 2: 4}, 'Volume': {0: 2, 1: 4, 2: 8}, 'Temperature': {0: 3, 1: 6, 2: 12}})
# Plot
fig,axes = subplots(2,1)
df.plot(x='Temperature', y=['Volume'], marker = 'o',ax=axes[0,0])
df.plot(x='Temperature', y=['Pressure'], marker = 'o',ax=axes[1,0])
show()
Unfortunately, there is a problem with the indices:
df.plot(x='Temperature', y=['Volume'], marker = 'o',ax=axes[0,0])
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
Please, could you help me ?
If you have only one dimension (like 2 x 1 subplots), you can just used axes[0] and axes[1]. When you have two dimensional subplots (2 x 3 subplots for example), you indeed need slicing with two numbers.
A very newbie question but very new in using Altair library. I have Dates as X-axis and and in Y column containing 0 and 1. I want two lines one for 0 and one for 1 both of different colours. How to do it?
You can do this by mapping this column to the color encoding. Here's a short example:
import altair as alt
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame({
'x': pd.date_range('2020-01-01', freq='D', periods=10),
'y': np.random.randn(10).cumsum(),
'z': np.random.randint(0, 2, 10),
})
alt.Chart(df).mark_line().encode(
x='x:T',
y='y:Q',
color='z:O'
)
This question already has answers here:
Adding labels in x y scatter plot with seaborn
(6 answers)
Closed 4 years ago.
I have a Seaborn scatterplot using data from a dataframe. I would like to add data labels to the plot, using other values in the df associated with that observation (row). Please see below - is there a way to add at least one of the column values (A or B) to the plot? Even better, is there a way to add two labels (in this case, both the values in column A and B?)
I have tried to use a for loop using functions like the below per my searches, but have not had success with this scatterplot.
Thank you for your help.
df_so = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
scatter_so=sns.lmplot(x='C', y='D', data=df_so,
fit_reg=False,y_jitter=0, scatter_kws={'alpha':0.2})
fig, ax = plt.subplots() #stuff like this does not work
Use:
df_so = pd.DataFrame(np.random.randint(0,100,size=(20, 4)), columns=list('ABCD'))
scatter_so=sns.lmplot(x='C', y='D', data=df_so,
fit_reg=False,y_jitter=0, scatter_kws={'alpha':0.2})
def label_point(x, y, val, ax):
a = pd.concat({'x': x, 'y': y, 'val': val}, axis=1)
for i, point in a.iterrows():
ax.text(point['x']+.02, point['y'], str(point['val']))
label_point(df_so['C'], df_so['D'], '('+df_so['A'].astype(str)+', '+df_so['B'].astype(str)+')', plt.gca())
Output: