I can't figure out how to filter a column and then plot it successfully on Seaborn.
The below code works perfectly and plots a line graph with all of the unique columns values separated.
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
sns.set(style='darkgrid')
data = wcsales1.loc[wcsales1.Sales_Year > 2016]
sales_year = data['Sales_Year']
ppa = data['Price_Per_Acre']
dates = data['LATEST_LAND_SALE_DATE']
juris = data['PLANNING_JURISDICTION']
sns.relplot(x = sales_year, y = ppa, ci=None, kind='line', hue=juris)
plt.show()
However, I want to plot the values in the variable 'egs', listed below, which are two of many unique values in the variable 'juris'
I tried the below code but am getting a Value Error, also included below.
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
sns.set(style='darkgrid')
data = wcsales1.loc[wcsales1.Sales_Year > 2016]
data = data.reset_index()
sales_year = data['Sales_Year']
ppa = data['Price_Per_Acre']
dates = data['LATEST_LAND_SALE_DATE']
juris = data['PLANNING_JURISDICTION']
egs = ['HS', 'FV']
south = data.loc[data.PLANNING_JURISDICTION.isin(egs)]
print(type(south))
sns.relplot(x = sales_year, y = ppa, ci=None, kind='line', hue=south)
plt.show()
Error below
Shape of passed values is (19, 3), indices imply (1685, 3)
Thanks for your help!
With sns you should pass the data option and x,y, hue as the columns in the data:
sns.relplot(x='Sales_Year', y='Price_Per_Acre',
hue='PLANNING_JURISDICTION',
data=data.loc[data.PLANNING_JURISDICTION.isin(egs)],
kind='line', ci=None
)
Related
I am trying to connect lines based on a specific relationship associated with the points. In this example the lines would connect the players by which court they played in. I can create the basic structure but haven't figured out a reasonably simple way to create this added feature.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df_dict={'court':[1,1,2,2,3,3,4,4],
'player':['Bob','Ian','Bob','Ian','Bob','Ian','Ian','Bob'],
'score':[6,8,12,15,8,16,11,13],
'win':['no','yes','no','yes','no','yes','no','yes']}
df=pd.DataFrame.from_dict(df_dict)
ax = sns.boxplot(x='score',y='player',data=df)
ax = sns.swarmplot(x='score',y='player',hue='win',data=df,s=10,palette=['red','green'])
plt.show()
This code generates the following plot minus the gray lines that I am after.
You can use lineplot here:
sns.lineplot(
data=df, x="score", y="player", units="court",
color=".7", estimator=None
)
The player name is converted to an integer as a flag, which is used as the value of the y-axis, and a loop process is applied to each position on the court to draw a line.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df_dict={'court':[1,1,2,2,3,3,4,4],
'player':['Bob','Ian','Bob','Ian','Bob','Ian','Ian','Bob'],
'score':[6,8,12,15,8,16,11,13],
'win':['no','yes','no','yes','no','yes','no','yes']}
df=pd.DataFrame.from_dict(df_dict)
ax = sns.boxplot(x='score',y='player',data=df)
ax = sns.swarmplot(x='score',y='player',hue='win',data=df,s=10,palette=['red','green'])
df['flg'] = df['player'].apply(lambda x: 0 if x == 'Bob' else 1)
for i in df.court.unique():
dfq = df.query('court == #i').reset_index()
ax.plot(dfq['score'], dfq['flg'], 'g-')
plt.show()
I have around 4475 rows of csv data like below:
,Time,Values,Size
0,1900-01-01 23:11:30.368,2,
1,1900-01-01 23:11:30.372,2,
2,1900-01-01 23:11:30.372,2,
3,1900-01-01 23:11:30.372,2,
4,1900-01-01 23:11:30.376,2,
5,1900-01-01 23:11:30.380,,
6,1900-01-01 23:11:30.380,,
7,1900-01-01 23:11:30.380,,
8,1900-01-01 23:11:30.380,,321
9,1900-01-01 23:11:30.380,,111
.
.
4474,1900-01-01 23:11:32.588,,
When I try to create simple seaborn lineplot with below code. It creates line chart but its continuous chart while my data i.e. 'Values' has many empty/nan values which should show as gap on chart. How can I do that?
[from datetime import datetime
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv("Data.csv")
sns.set(rc={'figure.figsize':(13,4)})
ax =sns.lineplot(x="Time", y="Values", data=df)
ax.set(xlabel='Time', ylabel='Values')
plt.xticks(rotation=90)
plt.tight_layout()
plt.show()]
As reported in this answer:
I've looked at the source code and it looks like lineplot drops nans from the DataFrame before plotting. So unfortunately it's not possible to do it properly.
So, the easiest way to do it is to use matplotlib in place of seaborn.
In the code below I generate a dataframe like your with 20% of missing values in 'Values' column and I use matplotlib to draw a plot:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({'Time': pd.date_range(start = '1900-01-01 23:11:30', end = '1900-01-01 23:11:30.1', freq = 'L')})
df['Values'] = np.random.randint(low = 2, high = 10, size = len(df))
df['Values'] = df['Values'].mask(np.random.random(df['Values'].shape) < 0.2)
fig, ax = plt.subplots(figsize = (13, 4))
ax.plot(df['Time'], df['Values'])
ax.set(xlabel = 'Time', ylabel = 'Values')
plt.xticks(rotation = 90)
plt.tight_layout()
plt.show()
assuming that I receive multiple dataframes sequentially from some tasks that I perform, how do I create regplots that will show up together - something similar to what's shown below.
assuming these are my codes which generate dfs with random numbers. I would like to show those 4 regplots together
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
sns.set()
for i in range(4):
df1 = pd.DataFrame(np.random.randint(1,10,10),columns=['A'])
df2 = pd.DataFrame(np.random.randint(1,10,10),columns=['B'])
df = pd.concat([df1,df2],axis=1)
sns.regplot(df1,df2)
plt.show()
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
fig, axs = plt.subplots(2, 2)
for ax, i in zip(axs.flat, range(4)):
df1 = pd.DataFrame(np.random.randint(1,10,10),columns=['A'])
df2 = pd.DataFrame(np.random.randint(1,10,10),columns=['B'])
df = pd.concat([df1,df2],axis=1)
sns.regplot(x='A', y='B', data=df, ax=ax)
plt.tight_layout()
plt.show()
I am having some problems with plotting a Pandas dataframe with repeating range on x-axis after every 17 points. It doesn't start from new line after repetition. How to fix this issue.
import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_excel('BS.xlsx')
plt.plot(df.BZ, df.energy)
plt.show()
Repeating Dataframe
Based on the df provided. You can try as below:
import pandas as pd
from matplotlib import pyplot as plt
df = pd.read_excel('BS.xlsx')
df['range']= df.index//17
ax = plt.axes()
df.groupby('range').apply(lambda x:x.plot(x='BZ', y= 'energy', legend = False, ax=ax))
plt.show()
I would like create an plot with to display the last value on line. But i can not create the plot with the last value on chart. Do you have an idea for to resolve my problem, thanks you !
Input :
DataFrame
Plot
Output :
Cross = Last Value In columns
Output Final
# import eikon as ek
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import os
import seaborn as sns; sns.set()
import pylab
from scipy import *
from pylab import *
fichier = "P:/GESTION_RPSE/GES - Gestion Epargne Salariale/Dvp Python/Florian/Absolute
Performance/PLOT.csv"
df = pd.read_csv(fichier)
df = df.drop(columns=['Unnamed: 0'])
# sns.set()
plt.figure(figsize=(16, 10))
df = df.melt('Date', var_name='Company', value_name='Value')
#palette = sns.color_palette("husl",12)
ax = sns.lineplot(x="Date", y="Value", hue='Company', data=df).set_title("LaLaLa")
plt.show()
Do you just want to put an 'X' at the end of your lines?
If so, you could pass markerevery=[-1] to the call to lineplot(). However there are a few caveats:
You have to use style= instead of hue= otherwise, there are no markers drawn
Filled markers work better than unfilled markers (like "x"). You can just use markers=True to use the default markers, or pass a list markers=['s','d','o',etc...]
code:
fmri = sns.load_dataset("fmri")
fig, ax = plt.subplots()
ax = sns.lineplot(x="timepoint", y="signal",
style="event", data=fmri, ci=None, markers=True, markevery=[-1], markersize=10)