Python pandas plot linechart with data points - python

I would like to plot a linechart based on column A. Based on Column sig I would like to add some markers to the chart A:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.DataFrame(np.random.randn(120), columns=list('A'))
data['sig'] = np.NaN
data['sig'] = np.where((data['A'] > 1), data['A'], data['sig'] )
data.plot(grid=True)
plt.show()
I tried to add markevery=data['sig'] to the plot() statement, but it gave me several errors. Any hints?

Why not plot directly in matplotlib?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.DataFrame(np.random.randn(120), columns=list('A'))
data['sig'] = np.NaN
data['sig'] = np.where((data['A'] > 1), data['A'], data['sig'] )
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(data["A"])
ax.scatter(data.index, data["sig"])

Related

python remove edges in stacked area chart

I am trying to remove the "edge color" from a stacked area chart in python, as the plots tend to get "overlaid" by the "last" column:
import pandas as pd, numpy as np
df = pd.DataFrame(np.random.randint(0, 10000, size=(10000, 4)), columns=list('ABCD'))
import matplotlib.pyplot as plt, seaborn as sns
def myplot(df):
fig, ax = plt.subplots(figsize=(4, 4))
df.plot.area(ax=ax)
myplot(df=df)
The problem is that it looks as if the red variable is the "largest one". But that is a plotting artifact because its borders from one x value t othe next overlap that of other variables. To see what I mean, run this (same data with only 100 observations)
import pandas as pd, numpy as np
df = pd.DataFrame(np.random.randint(0, 100, size=(100, 4)), columns=list('ABCD'))
import matplotlib.pyplot as plt, seaborn as sns
def myplot(df):
fig, ax = plt.subplots(figsize=(4, 4))
df.plot.area(ax=ax)
myplot(df=df)

Circular contour map in python

I have a 120mm diameter circular disk, where I measure temperature at 20 different locations. These measurement locations are at random places. I am looking for a way to plot it as in attached desired plot link. When I used tricontour, It just plots the random points. I am unable to find a way to fill the circle as in below attached pic. Is there any other way to plot this? Spent lot of time searching for it with no success.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = {"x": [110,50,-85,20,45,0,-80,-30,-105,80], "y":
[0,100,75,-90,20,115,-85,-20,-45,-90],"z":[10,2,6,4,9,12,2,6,4,12]}
x = data['x']
y = data['y']
z = data['z']
f, ax = plt.subplots(1)
plot = ax.tricontourf(x,y,z, 20)
ax.plot(x,y, 'ko ')
circ1 = Circle((0, 0), 120, facecolor='None', edgecolor='r', lw=5)
ax.add_patch(circ1)
f.colorbar(plot)
Example data :
Desired plot:
What I got from tricontour:
There is much data to do a really nice coontour plot, but here is a solution with your data and an example with a substantially larger dataset:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.tri as tri
data = {"x": [110,50,-85,20,45,0,-80,-30,-105,80], "y":
[0,100,75,-90,20,115,-85,-20,-45,-90],"z":[10,2,6,4,9,12,2,6,4,12]}
df = pd.DataFrame(data)
fig = plt.figure()
ax = fig.add_subplot(projection='polar')
ax.set_title("tricontour")
ax.tricontourf(df["x"], df["y"], df["z"],20)
plt.show()
which gives
and for a larger dataframe:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df= pd.DataFrame(np.random.randint(0,1000,size=(1000, 3)), columns=list('XYZ'))
fig = plt.figure()
ax = fig.add_subplot(projection='polar')
ax.set_title("tricontour")
ax.tricontourf(df["X"], df["Y"], df["Z"],20)
plt.show()
which returns

How to draw multiple lines with Seaborn?

I am trying to draw a plot with two lines. Both with different colors. And different labels. This is what I have come up with.
This is code that I have written.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
data1 = pd.read_csv("/content/drive/MyDrive/Summer-2020/URMC/training_x_total_data_ones.csv", header=None)
data2 = pd.read_csv("/content/drive/MyDrive/Summer-2020/URMC/training_x_total_data_zeroes.csv", header=None)
sns.lineplot(data=data1, color="red")
sns.lineplot(data=data2)
What am I doing wrong?
Edit
This is how my dataset looks like
So, I just added another color in the second line and that seemed to work.
import random
import numpy as np
import seaborn as sns
mu, sigma = 0, 0.1
s = np.random.normal(mu, sigma, 100)
mu1, sigma1 = 0.5, 1
t = np.random.normal(mu1, sigma1, 100)
sns.lineplot(data= s, color = "red")
sns.lineplot(data= t, color ="blue")
Try specifying the x and y of the call to sns.lineplot?
import pandas as pd
import numpy as np
import seaborn as sns
x = np.arange(10)
df1 = pd.DataFrame({'x':x,
'y':np.sin(x)})
df2 = pd.DataFrame({'x':x,
'y':x**2})
sns.lineplot(data=df1, x='x', y='y', color="red")
sns.lineplot(data=df2, x='x', y='y')
Without doing so, I get a similar plot as yours.

Showing multiple Line Legends in Matplotlib

I am trying to display all 4 legends of my line graph, with the Column headers serving as the respective Legend names.
Is there an elegant way of executing this without having to write individual lines of code to plot and label each column?
Examples of my current data set are as follows:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
x = pd.Series(np.array([1,2,3,4,5,6,7,8,9,10]))
y = pd.DataFrame(np.random.rand(10,4))
y.columns = ["A","B","C","D"]
fig, ax = plt.subplots(figsize=(10, 7))
ax.plot(x, y, label=True)
Indeed you can use the plot function defined in pandas:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
x = pd.Series(np.array([1,2,3,4,5,6,7,8,9,10]))
y = pd.DataFrame(np.random.rand(10,4))
y.columns = ["A","B","C","D"]
y['x'] = x
fig, ax = plt.subplots(figsize=(10, 7))
y.plot(ax=ax)

How to add multiple trendlines pandas

I have plotted a graph with two y axes and would now like to add two separate trendlines for each of the y plots.
This is my code:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
%matplotlib inline
amp_costs=pd.read_csv('/Users/Ampicillin_Costs.csv', index_col=None, usecols=[0,1,2])
amp_costs.columns=['PERIOD', 'ITEMS', 'COST PER ITEM']
ax=amp_costs.plot(x='PERIOD', y='COST PER ITEM', color='Blue', style='.', markersize=10)
amp_costs.plot(x='PERIOD', y='ITEMS', secondary_y=True,
color='Red', style='.', markersize=10, ax=ax)
Any guidance as to how to plot these two trend lines to this graph would be much appreciated!
Here is a quick example of how to use sklearn.linear_model.LinearRegression to make the trend line.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
plt.style.use('ggplot')
%matplotlib inline
period = np.arange(10)
items = -2*period +1 + np.random.randint(-2,2,len(period))
cost = 35000*period +15000 + np.random.randint(-25000,25000,len(period))
data = np.vstack((period,items,cost)).T
df = pd.DataFrame(data, columns=\['P','ITEMS', 'COST'\]).set_index('P')
lmcost = LinearRegression().fit(period.reshape(-1,1), cost.reshape(-1,1))
lmitems = LinearRegression().fit(period.reshape(-1,1), items.reshape(-1,1))
df['ITEMS_LM'] = lmitems.predict(period.reshape(-1,1))
df['COST_LM'] = lmcost.predict(period.reshape(-1,1))
fig,ax = plt.subplots()
df.ITEMS.plot(ax = ax, color = 'b')
df.ITEMS_LM.plot(ax = ax,color= 'b', linestyle= 'dashed')
df.COST.plot(ax = ax, secondary_y=True, color ='g')
df.COST_LM.plot(ax = ax, secondary_y=True, color = 'g', linestyle='dashed')

Categories

Resources