I want to select from a dataframe based on a name. Then I want to plot the data on a single graph using a for loop.
df = pd.read_csv ('Kd.csv')
watertype = ['Evian','Volvic','Buxton']
for type in watertype:
sdf = df[(df['Water']==type)]
Na = sdf.iloc[:,13]
Kd = sdf.iloc[:,2]
plt.plot(Na,Kd,'o')
plt.show()`
Multiple graphs produced instead of overlaying them on a single graph.
Related
I have a function that creates a Holoviewa heatmap. If I save the heatmap using hv.save(heatmap, 'heatmap.html') it works great! I just cannot figure out how to show the plot without having to save it. The same script generates two density plots with Plotly and using .show() and pops the plot up in my browser.
I am NOT using jupyter notebook and have been starting the bokeh server from a DOS prompt. I am working inside PyCharm Community with Python 3.10. Though if I could do it all from inside the script that would be easier.
def gen_heat_map(df: pandas.DataFrame, freq: float) -> holoviews.HeatMap:
"""
Uses a single frequency upon which to build the heatmap.
:param df: pandas.Dataframe containing data read from a JSON file
:param freq: The frequency to build the heatmap out of
:return: Holoviews Heat Map
"""
# Select a single frequency upon which to build the heatmap
single_frq = df[df.centerFrequency == freq].reset_index(drop=True)
# create a second dataframe from each transmission
sec_df = pd.DataFrame()
for index, row in single_frq.iterrows():
sec_df = sec_df.append(make_by_second(row), ignore_index=True)
min_df = sec_df.set_index('time').resample('1min').mean().reset_index().replace(np.nan, -160)
with pd.option_context('display.max_columns', None):
print(min_df)
min_df["Minute"] = min_df["time"].dt.strftime("%M")
min_df["Hour"] = min_df['time'].dt.strftime("%H")
heatmap = hv.HeatMap(min_df, ['Minute', 'Hour'], ['power', 'time'])
heatmap.opts(radial=True,
width=750,
height=750,
tools=['hover'],
colorbar=True,
cmap='bokeh',
start_angle=-np.pi * 7 / 24,
title='Frequency Power Level Radial Heat Map'
)
return heatmap
heatmap = gen_heat_map(df, 929612500.0)
The function gen_heat_map takes a large Pandas Dataframe of data read from a JSON file plus a single frequency and generates the heat map. It is trying to display this resultant heat map that is the issue. I can do so through Holoviz's Panel toolkit but I would like to find a simpler solution.
Suggestions?
Thanks,
Doug
I want to plot some statistics results for each region. I have nested 'for' loops, where in the inner loop it generates the statistics, in the outer loop it selects the regions and plot the respective statistic results. Not sure why my code plots data from all regions into the same figure, not one figure for a region.
Yrstat = []
for j in Regionlist:
for i in Yrlist:
dfnew = df.loc[(df['Yr']==i)&(df['Region']==j)]
if not dfnew.empty:
#calculate the confidence interval and mean for data in each year
CI = scipy.stats.norm.interval(alpha=0.95, loc=np.mean(dfnew['FluxTot']), scale=scipy.stats.sem(dfnew['FluxTot']))
list(CI)
mean = np.mean (dfnew['FluxTot'])
Yrstat.append((i, mean, CI[0], CI[1]))
#convert stats list to a dataframe
yrfullinfo = pd.DataFrame(Yrstat, columns = ['Yr', 'mean', 'CI-','CI+'])
#making figures
fig, ax =plt.subplots()
ax.plot(yrfullinfo['Yr'], yrfullinfo['mean'], label = 'mean')
ax.plot(yrfullinfo['Yr'], yrfullinfo['CI-'], label = '95%CI')
ax.plot(yrfullinfo['Yr'], yrfullinfo['CI+'], label = '95%CI')
ax.legend()
#exporting figures
filename = "C:/Users/Christina/Desktop/python test/Summary in {}.png". format (j)
fig.savefig(filename)
plt.close(fig)
The problem wasn't the figure, the script saves a png file for each region in an own plot which is correct. The problem is your data.
You intitialize Yrstat=[] outside of both loops. Then you append data to it in every step of the inner loop (and also all outer loops) and plot the data of the "new" DataFrame yrfullinfo. This DataFrame grows bigger with each iteration.
You need to create a new list of values for each Region, that's why I moved the list Yrstat in the outer loop to get reinitialized for every region.
for j in Regionlist:
Yrstat = []
for i in Yrlist:
dfnew=dfmerge.loc[(dfmerge['Yr']==i)&(dfmerge['Region']==j)]
if not dfnew.empty:
#get all statistics for data in each year
CI = st.norm.interval(alpha=0.95, loc=np.mean(dfnew['FluxTot']), scale=st.sem(dfnew['FluxTot']))
list(CI)
mean = np.mean (dfnew['FluxTot'])
Yrstat.append((j, i, mean, CI[0], CI[1]))
df = pd_df_total_primary_Y.set_index('EthnicGroups_EthnicGroup1Desc')
df1 = pd_df_total_general_Y.set_index('EthnicGroups_EthnicGroup1Desc')
df[["P_Y_Count20", "P_Y_Count16", "P_Y_Count12", "P_Y_Count08","P_Y_Count04", "P_Y_Count00"]].plot.bar()
plt.title('total_primary_Y');
df1[["G_Y_Count20", "G_Y_Count16", "G_Y_Count12", "G_Y_Count08", "G_Y_Count04", "G_Y_Count00"]].plot.bar()
plt.title('total_general_Y');
I am trying to plot these graph on the same row and then add two more graphs below them. I am struggling to get it to work. how can i do it?
I have the following dataframe in pandas:
dfClicks = pd.DataFrame({'clicks': [700,800,550],'date_of_click': ['10/25/1995
03:30','10/25/1995 04:30','10/25/1995 05:30']})
dfClicks['date_of_click'] = pd.to_datetime(dfClicks['date_of_click'])
dfClicks.set_index('date_of_click')
dfClicks.clicks = pd.to_numeric(dfClicks.clicks)
Could you please advise how I can plot the above such that the x-axis shows the date/time and the y axis the number of clicks? I will also need to plot another data frame which includes predicted clicks on the same graph, just to compare. The test could be a replica of above, with minor changes:
dfClicks2 = pd.DataFrame({'clicks': [750,850,500],'date_of_click': ['10/25/1995
03:30','10/25/1995 04:30','10/25/1995 05:30']})
dfClicks2['date_of_click'] = pd.to_datetime(dfClicks2['date_of_click'])
dfClicks2.set_index('date_of_click')
dfClicks2.clicks = pd.to_numeric(dfClicks2.clicks)
Change to numeric the column clicks and then:
ax = dfClicks.plot()
dfClicks2.plot(ax=ax)
ax.legend(["Clicks","Clicks2"])
Output:
UPDATE:
There is an error in how you set the index, change
dfClicks.set_index('date_of_click')
with:
dfClicks = dfClicks.set_index('date_of_click')
I am trying to remove the overlay text on my boxplot I created using pandas. The code to generate it is as follows (minus a few other modifications):
ax = df.boxplot(column='min2',by=df['geomfull'],ax=axes,grid=False,vert=False, sym='',return_type='dict')
I just want to remove the "boxplot grouped by 0..." etc. and I can't work out what object it is in the plot. I thought it was an overflowing title but I can't find where the text is coming from! Thanks in advance.
EDIT: I found a work around which is to construct a new pandas frame with just the relevant list of things I want to box (removing all other variables).
data = {}
maps = ['BA4','BA5','BB4','CA4','CA5','EA4','EA5','EB4','EC4','EX4','EX5']
for mapi in maps:
mask = (df['geomfull'] == mapi)
arr = np.array(df['min2'][mask])
data[mapi] = arr
dfsub = pd.DataFrame(data)
Then I can use the df.plot routines as per examples....
bp = dfsub.plot(kind='box',ax=ax, vert=False,return_type='dict',sym='',grid=False)
This produces the same plot without the overlay.