I would like to get in jupyter the input text from a user.
Such input is supposed to be a csv, comma separated values.
Depending on that values I would like to generatate an interactive ipysheet which dependens on another dataframe data and the user input.
So:
a) user input.
b) ipysheet = f(user input) filtering df
c) Final result = f(ipysheet)
Already in the first two steps I can not manage to get the thing working.
from ipysheet import from_dataframe, to_dataframe
from ipywidgets import *
df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
csv = ipywidgets.Text(value='',
placeholder = 'type here',
description = 'csv:',
disabled = False)
def submit(user_input):
list=user_input.split(',')
# here do calculations extracting data from df
for i,word in enumerate(list):
print(i,word)
# if I would get this running here I would create the ipysheet like so:
sheet = from_dataframe(df)
display(csv)
widgets.interact(submit,user_input=csv)
First: I am not able to link the text input and the submit fuction
Second: in submit function the only way to access df is with global df?
Third is it possible to link two ipysheets.
Related
so im not quite sure how to formulate the question, as im quite new in pythong and coding in general.
I have a GUI that displays already available information form a csv:
def updatetext(self):
"""adds information extracted from database already provided"""
df_subj = Content.extract_saved_data(self.date)
self.lineEditFirstDiagnosed.setText(str(df_subj["First_Diagnosed_preop"][0])) \
if str(df_subj["First_Diagnosed_preop"][0]) != 'nan' else self.lineEditFirstDiagnosed.setText('')
self.lineEditAdmNeurIndCheck.setText(str(df_subj['Admission_preop'][0])) \
works great
now, if i chenge values in the GUI, i want them to be updated in the csv.
I started like this:
def onClickedSaveReturn(self):
"""closes GUI and returns to calling (main) GUI"""
df_general = Clean.get_GeneralData()
df_subj = {k: '' for k in Content.extract_saved_data(self.date).keys()} # extract empty dictionary
df_subj['ID'] = General.read_current_subj().id[0]
df_subj['PID'] = df_general['PID_ORBIS'][0]
df_subj['Gender'] = df_general['Gender'][0]
df_subj['Diagnosis_preop'] = df_general['diagnosis'][0]
df_subj["First_Diagnosed_preop"] = self.lineEditFirstDiagnosed.text()
df_subj['Admission_preop'] = self.lineEditAdmNeurIndCheck.text()
df_subj['Dismissal_preop'] = self.DismNeurIndCheckLabel.text()
and this is what my boss added now:
subj_id = General.read_current_subj().id[0] # reads data from curent_subj (saved in ./tmp)
df = General.import_dataframe('{}.csv'.format(self.date), separator_csv=',')
if df.shape[1] == 1:
df = General.import_dataframe('{}.csv'.format(self.date), separator_csv=';')
idx2replace = df.index[df['ID'] == subj_id][0]
# TODO: you need to find a way to turn the dictionaryy df_subj into a dataframe and replace the data at
# the index idxreplace of 'df' with df_subj. Later I would suggest to use line 322 to save everything to the
# file
df.iloc[idx2replace] = pds.DataFrame([df_subj])
df.to_csv("preoperative.csv", index=False)
# df.to_csv(os.path.join(FILEDIR, "preoperative.csv"), index=False)
self.close()
I'm not really sure how to approach this, or to be honest, what to do at all.
Hope someone can help me.
Thank youu
You should load the file only once and keep the DF (self.df or something). Then display it and every time the user changes a value in the GUI the DF should update and when the user clicks save you should just overwrite the existing file with the current DF in memory.
I am well and truly baffled. First time using the dropdown widget so forgive me if this is obvious and thank you for any help you can provide.
Here is the dataframe I want to display and how it was built:
def top_10_venues(data) :
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
try:
columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
except:
columns.append('{}th Most Common Venue'.format(ind+1))
# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = data['Neighborhood']
for ind in np.arange(denver_grouped.shape[0]):
neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(data.iloc[ind, :], num_top_venues)
neighborhoods_venues_sorted = neighborhoods_venues_sorted.set_index(['Neighborhood'])
top_10_venues(denver_grouped)
neighborhoods_venues_sorted
Here is my dropdown widget:
#Experimenting with Jupyter dropdown
filtered_df = None
dropdown = widgets.SelectMultiple(
options=neighborhoods_venues_sorted.index,
description='Venue',
disabled=False,
layout={'height':'100px', 'width':'40%'})
def max_density(widget):
global filtered_df
selection = list(widget['new'])
with out:
clear_output()
display(neighborhoods_venues_sorted[selection])
filtered_df = neighborhoods_venues_sorted[selection]
out = widgets.Output()
dropdown.observe(filter_dataframe, names='value')
display(dropdown)
display(out)
Here is what I end up seeing, the unformatted dataframe I ran the function on?
Booyah, figured it out!
Seems my issue was a misunderstanding of what was happening within the cell that created neighborhoods_venues_sorted. I thought I was creating a dataframe. Instead I created a function
First is the sort function
def return_most_common_venues(row, num_top_venues):
row_categories = row.iloc[1:]
row_categories_sorted = row_categories.sort_values(ascending=False)
return row_categories_sorted.index.values[0:num_top_venues]
This is the new function instead of a block of code in a cell
#Function to create sorted data frame with top 10 most common venues
def top_ten_venues(df) :
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
try:
columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
except:
columns.append('{}th Most Common Venue'.format(ind+1))
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = df['Neighborhood']
for ind in np.arange(denver_grouped.shape[0]):
neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(df.iloc[ind, :], num_top_venues)
#important to have a return in a function, this is the output that can be attached to a variable
return neighborhoods_venues_sorted
Next I ran it on my targeted dataframe and assigned it to a variable. This fixed my issue, I'm still too new to understand fully why when this exact same code was run in a cell it refused to assign it as a new dataframe.
#creating a variable to hold the df for later access
neighborhoods_venues_sorted = top_ten_venues(denver_grouped)
#reindexing because it's fun
neighborhoods_venues_sorted = neighborhoods_venues_sorted.set_index(['Neighborhood'])
I'm writing a function for Jupyter notebooks, where a user will be able to obtain data as Pandas Dataframe (irrelevant for this question) and display it with filters he could create when needed.
My problem is that I can't "link" the interact filter to the data itself. I had no problems to define manually the filters in the code, but not from the user side. I researched a lot questions from Stackoverflow, Google, project Github and docs before posting here.
Here is the POC:
import pandas as pd
import ipywidgets as widgets
from ipywidgets import *
from IPython.display import display
import numpy as np
np.random.seed(0)
# Data example
my_columns = list('ABCD')
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
# Encapsulate Table inside Output widget
table_out = widgets.Output()
with table_out:
display(df)
# Filter for ints
def filter_int(column, x):
return df.loc[df[column] > x]
# Our filter generator
def generate_filter(button):
# Check if exist before creating
if not select_definition.value in [ asd.children[0].description for asd in filters.children ]:
# Not exist. Create this filter
new_filter = interactive(
filter_int, # Our filter for ints
column=fixed(select_definition.value), # Which column as filter
x=widgets.IntSlider(min=0, max=100, step=1, value=10, description=select_definition.value) # Value from the user
)
# Append created filter
filters.children=tuple(list(filters.children) + [new_filter])
# Define button and event
button = widgets.Button(description="Add")
button.on_click(generate_filter)
# Define Dropdown
select_definition = widgets.Dropdown(options=my_columns, layout=Layout(width='10%'))
# Put Dropdown and button together
choose_filter = widgets.HBox([select_definition, button])
# Where we will put all our filters
filters = widgets.HBox()
display(choose_filter, filters, table_out)
Which will create this:
I'm able to create the filters for the columns dynamically, but I'm not sure how to make them update the table and link together (so the table will be updated based on multiple filters).
The expected result is to be able create filters for column A and B and update the table with values defined by them, as shown in the image below:
Any help is appreciated!
Note: The last image was generated with df.loc[(df['A'] > 22) & (df['B'] > 92)]
I am using python3 and pandas to create a script to:
Read unstructured xsls data of varing column lengths
Total the "this", "last" and "diff" columns
Add Total under the brands columns
Dynamically bold the entire row that contains "total"
On the last point, the challenge I have been struggling with is that the row index changes depending on the data being fed in to the script. The code provided does not have a solution to this issue. I have tried every variation I can think of using style.applymap(bold) with and without variables.
Example of input
input
Example of desired outcome
outcome
Script:
import pandas as pd
import io
import sys
import warnings
def bold(val):
return 'font-weight: bold'
excel_file = 'testfile1.xlsx'
df = pd.read_excel(excel_file)
product = (df.loc[df['Brand'] == "widgit"])
product = product.append({'Brand':'Total',
'This':product['This'].sum(),
'Last':product['Last'].sum(),
'Diff':product['Diff'].sum(),
'% Chg':product['This'].sum()/product['Last'].sum()
},
ignore_index=True)
product = product.append({'Brand':' '}, ignore_index=True)
product.fillna(' ', inplace=True)
try something like this:
def highlight_max(x):
return ['font-weight: bold' if v == x.loc[4] else ''
for v in x]
df = pd.DataFrame(np.random.randn(5, 2))
df.style.apply(highlight_max)
output:
I have a Holoviews code with the intent of saving the output as .html. The below works fine i.e. html is genereated and tags are renders but filters don't work. What am I doing wrong?
def load_data(country, lan_name, **kwargs):
df = subset
if country != 'ALL':
df = df[(df.country == country)]
if lan_name != 'ALL':
df = df[(df.lan_name == lan_name)]
table = format_chars(df['term'], df['hex'])
#hv.Table(df, ['country', 'lan_name'], [], label='Data Table')
layout = (table).opts(
opts.Layout(merge_tools=False),
opts.Div(width=700, height=400),
)
return layout
methods = ['ALL'] + sorted(list(subset['country'].unique()))
models = ['ALL'] + sorted(list(subset['lan_name'].unique()))
dmap = hv.DynamicMap(load_data, kdims=['country', 'lan_name']).redim.values(country=methods, lan_name=models)
hv.save(dmap, 'output.html', backend='bokeh')
By "filters" it sounds like you mean the widgets that select along the country and lan_name dimensions. Each time you select a new value of a widget, a DynamicMap calls the Python function that you provide it (load_data here) to calculate the display (which is what makes it "Dynamic"). There is no Python process available when you have a static HTML file, so the display will never get updated in that case.
To make some limited functionality available in a static HTML file, you can convert the DynamicMap to a HoloMap that contains all the displayed items for some specific combinations of widget values (http://holoviews.org/user_guide/Live_Data.html#Converting-from-DynamicMap-to-HoloMap). The resulting parameter space can quickly get quite large, so you will often need to select a feasible subset of values for this to be a practical option.