Sharing dataframe between callbacks - python

I am trying to share dataframe between callbacks but i keep getting this error. I want to use dcc.store to the data. Then I will have one callback filtering the data while the other callback plotting the graph.
"Callback error updating main_data.data"
My code run fine if I include everything in one callback, but it won't work once I split it.
import dash
import pathlib
import numpy as np
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import pandas as pd
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output, State
from flask import Flask
df =pd.read_csv("salesfunnela.csv")
mgr_options = df["Manager"].unique()
mgr_options = np.insert(mgr_options, 0 , 'All Managers')
server = Flask(__name__)
app = dash.Dash(server=server)
app.layout = html.Div([
dcc.Store(id='main_data'),
html.Div(
[
html.P("Div1", className="control_label"),
dcc.Dropdown(
id="Manager",
options=[{
'label': i,
'value': i
} for i in mgr_options],
value='All Managers'),
],
style={'width': '25%',
'display': 'inline-block'}),
dcc.Graph(id='funnel-graph'),
html.Div(
[
html.P("Div2", className="abc"),
],
style={'width': '25%',
'display': 'inline-block'}),
])
#app.callback(
dash.dependencies.Output('main_data', 'data'),
[dash.dependencies.Input('Manager', 'value')])
def update_data(Manager):
if Manager == "All Managers":
df_plot = df.copy()
else:
df_plot = df[df['Manager'] == Manager]
return df_plot
#app.callback(
dash.dependencies.Output('funnel-graph', 'figure'),
[dash.dependencies.Input('main_data', 'data')])
def update_graph(main_data):
pv = pd.pivot_table(
df_plot,
index=['Name'],
columns=["Status"],
values=['Quantity'],
aggfunc=sum,
fill_value=0)
traces = [go.Bar(x=pv.index, y=pv[('Quantity', t[1])], name=t[1]) for t in pv]
return {
'data': traces,
'layout':
go.Layout(
title='Customer Order Status for {}'.format(Manager),
barmode='stack')
}
if __name__ == '__main__':
app.run_server(debug=True)

Some time has passed but I hope this might help.
What is basically discussed in previous answer is to change def update_graph(main_data) to def update_graph(df_plot), or alternatively, change df_plot in the function to main_data if you like. this will most likely not solve your problem though. Since the problem is that the function update_data cannot store the data in the first place. The idea to store the filtered data somewhere is probably a good idea, instead of sending it through chained callbacks.
In the section for sharing data between callbacks in the docs/getting started guide (https://dash.plotly.com/sharing-data-between-callbacks), it says that you have to store the data as either JSON or base64 encoded binary data. A Pandas DataFrame is not binary data in an ASCII string format (base64), if you want to encode a DataFrame in base64 you should probably convert it to a string first and then encode that into base64 (e.g. https://docs.python.org/3/library/base64.html). So in your example code, to use JSON, you would have to change the return statement to
return df_plot.to_json(date_format='iso', orient='split')
in the update_data function.
Then in update_graph you would now need to convert the JSON back into Pandas DataFrame. The first few lines of that function would then look like this instead
def update_graph(main_data):
df_plot = pd.read_json(main_data, orient='split')
pv = pd.pivot_table(
df_plot,
index=['Name'],
columns=["Status"],
values=['Quantity'],
aggfunc=sum,
fill_value=0)
I hope this helps, and that it's not too late.

You probably want to read more about Chained callbacks...
Docs - https://dash.plotly.com/basic-callbacks
Scroll down to the section: Dash App With Chained Callbacks
In the docs-example, you'll notice that the data is not really passed between two callbacks.
Rather they work like event listeners, listening to updates in the DOM.
In your case, there's nothing called "main-data" in the layout, which the second callback is trying to listen to.
Try to play around with 'funnel-graph' or 'Div2' or setup another element whose updates can be tracked by these callbacks.

Related

Display "5 items chosen" instead of "Red", "Blue", "Yellow" .... in a dropdown Dash Python?

I want to create a table in Dash where it is possible to choose multiple values in a specific column. My goal is to do it with a regular table and then add dropdowns for filtering.
However, when a dropdown is added and the size of the choices is bigger than the dropdown it adds "rows". Is it possible to let the choices that been made to be translated into number of choices made instead?
I wonder is it is possible to do something similiar to the Basic example in
https://mdbootstrap.com/docs/standard/extended/multiselect/
I have tried something like this
from dash import Dash, dcc, html, Input, Output
from plotly.express import data
import pandas as pd
df = data.medals_long()
app = Dash(__name__)
app.layout = html.Div([
dcc.Dropdown(df.columns, id='pandas-dropdown-1', multi=True),
html.Div(id='pandas-output-container-1')
])
#app.callback(
Output('pandas-output-container-1', 'children'),
Output('pandas-dropdown-1', 'search_value'),
Input('pandas-dropdown-1', 'value')
)
def update_output(value):
return f'You have selected {value}', f'{len(value)} values have been chosen'
if __name__ == '__main__':
app.run_server(debug=True)

Dash Datatable disappearing while switching between Tabs

I have a Dash app with 2 Tabs and on one Tab I have an upload button while on the other Tab the uploaded dataset is being shown. After uploading the data, it is shown on the second tab but when I switch to the first Tab and come back again to the second Tab, the data table is not there anymore. I have tried using persistence and persistence-type but it doesn't work. Here is the code for the data table
#du.callback(
output=Output('output-datatable', 'children'),
id='upload-data',
)
def get_a_list(filenames):
data1=pd.read_excel(filenames[0])
return dash_table.DataTable(
data = data1.to_dict('records'),
columns = [{'name': i, 'id': i} for i in data1.columns],
page_size =15, persistence = True, persistence_type = 'memory')
Instead of persistence, use dcc.store
In the layout:
dcc.Store(id='store-data', data=[], storage_type='local')
Callback function:
#du.callback(
output=Output('store-data', 'data'),
input = Input(id='upload-data','children')
)
Also check out this video: https://www.youtube.com/watch?v=dLykSQNIM1E

dash DataTable reset on page reload

The following code reads a simple model (3 columns 50 rows) from a CSV file for editing in a table in my (larger) dash app. Editing a cell writes the whole table back to file as expected. However, reloading the page displays the table as it was originally loaded from file, thus losing any edits. Any clues about how to keep the edits between page reloads?
df_topic_list=pd.read_csv(model_file)
app.layout = html.Div([
dcc.Store(id='memory-output'),
html.Div([
dash_table.DataTable(df_topic_list.to_dict('records'),
id='memory-table',
columns=[{"name": i, "id": i} for i in df_topic_list.columns],editable=True
),
])
])
#app.callback(Output('memory-output', 'data'),
Input('memory-table', 'data'))
def on_data_set_table(data):
pd.DataFrame(data).to_csv(model_file,index=False)
return data
app.run_server(port=8052)
When you refresh page then it doesn't run all code again but it only sends again app.layout which it generated only once and which has original data from file. And when you update data in cell in table then it updates only values in browser (using JavaScript) but not in code app.layout.
But it has options to presist values in browser memory and it should use these values after reloading.
app.layout = html.Div([
dcc.Store(id='memory-output'),
html.Div([
dash_table.DataTable(
df_topic_list.to_dict('records'),
id='memory-table',
columns=[{"name": i, "id": i} for i in df_topic_list.columns],
editable=True,
persistence=True, # <---
persisted_props=["data"], # <---
)
])
])
It works for me but it seems some people had problem with this.
See issues: Dash table edited data not persisting · Issue #684 · plotly/dash-table
But I found other method to keep it.
I assign table to separated variable - ie. table - and in callback I replace table.data in this table.
from dash import Dash, Input, Output, callback
from dash import dcc, html, dash_table
import pandas as pd
model_file = 'data.csv'
df_topic_list = pd.read_csv(model_file)
app = Dash(__name__)
table = dash_table.DataTable(
df_topic_list.to_dict('records'),
id='memory-table',
columns=[{"name": i, "id": i} for i in df_topic_list.columns],
editable=True,
)
app.layout = html.Div([
dcc.Store(id='memory-output'),
html.Div([table])
])
#app.callback(
Output('memory-output', 'data'),
Input('memory-table', 'data')
)
def on_data_set_table(data):
pd.DataFrame(data).to_csv(model_file, index=False)
table.data = data # <--- replace data
return data
app.run_server(port=8052)

Define Dependent Dictionaries in Python

I am working on an NLP project analyzing the words spoken by characters in The Office. Part of this project involves making a network diagram of which characters talk to each other for a given episode.
This will be shown in a Dash app by allowing a user to select dropdowns for 4 parameters: season, episode, character1, and character2.
Here is a relevant snippet of my code so far:
#Import libraries
import pandas as pd
import numpy as np
import dash
import dash_core_components as dcc
import dash_html_components as html
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output, State
#Load data
sheet_url = 'https://docs.google.com/spreadsheets/d/18wS5AAwOh8QO95RwHLS95POmSNKA2jjzdt0phrxeAE0/edit#gid=747974534'
url = sheet_url.replace('/edit#gid=', '/export?format=csv&gid=')
df = pd.read_csv(url)
#Set parameters
choose_season = df['season'].unique()
choose_episode = df['episode'].unique()
choose_character = ['Andy','Angela', 'Darryl', 'Dwight', 'Jan', 'Jim','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Roy','Ryan','Stanley','Toby']
#Define app layout
app = dash.Dash()
server = app.server
app.layout = html.Div([
dbc.Row([
dbc.Col(
dcc.Dropdown(
id='dropdown1',
options=[{'label': i, 'value': i} for i in choose_season],
value=choose_season[0]
), width=3
),
dbc.Col(
dcc.Dropdown(
id='dropdown2',
options=[{'label': i, 'value': i} for i in choose_episode],
value=choose_episode[0]
), width=3
),
dbc.Col(
dcc.Dropdown(
id='dropdown3',
options=[{'label': i, 'value': i} for i in choose_character],
value=choose_character[0]
), width=3
),
dbc.Col(
dcc.Dropdown(
id='dropdown4',
options=[{'label': i, 'value': i} for i in choose_character],
value=choose_character[1]
), width=3
)
])
])
if __name__=='__main__':
app.run_server()
In order to have this work efficiently, I would like to have the following dependencies in the dropdown menus:
1.) The selection of the first dropdown menu updates the dropdown menu
ie: Season updates possible episodes
2.) The selection of the first two dropdown menus updates the 3rd and 4th dropdown menus
ie: Season, Episode updates possible characters (if a character was not in that episode, they will not appear)
3.) The selection of the third dropdown menu updates the fourth dropdown menu
ie: If a character is selected in the third dropdown menu, they can not be selected in the fourth (can't select the same character twice)
I understand one way to do this is to make a massive season to episode dictionary and then an even larger season to episode to character dictionary.
I've already made the code to process the season to episode dictionary:
#app.callback(
Output('dropdown2', 'options'), #--> filter episodes
Output('dropdown2', 'value'),
Input('dropdown1', 'value') #--> choose season
)
def set_episode_options(selected_season):
return [{'label': i, 'value': i} for i in season_episode_dict[selected_season]], season_episode_dict[selected_season][0]
I can definitely build these dictionaries, but this seems like a really inefficient use of time. Does anyone know of a way to build these dictionaries with just a few lines of code? Not sure how to approach building these in the easiest way possible. Also, if you have an idea for a better way to approach this problem, please let me know that too.
Any help would be appreciated! Thank you!
I think I see what you're asking about now. Something like this should get you a basic dictionary, which you could then modify for the options param for the dropdowns.
df = pd.read_csv(url)
season_episode_character_dictionary = {}
for season in df['season'].unique.tolist():
df_season = df[df['season'].eq(season)]
season_episode_character_dictionary[season] = {}
for episode in df_season['episode'].unique.tolist():
df_episode = df_season[df_season['episode'].eq(episode)]
characters = df_episode['characters'].unique.tolist()
season_episode_character_dictionary[season][episode] = characters

Passing Figure object to Graph in plotly dash

When I try to pass a Figure object to the dcc.Graph() in my layout, I get en error that says:
dash.exceptions.InvalidCallbackReturnValue: The callback ..graph.figure.. is a multi-output.
Expected the output type to be a list or tuple but got:
Figure({# the content of the figure})
my code is like:
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.express as px
app.layout = html.Div([
dcc.Graph(
id='graph'
)
])
#app.callback(
[
Output('graph', 'figure'),
],
[
Input('my-input', 'value')
]
)
def gen_graph(value):
dff = # my filtered df
fig = px.line(dff, x='x_var', y='y_var')
return fig
Feels like I'm missing something in how the Figure should be passed to the dcc.Graph(). Any ideas?
You structured your Output as a list, that makes it a multi-output callback. Just change it like this:
#app.callback(
Output('graph', 'figure'),
[
Input('my-input', 'value')
]
)
def gen_graph(value):
...
Alternatively, you could wrap your output in brackets to make it a list (return [fig]). Either way should work fine.

Categories

Resources