is it possible to create a chart in python with multiple categories?
For example:
I copy pasted this in powerpoint, loaded it and tried this:
presentation.slides[0].shapes[0].chart.plots[0].categories.flattened_labels
which gave me all the labels in a tuple format -->
(('Oct-2019', 'Advertiser'),('Oct-2019','25th percentile), etc ...)
if I try printing :
presentation.slides[0].shapes[0].chart.plots[0].categories[i]
for i in 0 to 3
I get the values 'Advertiser','25th percentile' etc, but I can't find a way to access the 'Oct-2019' value.
When creating the ChartData, I saw that I can also add_category, but this adds the label to the categories I'm currently accessing (e.g. Advertiser, 25th percentile, etc), but I would like to add 'Nov 2019' which is in another hierarchy level.
This is a bit of a roundabout way of asking if anyone has created a multi-category chart with python pptx, and how they would do it from the chartdata level/what this would look like.
Thank you!
#data
pivot_digital_monthly_imp_laydown = pd.pivot_table(digital_monthly_imp_laydown, values='Original Value', index=['Year','Month'],
columns=['Campaign'], aggfunc=np.sum,fill_value=0,sort=False).reset_index().rename_axis(None, axis=1)
pivot_digital_monthly_imp_laydown.Year=pivot_digital_monthly_imp_laydown.Year.astype(str)
#inserting Impressions laydown in slide
prs=input_ppt
digital_monthly_imp_laydown_chart = prs.slides[3].shapes[6].chart
digital_monthly_imp_laydown_data =CategoryChartData()
#multilevel Categories
cat=list(pivot_digital_monthly_imp_laydown['Year'])
subcat=list(pivot_digital_monthly_imp_laydown['Month'])
b={}
for i,j in zip(cat,subcat):
key=str(i)
b.setdefault(key,[])
b[key].append(j)
main_cat=list(b.keys())
for i in range(len(main_cat)):
ear=digital_monthly_imp_laydown_data.add_category(str(main_cat[i]))
for sub in b[main_cat[i]]:
ear.add_sub_category(str(sub))
#add series data
for col in pivot_digital_monthly_imp_laydown.columns[2:]:
new = list(pivot_digital_monthly_imp_laydown[col])
digital_monthly_imp_laydown_data.add_series(str(col),new)
new=[]
digital_monthly_imp_laydown_chart.replace_data(digital_monthly_imp_laydown_data)
#saving
prs.save("./Output/Sub Brand Output.pptx")
Related
I have defined 10 different DataFrames A06_df, A07_df , etc, which picks up six different data point inputs in a daily time series for a number of years. To be able to work with them I need to do some formatting operations such as
A07_df=A07_df.fillna(0)
A07_df[A07_df < 0] = 0
A07_df.columns = col # col is defined
A07_df['oil']=A07_df['oil']*24
A07_df['water']=A07_df['water']*24
A07_df['gas']=A07_df['gas']*24
A07_df['water_inj']=0
A07_df['gas_inj']=0
A07_df=A07_df[['oil', 'water', 'gas','gaslift', 'water_inj', 'gas_inj', 'bhp', 'whp']]
etc for a few more formatting operations
Is there a nice way to have a for loop or something so I don’t have to write each operation for each dataframe A06_df, A07_df, A08.... etc?
As an example, I have tried
list=[A06_df, A07_df, A08_df, A10_df, A11_df, A12_df, A13_df, A15_df, A18_df, A19_df]
for i in list:
i=i.fillna(0)
But this does not do the trick.
Any help is appreciated
As i.fillna() returns a new object (an updated copy of your original dataframe), i=i.fillna(0) will update the content of ibut not of the list content A06_df, A07_df,....
I suggest you copy the updated content in a new list like this:
list_raw = [A06_df, A07_df, A08_df, A10_df, A11_df, A12_df, A13_df, A15_df, A18_df, A19_df]
list_updated = []
for i in list_raw:
i=i.fillna(0)
# More code here
list_updated.append(i)
To simplify your future processes I would recommend to use a dictionary of dataframes instead of a list of named variables.
dfs = {}
dfs['A0'] = ...
dfs['A1'] = ...
dfs_updated = {}
for k,i in dfs.items():
i=i.fillna(0)
# More code here
dfs_updated[k] = i
The Problem
I wanted to create an interactive hbar plot, where you can switch between 3 different data sources, using a select widget, a python callback and a local bokeh serve. The plot with the default source renders fine, but when I switch to a different source, the y labels stay the same and the plot turns blank. Changing back to the original value on the select widget does not show the plot I started out with and stays blank.
When I hard-code the inital source to a different one in the code, it renders just fine until I switch it by using the widget again, so the data itself seems to work fine individually.
Am I missing something? I read through many threads, docs and tutorials but can't find anything wrong with my code.
Here is what I have done so far:
I read a .csv and create 3 seperate dataframes and then convert then to columndatasources. Every source has 10 data entries with the columns "species", "ci_lower" and "ci_upper".
Here is an example of one source (all three are built exactly the same way, with different taxon classes):
df = pd.read_csv(os.path.join(os.path.dirname(__file__), "AZA_MLE_Jul2018_utf8.csv",), encoding='utf-8')
m_df = df[df["taxon_class"]=="Mammalia"]
m_df = m_df.sort_values(by="mle", ascending=False)
m_df = m_df.reset_index(drop=True)
m_df = m_df.head(10)
m_df = m_df.sort_values(by="species", ascending=False)
m_df = m_df.reset_index(drop=True)
m_source = bp.ColumnDataSource(m_df)
I saved all 3 sources in a dict:
sources_dict={
"Mammalia": m_source,
"Aves": a_source,
"Reptilia": r_source
}
... and then created my variable called "source" that should change interactively with the "Mammalia" source as default:
source = sources_dict["Mammalia"]
Next I created a figure and added a hbar plot with the source variable as follows:
plot = bp.figure(x_range=(0, np.amax(source.data["ci_upper"])+5), y_range=source.data["species"])
plot.hbar(y="species", right="ci_lower", left="ci_upper", height=0.5, fill_color="#b3de69", source=source)
Then I added the select widget with a python callback:
def select_handler(attr, old, new):
source.data["species"]=sources_dict[new].data["species"]
source.data["ci_lower"]=sources_dict[new].data["ci_lower"]
source.data["ci_upper"]=sources_dict[new].data["ci_upper"]
select = Select(title="Taxonomic Class:", value="Mammalia", options=list(sources_dict.keys()))
select.on_change("value", select_handler)
curdoc().add_root(bk.layouts.row(plot, select))
I tried this:
My suspicion was that the error lies within the callback function, so I tried many different variants, all with the same bad result. I will list some of them here:
I tried using a python native dictionary:
new_data= {
'species': sources_dict[new].data["species"],
'ci_lower': sources_dict[new].data["ci_lower"],
'ci_upper': sources_dict[new].data["ci_upper"]
}
source.data=new_data
I tried assigning the whole data source, not just swapping the data
source=sources_dict[new]
I also tried using dict()
source.data = dict(species=sources_dict[new].data["species"], ci_lower=sources_dict[new].data["ci_lower"], ci_upper=sources_dict[new].data["ci_upper"])
Screenshots
Here is a screenshot of the initial plot, when I run the py file with bokeh serve --show file.py
And here one after changing the selected value:
Would greatly appreaciate any hints that could help me figure this out
Answering your question in the comment, changing data does not change the ranges because y_range=some_thing is just a convenience over creating a proper range class that's done behind the curtain.
Here's how you can do it manually. Notice that I don't touch x_range at all - by default it's DataRange1d that computes its start/end values automatically.
from bokeh.io import curdoc
from bokeh.layouts import column
from bokeh.models import Select, ColumnDataSource
from bokeh.plotting import figure
d1 = dict(x=[0, 1], y=['a', 'b'])
d2 = dict(x=[8, 9], y=['x', 'y'])
ds = ColumnDataSource(d1)
def get_factors(data):
return sorted(set(data['y']))
p = figure(y_range=get_factors(d1))
p.circle(x='x', y='y', source=ds)
s = Select(options=['1', '2'], value='1')
def update(attr, old, new):
if new == '1':
ds.data = d1
else:
ds.data = d2
p.y_range.factors = get_factors(ds.data)
s.on_change('value', update)
curdoc().add_root(column(p, s))
I have a dataframe that denotes events that occured in particular locations.
I am aware that folium does not allow dynamic display of the appearance of the events so I was thinking about basically iterate through the dates and save a png of each folium map created.
Unfortunately I am mentally stuck in a 2 part problem:
1) how to loop through a ranges of dates (for example one map for each month)
2) an appropriate way to save the generated images for each loop.
This is a dataframe sample for this example:
since = ['2019-07-05', '2019-07-17', '2014-06-12', '2016-03-11']
lats = [38.72572, 38.71362, 38.79263, 38.71931]
longs = [-9.13412, -9.14407, -9.40824, -9.13143]
since_map = {'since' : pd.Series(since), 'lats' : pd.Series(lats), 'longs' : pd.Series(longs)}
since_df = pd.DataFrame(since_map)
I was able to create the base map:
lat_l = 38.736946
long_l = -9.142685
base_l = folium.Map(location=[lat_l,long_l], zoom_start=12)
neigh = folium.map.FeatureGroup()
And add some of the markers to the folium map:
for lati, longi in zip(since_df.longs, since_df.lats):
neigh.add_child(folium.CircleMarker([longi, lati], radius = 2, color = 'blue', fill = True))
base_l.add_child(neigh)
I am struggling to visualize how to loop through ranges of the dates and save each file. From what I saw here:
https://github.com/python-visualization/folium/issues/35 I actually have to open the saved html and then save it as png for each image.
If you could point me to an example or documentation that could demonstrate how this can be accomplished I would be very appreciative.
If you think I am overcomplicating it or you have a better alternative to what I am thinking I have an open ear to suggestions.
Thank you for your help.
I have a Holoviews code with the intent of saving the output as .html. The below works fine i.e. html is genereated and tags are renders but filters don't work. What am I doing wrong?
def load_data(country, lan_name, **kwargs):
df = subset
if country != 'ALL':
df = df[(df.country == country)]
if lan_name != 'ALL':
df = df[(df.lan_name == lan_name)]
table = format_chars(df['term'], df['hex'])
#hv.Table(df, ['country', 'lan_name'], [], label='Data Table')
layout = (table).opts(
opts.Layout(merge_tools=False),
opts.Div(width=700, height=400),
)
return layout
methods = ['ALL'] + sorted(list(subset['country'].unique()))
models = ['ALL'] + sorted(list(subset['lan_name'].unique()))
dmap = hv.DynamicMap(load_data, kdims=['country', 'lan_name']).redim.values(country=methods, lan_name=models)
hv.save(dmap, 'output.html', backend='bokeh')
By "filters" it sounds like you mean the widgets that select along the country and lan_name dimensions. Each time you select a new value of a widget, a DynamicMap calls the Python function that you provide it (load_data here) to calculate the display (which is what makes it "Dynamic"). There is no Python process available when you have a static HTML file, so the display will never get updated in that case.
To make some limited functionality available in a static HTML file, you can convert the DynamicMap to a HoloMap that contains all the displayed items for some specific combinations of widget values (http://holoviews.org/user_guide/Live_Data.html#Converting-from-DynamicMap-to-HoloMap). The resulting parameter space can quickly get quite large, so you will often need to select a feasible subset of values for this to be a practical option.
I have been playing around with Pygal, creating some line graphs for a project I am working on. I currently have my y axis set to be the value recorded and the x axis being the date / time the test was conducted. However I would also like to link the serial number to each data point. At the moment when you hover on a data point you get the y value in bold and underneith that you get the date it was recorded.
Does anyone know if it is possible to link information to data points without them being an axis label?
For reference I currently have the serial numbers being added to the list: 'sn_list'.
for row in line_graph_query:
if str(row.date_time) >= start_date and str(row.date_time) <= end_date :
min_values.append(float(row.minimum_value))
max_values.append(float(row.maximum_value))
recorded_values.append(float(row.recorded_value))
sn_list.append(row.product_serial_number)
date_list.append(row.date_time)
number_of_records = number_of_records + 1
print(min_values)
print(max_values)
print(recorded_values)
distance_x_axis = math.floor(number_of_records/6)
line_chart = pygal.Line(no_data_text='No result found', style=custom_style,x_labels_major_every=distance_x_axis, x_label_rotation=20, show_minor_x_labels=False )
line_chart.title = 'Detailed Results of '+test_name+' tests of '+board_pn
line_chart.x_labels = map(str,date_list)
line_chart.add('Minimum', min_values)
line_chart.add('Maximum', max_values)
line_chart.add('Recorded', recorded_values)
graph_render.append(line_chart.render_data_uri())
graphs_to_render[test_name] = graph_render[-1]
You can set the tooltip to any text you like by providing your data as dicts (see the documentation here). Each value should be represented by a dict that has at least a value attribute, this is the same value that you were providing to the chart directly. There are then a number of other attributes you can set, amongst them label.
You should be able to get the tooltips you want by changing the three lines that append data in your if structure:
min_values.append({"value": float(row.minimum_value),
"label": row.product_serial_number})
max_values.append({"value": float(row.maximum_value),
"label": row.product_serial_number})
recorded_values.append({"value": float(row.recorded_value),
"label": row.product_serial_number})
Unless you are using it somewhere else this also means that you have no need of the sn_list.