How can I configure a line chart using pivot_ui? - python

I would like to create a dynamic line chart in Voila. How can I manipulate the below code to show a standard line graph where the x axis equals column "a" and the y axis equals column "b"? Potentially the user can then dynamically update the output to make the y axis equal to column "c" by drag and drop etc.
from pivottablejs import pivot_ui
import pandas as pd
import IPython
df = pd.DataFrame(("a": [1,2,3], "b": [30,45,60],"c": [100,222,3444]))
display.display(df)
pivot_ui(df,outfile_path='pivottablejs.html',
rendererName="Line Chart",
cols= ["b","c"]
rows= ["a"],
aggregatorName="Sum"
)
display.display(IPython.display.HTML('pivottablejs.html"))
Thank you.

This should be a comment but I cannot post code with comments easily.
Please always test your code, preferably in a new notebook so it is fresh kerenl, before you post it.
Your dataframe assignment won't work. Should it be something like below?
That display code won't work in Jupyter notebook classic or JupyterLab presently.
Try something like this for assignment and display:
import pandas as pd
df = pd.DataFrame({"a": [1,2,3], "b": [30,45,60],"c": [100,222,3444]})
display(df)
That works in the classic notebook, JupyterLab, and Voila to make & display the dataframe.
Related to this is that is advisable to develop for Voila in JupyterLab. JupyterLab's rendering machinery is more modern and so closer to what Voila uses.
You can easily test renderings in launches from the Voila binder example page. Go there and click 'launch binder for the appropriate rendering. From JupyterLab you can select the Voila icon from the toolbar just above an open notebook and get the Voila rendering on the side.

Related

Can't display Latex expressions in pandas using Google collab

Hello I have a problem that I cant display a simple pandas table whitch has Latex expressions in it.
I have a simple python script:
x = 1;
x1 = 2;
my_dict = {'Case1':{'Formula1':'$$x^{-1}$$', 'Formula2':'$$x^2$$', 'Formula3':x},
'Case2':{'Formula1':'$$x^{-2}$$', 'Formula2':'$$x^4$$', 'Formula3':x1}}
df = pd.DataFrame(my_dict)
display(df.transpose())
When I am using Jupyter Notebook it works fine like this:
Example in Jupyter Notebook
but when I opened the same code in Google collab it is showing like this:
Formula1 Formula2 Formula3
Case1 $$x^{-1}$$ $$x^2$$ 1
Case2 $$x^{-2}$$ $$x^4$$ 2
Is there something I can do to make Google collab display it like Jupyter notebook ?
You could use something like:
from IPython.display import Math
Math('$$x^{-1}$$')
This wouldn't work for the cells inside the table, but I don't think there's any way to override pandas' behavior, so you might have to write your own code to print the table out with the formulas intact.
You could also do
import pandas as pd
from IPython.display import Markdown
v = pd.DataFrame({'Case1':{'Formula1':'$$x^{-1}$$', 'Formula2':'$$x^2$$', 'Formula3':1},
'Case2':{'Formula1':'$$x^{-2}$$', 'Formula2':'$$x^4$$', 'Formula3':2}})
display(Markdown(str(v)))
but the table formatting is broken.

Interactive visualization - select which csv to visualize

I'm writing an interactive visualization code using Python.
What i would like to do is to create an interactive visualization which allows the user to select a file from a dropdown menu (or something like that) and then plot a barplot of the selected data.
My data folder has the following structure:
+-- it_features
| +-- it_2017-01-20--2017-01-27.csv
| +-- it_2017-01-27--2017-02-03.csv
| +-- it_2017-02-03--2017-02-10.csv
and so on (there are many more files, I'm just reporting few of them for simplicity).
So far I'm able to access and retrieve all the data contained in the folder:
import os
import pandas as pd
path = os.getcwd()
file_folder = os.path.join(path,'it_features')
for csv_file in os.listdir(file_folder):
print(csv_file)
file = os.path.join(file_folder,csv_file)
df = pd.read_csv(file)
#following code....
What I would like to do is create an insteractive visualization which allows the user to select the file name (for example it_2017-02-03--2017-02-10.csv) and plot the data of that file.
I'm able to select "by hand" the file I want and plot its data by inserting its filename in a variable and then retrieving the data, but I would like not to insert it via code and allow the final user to browse and select one of the files using a dropdown menu or something similar.
My simple code:
import os
import pandas as pd
path = os.getcwd()
file_folder = os.path.join(path,'it_features')
file = os.path.join(file_folder,'it_2020-02-07--2020-02-14.csv') # Here I insert my filename
df=pd.read_csv(file)
ax=df.value_counts(subset=['Artist']).head(10).plot(y='number of songs',kind='bar', figsize=(15, 7), title="7-14 February 2020")
ax.set_xlabel("Artist")
ax.set_ylabel("Number of Songs Top 200")
Which generates the following plot:
As I already said, I would like to introduce a somewhat drodown menu that allows the user to select the csv data he wants to plot using an interactive plot.
I saw that it's possible to create dropdown menus with Plotly, but in the various examples (https://plotly.com/python/dropdowns/) it doesn't seem to select and then load the data.
I also saw this code (Kaggle code) which seems to do what I wanted to do: you can select the region and plot the data from that region.
The main problem is that he just creates a big unique dataframe with US states, and then creates a trace for each one of them.
What i would like to do (if possible) is to select the file name from the dropdown, load the csv and then plot its data, without creating a single giant dataframe with all my files in it.
Is it possible?
EDIT: The solution proposed by gherka works perfectly, but I would like to have a solution inside Plotly using its dropdown menu.
Since you're working in Jupyter Notebook, you have a number of different options.
Some visualisation libraries will have built-in widgets that you can use, however they would often require you to run a server or provide a javascript callback. For a library-agnostic approach, you can use ipywidgets. This library is specifically for creating widgets to be used in Jupyter Notebooks. The documentation is here.
To create a simple dropdown with a static bar plot underneath, you would need three widgets - Label for dropdown description, Dropdown and Output. VBox is for laying them out.
from ipywidgets import VBox, Label, Dropdown, Output
desc = Label("Pick a .csv to plot:")
dropdown = Dropdown(
options=['None', 'csv1', 'csv2', 'csv3'],
value='None',
disabled=False)
output = Output()
dropdown.observe(generate_plot, names="value")
VBox([desc, dropdown, output])
The key element is the generate_plot function. It must have a single parameter that you use to decide what effect the widget action has on your plot. When you interact with the dropdown, the generate_plot function will be called and passed a dictionary with "new" value, "old" value and a few other things.
Here's a function to generate a basic seaborn bar chart with an adjustable data source. Notice I had to include an explicit plt.show() - plots won't render otherwise.
def generate_plot(change):
with output:
output.clear_output() # reset the view
if change["new"] != "None":
data = pd.read_csv(...) # your custom code based on dropdown selection
sns.catplot(x="Letters", y="Numbers", kind="bar", data=data)
fig = plt.figure()
plt.show(fig)
If you have many large .csv files, one other thing is you might want to do is implement a caching system so that you keep the last few user selections in memory and avoid re-reading them on each selection.
For a more in-depth look at how to add interactivity to matplotlib plots using ipywidgets I found this tutorial quite useful.
tkinter is a super common UI framework for python, and is part of the standard library. Based on answers in a similar question, you can use this:
from tkinter.filedialog import askopenfilename
filename = askopenfilename()
which pops up a standard file explorer window.

Display data in pandas dataframe

This code allows me to display panda dataframe contents in Jupyter notebook.
import pandas as pd
# create a simple dataset of people
data = {'Name': ["John", "Anna", "Peter", "Linda"],
'Location' : ["New York", "Paris", "Berlin", "London"],
'Age' : [24, 13, 53, 33]
}
data_pandas = pd.DataFrame(data)
# IPython.display allows "pretty printing" of dataframes
# in the Jupyter notebook
display(data_pandas)
However, I am not using Jupyter notebook. I am using pycharm and Anaconda (python v3.6). How should I display data_pandas if I am not using Jupyter?
put data_pandas in a cell and run that cell. It will display the content in output.
To be able to do the same thing in pycharm you will have to run anaconda notebook from pycharm. Which works like this: https://www.jetbrains.com/help/pycharm/2016.3/using-ipython-jupyter-notebook-with-pycharm.html
Then it's basically same as running a normal jupyter notebook on a different browser.
If you are running a normal python program and want an inline output, it's not going to happen. You will have to at least run an Ipython program to do so. Iteractive python.

Dataframe head not shown in PyCharm

I have the following code in PyCharm
import pandas as pd
import numpy as np
import matplotlib as plt
df = pd.read_csv("c:/temp/datafile.txt", sep='\t')
df.head(10)
I get the following output:
Process finished with exit code 0
I am supposed to get the first ten rows of my datafile, but these do not appear in PyCharm.
I checked the Project interpreter and all settings seem to be alright there. The right packages are installed (numpy, pandas, matplotlib) under the right Python version.
What am I doing wrong? Thanks.
PyCharm is not Python Shell which automatically prints all results.
In PyCharm you have to use print() to display anything.
print(df.head(10))
The same is when you run script in other IDE or editor or directly python script.py
For printing all data
print(df)
By Default it will print top 5 records for head.
print(df.head())
If you need 10 rows then you can write this way
print(df.head(10))
I did File-Invalidate Caches/Restart Option Invalidate and after that I was able to get the head:

IPython Notebook output cell is truncating contents of my list

I have a long list (about 4000 items) whose content is suppressed when I try to display it in an ipython notebook output cell. Maybe two-thirds is shown, but the end has a "...]", rather than all the contents of the list. How do I get ipython notebook to display the whole list instead of a cutoff version?
pd.options.display.max_rows = 4000
worked for me
See : http://pandas.pydata.org/pandas-docs/stable/options.html
I know its a pretty old thread, but still wanted to post my answer in the hope it helps someone.
You can change the number of max_seq_items shown by configuring the pandas options as follows:
import pandas as pd
pd.options.display.max_seq_items = 2000
This should work:
print(str(mylist))
Simple!
How to disable list truncation in IPython:
Create an IPython config file if you don't already have one:
ipython profile create
Edit the config file to include this line:
c.PlainTextFormatter.max_seq_length = 0
Restart your notebook instance.
The following line prints everything in your list in a readable manner.
[print(x) for x in lis]
A quick hack if you're using pandas is to do
from pandas import DataFrame
from IPython.display import HTML
HTML(DataFrame(myList).to_html())
For cases where the output of print(mylist) is something like [1, 1, 1, ..., 1, 1, 1] then [*mylist] will expand the items into rows where all items are visible.
Here's a way to display the whole list in the IPython output cell that doesn't require Pandas:
from IPython.display import HTML
x = range(4000)
HTML('<br />'.join(str(y) for y in x))
It is also pretty easy to add additional HTML elements and get a more elaborate display. Clicking to the left of the output cell will now shrink the contents and add a local scroll bar.
just use the print command instead of calling the list directly. Like print mylist . It would not truncate then.

Categories

Resources