What is difference between plot and iplot in Pandas? - python

What is the difference between plot() and iplot() in displaying a figure in Jupyter Notebook?

I just started using iplot() in Python (3.6.6). I think it uses the Cufflinks wrapper over plotly that runs Matplotlib under the hood. It is seems to be the easiest way for me to get interactive plots with simple one line code.
Although it needs some libraries to setup. For example, the code below works in Jupyter Notebook (5.0.0) on macOS. The plots attached here are PNG and therefore not interactive.
Example: (1) Line plot (2) Bar plot {code below}
# Import libraries
import pandas as pd
import numpy as np
from plotly import __version__
%matplotlib inline
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
init_notebook_mode(connected=True)
cf.go_offline()
# Create random data
df = pd.DataFrame(np.random.randn(100,4), columns='Col1 Col2 Col3 Col4'.split())
df.head(2)
# Plot lines
df.iplot()
# Plot bars
df.iplot(kind='bar')

iplot is interactive plot. Plotly takes Python code and makes beautiful looking JavaScript plots. They let you have a lot of control over how these plots look and they let you zoom, show information on hover and toggle data to be viewed on the chart. Tutorial.
plot command = Matplotlib which is more old-school. It creates static charts. So there is not much hover information really, and you have to rerun the code to change anything. It was made after MATLAB which is an older program, so some people say it looks worse. It has a lot of options though and gives you a good amount of control over plots. It'll probably be created faster than a Plotly chart will be if you have a huge data set, but I wouldn't suspect much. Tutorial.
Matplotlib is standard and has been around longer, so there is a lot of information on it. Here is a blog post talking about different plotting packages in Python.

Correct answer provided.I tried to run this code in pycharm IDE but could not. jupyter notebook is required to graph iplot.

iplot() is more sophisticated and more interactive compared to Plot() method in pandas. iplot() covers what plot() has to offer plus it has a lot of additional features as well to make it more interactive.

Related

Jupyter and %matplotlib inline lost axis

I am having a really weird issue with using the %matplotlib inline code in my jupyter notebook for plotting graphs using both pyplot and the pandas plotting function.
The problem is they show up without any axes, and basically just show the graph area without anything aside from data points.
I found adding:
import matplotlib as mpl
mpl.rcParams.update(mpl.rcParamsDefault)
reverse it, but I find it odd that should do that every time as the effect disappears as soon as I run %matplotlib inlinecommand.
an example could be
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
plt.scatter(A,A)
plt.tight_layout()
plt.xlabel('here')
plt.show()
This would generate the graph below:
Weird enough if I uses the savefig it get plotted with the axis, if I uses the right-click -> new output -> save as figure, I also get the graph with the figures !!
like this:
Can anyone help me understand what is wrong, which global setting did I mess up, and how do I revert it?
(I don't remember messing around with any settings aside from some settings for pandas, but don't think they should have had an impact)
as mentioned running mpl.rcParams.update(mpl.rcParamsDefault) command does bring it back to normal until I run %matplotlib inline` again !!
Any help would be much appreciated.
Okay I am sorry I think I can answer the question myself now.
With the helpfull #Mr. T asking for the imgur link made me realize what was going on. I had starting using the dark jupyter lab theme, and the graph would generate plots with transparent background, ie. the text and lines where there, but I just couldn't see them.
The trick is to change the background color preferably globally, but that will be a task for tomorrow.

Getdist plot not showing up in console

I'm using getdist to plot some simulation results.
In jupyter writing just this line
g = plots.getSubplotPlotter()
g.triangle_plot([samples, samples2], filled=True)
Python will show plots as we can see here.
Now if we want to write it in Python shell and run it with IDLE, this does not produce any plot. plt.show() does not work here.
How to instruct python or matplotlib to show the plots and save them?
The problem is that getdist sets the backend to Agg (in this line), which is a non-interactive backend and hence cannot produce an interactive figure via plt.show().
This is pretty bad style, because the user should select the backend, not the package. You might want to inform the developpers about this design flaw.
Anyways it would be possible to switch the backend after importing getdist, via plt.switch_backend(..). As backend you would need to use any interactive backend you have available, e.g. "Qt5Agg" or "TkAgg".
import numpy as np
from getdist import plots, MCSamples
import matplotlib.pyplot as plt
plt.switch_backend("Qt5Agg")
# .. some code ..
g = plots.getSubplotPlotter()
g.triangle_plot([samples, samples2], filled=True)
plt.show()

Matplotlib plt.show() doesn't display anything

I want to create a barchart for my dataframe but it doesn't show up, so I made this small script to try out some things and this does display the barchart the way i want. The dataframe is structured the exact same way (I assume) as my big script where all my data is transformed.
Even if I copy paste this code in my other script it doesn't show the the plot
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({
'soortfout':['totaalnoodstoppen','aantaltrapopen','aantaltrapdicht','aantalrectdicht','aantalphotocellopen','aantalphotocelldicht','aantalsafetyedgeopen', 'aantalsafetyedgeclose'],
'aantalfouten':[19,9,0,0,10,0,0,0],
})
print(df)
df.plot(kind='bar',x='soortfout',y='aantalfouten')
plt.show()
I can't really paste my other code in here since it's pretty big. But is it possible that other code that doesn't even use anything from matplotlib interferes with plotting a chart?
I've tried most other solutions like:
matplotlib.rcParams['backend'] = "Qt4Agg"
Currently using Pycharm 2.5
It does work when i use Jupyter notebook.
I was importing modules that i wasn't using so they were grayed out.
But apparently you shouldn't use import pandas_profiling if you want to plot with matplotlib
Don't import modules that can interfere with plotting like pandas_profiling

plotly: huge number of datapoints

I am trying to plot something with a huge number of data points (2mm-3mm) using plotly.
When I run
py.iplot(fig, filename='test plot')
I get the following error:
Woah there! Look at all those points! Due to browser limitations, the Plotly SVG drawing functions have a hard time graphing more than 500k data points for line charts, or 40k points for other types of charts. Here are some suggestions:
(1) Use the `plotly.graph_objs.Scattergl` trace object to generate a WebGl graph.
(2) Trying using the image API to return an image instead of a graph URL
(3) Use matplotlib
(4) See if you can create your visualization with fewer data points
If the visualization you're using aggregates points (e.g., box plot, histogram, etc.) you can disregard this warning.
So then I try to save it with this:
py.image.save_as(fig, 'my_plot.png')
But then I get this error:
PlotlyRequestError: Unknown Image Server Error
How do I do this properly? I don't care if it's a still image or an interactive display within my notebook.
Plotly really seems to be very bad in this. I am just trying to create a boxplot with 5 Million points, which is no problem in the simple R function "boxplot", but plotly is calculating endlessly for this.
It should be a major issue to improve this. Not all data has to be saved (and shown) in the plotly object. This is the main problem I guess.
one option would be down-sampling your data, not sure if you'd like that:
https://github.com/devoxi/lttb-py
I also have problems with plotly in the browser with large datasets - if anyone has solutions, please write!
Thank you!
You can try the render_mode argument. Example:
import plotly.express as px
import pandas as pd
import numpy as np
N = int(1e6) # Number of points
df = pd.DataFrame(dict(x=np.random.randn(N),
y=np.random.randn(N)))
fig = px.scatter(df, x="x", y="y", render_mode='webgl')
fig.update_traces(marker_line=dict(width=1, color='DarkSlateGray'))
fig.show()
In my computer N=1e6 takes about 5 seconds until the plot is visible, and the "interactiveness" is still very good. With N=10e6 it takes about 1 minute and the plot is not responsive anymore (i.e. it is really slow to zoom, pan or anything).

Matplotlib - Tcl_AsyncDelete: async handler deleted by the wrong thread?

I'm asking this question because I can't solve one problem in Python/Django (actually in pure Python it's ok) which leads to RuntimeError: tcl_asyncdelete async handler deleted by the wrong thread. This is somehow related to the way how I render matplotlib plots in Django. The way I do it is:
...
import matplotlib.pyplot as plt
...
fig = plt.figure()
...
plt.close()
I extremely minimized my code. But the catch is - even if I have just one line of code:
fig = plt.figure()
I see this RuntimeError happening. I hope I could solve the problem, If I knew the correct way of closing/cleaning/destroying plots in Python/Django.
By default matplotlib uses TK gui toolkit, when you're rendering an image without using the toolkit (i.e. into a file or a string), matplotlib still instantiates a window that doesn't get displayed, causing all kinds of problems. In order to avoid that, you should use an Agg backend. It can be activated like so --
import matplotlib
matplotlib.use('Agg')
from matplotlib import pyplot
For more information please refer to matplotlib documentation -- http://matplotlib.org/faq/howto_faq.html#matplotlib-in-a-web-application-server
The above (accepted) answer is a solution in a terminal environment. If you debug in an IDE, you still might wanna use 'TkAgg' for displaying data. In order to prevent this issue, apply these two simple rules:
everytime you display your data, initiate a new fig = plt.figure()
don't close old figures manually (e.g. when using a debug mode)
Example code:
import matplotlib
matplotlib.use('TkAgg')
from matplotlib import pyplot as plt
fig = plt.figure()
plt.plot(data[:,:,:3])
plt.show()
This proves to be the a good intermediate solution under MacOS and PyCharm IDE.
If you don't need to show plots while debugging, the following works:
import matplotlib
matplotlib.use('Agg')
from matplotlib import pyplot as plt
However, if you would like to plot while debugging, you need to do 3 steps:
1.Keep backend to 'TKAgg' as follows:
import matplotlib
matplotlib.use('TKAgg')
from matplot.lib import pyplot as plt
or simply
import matplotlib.pyplot as plt
2.As Fábio also mentioned, you need to add fig(no. #i)=plt.figure(no.#i) for each figure #i. As the following example for plot no.#1, add:
fig1 = plt.figure(1)
plt.plot(yourX,yourY)
plt.show()
3.Add breakpoints. You need to add two breakpoints at least, one somewhere at the beginning of your codes (before the first plot), and the other breakpoint at a point where you would like all plots (before to the second breakpoint) are plotted. All figures are plotted and you even don't need to close any figure manually.
For me, this happened due to parallel access to data by both Matplotlib and by Tensorboard, after Tensorboard's server was running for a week straight.
Rebotting tensorboard tensorboard --logdir . --samples_per_plugin images=100 solved this for me.
I encountered this problem when plotting graphs live with matplotlib in my tkinter application.
The easiest solution I found, was to always delete subplots. I found you didn't need to instantiate a new figure, you only needed to delete the old subplot (using del subplot), then remake it.
Before plotting a new graph, make sure to delete the old subplot.
Example:
f = Figure(figsize=(5,5), dpi=100)
a = f.add_subplot(111)
(For Loop code that updates graph every 5 seconds):
del a #delete subplot
a = f.add_subplot(111) #redefine subplot
Finding this simple solution to fix this "async handler bug" was excruciatingly painful, I hope this helps someone else :)

Categories

Resources