Huge proglem to load Plotly plots in Jupyter Notebook Python - python

I have a huge problem with Jupyter Notebook.
When he builds graphs using the plot packet, to display the next graph I have to reload the dataset, otherwise an error pops up: TypeError: list indices must be integers or slices, not str. When I reopen the project, all the graphs created using Plotly are not visible, just a white background and I have to reload them if I want to see them, and everything after reloading the dataset, that is:
1) I open the project in Jupyte Notebook, all graphs created by Plotly are not visible (others from Seaborn for example are visible.
2) I load the Plotly charts, but I can load them one by one, because first I have to load the dataset, then the chart, and so an error occurs to all Plotly charts differently: TypeError: list indices must be integers or slices
what can I do about it how to fix it? it seriously hinders the work
this is my libraries:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
import plotly.offline as py
import plotly.graph_objs as go
import plotly.tools as tls
This is an example of this error, but if I load the dataset again and again load this plot everything will be ok:
and the example of the code of plot:
#Distribution of credit risk in dataset (target variable)
#Size of the plot
figsize=(10,5)
#Sums of good and bad credits in the dataset
goodCount = data[data["Risk"]== 'good']["Risk"].value_counts().values
badCount = data[data["Risk"]== 'bad']["Risk"].value_counts().values
#Bar fo good credit
trace0 = go.Bar(x = data[data["Risk"]== 'good']["Risk"].value_counts().index.values,
y = data[data["Risk"]== 'good']["Risk"].value_counts().values,
name='Good credit',
text= goodCount,
textposition="auto",
marker = dict(color = "green", line=dict(color="black", width=1),),opacity=1)
#Bar of bad credit
trace1 = go.Bar(x = data[data["Risk"]== 'bad']["Risk"].value_counts().index.values,
y = data[data["Risk"]== 'bad']["Risk"].value_counts().values,
name='Bad credit',
text= badCount,
textposition="auto",
marker = dict(color = "red", line=dict(color="black", width=1),),opacity=1)
#Creation of bar plot
data = [trace0, trace1]
layout = go.Layout()
layout = go.Layout(yaxis=dict(title='Count'),
xaxis=dict(title='Risk variable'),
title='Distribution of target variable in the dataset')
fig = go.Figure(data=data, layout=layout)
fig.show()

Related

How do you put the x axis labels on the top of the heatmap created with seaborn? [duplicate]

This question already has answers here:
How to move labels from bottom to top without adding "ticks"
(2 answers)
How to have the axis ticks in both top and bottom, left and right of a heatmap
(2 answers)
Closed 4 months ago.
I have created a heatmap using the seaborn and matplotlib package in python, and while it is perfectly suited for my current needs, I really would prefer to have the labels on the x-axis of the heatmap to be placed at the top of the plot, rather than at the bottom (which seems to be its default).
So an abridged form of my data looks like this:
NP NP1 NP2 NP3 NP4 NP5
identifier
A1BG~P04217 -0.094045 0.012229 0.102279 1.319618 0.002383
A2M~P01023 -0.805089 -0.477339 -0.351341 0.089735 -0.473815
AARS1~P49588 0.081827 -0.099849 -0.287426 0.101588 0.136366
ABCB6~Q9NP58 0.109911 0.458039 -0.039325 -0.484872 1.905586
ABCC1~I3L4X2 -0.560155 0.580285 0.012868 0.291303 -0.407900
ABCC4~O15439 0.055264 0.138630 -0.204665 0.191241 0.304999
ABCE1~P61221 -0.510108 -0.059724 -0.233365 0.078956 -0.651327
ABCF1~Q8NE71 -0.348526 -0.135414 -0.390021 -0.190644 -0.276303
ABHD10~Q9NUJ1 0.237959 -2.060834 0.325901 -0.778036 -4.046345
ABHD11~Q8NFV4 0.294587 1.193258 -0.797294 -0.148064 -1.153391
And when I use the following code:
import seaborn as sns
import matplotlib as plt
fig, ax = plt.subplots(figsize=(10,30))
ax = sns.heatmap(df_example, annot=True, xticklabels=True)
I get this kind of plot:
https://imgpile.com/i/T3zPH1
I should note that the this plot was made from the abridged dataframe above, the actual dataframe has thousands of identifiers, making it very long.
But as you can see, the labels on the x axis only appear at the bottom. I have been trying to get them to appear on the top, but seaborn doesn't seem to allow this kind of formatting.
So I have also tried using plotly express, but while I solve the issue of placing my x-axis labels on top, I have been completely unable to format the heat map as I had before using seaborn. The following code:
import plotly.express as px
fig = px.imshow(df_example, width= 500, height=6000)
fig.update_xaxes(side="top")
fig.show()
yields this kind of plot: https://imgpile.com/i/T3zF42.
I have tried many times to reformat it using the documentation from plotly (https://plotly.com/python/heatmaps/), but I can't seem to get it to work. When one thing is fixed, another problem arises. I really just want to keep using the seaborn based code as above, and just fix the x-axis labels. I'm also happy to have the x-axis label at both the top and bottom of the plot, but I can't get that work presently. Can someone advise me on what to do here?
Ok, so I did a bit more research, and it turns out you can add the follow code with the seaborn approach:
plt.tick_params(axis='both', which='major', labelsize=10, labelbottom = False, bottom=False, top = False, labeltop=True)
If your data are stored into csv file, you can use this code:
import pandas as pd
import plotly.express as px
df = pd.read_csv("file.csv").round(2)
fig = px.imshow(df.iloc[:,1:],
y = df['identifier'],
text_auto=True, aspect="auto")
fig.show()
The data in the CSV file are in the following format:
identifier NP1 NP2 NP3 NP4 NP5
A1BG~P04217 -0.094045 0.012229 0.102279 1.319618 0.002383
A2M~P01023 -0.805089 -0.477339 -0.351341 0.089735 -0.473815
AARS1~P49588 0.081827 -0.099849 -0.287426 0.101588 0.136366
ABCB6~Q9NP58 0.109911 0.458039 -0.039325 -0.484872 1.905586
ABCC1~I3L4X2 -0.560155 0.580285 0.012868 0.291303 -0.407900
ABCC4~O15439 0.055264 0.138630 -0.204665 0.191241 0.304999
ABCE1~P61221 -0.510108 -0.059724 -0.233365 0.078956 -0.651327
ABCF1~Q8NE71 -0.348526 -0.135414 -0.390021 -0.190644 -0.276303
ABHD10~Q9NUJ1 0.237959 -2.060834 0.325901 -0.778036 -4.046345
ABHD11~Q8NFV4 0.294587 1.193258 -0.797294 -0.148064 -1.153391
Now let's display the xaxis top of the heatmap by adding:
fig.update_layout(xaxis = dict(side ="top"))
Alternative solution if you have old version of Plotly:
fig = go.Figure(data=go.Heatmap(
x=df.columns[1:],
y=df.identifier,
z=df.iloc[:,1:],
text=df.iloc[:,1:],
texttemplate="%{text}"))
fig.update_layout(xaxis = dict(side ="top"))
fig.show()

Python Plotly scatter 3D plot colormap customization

I am using plotly. I am getting the plot. The problem is, I am using seasons as colormap. I have used 1 for fall, 2 for winter, ..,4 for summer. Now, the colomap shows these numbers and also 1.5, 2.5 etc. I want to show Names instead of numbers
My code:
import plotly.express as px
from plotly.offline import plot
import plotly
fig = px.scatter_3d(df, x=xlbl, y=ylbl, z=zlbl,
color=wlbl,opacity=0,
color_continuous_scale = plotly.colors.sequential.Viridis)
temp_name = 'Temp_plot.html'
plot(fig, filename = temp_name, auto_open=False,
image_width=1200,image_height=800)
plot(fig)
Present output:
You can modify the coloraxis by adding the following lines to your code:
cat_labels = ["Fall", "Winter", "Spring", "Summer"]
fig.update_coloraxes(colorbar=dict(ticktext=cat_labels,
tickvals=list(range(1, len(cat_labels)+1))))
Sample output with random data:

Python Plotly Express Scatter Plot

I want to create an interactive scatter plot; so I am using the plotly.graph_objects module.
My data has two columns of about 100 points.
When I make a line plot, I have no problem.
But when I try to make a scatter plot, Jupyter seems to hang (message at the bottom says - Local Host not responding)
It takes a while for Jupyter to respond and I still have no plot.
The code I am using is:
import plotly.express as px
import plotly.graph_objects as go
fig = go.Figure()
var_list = ['cloxth1 ()','cloxth2 ()']
for item in var_list:
stripped_item = item.replace(' ()','')
fig.add_trace(go.Scatter(
x=np.linspace(0,len(df),len(df)),
y=df[item],
mode='markers',
marker={'size':1},
name = item
))
fig.update_layout(title = 'CLOXTH',
xaxis_title = 'data samples',
yaxis_title = 'mV')
fig.show()
Is there anything wrong with the way I am using go.Scatter?
I tried using px.scatter instead. It seems to work, as in I get a scatter plot. But in the plotly.express case I am unable to have a proper legend for 'cloxth1' and 'cloxth2'; also, both data sets are plotted with the same color.
How can I get around this?
A few rows from the data:
Sample Data
# read in with
df = pd.read_clipboard(sep=',', index_col=[0])
# copy to clipboard
,time(s),Filename,time_stamp,time_vector(ms),time_vector_zerobased(ms),cloxth1(),cloxth2()
0.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:03.8,0,0,725.9097285,725.9097285
1.001,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:04.8,1001,1001,725.9097285,725.9097285
2.001,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:05.8,2001,2001,725.9097285,725.9097285
3.002,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:06.8,3002,3002,725.9097285,725.9097285
4.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:07.8,4000,4000,725.9097285,725.9097285
5.002,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:08.8,5002,5002,725.9097285,725.9097285
6.002,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:09.8,6002,6002,725.9097285,725.9097285
7.001,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:10.8,7001,7001,725.9097285,725.9097285
8.003,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:11.8,8003,8003,725.9097285,725.9097285
9.002,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:12.8,9002,9002,725.9097285,725.9097285
10.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:13.8,10000,10000,725.9097285,725.9097285
11.005,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:14.8,11005,11005,725.9097285,725.9097285
12.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:15.8,12000,12000,725.9097285,725.9097285
13.001,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:16.8,13001,13001,725.9097285,725.9097285
14.003,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:17.8,14003,14003,725.9097285,725.9097285
15.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:18.8,15000,15000,725.9097285,725.9097285
16.002,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:19.8,16002,16002,725.9097285,725.9097285
17.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:20.8,17000,17000,725.9097285,725.9097285
18.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:21.8,18000,18000,725.9097285,725.9097285
19.003,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:22.8,19003,19003,725.9097285,725.9097285
20.001,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:23.8,20001,20001,725.9097285,725.9097285
21.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:24.8,21000,21000,725.9097285,725.9097285
22.005,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:25.8,22005,22005,725.9097285,725.9097285
23.0,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:26.8,23000,23000,725.9097285,725.9097285
24.002,4DRBUP1N8HB706662_Trip-Detail_2020-07-20,00-04-03.csv.zip,04:27.8,24002,24002,725.9097285,725.9097285

How can I plot an interactive 3d scatterplot from a pandas dataframe?

Pandas provides builtin plotting functionality for DataFrames with several plotting backend engines (matplotlib, etc.). Id like to plot an interactive 3D scatterplot directly from a dataframe via df.plot() but came up with non-interactive plots only. I'm thinking of something I get when e.g. plotly. I'd prefer a solution which is independent of exploratory data analysis IDE setup dependencies (e.g. ipywidget when using JupyterLab). How can I plot interactive 3D scatter plots via df.plot()?
plotly is the way to go...use scatter3d
import plotly as py
import plotly.graph_objs as go
import numpy as np
import pandas as pd
# data
np.random.seed(1)
df = pd.DataFrame(np.random.rand(20, 3), columns=list('ABC'))
trace = go.Scatter3d(
x=df['A'],
y=df['B'],
z=df['C'],
mode='markers',
marker=dict(
size=5,
color=c,
colorscale='Viridis',
),
name= 'test',
# list comprehension to add text on hover
text= [f"A: {a}<br>B: {b}<br>C: {c}" for a,b,c in list(zip(df['A'], df['B'], df['C']))],
# if you do not want to display x,y,z
hoverinfo='text'
)
layout = dict(title = 'TEST',)
data = [trace]
fig = dict(data=data, layout=layout)
py.offline.plot(fig, filename = 'Test.html')
Holding to the df.plot() approach there is an df.iplot() with Cufflinks and Plotly.

How to increase the available space for ylabel?

Good morning!
I'm making some bar plots with Seaborn, but I've difficulties getting a proper ylabel for them.
Here is a reproductible example:
import pandas as pd
import os
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib
from pdb import set_trace as bp
name = 'test.pdf'
data = pd.DataFrame({'Labels': ['Label', 'Longer label', 'A really really large label'], 'values': [200, 100, 300]})
sns.set_style("dark")
ax = sns.barplot(y = data['Labels'], x = data['values'], data = data)
ax.set(ylabel = 'Labels', xlabel = 'Values')
plt.savefig(name)
plt.close()
As you can see, second and third labels ('Longer label' and 'A really really large label') can't be shown completely and I can't solve it.
Furthermore, I would want to know how to delete these short black lines at top and at left of the image.
Thanks you very much!!
You need to specify bbox_inches='tight' while saving the figure as
plt.savefig(name, bbox_inches='tight')
If you are working with JuPyter notebooks, then plt.tight_layout() would work for inline plots as commented above by #ALollZ

Categories

Resources