Setting custom tooltip on 3d surface plot in plotly python - python

I’ve been trying for a while to set custom tooltips on a 3d surface plot, but cannot figure it out. I can do something very simple, like make the tooltip the same for each point, but I’m having trouble putting different values for each point in the tooltip, when the fields aren't being graphed.
In my example, I have a dataset of 53 rows (weeks) and 7 columns (days of the week) that I’m graphing on a 3d surface plot, by passing the dataframe in the Z parameter. It’s a year’s worth of data, so each day has its own numeric value that’s being graphed. I’m trying to label each point with the actual date (hence the custom tooltip, since I'm not passing the date itself to the graph), but cannot seem to align the tooltip values correctly.
I tried a simple example to create a "tooltip array" of the same shape as the dataframe, but when I test whether I’m getting the shape right, by using a repeated word, I get an even weirder error where it uses the character values in the word as tooltips (e.g., c or _). Does anyone have any thoughts or suggestions? I can post more code, but tried to replicate my error with a simpler example.
labels=np.array([['test_label']*7]*53)
fig = go.Figure(data=[
go.Surface(z=Z, text=labels, hoverinfo='text'
)],)
fig.show()

We have created sample data similar to the data provided in the image. I created a data frame with randomly generated values for the dates of one consecutive year, added the week number and day number, and formed it into Z data. I have also added a date only data column. So your code will make the hover text display the date.
import numpy as np
import plotly.graph_objects as go
import pandas as pd
df = pd.DataFrame({'date':pd.to_datetime(pd.date_range('2021-01-01','2021-12-31',freq='1d')),'value':np.random.rand(365)})
df['day_of_week'] = df['date'].dt.weekday
df['week'] = df['date'].dt.isocalendar().week
df['date2'] = df['date'].dt.date
Z = df[['week','day_of_week','value']].pivot(index='week', columns='day_of_week')
labels = df[['week','day_of_week','date2']].pivot(index='week', columns='day_of_week').fillna('')
fig = go.Figure(data=[
go.Surface(z=Z,
text=labels,
hoverinfo='text'
)]
)
fig.update_layout(autosize=False, width=800, height=600)
fig.show()

Related

Datetime plotting

Python beginner here :/!
The csv files can be found here (https://www.waterdatafortexas.org/groundwater/well/8739308)
#I'm trying to subset my data and plot them by years or every 6 months but I just cant make it work, this is my code so far
data=pd.read_csv('Water well.csv')
data["datetime"]=pd.to_datetime(data["datetime"])
data["datetime"]
fig, ax = plt.subplots()
ax.plot(data["datetime"], data["water_level(ft below land surface)"])
ax.set_xticklabels(data["datetime"], rotation= 90)
and this is my data and the output. As you can see, it only plots 2021 by time
This is my data of water levels from 2016 to 2021 and the output of the code
data
When you run your script, you get the following warning:
UserWarning: FixedFormatter should only be used together with FixedLocator
ax.set_xticklabels(data["datetime"], rotation= 90)
Your example demonstrates, why they included this warning.
Comment out your line
#ax.set_xticklabels(data["datetime"], rotation= 90)
and you have the following (correct) output:
Your code takes now the nine automatically generated x-axis ticks, removes the correct labels, and labels them instead with the first nine entries of the dataframe. Obviously, these labels are wrong, and this is the reason they provide you with the warning - either let matplotlib do the automatic labeling or do both using FixedFormatter and FixedLocator to ensure that tick positions and labels match.
For more information on Tick locators and formatters consult the matplotlib documentation.
P.S.: You also have to invert the y-axis because the data are in ft below land surface.
The problem is, you have too much data, you have to simplify it.
At first you can try to do something like this:
data["datetime"]=pd.to_datetime(data["datetime"])
date = data["datetime"][0::1000][0:10]
temp = data["water_level(ft below land surface)"][0::1000][0:10]
fig, ax = plt.subplots()
ax.plot(date, temp)
ax.set_xticklabels(date, rotation= 90)
date = data["datetime"][0::1000][0:10]
This line mean: take the index 0, then 1000, then 2000, ...
So you will have an new array. And then with this new array you just take the first 10 indexes.
It's a dirty solution
The best solution in my opinion is to create a new dataset with the average temperature for each day or each week. And after you display the result

How to align bars with tick labels in plt or pandas histogram (when plotting multiple columns)

I have started using python for lots of data problems at work and the datasets are always slightly different. I'm trying to explore more efficient ways of plotting data using the inbuilt pandas function rather than individually writing out the code for each column and editing the formatting to get a nice result.
Background: I'm using Jupyter notebook and looking at histograms where the values are all unique integers.
Problem: I want the xtick labels to align with the centers of the histogram bars when plotting multiple columns of data with the one function e.g. df.hist() to get histograms of all columns at once.
Does anyone know if this is possible?
Or is it recommended to do each graph on its own vs. using the inbuilt function applied to all columns?
I can modify them individually following this post: Matplotlib xticks not lining up with histogram
which gives me what I would like but only for one graph and with some manual processing of the values.
Desired outcome example for one graph:
Basic example of data I have:
# Import libraries
import pandas as pd
import numpy as np
# create list of datapoints
data = [[170,30,210],
[170,50,200],
[180,50,210],
[165,35,180],
[170,30,190],
[170,70,190],
[170,50,190]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['height', 'width','weight'])
# print dataframe.
df
Code that displays the graphs in the problem statement
df.hist(figsize=(5,5))
plt.show()
Code that displays the graph for weight how I would like it to be for all
df.hist(column='weight',bins=[175,185,195,205,215])
plt.xticks([180,190,200,210])
plt.yticks([0,1,2,3,4,5])
plt.xlim([170, 220])
plt.show()
Any tips or help would be much appreciated!
Thanks
I hope this helps.You take the column and count the frequency of each label (value counts) then you specify sort_index in order to get the order by the label not by the frecuency, then you plot the bar plot.
data = [[170,30,210],
[170,50,200],
[180,50,210],
[165,35,180],
[170,30,190],
[170,70,190],
[170,50,190]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['height', 'width','weight'])
df.weight.value_counts().sort_index().plot(kind = 'bar')
plt.show()

Plotting Bar Graph by Years in Matplotlib

I am trying to plot this DataFrame which records various amounts of money over a yearly series:
from matplotlib.dates import date2num
jp = pd.DataFrame([1000,2000,2500,3000,3250,3750,4500], index=['2011','2012','2013','2014','2015','2016','2017'])
jp.index = pd.to_datetime(jp.index, format='%Y')
jp.columns = ['Money']
I would simply like to make a bar graph out of this using PyPlot (i.e pyplot.bar).
I tried:
plt.figure(figsize=(15,5))
xvals = date2num(jp.index.date)
yvals = jp['Money']
plt.bar(xvals, yvals, color='black')
ax = plt.gca()
ax.xaxis_date()
plt.show()
But the chart turns out like this:
Only by increasing the width substantially will I start seeing the bars. I have a feeling that this graph is attributing the data to the first date of the year (2011-01-01 for example), hence the massive space between each 'bar' and the thinness of the bars.
How can I plot this properly, knowing that this is a yearly series? Ideally the y-axis would contain only the years. Something tells me that I do not need to use date2num(), since this seems like a very common, ordinary plotting exercise.
My guess as to where I'm stuck is not handling the year correctly. As of now I have them as DateTimeIndex, but maybe there are other steps I need to take.
This has puzzled me for 2 days. All solutions I found online seems to use DataFrame.plot, but I would rather learn how to use PyPlot properly. I also intend to add two more sets of bars, and it seems like the most common way to do that is through plt.bar().
Thanks everyone.
You can either do
jp.plot.bar()
which gives:
or plot against the actual years:
plt.bar(jp.index.year, jp.Money)
which gives:

Python horizontal bar plotly not showig the whole range of timestamp data [duplicate]

This question already has an answer here:
How to show timestamp x-axis in Python Plotly
(1 answer)
Closed 3 years ago.
I want to plot the data availability using pyplot. I got the code from #vestland. My monthly data is here.
In general, the data spans from January 2009 to January 2019. Each variable comes with its own time period.
Below is the code.
import pandas as pd
import plotly.express as px
path = r'C:\Users\....\availability3.txt'
df = pd.read_csv(path)
df = df.drop(['Unnamed: 0'], axis=1)
fig = px.bar(df, x="Timestamp", y="variable", color='value', orientation='h',
hover_data=["Timestamp"],
height=300,
color_continuous_scale=['firebrick', '#2ca02c'],
title='Data Availabiltiy Plot',
template='plotly_white',
)
fig.update_layout(yaxis=dict(title=''),
xaxis=dict(
title='',
showgrid=True,
gridcolor='white',
tickvals=[]
)
)
fig.show()
As you can see below, the plot shows only the first row of the data which is the first day.
What I want is to show the whole range of the data on the x axis with corresponding values and colors. The result should show data from January 2009 to January 2019, variable values of 0 is shown on red and 1 in green.
Perhaps this is an issue with timestamp, because when using the number index, the plot is just okay.
Edit
By removing duplicates in the dataset and set timestamp as index, I got an almost the expected result. This the new code.
fig = px.bar(df, y="variable", color='value', orientation='h',
hover_data=[df.index],
height=300,
color_continuous_scale=['firebrick', '#2ca02c'],
title='Data Availabiltiy Plot',
template='plotly_white',
)
Now the whole time span is showing as expected. But the value of x-axis timestamp is not yet showing. I will ask in another post
I checked the documentation for plotly.express.bar and briefly worked with your code. Your data may be stacked one on top of each other.
Setting orientation='v' shows all of the data, but not in any particularly intuitive way, although I believe it does answer the question you asked. Yes, the data for Alice, Thalia, Citra, and Pebaru are all present, but the y-axis needs modification to get the proper labels:
Alternatively, setting orientation='h' and barmode='overlay' shows all of the data when you hover, but not as individual bars. You can see the overlay blur on the right edge of the bars:
There are quite a few arguments for plotly.express.bar in the documentation: https://plot.ly/python-api-reference/generated/plotly.express.bar.html#plotly.express.bar. Experiment around and see what you can come up with.
EDIT:
1) Set the x-axis independently using the Timeframe column.
2) Use .groupby() with an averaging function on value.

Pandas - formatting tick labels

I have a pandas dataframe with dates in column 0 and times in column 1. I wish to plot data in columns 2,3,4...n as a function of the date and time. How do I go about formatting the tick labels in the code below so that I can display both the Date and time in the plot. Thanks in advance. I'm new to stackoverflow (and python for that matter) so sorry but I don't have enough a reputation that allows me to attach the image that I get from my code below.
df3=pd.read_table('filename.txt',
sep=',',
skiprows=4,
na_values='N\A',
index_col=[0,1]) # date and time are my indices
datedf=df3.ix[['01:07:2013'],['AOT_1640','AOT_870']]
fig, axes = plt.subplots(nrows=2, ncols=1)
for i, c in enumerate(datedf.columns):
print i,c
datedf[c].plot(ax=axes[i], figsize=(12, 10), title=c)
plt.savefig('testing123.png', bbox_inches='tight')
You could combine columns 0 and 1 into a single date & time column, set that to your index and then the pandas .plot attribute will automatically use the index as the x-tick labels. Hard to say how it will work with your data set as I can't see it but the main point is that Pandas uses the index for the x-tick labels unless you tell it not to. Be warned that this doesn't work well with hierarchical indexing (at least in my very limited experience).

Categories

Resources