Plotting time on the x-axis with Python's matplotlib - python

I am reading in data from a text file which contains data in the format (date time; microVolts):
e.g. 07.03.2017 23:14:01,000; 279
And I wish to plot a graph using matplotlib by capturing only the time (x-axis) and plotting it against microVolts (y-axis). So far, I've managed to extract the time element from the string and convert it into datetime format (shown below).
I tried to append each value of time into x to plot, but the program just freezes and displays nothing.
Here is part of the code:
from datetime import datetime
import matplotlib.pyplot as plt
ecg = open(file2).readlines()
x = []
for line in range(len(ecg)):
ecgtime = ecg[7:][line][:23]
ecgtime = datetime.strptime(ecgtime, '%d.%m.%Y %H:%M:%S,%f')
x.append(ecgtime.time())
I'm aware the datetime format is causing the issue but I can't figure out how to convert it into float/int as it says:
'invalid literal for float(): 23:14:01,000'

I have no reputation for comment than I have to answer.
datetime.datetime.time() converts to datetime.time object, you need float.
Could you try datetime.datetime.timestamp()?
See last line:
from datetime import datetime
import matplotlib.pyplot as plt
ecg = open(file2).readlines()
x = []
for line in range(len(ecg)):
ecgtime = ecg[7:][line][:23]
ecgtime = datetime.strptime(ecgtime, '%d.%m.%Y %H:%M:%S,%f')
x.append(ecgtime.timestamp())
EDIT: timestamp() is available sine Python 3.3. For Python 2 you can use
from time import mktime
...
x.append(mktime(ecgtime.timetuple()))

Related

How to plot scatterplot using matplotlib from arrays (using strings)? Python

I have been trying to plot a 3D scatterplot from a pandas array (I have tried to convert the data over to numpy arrays and strings to put into the system). However, the error ValueError: s must be a scalar, or float array-like with the same size as x and y keeps popping up. My data for Patient ID is in the format of EMR-001, EMR-002 etc after blanking it out. My data for Discharge Date is converted to become a string of numbers like 20200120. My data for Diagnosis Code is a mix of characters like 001 or 10B.
I have also tried to look online at some of the other examples but have not been able to identify any areas. Could I seek your advice for anything I missed out or code I can input?
I'm using Python 3.9, UTF-8. Thank you in advanced!
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
#importing csv data via pd
A = pd.read_csv('input.csv') #import file for current master list
Diagnosis_Des = A["Diagnosis Code"]
Discharge_Date = A["Discharge Date"]
Patient_ID = A["Patient ID"]
B = Diagnosis_Des.to_numpy()
#B1 = np.array2string(B)
#print(B.shape)
C = Discharge_Date.to_numpy() #need to change to data format
#C1 = np.array2string(C)
#print(C1)
D = Patient_ID.to_numpy()
#D1 = np.array2string(D)
#print(D.shape)
from matplotlib import pyplot
from mpl_toolkits.mplot3d import Axes3D
sequence_containing_x_vals = D
sequence_containing_y_vals = B
print(type(sequence_containing_y_vals))
sequence_containing_z_vals = C
print(type(sequence_containing_z_vals))
plt.scatter(sequence_containing_x_vals, sequence_containing_y_vals, sequence_containing_z_vals)
pyplot.show()

Convert WRF UTC time to Local time

I have to make spatial plots from a bunch of WRFout files that I have. Currently, I am using following lines of code to print the respective times for each spatial plot
#..Load packages
import os
import netCDF4
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap,addcyclic,cm,shiftgrid
from wrf import getvar,get_basemap,to_np,latlon_coords
#..Read the files
fpath = sorted(glob.glob("/path/wrfout_d01_2017-03-02_00:00:00"))
with netCDF4.Dataset(fpath, 'r') as fin:
#..Read variables
p = getvar(fin,'pressure')
times = getvar(fin,'times',meta=False)
#..Make the pressure plot
fig = plt.figure()
mp = get_basemap(p)
x,y = mp(to_np(lons),to_np(lats))
cntrs = mp.contourf(x,y,p,cmap='jet')
plt.title(str(to_np(times))[0:-10])
plt.show()
The times variable gives time in the format 2017-03-02T00:00:00.000000000.
The line of code plt.title(str(to_np(times))[0:-10]) prints the time as 2017-03-02T00:00:00, which is a UTC time. But, I want it to be printed as 2017-03-01 17:00:00, which is the local time (UTC- 7 hours).
Thanks in advance, any suggestions will be highly appreciated.
You can use pandas to do the conversion.You can choose the timezone that works for you.
Just added the snippet thats useful.
import pandas as pd
#..Read variables
...
times = getvar(fin,'times',meta=False)
mountainTime = pd.Timestamp(times,tz='US/Mountain')
#..Make the pressure plot
...
plt.title(str(mountainTime)[0:-6])
This might help.
import datetime
dt=datetime.datetime.strptime("2017-03-02T00:00:00", "%Y-%m-%dT%H:%M:%S") #Get your datetime object
dt=dt.replace(tzinfo=datetime.timezone.utc) #Convert it to an aware datetime object in UTC time.
print(dt) #You do not need this line. For show only :P
dt=dt.astimezone() #Convert it to your local timezone
print(dt.strftime("%Y-%m-%d %H:%M:%S"))
Output:
2017-03-02 00:00:00+00:00
2017-03-02 05:30:00
Now my timezone is UTC+5:30 (India). So, showing that. Yours should give your.

Error "'NoneType' object has no attribute 'offset'" when analysing GPX data

I am following this tutorial while learing Python (https://towardsdatascience.com/how-tracking-apps-analyse-your-gps-data-a-hands-on-tutorial-in-python-756d4db6715d).
I am at the step where I want to plot 'time' and 'elevation'. But when I do this with:
plt.plot(df['time'], df['ele'])
plt.show()
I get the error "'NoneType' object has no attribute 'offset'". If I plot 'longitude' and 'latitude' everything works fine.
I cannot find a way to solve this problem by myself.
This is "my" code so far:
import gpxpy
import matplotlib.pyplot as plt
import datetime
from geopy import distance
from math import sqrt, floor
import numpy as np
import pandas as pd
import chart_studio.plotly as py
import plotly.graph_objects as go
import haversine
#Import Plugins
gpx_file = open('01_Karlsruhe_Schluchsee.gpx', 'r')
gpx = gpxpy.parse(gpx_file)
data = gpx.tracks[0].segments[0].points
## Start Position
start = data[0]
## End Position
finish = data[-1]
df = pd.DataFrame(columns=['lon', 'lat', 'ele', 'time'])
for point in data:
df = df.append({'lon': point.longitude, 'lat' : point.latitude,
'ele' : point.elevation, 'time' : point.time}, ignore_index=True)
print(df)
plt.plot(df['time'], df['ele'])
plt.show()
Picture of my dataframe
Removing the timezone from your 'time' column might do the trick. You can do this with tz_localize. Note that you have to call method dt to access column datetime properties:
df['time'] = df['time'].dt.tz_localize(None)
The problem is that the times in gpxpy have a time zone of SimpleTZ('Z'), which I think is their own implementation of the tzinfo abstract base class. That makes it "aware" as opposed to "naive" but apparently doesn't allow getting the offset.
See https://github.com/tkrajina/gpxpy/issues/209.
I fixed it as follows and also got the time zone for the first location in a track. (The track may have been done is a different time zone than local.)
import datetime
from zoneinfo import ZoneInfo
from timezonefinder import TimezoneFinder
def get_data(gpx):
'''Currently Only does the first track and first segment'''
tzf = TimezoneFinder()
# Use lists for the data not a DataFrame
lat = []
lon = []
ele = []
time = []
n_trk = len(gpx.tracks)
for trk in range(n_trk):
n_seg = len(gpx.tracks[trk].segments)
first = True # Flag to get the timezone for this track
for seg in range(n_seg):
points = gpx.tracks[trk].segments[seg].points
for point in points:
if(first):
# Get the time zone from the first point in first segment
tz_name = tzf.timezone_at(lng=point.longitude, lat=point.latitude)
first = False
lat.append(point.latitude)
lon.append(point.longitude)
ele.append(point.elevation)
try:
new_time = point.time.astimezone(ZoneInfo(tz_name))
except:
new_time = point.time.astimezone(ZoneInfo('UTC'))
time.append(new_time)
return lat, lon, ele, time
With these changes the plots in PyPlot work as expected.
ZoneInfo is only available as of Python 3.9, and on Windows you also have to install tzdata (pip install tzdata) for ZoneInfo to work. For earlier versions you could do essentially the same thing using pytz.

Python datetime switching between US and UK date formats

I'm using matplotlib to plot some data imported from CSV files. These files have the following format:
Date,Time,A,B
25/07/2016,13:04:31,5,25550
25/07/2016,13:05:01,0,25568
....
01/08/2016,19:06:43,0,68425
The dates are formatted as they would be in the UK, i.e. %d/%m/%Y. The end result is to have two plots: one of how A changes with time, and one of how B changes with time. I'm importing the data from the CSV like so:
import matplotlib
matplotlib.use('Agg')
from matplotlib.mlab import csv2rec
import matplotlib.pyplot as plt
from datetime import datetime
import sys
...
def analyze_log(file, y):
data = csv2rec(open(file, 'rb'))
fig = plt.figure()
date_vec = [datetime.strptime(str(x), '%Y-%m-%d').date() for x in data['date']]
print date_vec[0]
print date_vec[len(date_vec)-1]
time_vec = [datetime.strptime(str(x), '%Y-%m-%d %X').time() for x in data['time']]
print time_vec[0]
print time_vec[len(time_vec)-1]
datetime_vec = [datetime.combine(d, t) for d, t in zip(date_vec, time_vec)]
print datetime_vec[0]
print datetime_vec[len(datetime_vec)-1]
y_vec = data[y]
plt.plot(datetime_vec, y_vec)
...
# formatters, axis headers, etc.
...
return plt
And all was working fine before 01 August. However, since then, matplotlib is trying to plot my 01/08/2016 data points as 2016-01-08 (08 Jan)!
I get a plotting error because it tries to plot from January to July:
RuntimeError: RRuleLocator estimated to generate 4879 ticks from 2016-01-08 09:11:00+00:00 to 2016-07-29 16:22:34+00:00:
exceeds Locator.MAXTICKS * 2 (2000)
What am I doing wrong here? The results of the print statements in the code above are:
2016-07-25
2016-01-08 #!!!!
13:04:31
19:06:43
2016-07-25 13:04:31
2016-01-08 19:06:43 #!!!!
Matplotlib's csv2rec function parses your dates already and tries to be intelligent when it comes to parsing dates. The function has two options to influence the parsing, dayfirst should help here:
dayfirst: default is False so that MM-DD-YY has precedence over DD-MM-YY.
yearfirst: default is False so that MM-DD-YY has precedence over YY-MM-DD.
See http://labix.org/python-dateutil#head-b95ce2094d189a89f80f5ae52a05b4ab7b41af47 for further information.
You're using strings in %d/%m/%Y format but you've given the format specifier as %Y-%m-%d.

How to use datetime.time to plot in Python

I have list of timestamps in the format of HH:MM:SS and want to plot against some values using datetime.time. Seems like python doesn't like the way I do it. Can someone please help ?
import datetime
import matplotlib.pyplot as plt
# random data
x = [datetime.time(12,10,10), datetime.time(12, 11, 10)]
y = [1,5]
# plot
plt.plot(x,y)
plt.show()
*TypeError: float() argument must be a string or a number*
Well, a two-step story to get 'em PLOT really nice
Step 1: prepare data into a proper format
from a datetime to a matplotlib convention compatible float for dates/times
As usual, devil is hidden in detail.
matplotlib dates are almost equal, but not equal:
# mPlotDATEs.date2num.__doc__
#
# *d* is either a class `datetime` instance or a sequence of datetimes.
#
# Return value is a floating point number (or sequence of floats)
# which gives the number of days (fraction part represents hours,
# minutes, seconds) since 0001-01-01 00:00:00 UTC, *plus* *one*.
# The addition of one here is a historical artifact. Also, note
# that the Gregorian calendar is assumed; this is not universal
# practice. For details, see the module docstring.
So, highly recommended to re-use their "own" tool:
from matplotlib import dates as mPlotDATEs # helper functions num2date()
# # and date2num()
# # to convert to/from.
Step 2: manage axis-labels & formatting & scale (min/max) as a next issue
matplotlib brings you arms for this part too.
Check code in this answer for all details
It is still valid issue in Python 3.5.3 and Matplotlib 2.1.0.
A workaround is to use datetime.datetime objects instead of datetime.time ones:
import datetime
import matplotlib.pyplot as plt
# random data
x = [datetime.time(12,10,10), datetime.time(12, 11, 10)]
x_dt = [datetime.datetime.combine(datetime.date.today(), t) for t in x]
y = [1,5]
# plot
plt.plot(x_dt, y)
plt.show()
By deafult date part should not be visible. Otherwise you can always use DateFormatter:
import matplotlib.dates as mdates
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H-%M-%S'))
I came to this page because I have a similar issue. I have a Pandas DataFrame df with a datetime column df.dtm and a data column df.x, spanning several days, but I want to plot them using matplotlib.pyplot as a function of time of day, not date and time (datetime, datetimeindex). I.e., I want all data points to be folded into the same 24h range in the plot. I can plot df.x vs. df.dtm without issue, but I've just spent two hours trying to figure out how to convert df.dtm to df.time (containing the time of day without a date) and then plotting it. The (to me) straightforward solution does not work:
df.dtm = pd.to_datetime(df.dtm)
ax.plot(df.dtm, df.x)
# Works (with times on different dates; a range >24h)
df['time'] = df.dtm.dt.time
ax.plot(df.time, df.x)
# DOES NOT WORK: ConversionError('Failed to convert value(s) to axis '
matplotlib.units.ConversionError: Failed to convert value(s) to axis units:
array([datetime.time(0, 0), datetime.time(0, 5), etc.])
This does work:
pd.plotting.register_matplotlib_converters() # Needed to plot Pandas df with Matplotlib
df.dtm = pd.to_datetime(df.dtm, utc=True) # NOTE: MUST add a timezone, even if undesired
ax.plot(df.dtm, df.x)
# Works as before
df['time'] = df.dtm.dt.time
ax.plot(df.time, df.x)
# WORKS!!! (with time of day, all data in the same 24h range)
Note that the differences are in the first two lines. The first line allows better collaboration between Pandas and Matplotlib, the second seems redundant (or even wrong), but that doesn't matter in my case, since I use a single timezone and it is not plotted.

Categories

Resources