I am trying to plot a drone's altitude vs time (Time on the X-axis and altitudes on the Y-axis). I converted my list of timestamps into a MatPlotLib-readable format using dates = matplotlib.dates.date2num(timestamps). The length of the altitudes list and the converted timestamps list is 16587 exactly, so there is no mismatch there. The graph came out absolutely horrendous and I would like to know how to make this readable with so much data. My full code is
timestamps = []
for stamp in times: #convert list of timestamp Strings to Python timestamp objects
stamp = date + " " + stamp
stamp = stamp.replace('.', ':') # We want the milliseconds to be behind a colon so it can be easily formatted to DateTime
stamp = datetime.strptime(stamp, '%Y-%m-%d %H:%M:%S:%f')
timestamps.append(stamp)
dates = matplotlib.dates.date2num(timestamps)
for alt in altitudes:
alt = round(float(alt), 2)
plt.plot_date(dates, altitudes)
plt.show()
The graph is indeed unreadable, even if it is not clear what's your expectation.
When plotting a huge number of points, I guess is better to specify also the alpha parameter to add some transparency and "see through" clouds of overlapping points.
Then you can specify your x and yticks (maybe also with rotation parameter) to show a smaller portion of them and add plt.grid(True)
These are just basic suggestions. Try to be more specific in "make this readable".
Related
I measured the seeing index and I need to plot it as a function of time, but the time I received from the measurement is a string with 02-09-2022_time_11-53-51,045 format. How can I convert it into something Python could read and I could use in my plot?
Using pandas I extracted time and seeing_index columns from the txt file received by the measurement. Python correctly plotted seeing index values on Y axes, but besides plotting time values on the X axis, it just added a number to each row and plotted index against row number. What can I do so it was index against time?
You may try this:
df.time = pd.to_datetime(df.time, format='%d-%m-%Y_time_%H-%M-%S,%f')
Hello I have a a list that has about 150 dates that are stored in string format. I would like to set an interval so that there are only 10 ticks along the x-axis I am not sure how to do this without changing the type format.
'1980-06',
'1980-09',
'1980-12',
'1981-03',
'1981-06',
'1981-09',
'1981-12',
...
You can provide the ticks list you want to show. To get that ticks list, just divide your date list into ten parts and find how far each tick should be. And then use Python's indexing to get the values of ticks. Check below:
import math
dates = ['1980-06', '1980-09', '1980-12', '1981-12' ...] # Your date list
date_len = len(dates)
step = int(math.ceil(date_len/10))
ticks = dates[::step] # ticks to show on graph
I want to plot timelines, my dates are formatted as day/month/year.
When building the index, I take care of that:
# format Date
test['DATA'] = pd.to_datetime(test['DATA'], format='%d/%m/%Y')
test.set_index('DATA', inplace=True)
and with a double check I see months and days are correctly interpreted:
#the number of month reflect the month, not the day : correctly imported!
test['Year'] = test.index.year
test['Month'] = test.index.month
test['Weekday Name'] = test.index.weekday_name
However, when I plot, I see datapoints get connected erratically (although their distribution seems to be correct, since I expect a seasonality):
# Start and end of the date range to extract
start, end = '2018-01', '2018-04'
# Plot daily, weekly resampled, and 7-day rolling mean time series together
fig, ax = plt.subplots()
ax.plot(test.loc['2018', 'TMIN °C'],
marker='.', linestyle='-', linewidth=0.5, label='Daily')
I suspect it may have to do with misinterpreted dates, or that dates are not put in the right sequence, but could not find a way to verify where an error may be.
Could you help validating how to import correctly my timeseries ?
Oh, it was super simple. I assumed datetime was automatically sorted, instead one must sort :
test.loc['2018-01':'2018-03'].sort_index().index #sorted
test.loc['2018-01':'2018-03'].index #not sorted
This question may be delated or marked as duplicate, I let it for moderators:
Pandas - Sorting a dataframe by using datetimeindex
I have stock data that contains the ohlc attribute and I want to make a RSI indicator plot by calculating the close value. Because the stock data is sorted by date, the date must be changed to a number using date2num. But the calculation result of the close attribute becomes a list of RSI values when plotted overlapping.
I think the length of the results of the RSI is not the same as the date length, but after I test by doing len(rsi) == len(df ['date']) show the same length. Then I try not to use the x-axis date but the list of number made by range(0, len(df['date'])) and plot show as I expected.
#get data
df = df.tail(1000)
#covert date
df['date'] = pd.to_datetime(df['date'])
df['date'] = df['date'].apply(mdates.date2num)
#make indicator wit TA-Lib
rsi = ta.RSI(df['close'], timeperiod=14)
#plot rsi indicator wit TA-Lib
ax1.plot(df['date'], rsi)
ax2.plot(range(0, len(df['date'])), rsi)
#show chart
plt.show()
I expect the output using the x-axis date to be the same as the x-axis list of numbers
Image that shows the difference
It seems that matplotlib chooses the x-ticks to display (when chosen automatically) to show "round" numbers. So in your case of integers, a tick every 200; in your case of dates, every two months.
You seem to expect the dates to follow the same tick steps as the integers, but this will cause the graph to show arbitrary dates in the middle of the month, which isn't a good default behavior.
If that's the behavior you want, try something of this sort:
rng = range(len(df['date']))
ax2.plot(rng, rsi) # Same as in your example
ax2.set_xlim((rng[0], rng[-1])) # Make sure no ticks outside of range
ax2.set_xticklabels(df['date'].iloc[ax2.get_xticks()]) # Show respective dates in the locations of the integers
This behavior can of course be reversed if you wish to show numbers instead of dates, using the same ticks as the dates, but I'll leave that to you.
After I tried several times, I found the core of the problem. On weekends the data is not recorded so there is a gap on the date. The matplotlib x-axis date will be given a gap on weekends even though there is no data on that day, so the line plot will overlap.
For the solution I haven't found it, but for the time being I use the list of numbers.
I have some data, that I have loaded up into numpy, I do not have a csv or any file loaded up with the range of dates I need, however I know what this array length is.
Currently I am just doing this to print up a simple graph:
t = numpy.arange(0.0, len(data), 1)
pylab.plot(t, data)
Would it be possible to replace t here so that I can specify a start and end date and it would print the actual date? Say, I have 365 days in my dataset, it would give the plot actually dates such as DD/MM/YYYY , 1/1/1999.1/2/1999.....12/31/1999?
You might want to take a look at plot_date()
and the matliplot dates api.