So I do have a simple question. I have a program which simulates a week/month of living of a shop. For now it takes care of cashdesks (I don't know if I transalted that one correctly from my language), as they can fail sometimes, and some specialist has to come to the shop and repair them. At the end of simulation, program plots a graph which look like this:
The 1.0 state occurs when the cashdesk has gotten some error/broke, then it waits for a technician to repair it, and then it gets back to 0, working state.
I or rather my project guy would rather see something else than minutes on the x axis. How can I do it? I mean, I would like it to be like Day 1, then an interval, Day 2, etc.
I know about pyplot.xticks() method, but it assigns the labels to the ticks that are in the list in the first argument, so then I would have to make like 2000 labels, with minutes, and I want only 7, with days written on it.
You can use matplotlib set_ticks and get_xticklabels() method of ax, inspired by this and this questions.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
minutes_in_day = 24 * 60
test = pd.Series(np.random.binomial(1, 0.002, 7 * minutes_in_day))
fig, ax = plt.subplots(1)
test.plot(ax = ax)
start, end = ax.get_xlim()
ax.xaxis.set_ticks(np.arange(start, end, minutes_in_day))
labels = ['Day\n %d'%(int(item.get_text())/minutes_in_day+ 1) for item in ax.get_xticklabels()]
ax.set_xticklabels(labels)
I get something like the picture below.
You're on the right track with plt.xticks(). Try this:
import matplotlib.pyplot as plt
# Generate dummy data
x_minutes = range(1, 2001)
y = [i*2 for i in x_minutes]
# Convert minutes to days
x_days = [i/1440.0 for i in x_minutes]
# Plot the data over the newly created days list
plt.plot(x_days, y)
# Create labels using some string formatting
labels = ['Day %d' % (item) for item in range(int(min(x_days)), int(max(x_days)+1))]
# Set the tick strings
plt.xticks(range(len(labels)), labels)
# Show the plot
plt.show()
Related
My code creates 4 stock graphs (data taken from yahoo with pandas datareader) with matplotlib. The problem i SOMETIMES get is the X & Y-axis on the first graph created in the sequence of 4 graphs being created one after the other looks like this:
Matplotlib scales it entirely wrong, the numbers given to it are not years ranging from 1970 to 2020, It is only given 5 days. But as stated earlier it only sometimes happens and when the code creates the other 3 graphs there are absolutely no problems. I believe this is due to some issue happening when it creates a graph that ranges from 5-10 dollars and then quickly another graph that ranges from 50-100 dollars. I do have a time sleep between every graph being created but it doesn't seem to work.
I also can't scale it manually every time because the code has to create graphs of various stocks with various ranges on the axis.
#Libraries
import pandas as pd
import pandas_datareader as web
import matplotlib.pyplot as plt
import datetime
date = datetime.date.today()
def Graphmaker(ticker, start, end, interval):
stock_data = web.DataReader(ticker, data_source = "yahoo", start = start, end = end) #Gets information
plt.plot(stock_data['Close']) #Plots
plt.autoscale()
plt.axis('off') #Removes axis
graph = plt.gcf()
#
plt.switch_backend('agg')
#
plt.draw()
#plt.show()
date = datetime.date.today()
file_name = "graph_" + ticker + "_" + interval + "_day_graph_" + str(date) + ".png"
graph.savefig('Data/test/' + file_name, dpi=100) #Saves graph, Adds the graph image to be tested which later gets moved to 'train'
plt.clf() #Without this the graphs stack on each other and create incorrect lines
return(file_name) #Passing the file name back to main which gets passed to predictor
I have time-series plots (over 1 year) where the months on the x-axis are of the form Jan, Feb, Mar, etc, but I would like to have just the first letter of the month instead (J,F,M, etc). I set the tick marks using
ax.xaxis.set_major_locator(MonthLocator())
ax.xaxis.set_minor_locator(MonthLocator())
ax.xaxis.set_major_formatter(matplotlib.ticker.NullFormatter())
ax.xaxis.set_minor_formatter(matplotlib.dates.DateFormatter('%b'))
Any help would be appreciated.
The following snippet based on the official example here works for me.
This uses a function based index formatter order to only return the first letter of the month as requested.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import matplotlib.cbook as cbook
import matplotlib.ticker as ticker
datafile = cbook.get_sample_data('aapl.csv', asfileobj=False)
print 'loading', datafile
r = mlab.csv2rec(datafile)
r.sort()
r = r[-365:] # get the last year
# next we'll write a custom formatter
N = len(r)
ind = np.arange(N) # the evenly spaced plot indices
def format_date(x, pos=None):
thisind = np.clip(int(x+0.5), 0, N-1)
return r.date[thisind].strftime('%b')[0]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(ind, r.adj_close, 'o-')
ax.xaxis.set_major_formatter(ticker.FuncFormatter(format_date))
fig.autofmt_xdate()
plt.show()
I tried to make the solution suggested by #Appleman1234 work, but since I, myself, wanted to create a solution that I could save in an external configuration script and import in other programs, I found it inconvenient that the formatter had to have variables defined outside of the formatter function itself.
I did not solve this but I just wanted to share my slightly shorter solution here so that you and maybe others can take it or leave it.
It turned out to be a little tricky to get the labels in the first place, since you need to draw the axes, before the tick labels are set. Otherwise you just get empty strings, when you use Text.get_text().
You may want to get rid of the agrument minor=True which was specific to my case.
# ...
# Manipulate tick labels
plt.draw()
ax.set_xticklabels(
[t.get_text()[0] for t in ax.get_xticklabels(minor=True)], minor=True
)
I hope it helps:)
The original answer uses the index of the dates. This is not necessary. One can instead get the month names from the DateFormatter('%b') and use a FuncFormatter to use only the first letter of the month.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
from matplotlib.dates import MonthLocator, DateFormatter
x = np.arange("2019-01-01", "2019-12-31", dtype=np.datetime64)
y = np.random.rand(len(x))
fig, ax = plt.subplots()
ax.plot(x,y)
month_fmt = DateFormatter('%b')
def m_fmt(x, pos=None):
return month_fmt(x)[0]
ax.xaxis.set_major_locator(MonthLocator())
ax.xaxis.set_major_formatter(FuncFormatter(m_fmt))
plt.show()
Following up my previous question: Sorting datetime objects by hour to a pandas dataframe then visualize to histogram
I need to plot 3 bars for one X-axis value representing viewer counts. Now they show those under one minute and above. I need one showing the overall viewers. I have the Dataframe but I can't seem to make them look right. With just 2 bars I have no problem, it looks just like I would want it with two bars:
The relevant part of the code for this:
# Time and date stamp variables
allviews = int(df['time'].dt.hour.count())
date = str(df['date'][0].date())
hours = df_hist_short.index.tolist()
hours[:] = [str(x) + ':00' for x in hours]
The hours variable that I use to represent the X-axis may be problematic, since I convert it to string so I can make the hours look like 23:00 instead of just the pandas index output 23 etc. I have seen examples where people add or subtract values from the X to change the bars position.
fig, ax = plt.subplots(figsize=(20, 5))
short_viewers = ax.bar(hours, df_hist_short['time'], width=-0.35, align='edge')
long_viewers = ax.bar(hours, df_hist_long['time'], width=0.35, align='edge')
Now I set the align='edge' and the two width values are absolutes and negatives. But I have no idea how to make it look right with 3 bars. I didn't find any positioning arguments for the bars. Also I have tried to work with the plt.hist() but I couldn't get the same output as with the plt.bar() function.
So as a result I wish to have a 3rd bar on the graph shown above on the left side, a bit wider than the other two.
pandas will do this alignment for you, if you make the bar plot in one step rather than two (or three). Consider this example (adapted from the docs to add a third bar for each animal).
import pandas as pd
import matplotlib.pyplot as plt
speed = [0.1, 17.5, 40, 48, 52, 69, 88]
lifespan = [2, 8, 70, 1.5, 25, 12, 28]
height = [1, 5, 20, 3, 30, 6, 10]
index = ['snail', 'pig', 'elephant',
'rabbit', 'giraffe', 'coyote', 'horse']
df = pd.DataFrame({'speed': speed,
'lifespan': lifespan,
'height': height}, index=index)
ax = df.plot.bar(rot=0)
plt.show()
In pure matplotlib, instead of using the width parameter to position the bars as you've done, you can adjust the x-values for your plot:
import numpy as np
import matplotlib.pyplot as plt
# Make some fake data:
n_series = 3
n_observations = 5
x = np.arange(n_observations)
data = np.random.random((n_observations,n_series))
# Plotting:
fig, ax = plt.subplots(figsize=(20,5))
# Determine bar widths
width_cluster = 0.7
width_bar = width_cluster/n_series
for n in range(n_series):
x_positions = x+(width_bar*n)-width_cluster/2
ax.bar(x_positions, data[:,n], width_bar, align='edge')
In your particular case, seaborn is probably a good option. You should (almost always) try keep your data in long-form so instead of three separate data frames for short, medium and long, it is much better practice to keep a single data frame and add a column that labels each row as short, medium or long. Use this new column as the hue parameter in Seaborn's barplot
If I run the following, it appears to work as expected, but the y-axis is limited to the earliest and latest times in the data. I want it to show midnight to midnight. I thought I could do that with the code that's commented out. But when I uncomment it, I get the correct y-axis, yet nothing plots. Where am I going wrong?
from datetime import datetime
import matplotlib.pyplot as plt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-01-04 11:55:09']
x = []
y = []
for i in range(0, len(data)):
t = datetime.strptime(data[i], '%Y-%m-%d %H:%M:%S')
x.append(t.strftime('%Y-%m-%d')) # X-axis = date
y.append(t.strftime('%H:%M:%S')) # Y-axis = time
plt.plot(x, y, '.')
# begin = datetime.strptime('00:00:00', '%H:%M:%S').strftime('%H:%M:%S')
# end = datetime.strptime('23:59:59', '%H:%M:%S').strftime('%H:%M:%S')
# plt.ylim(begin, end)
plt.show()
Edit: I also noticed that the x-axis isn't right either. The data skips Jan 2, but I want that on the axis so the data is to scale.
This is a dramatically simplified version of code dealing with over a year's worth of data with over 2,500 entries.
If Pandas is available to you, consider this approach:
import pandas as pd
data = pd.to_datetime(data, yearfirst=True)
plt.plot(data.date, data.time)
_=plt.ylim(["00:00:00", "23:59:59"])
Update per comments
X-axis date formatting can be adjusted using the Locator and Formatter methods of the matplotlib.dates module. Locator finds the tick positions, and Formatter specifies how you want the labels to appear.
Sometimes Matplotlib/Pandas just gets it right, other times you need to call out exactly what you want using these extra methods. In this case, I'm not sure why those numbers are showing up, but this code will remove them.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time)
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%m-%d')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)
I am really new to Python but I need to use a already existing iPython notebook from my professor for analyzing a dataset (using python 2). The data I have is in a .txt document and is a list consisting of numbers with a "," as decimal seperator. I managed to import this list and plot it––all good till here.
My problem now is:
I want an index (year) on the x-axis of my chart starting at 563 for the first value going till 1995 for the last value (there are 1,433 data points in total). How can I add this index to the list without touching the original data?
Here is the code I use:
import numpy as np
import random
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure(figsize=(15,4))
import os
D = open(os.path.expanduser("~/MY_FILE_DIRECTORY/Data.txt"))
Dat = D.read().replace(',','.')
Dat = [float(x) for x in Dat.split('\n')]
D.close()
plt.subplot(1, 1, 1)
plt.plot(Dat, 'b-')
cutmin = 0
cutmax = 1420
plt.axvline(cutmin, color = 'red')
plt.axvline(cutmax, color = 'red')
plt.grid()
Please help me! :-)
I suppose when you say index you mean x-axis labels for your data which is different from the x-coordinates of your actual data (which you do not want to modify). You also say that these indices are years from 563 to 1995. xticks() function allows you to change the localtions and labels of the tick marks on your x-axis. So you can add these two lines to your code.
index = np.arange(563, 1996, 1, dtype=np.int32)
plt.xticks( index )
Hope this is what you wanted.