Plotting Time and float value using python matplotlib from File - python

I am having a text file with time and a float value. I have heard that it is possible to plot these two columns using matplotlib. Searched similar threads but could not make it happening. My code and Data are-
import math
import datetime
import matplotlib
import matplotlib.pyplot as plt
import csv
with open('MaxMin.txt','r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True)
x = []
y = []
for cols in csv_input:
x = matplotlib.dates.date2num(cols[0])
y = [float(cols[1])]
# naming the x axis
plt.xlabel('Real-Time')
# naming the y axis
plt.ylabel('Acceleration (m/s2)')
# giving a title to my graph
plt.title('Accelerometer reading graph!')
# plotting the points
plt.plot(x, y)
# beautify the x-labels
plt.gcf().autofmt_xdate()
# function to show the plot
plt.show()
And part of the Data in MaxMin.txt
23:28:30.137 10.7695982757
23:28:30.161 10.4071263594
23:28:30.187 9.23969855461
23:28:30.212 9.21066485657
23:28:30.238 9.25117645762
23:28:30.262 9.59227680741
23:28:30.287 9.9773536301
23:28:30.312 10.0128275058
23:28:30.337 9.73353441664
23:28:30.361 9.75064993988
23:28:30.387 9.717339267
23:28:30.412 9.72736788911
23:28:30.440 9.62451269364
I am a beginner in Python and on python 2.7.15 in windows 10 pro(64 bit). I have installed numpy,scipy scikit-learn already. Please help.
Final Output Graph from complete Data Set. Thanks # ImportanceOfBeingErnest

You could use pandas to achieve this, first store your file in a .csv format:
import math
import datetime
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd #### import this library
df = pd.read_csv("path_to_file.csv", delimiter=' ', encoding='latin-1')
x = df.ix[:,0]
y = df.ix[:,1]
# naming the x axis
plt.xlabel('Real-Time')
# naming the y axis
plt.ylabel('Acceleration (m/s2)')
# giving a title to my graph
plt.title('Accelerometer reading graph!')
# plotting the points
plt.plot(x, y)
# beautify the x-labels
plt.gcf().autofmt_xdate()
# function to show the plot
plt.show()
if the first colunm does not have a datatime format you may convert it to this format like df.ix[:,0] = pd.to_datetime(df.ix[:,0])
and you take the hour for example:
df.ix[:,0] = df.ix[:,0].map(lambda x: x.hour)
The output after running the code was like:

The error you made in the original attempt is actually pretty minor. Instead of appending the values from the loop you redefined them.
Also you would need to use datestr2num instead of date2num, because the string read in is not yet a date.
import matplotlib
import matplotlib.pyplot as plt
import csv
with open('MaxMin.txt','r') as f_input:
csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True)
x = []
y = []
for cols in csv_input:
x.append(matplotlib.dates.datestr2num(cols[0]))
y.append(float(cols[1]))
# naming the x axis
plt.xlabel('Real-Time')
# naming the y axis
plt.ylabel('Acceleration (m/s2)')
# giving a title to my graph
plt.title('Accelerometer reading graph!')
# plotting the points
plt.plot_date(x, y)
# beautify the x-labels
plt.gcf().autofmt_xdate()
# function to show the plot
plt.show()
My recommendation for how to make this easier would be, to use numpy and convert the input to datetime.
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
x,y= np.loadtxt('MaxMin.txt', dtype=str, unpack=True)
x = np.array([datetime.strptime(i, "%H:%M:%S.%f") for i in x])
y = y.astype(float)
plt.plot(x,y)
plt.gcf().autofmt_xdate()
plt.show()
Concerning the ticking of the axes: In order to have ticks every half a second you can use a MicrosecondLocator with an interval of 500000.
import matplotlib.dates
# ...
loc = matplotlib.dates.MicrosecondLocator(500000)
plt.gca().xaxis.set_major_locator(loc)
plt.gca().xaxis.set_major_formatter(matplotlib.dates.AutoDateFormatter(loc))

Related

Matplotlib axes confused

I'm trying to figure out where I went wrong with this plot, I'm trying to get the axes to go from 0-1 but this one is going from 0.1-0-1, and I'm not too sure where I'm going wrong.
the csv file is in the following format:
dishwasher,60,1,1,0,1,0,0.1
import matplotlib.pyplot as plt
import numpy as np
import csv
x = np.array([1,2,3,4,5,6])
with open('Test 5.csv', 'r') as csvfile:
plots = csv.reader(csvfile, delimiter=',')
rows = [row for row in plots]
y1=rows[0][2:]
y2=rows[1][2:]
plt.plot(x,y1, label='Washing Machine')
plt.plot(x,y2, label='Dishwasher')
plt.legend()
plt.show()
the plot comes out as followed:
The only solution I could think of was to invert the axes or to outline the scale for the y-axis but neither worked
Your y values are most seemingly strings that's why your y-axis is out of order. Convert them to floats before plotting using the following list comprehension way
y1=rows[0][2:]
y2=rows[1][2:]
y1 = [float(i) for i in y1] # <--- convert to float
y2 = [float(i) for i in y2] # <--- convert to float
plt.plot(x,y1, label='Washing Machine')
plt.plot(x,y2, label='Dishwasher')
You can also use a map function as following
y1 = list(map(float, y1))
y2 = list(map(float, y2))
Trying using pandas to import CSV files.
You don't have to explicitly pass x = [1,2,3,...] by default x-axis will take those labels.
sample code:
import pandas as pd
df = pd.read_csv("Test 5.csv")
print(df.columns)
Let's assume your data frame df has two columns (washing_machine & dishwasher). To plot these columns using matplotlib.
plt.plot(df.washing_machine.values, label='Washing Machine')
plt.plot(df.dishwasher.values, label='Dishwasher')
plt.legend()
plt.show()
Hope, this helps. Enjoy coding.

Contour plot from csv file with row being axis

I am trying to make a contour plot from a csv file. I would like the first column to be the x axis, the first row (with has values) to be the y, and then the rest of the matrix is what should be contoured, see the basic example in the figure below.
Simple table example
What I am really struggling is to get that first row to be the y axis, and then how to define that set of values so that they can be called into the contourf function. Any help would be very much appreciated as I am very new to python and am really don't know where to start with this problem.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import csv
import pandas as pd
import numpy as np
from csv import reader
from matplotlib import cm
f = pd.read_csv('/trialforplot.csv',dayfirst=True,index_col=0)
x = f.head()
y = f.columns
X,Y = np.meshgrid(x,y)
z=(x,y)
z=np.array(z)
Z=z.reshape((len(x),len(y)))
plt.contour(Y,X,Z)
plt.colorbar=()
plt.xlabel('Time')
plt.ylable('Particle Size')
plt.show()
I'm stuck at defining the z values and getting my contour plot plotting.

Using pandas/matplotlib/python, I cannot visualize my csv file as clusters

My csv file is,
https://github.com/camenergydatalab/EnergyDataSimulationChallenge/blob/master/challenge2/data/total_watt.csv
I want to visualize this csv file as clusters.
My ideal result would be the following image.(Higher points (red zone) would be higher energy consumption and lower points (blue zone) would be lower energy consumption.)
I want to set x-axis as dates (e.g. 2011-04-18), y-axis as time (e.g. 13:22:00), and z-axis as energy consumption (e.g. 925.840613752523).
I successfully visualized the csv data file as values per 30mins with the following program.
from matplotlib import style
from matplotlib import pylab as plt
import numpy as np
style.use('ggplot')
filename='total_watt.csv'
date=[]
number=[]
import csv
with open(filename, 'rb') as csvfile:
csvreader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in csvreader:
if len(row) ==2 :
date.append(row[0])
number.append(row[1])
number=np.array(number)
import datetime
for ii in range(len(date)):
date[ii]=datetime.datetime.strptime(date[ii], '%Y-%m-%d %H:%M:%S')
plt.plot(date,number)
plt.title('Example')
plt.ylabel('Y axis')
plt.xlabel('X axis')
plt.show()
I also succeeded to visualize the csv data file as values per day with the following program.
from matplotlib import style
from matplotlib import pylab as plt
import numpy as np
import pandas as pd
style.use('ggplot')
filename='total_watt.csv'
date=[]
number=[]
import csv
with open(filename, 'rb') as csvfile:
df = pd.read_csv('total_watt.csv', parse_dates=[0], index_col=[0])
df = df.resample('1D', how='sum')
import datetime
for ii in range(len(date)):
date[ii]=datetime.datetime.strptime(date[ii], '%Y-%m-%d %H:%M:%S')
plt.plot(date,number)
plt.title('Example')
plt.ylabel('Y axis')
plt.xlabel('X axis')
df.plot()
plt.show()
Although I could visualize the csv file as values per 30mins and per days, I do not have any idea to visualize the csv data as clusters in 3D..
How can I program it...?
Your main issue is probably just reshaping your data so that you have date along one dimension and time along the other. Once you do that you can use whatever plotting you like best (here I've used matplotlib's mplot3d, but it has some quirks).
What follows takes your data and reshapes it appropriately so you can then plot a surface that I believe is what your are looking for. The key is using the pivot method, which restructures your data by date and time.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
fname = 'total_watt.csv'
# Read in the data, but I skipped setting the index and made sure no data
# is lost to a nonexistent header
df = pd.read_csv(fname, parse_dates=[0], header=None, names=['datetime', 'watt'])
# We want to separate the date from the time, so create two new columns
df['date'] = [x.date() for x in df['datetime']]
df['time'] = [x.time() for x in df['datetime']]
# Now we want to reshape the data so we have dates and times making the result 2D
pv = df.pivot(index='time', columns='date', values='watt')
# Not every date has every time, so fill in the subsequent NaNs or there will be holes
# in the surface
pv = pv.fillna(0.0)
# Now, we need to construct some arrays that matplotlib will like for X and Y values
xx, yy = np.mgrid[0:len(pv),0:len(pv.columns)]
# We can now plot the values directly in matplotlib using mplot3d
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(xx, yy, pv.values, cmap='jet', rstride=1, cstride=1)
ax.grid(False)
# Now we have to adjust the ticks and ticklabels - so turn the values into strings
dates = [x.strftime('%Y-%m-%d') for x in pv.columns]
times = [str(x) for x in pv.index]
# Setting a tick every fifth element seemed about right
ax.set_xticks(xx[::5,0])
ax.set_xticklabels(times[::5])
ax.set_yticks(yy[0,::5])
ax.set_yticklabels(dates[::5])
plt.show()
This gives me (using your data) the following graph:
Note that I've assumed when plotting and making the ticks that your dates and times are linear (which they are in this case). If you have data with uneven samples, you'll have to do some interpolation before plotting.

Is there a ready solution in matplotlib to plot times?

This question has two parts. If it lacks of search for other sources plz be patient. This is part of my problem.
I wrote a script using data produced by tespeed. The data has the format "YYYYMMDDhhmm,down rate, up rate,unit,server" (hh:mm of ...).
201309221537,0.28,0.04,"Mbit","['speedtest server']"
201309221542,5.78,-1.00,"Mbit","['speedtest server']"
201309221543,0.15,0.06,"Mbit","[...]"
This script plots the above data:
#!/usr/bin/env
python2.7
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import csv
def main():
x = []
y = []
with open('/path/to/my/public_html/stdout_tespeed_log.csv','r') as csvfile:
strData = csv.reader(csvfile, delimiter=',')
for row in strData:
x += [float(row[0])]
y += [float(row[1])]
fig = plt.figure()
plt.plot(x,y,'+', label='Average download')
plt.gca().xaxis.major.formatter.set_scientific(False)
plt.gca().xaxis.major.formatter.set_powerlimits((-2,13))
locs,labels = plt.xticks()
plt.xticks(locs, map(lambda x: "%12.0f" % x, locs))
plt.axis([x[0], x[-1],0,6.5])
plt.xticks(rotation=20)
plt.xlabel('Date [YYYYMMDDhhmm]')
fig.subplots_adjust(bottom=0.2)
# plt.legend(loc=3)
plt.gcf().autofmt_xdate()
plt.savefig("/path/to/my/public_html/speed.png")
main()
At the end this produces a plot like this:
The time axis is not well configured. :-/ The periodically appearing gaps are because of the fact that there are no minutes 60 - 99 in every hour.
Is there some elegant way to accomplish this? Maybe a ready to go module? ;-)
Matplotlib accepts datetimes, so you can parse the times with
import datetime
datetime.datetime.strptime(row[0], "%Y%m%d%H%M")
and that should work fine.
The formatting options won't work (.set_scientific(False)) this way, though, and your
plt.xticks(locs, map(lambda x: "%12.0f" % x, locs))
should be replaced with something like
import matplotlib.dates as mdates
...
plt.gca().xaxis.major.formatter = mdates.DateFormatter('%Y/%m/%d %H:%M')

Python Matplotlib Plotting CSV data, formatting date X label

My data looks as follows:
2012021305, 65217
2012021306, 82418
2012021307, 71316
2012021308, 66833
2012021309, 69406
2012021310, 76422
2012021311, 94188
2012021312, 111817
2012021313, 127002
2012021314, 141099
2012021315, 147830
2012021316, 136330
2012021317, 122252
2012021318, 118619
2012021319, 115763
2012021320, 121393
2012021321, 130022
2012021322, 137658
2012021323, 139363
Where the first column is the data YYYYMMDDHH . I'm trying to graph the data using the csv2rec module. I can get the data to graph but the x axis and labels are not showing up the way that I expect them to.
import matplotlib
matplotlib.use('Agg')
from matplotlib.mlab import csv2rec
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from pylab import *
output_image_name='plot1.png'
input_filename="data.log"
input = open(input_filename, 'r')
input.close()
data = csv2rec(input_filename, names=['time', 'count'])
rcParams['figure.figsize'] = 10, 5
rcParams['font.size'] = 8
fig = plt.figure()
plt.plot(data['time'], data['count'])
ax = fig.add_subplot(111)
ax.plot(data['time'], data['count'])
hours = mdates.HourLocator()
fmt = mdates.DateFormatter('%Y%M%D%H')
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(fmt)
ax.grid()
plt.ylabel("Count")
plt.title("Count Log Per Hour")
fig.autofmt_xdate(bottom=0.2, rotation=90, ha='left')
plt.savefig(output_image_name)
I assume this has something to do with the date format. Any suggestions?
You need to convert the x-values to datetime objects
Something like:
time_vec = [datetime.strp(str(x),'%Y%m%d%H') for x in data['time']]
plot(time_vec,data['count'])
Currently, you are telling python to format integers (2012021305) as a date, which it does not know how to do, so it returns and empty string (although, I suspect that you are getting errors raised someplace).
You should also check your format string mark up.

Categories

Resources