Plotting a CSV-file with time using matplotlib - python

I have currently started a project where I need to evaluate and plot data using python. The csv-file that I have to plot are structured like this:
date,ch1,ch2,ch3,date2
11:56:20.149766,0.909257531,0.909420371,1.140183687, 13:56:20.149980
11:56:20.154008,0.895447016,0.895601869,1.122751355, 13:56:20.154197
11:56:20.157245,0.881764293,0.881911397,1.105638862, 13:56:20.157404
11:56:20.160590,-0.009178977,-0.000108901,-1.486875653, 13:56:20.160750
11:56:20.190473,-1.473576546,-1.477073431,-1.846657276, 13:56:20.190605
11:56:20.193810,-1.460405469,-1.463766813,-1.8300246, 13:56:20.193933
11:56:20.197139,-1.447362065,-1.450844049,-1.813711882, 13:56:20.197262
11:56:20.200480,-1.434574604,-1.437921286,-1.797878742, 13:56:20.200604
11:56:20.203803,-1.422042727,-1.425382376,-1.782045603, 13:56:20.203926
11:56:20.207136,-1.40951097,-1.412971258,-1.7663728, 13:56:20.207258
11:56:20.210472,-0.436505407,-0.438260257,-0.54675138, 13:56:20.210595
11:56:20.213804,0.953246772,0.953690529,1.19551909, 13:56:20.213921
11:56:20.217136,0.93815738,0.938464701,1.176487565, 13:56:20.217252
11:56:20.220472,0.923707485,0.924006522,1.158255577, 13:56:20.220590
11:56:20.223807,0.909385324,0.909676254,1.140343547, 13:56:20.223922
11:56:20.227132,0.895447016,0.895729899,1.122911215, 13:56:20.227248
11:56:20.230466,0.881892085,0.882039428,1.105798721, 13:56:20.230582
I can already read the file and print it using pandas:
df = pd.read_csv (r'F:\Schule\HTL\Diplomarbeit\aw_python\datei_meas.csv')
print (df)
But now I want to plot the file using matplotlib. The first column date should be in the x axis and column 2,3 and 4 should be the y-values of different graphs.
I hope that anyone can help me with my problem.
Kind regards
Matthias
Edit:
This is what I have tried to convert the date-column into a readable file-format:
import matplotlib.pyplot as plt
import numpy as np
import mplcursors
import pandas as pd
import matplotlib.dates as mdates
df = pd.read_csv (r'F:\Schule\HTL\Diplomarbeit\aw_python\datei_meas.csv')
print (df)
x_list = df.date
y = df.ch1
x = mdates.date2num(x_list)
plt.scatter(x,y)
plt.show
And this is the occurring error message:
d = d.astype('datetime64[us]')
ValueError: Error parsing datetime string " 11:56:20.149766" at position 3

Related

Plot Correlation Table imported from excel with Python

So I am trying to plot correlation Matrix (already calculated) in python. the table is like below:
And I would like it to look like this:
I am using the Following code in python:
import seaborn as sn
import matplotlib.pyplot as plt
import pandas as pd
data =pd.read_excel('/Desktop/wetchimp_global/corr/correlation_matrix.xlsx')
df = pd.DataFrame(data)
print (df)
corrMatrix = data.corr()
print (corrMatrix)
sn.heatmap(corrMatrix, annot=True)
plt.show()
Note that, the matrix is ready and I don't want to calculate the correlation again! but I failed to do that. Any suggestions?
You are recalculating the correlation with the following line:
corrMatrix = data.corr()
You then go on to utilize this recalculated variable in the heatmap here:
sn.heatmap(corrMatrix, annot=True)
plt.show()
To resolve this, instead of passing in the corrMatrix value which is the recalculated value, pass the pure excel data data or df (as df is just a copy of data). Thus, all the code you should need is:
import seaborn as sn
import matplotlib.pyplot as plt
import pandas as pd
data =pd.read_excel('/Desktop/wetchimp_global/corr/correlation_matrix.xlsx')
sn.heatmap(data, annot=True)
plt.show()
Note that this assumes, however, that your data IS ready for the heatmap as you suggest. As we online do not have access to your data we cannot confirm that.
I have deleted to frist column (names) and add them later so the code is as below:
import seaborn as sn
import matplotlib.pyplot as plt
import pandas as pd
data =pd.read_excel('/Users/yousefalbuhaisi/Desktop/wetchimp_global/corr/correlation_matrix.xlsx')
fig, ax = plt.subplots(dpi=150)
y_axis_labels = ['CLC','GIEMS','GLWD','LPX_BERN','LPJ_WSL','LPJ_WHyME','SDGVM','DLEM','ORCHIDEE','CLM4ME']
sn.heatmap(data,yticklabels=y_axis_labels, annot=True)
plt.show()
and the results are:

How to plot this graph using Python properly

I am trying to plot the graph bellow using python, but I am getting an error.
The Python commands I am using are:
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('data/filtro_bovespa_final.csv')
data.loc[(data['codigo'] == 'BBAS3') & (data['codigo'] == 'BBDC4')]
data.date = pd.to_datetime(data['date'],format='%Y%m%d')
data.set_index(['date','codigo'])
plt.plot(data.date,data.preco)
plt.show()
The error I am getting is:
I got this graph, but it is not what I need:
The csv file I am using: Bovespa
I need a graph that allows me to compare the price linked with both the codes (BBAS3 and BBDC4) as the first graph I showed.
What else should I do to get the graph I need?
To draw them by attribute, we use a pivot to turn the data frames into columns by attribute. I've also changed the extraction condition to OR.
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('./Data/filtro_bovespa_final.csv')
data.date = pd.to_datetime(data['date'],format='%Y%m%d')
data = data.loc[(data['codigo'] == 'BBAS3') | (data['codigo'] == 'BBDC4')]
data.set_index('date', inplace=True)
data = data.pivot(columns='codigo')
data.columns = ['BBAS3','BBDC4']
data.plot()
plt.show()

Working with Electrodermal data from Empatica E4 - how to plot with time?

I'm working with electrodermal data imported from an Empatica E4. I want to create descriptives and z score the data then plot it. I've managed to get so far with the
below:
# Import packages
import pandas as pd
# Download data
df = pd.read_csv("EDA.csv")
# Plot it
df.plot()
import pandas as pd
from scipy.stats import zscore
df = pd.DataFrame(pd.read_csv('EDA.csv', sep=','))
print(df.describe())
df = df.apply(zscore) # Normalization
print(df.describe())
print (df)
import matplotlib.pyplot as plt
plt.plot(df)
Here's my output:
Descriptives
Z SCORE plot
I want to change the x axis so that it reads time rather than the data point number. What stuck on is how to read in EDA.csv data at its 4hz sample rate and include that in my plot.
Thanks in advance!

How to plot DataFrames? in Python

I'm trying to plot a DataFrame, but I'm not getting the results I need. This is an example of what I'm trying to do and what I'm currently getting. (I'm new in Python)
import pandas as pd
import matplotlib.pyplot as plt
my_data = {1965:{'a':52, 'b':54, 'c':67, 'd':45},
1966:{'a':34, 'b':34, 'c':35, 'd':76},
1967:{'a':56, 'b':56, 'c':54, 'd':34}}
df = pd.DataFrame(my_data)
df.plot( style=[])
plt.show()
I'm getting the following graph, but what I need is: the years in the X axis and each line must be what is currently in X axis (a,b,c,d). Thanks for your help!!.
import pandas as pd
import matplotlib.pyplot as plt
my_data = {1965:{'a':52, 'b':54, 'c':67, 'd':45},
1966:{'a':34, 'b':34, 'c':35, 'd':76},
1967:{'a':56, 'b':56, 'c':54, 'd':34}}
df = pd.DataFrame(my_data)
df.T.plot( kind='bar') # or df.T.plot.bar()
plt.show()
Updates:
If this is what you want:
df = pd.DataFrame(my_data)
df.columns=[str(x) for x in df.columns] # convert year numerical values to str
df.T.plot()
plt.show()
you can do it this way:
ax = df.T.plot(linewidth=2.5)
plt.locator_params(nbins=len(df.columns))
ax.xaxis.set_major_formatter(mtick.FormatStrFormatter('%4d'))

plot histogram in python using csv file as input

I have a csv file which contains two columns where first column is fruit name and second column is count and I need to plot histogram using this csv as input to the code below. How do I make it possible. I just have to show first 20 entries where fruit names will be x axis and count will be y axis from entire csv file of 100 lines.
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('data.csv', header = None ,quoting=2)
data.hist(bins=10)
plt.xlim([0,100])
plt.ylim([50,500])
plt.title("Data")
plt.xlabel("fruits")
plt.ylabel("Frequency")
plt.show()
I edited the above program to plot a bar chart -
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('data.csv', sep=',',header=None)
data.values
print data
plt.bar(data[:,0], data[:,1], color='g')
plt.ylabel('Frequency')
plt.xlabel('Words')
plt.title('Title')
plt.show()
but this gives me an error 'Unhashable Type '. Can anyone help on this.
You can use the inbuilt plot of pandas, although you need to specify the first column is index,
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('data.csv', sep=',',header=None, index_col =0)
data.plot(kind='bar')
plt.ylabel('Frequency')
plt.xlabel('Words')
plt.title('Title')
plt.show()
If you need to use matplotlib, it may be easier to convert the array to a dictionary using data.to_dict() and extract the data to numpy array or something.

Categories

Resources