draw line/scatter plot from specific cells in an excel file? - python

I have an excel file with my data in sheet named 'main'.
I want to plot a line plot (or scatter) for particular cells in the 'main' sheet
The data I want to use in 'main' is:
X-axis data is in column A i.e. from A36 to A136
and
Y-axis data is in column A i.e. from G36 to G136
Here is the code I used to make the simpler version of the plot
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import pandas as pd
x = pd.read_excel('ob_half_cd100_titration.xlsx', 'test', parse_cols='A')
y = pd.read_excel('ob_half_cd100_titration.xlsx', 'test', parse_cols='B')
plt.plot(x, y)
plt.show()
The final figure should look like the following image (made from the 'test' sheet):
Link to the excel file :
https://www.dropbox.com/s/2pq4pzq7y7ng29e/ob_half_cd100_titration.xlsx?dl=0

Use a slice of the data:
plt.plot(x[35:136], y[35:136])

Related

Plotting a heatmap using CSV file data in python

I have output nested dictionary variable called all_count_details_dictionary. Using that variable I saved data to the CSV file using the following command.
import pandas as pd
csv_path = '../results_v6/output_01.csv'
# creating pandas dataframe using concat mehtod to extract data from dictionary
df = pd.concat([pd.DataFrame(l) for l in all_count_details_dictionary],axis=1).T
# saving the dataframe to the csv file
df.to_csv(csv_path, index=True)
The output CSV file is just like as below
The CSV file can be download using this link
So I used the following code to plot a graph
import matplotlib.pyplot as plt
def extract_csv_gen_plot(csv_path):
length = 1503 #len(dataframe_colums_list)
data = np.genfromtxt(csv_path, delimiter=",", skip_header=True, usecols=range(3, (length+1)))
print(data)
# renaming data axes
#fig, ax = plt.subplots()
#fig.canvas.draw()
#labels =[item.get_text() for item in ax.get_xticklabels()]
#labels[1] = 'testing'
#ax.set_xticklabels(labels)
#ax.set_xticklabels(list)
#ax.set_yticklabels(list)
#plt.setp(ax.get_xticklabels(), rotation = 90)
plt.imshow(data, cmap='hot',interpolation='nearest')
plt.show()
I tried to get the column labels and case details labels into the graph axes, but it doesn't work out. Can anyone please tell me there is any other best method to plot this table into a heat map than this?
Thank you!
I would suggest using Pandas, the labels are picked up automatically:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
def extract_csv_gen_plot(csv_path):
data = pd.read_csv(csv_path, index_col=1)
data = data.drop(data.columns[[0, 1]], axis=1)
data.index.names = ['Name']
g = sns.heatmap(data)
g.set_yticklabels(g.get_yticklabels(), rotation=0)
g.set_title('Heatmap')
plt.tight_layout()
plt.show()
extract_csv_gen_plot("output_01.csv")
I recommend using Seaborn, they have a heatmap plotting function that works very well with Pandas DataFrames
import seaborn as sns
sns.heatmap(data)
https://seaborn.pydata.org/generated/seaborn.heatmap.html

Plot Correlation Table imported from excel with Python

So I am trying to plot correlation Matrix (already calculated) in python. the table is like below:
And I would like it to look like this:
I am using the Following code in python:
import seaborn as sn
import matplotlib.pyplot as plt
import pandas as pd
data =pd.read_excel('/Desktop/wetchimp_global/corr/correlation_matrix.xlsx')
df = pd.DataFrame(data)
print (df)
corrMatrix = data.corr()
print (corrMatrix)
sn.heatmap(corrMatrix, annot=True)
plt.show()
Note that, the matrix is ready and I don't want to calculate the correlation again! but I failed to do that. Any suggestions?
You are recalculating the correlation with the following line:
corrMatrix = data.corr()
You then go on to utilize this recalculated variable in the heatmap here:
sn.heatmap(corrMatrix, annot=True)
plt.show()
To resolve this, instead of passing in the corrMatrix value which is the recalculated value, pass the pure excel data data or df (as df is just a copy of data). Thus, all the code you should need is:
import seaborn as sn
import matplotlib.pyplot as plt
import pandas as pd
data =pd.read_excel('/Desktop/wetchimp_global/corr/correlation_matrix.xlsx')
sn.heatmap(data, annot=True)
plt.show()
Note that this assumes, however, that your data IS ready for the heatmap as you suggest. As we online do not have access to your data we cannot confirm that.
I have deleted to frist column (names) and add them later so the code is as below:
import seaborn as sn
import matplotlib.pyplot as plt
import pandas as pd
data =pd.read_excel('/Users/yousefalbuhaisi/Desktop/wetchimp_global/corr/correlation_matrix.xlsx')
fig, ax = plt.subplots(dpi=150)
y_axis_labels = ['CLC','GIEMS','GLWD','LPX_BERN','LPJ_WSL','LPJ_WHyME','SDGVM','DLEM','ORCHIDEE','CLM4ME']
sn.heatmap(data,yticklabels=y_axis_labels, annot=True)
plt.show()
and the results are:

python plotting from two different files

I have two files, named "data1.dat" and "data2.dat". I want to take first column of "data1.dat" as xlabel and third column of "data2.dat" as ylabel and make a plot.
How can I do that?
Help please.
You can read both files and store the required column data in numpy arrays as follows :
import numpy as np
import matplotlib.pyplot as plt
with open('data1.dat','r') as f1:
x=np.genfromtxt(f1) . # I suppose your data1 file has 1 column
with open('data2.dat','r') as f2:
y=np.genfromtxt(f2)
y=y[:,2] # I only the third column
# plot
plt.figure()
plt.plot(x,y)
plt.show()

Contour plot from csv file with row being axis

I am trying to make a contour plot from a csv file. I would like the first column to be the x axis, the first row (with has values) to be the y, and then the rest of the matrix is what should be contoured, see the basic example in the figure below.
Simple table example
What I am really struggling is to get that first row to be the y axis, and then how to define that set of values so that they can be called into the contourf function. Any help would be very much appreciated as I am very new to python and am really don't know where to start with this problem.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import csv
import pandas as pd
import numpy as np
from csv import reader
from matplotlib import cm
f = pd.read_csv('/trialforplot.csv',dayfirst=True,index_col=0)
x = f.head()
y = f.columns
X,Y = np.meshgrid(x,y)
z=(x,y)
z=np.array(z)
Z=z.reshape((len(x),len(y)))
plt.contour(Y,X,Z)
plt.colorbar=()
plt.xlabel('Time')
plt.ylable('Particle Size')
plt.show()
I'm stuck at defining the z values and getting my contour plot plotting.

plot histogram in python using csv file as input

I have a csv file which contains two columns where first column is fruit name and second column is count and I need to plot histogram using this csv as input to the code below. How do I make it possible. I just have to show first 20 entries where fruit names will be x axis and count will be y axis from entire csv file of 100 lines.
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('data.csv', header = None ,quoting=2)
data.hist(bins=10)
plt.xlim([0,100])
plt.ylim([50,500])
plt.title("Data")
plt.xlabel("fruits")
plt.ylabel("Frequency")
plt.show()
I edited the above program to plot a bar chart -
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('data.csv', sep=',',header=None)
data.values
print data
plt.bar(data[:,0], data[:,1], color='g')
plt.ylabel('Frequency')
plt.xlabel('Words')
plt.title('Title')
plt.show()
but this gives me an error 'Unhashable Type '. Can anyone help on this.
You can use the inbuilt plot of pandas, although you need to specify the first column is index,
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('data.csv', sep=',',header=None, index_col =0)
data.plot(kind='bar')
plt.ylabel('Frequency')
plt.xlabel('Words')
plt.title('Title')
plt.show()
If you need to use matplotlib, it may be easier to convert the array to a dictionary using data.to_dict() and extract the data to numpy array or something.

Categories

Resources