How to turn a simple csv into a line graph using matplotlib?

How to turn a simple csv into a line graph using matplotlib? - python

I created a simple csv file with numbers that approach pi and I would like to create and store the output as a png. I have a very simple csv, each tow contains the number I want to graph and
import pandas as pd
import csv
import matplotlib.pyplot as plt
from decimal import Decimal
def create_png():
df = pd.read_csv('sticks.csv', names=["xstk", "stk"])
sumdf = df.sum(0)
num1 = sumdf['xstk']
num2 = sumdf['stk']
total = num1 + num2
aproxpi = [(2*float(total))/num1]
with open('aproxpi.csv', 'a') as pifile:
piwriter = csv.writer(pifile, delimiter= ' ')
piwriter.writerow(aproxpi)
Piplot = pd.read_csv('aproxpi.csv', names=['~Pi'])
#Piplot.groupby('~Pi')
Piplot.plot(title='The Buffon Needle Experiment')
if __name__ == "__main__":
create_png()
When I run this code nothing happens. If I use the show method on the AxesSubPlot I raise an exception. How can this be accomplished?

You need to call plt.show() to actually see the plot.

This code seems very incomplete - is there more you can give us?
It may be that Piplot.plot needs to have x and y specified, instead of simply a title. I believe that you need to create a new plot object and pass the data into it, rather than calling data.plot() as you are now. See the documentation.
Additionally, taking a look at this question may help.

Related

Scopus Abstract Retrieval - Value and Type Error only when too many entries are parsed

I am trying to retrieve abstracts via Scopus Abstract Retrieval. I have a file with 3590 EIDs.
import pandas as pd
import numpy as np
file = pd.read_excel(r'C:\Users\Amanda\Desktop\Superset.xlsx', sheet_name='Sheet1')
from pybliometrics.scopus import AbstractRetrieval
for i, row in file.iterrows():
q = row['EID']
ab = AbstractRetrieval(q,view='META_ABS')
file.at[i,"Abstract"] = ab.description
print(str(i) + ' ' + ab.description)
print(str(''))
I get a value error -
In response to the value error, I altered the code.
from pybliometrics.scopus import AbstractRetrieval
error_index_valueerror = {}
for i, row in file.iterrows():
q = row['EID']
try:
ab = AbstractRetrieval(q,view='META_ABS')
file.at[i,"Abstract"] = ab.description
print(str(i) + ' ' + ab.description)
print(str(''))
except ValueError:
print(f"{i} Value Error")
error_index_valueerror[i] = row['Title']
continue
When I trialed this code with 10-15 entries, it worked well and I retrieved all the abstracts. However, when I ran the actual file with 3590 EIDs, the output would be a series of 10-12 value errors before a type error ('can only concatenate str (not "NoneType") to str surfaces.
I am not sure how to tackle this problem moving forward. Any advice on this matter would be greatly appreciated!
(Side note: When I change view='FULL' (as recommended by the documentation), I still get the same outcome.)

Without EIDs to check, it is tough to point to the precise cause. However, I'm 99% certain that your problem are missing abstracts in the .description property. It's sufficient when the first call is empty, because it will turn the column type into float, to which you wish to append a string. That's what the error says.
Thus your problem has nothing to do with pybliometrics or Scopus, but with the way you bild the code.
Try this instead:
import pandas as pd
import numpy as np
from pybliometrics.scopus import AbstractRetrieval
def parse_abstract(eid):
"""Retrieve Abstract of a document."""
ab = AbstractRetrieval(q, view='META_ABS')
return ab.description or ab.abstract
FNAME = r'C:\Users\Amanda\Desktop\Superset.xlsx'
df = pd.read_excel(FNAME, sheet_name='Sheet1')
df["abstract"] = df["EID"].apply(parse_abstract)
Instead of appending values one-by-one in a loop, which is slow and error-prone, I use pandas' .apply() methods.
Also note how I write ab.description or ab.abstract. https://pybliometrics.readthedocs.io/en/stable/classes/AbstractRetrieval.html states that both should yield the same but can be empty. With this statement, if ab.description is empty (i.e., falsy), it will use ab.abstract instead.

called the defined function: I/O operation on closed file

I defined a function to fix and open the fits file in a list. Then call it twice to overly the fits picture. But when I used the parameters of called the function to plot, it shows that the I/O operation on closed file: "ValueError: I/O operation on closed file". I don't know why it happens. Thank you so much.
The following is the code of defined function
import numpy as np
import matplotlib.pyplot as plt
import applpy
from astropy.io import fits
from astropy.wcs import WCS
from astropy.wcs import FITSFixedWarning
import warnings
warnings.filterwarnings("ignore", category=FITSFixedWarning)
#### define the function to fix the fitsheader and get the head info.
def fix_fitshead(filelis):
with fits.open(filelis.replace('\n', ''), mode="readonly") as hdu:
da = hdu[0].data[:,:]
he = hdu[0].header
if he["NAXIS"] ==4:
he["NAXIS"] =2
for k in ["NAXIS3", "NAXIS4",
"CTYPE3", "CRVAL3", "CDELT3", "CROTA3", "CRPIX3",
"CTYPE4", "CRVAL4", "CDELT4", "CROTA4", "CRPIX4" ]:
if k in list(he.keys()):
he.remove(k)
hdu = fits.PrimaryHDU(da, he)
print('WCS=',WCS(he))
elif he["NAXIS"] ==3:
he["NAXIS"] =1
for k in ["NAXIS3","CTYPE3", "CRVAL3", "CRPIX3","CUNIT3", "LBOUND3"]:
if k in list(he.keys()):
he.remove(k)
if "CDELT3" in h3:
k1 = "CDELT3"
elif "CD3_3" in h3:
k1 = "CD3_3"
he.remove(k1)
hdu = fits.PrimaryHDU(da, he)
else:
hdu = hdu[0]
maxvalue = np.nanmax(da)
c = WCS(he).wcs_pix2world([[he["NAXIS1"]/2,he["NAXIS2"]/2]], 1)
coord = SkyCoord(ra=c[0][0]*u.deg, dec=c[0][1]*u.deg)
if "LINE" in he:
spec = he["LINE"]
elif 'MOLECULE' in he:
spec = he['MOLECULE']
elif "INSTRUME"in he:
spec = he['INSTRUME']
print(he["OBJECT"])
# print(maxvalue)
# PRODID = 'reduced-850um'
return maxvalue, spec, da, he, hdu, coord
##The following code is plot the overlay picture by calling the defined function.
file_gray = '/Users/hjma/Desktop/smoothdata/progress/13co_fits.list'
file_cont = '/Users/hjma/Desktop/smoothdata/progress/hcn10_fits.list'
with open(file_gray,'r') as f_gray:
gray_list = [row_gray.rstrip('\n') for row_gray in f_gray]
with open(file_cont,'r') as f_cont:
cont_list = [row_cont.rstrip('\n') for row_cont in f_cont]
for filegray in gray_list:
# print('filegray=',filegray)
#call the defined function
maxvalue1, spec1, d1, he1, hdu_gray, coord1 = fix_fitshead(filegray)
for filecont in cont_list:
# print('filecont=', filecont)
#call the defined function
maxvalue2, spec2, d2, he2, hdu_cont, coord2 = fix_fitshead(filecont)
print('d2=',d2)
fig = plt.figure()
fig.set_figwidth(4); fig.set_figheight(4)
ff = aplpy.FITSFigure(hdu_gray, figure=fig)
ff.show_colorscale(cmap="Blues")
ff.show_contour(hdu_cont, colors="red")
plt.plot([0], label="HCN 1-0", color="r")
plt.legend()
plt.tight_layout()
plt.close()
break`
I got the error:
---> 26 ff = aplpy.FITSFigure(hdu_gray, figure=fig)
....
ValueError: I/O operation on closed file

I added some general comments above on problems with your code you might want to resolve. I would give you a suggested re-write of your code, except I'm not 100% sure what it's meant to accomplish. It appears that, among other things, you want it to take 2D slices of 3D or 4D arrays, but as I noted above it doesn't actually achieve that goal.
Anyways, the reason for your error is specifically in the case where the data is already 2D, in your if/elif/else statement you have:
if he["NAXIS"] == 4:
...
hdu = fits.PrimaryHDU(da, he)
print('WCS=',WCS(he))
elif he["NAXIS"] == 3:
...
hdu = fits.PrimaryHDU(da, he)
else:
hdu = hdu[0]
In the first two cases you read the data from the file before closing the file (da = hdu[0].data[:,:]) and created a new HDU object from it. However, in the last case, you didn't do this and just pass the original HDU object from the closed file (hdu = hdu[0]). Since the file this is from is closed, the data in this HDU can't be ready any longer so you get "I/O operation on closed file" when you pass it to aplpy.FITSFigure and it tries to read the HDU data.
One way you could work around this is change that last line to hdu = fits.PrimaryHDU(da, he) like in the other cases, to create a new HDU from the already loaded data.
A better way, which according to your comment I think you might have found, is to refactor your code so that instead of passing fix_fitshead a filename, pass it an already open HDUList object and use it like:
with fits.open(filename) as hdul:
maxvalue1, spec1, d1, he1, hdu_gray, coord1 = fix_fitshead(hdul)
...
and don't close the file until you're actually done using it. This is a more flexible approach in general because it also allows you to use your code on FITS files that weren't directly opened from files on disk (e.g. for writing tests).

How to modify a set of concatenate traces in one file to a set of

I have a set of traces in one folder Folder_Traces:
Trace1.npy
Trace2.npy
Trace3.npy
Trace4.npy
...
In my code, I must concatenate all traces and put them in one file.Each trace is a table. The big file where I put all my file is a table containing a set of tables. This file looks like this: All_Traces=[[Trace1],[Trace2],[Trace3],...[Tracen]]
import numpy as np
import matplotlib.pyplot as plt
sbox=( 0x63,0x7c,0x77,0x7b,0xf2,0x6b..........)
hw = [bin(x).count("1") for x in range(256)]
print (sbox)
print ([hw[s] for s in sbox])
# Start calculating template
# 1: load data
tempTraces = np.load(r'C:\\Users\\user\\2016.06.01-09.41.16_traces.npy')
tempPText = np.load(r'C:\\Users\\user\\2016.06.01-09.41.16_textin.npy')
tempKey = np.load(r'C:\\Users\\user\\2016.06.01-09.41.16_keylist.npy')
print (tempPText)
print (len(tempPText))
print (tempKey)
print (len(tempKey))
plt.plot(tempTraces[0])
plt.show()
tempSbox = [sbox[tempPText[i][0] ^ tempKey[i][0]] for i in range(len(tempPText))]
print (sorted(tempSbox))
So, what I need is to use all my trace files without concatenation, because concatenation causes many memory problems. So what I need is to change this line: tempTraces = np.load(r'C:\\Users\\user\\2016.06.01-09.41.16_traces.npy') by the path for my folder directly then load each trace and make the necessary analysis. So, How to resolve that please?

How to write data results into an output file in python

I have wrote this code to analyze and search geological coordinates for proximity of data points. Since I had so many data points, the output in PyCharm was becoming overloaded and gave me a bunch of nonsense. Since then I have worked to try and solve this issue by writing the True/False results into separate documents on my computer.
The point of this code is to analyze the proximity of coordinates in file1 to all elements in file2. Then return any resulting matches of coordinates which share proximity. As you will see below I wrote a nested for loop to do this which I understand may be a sort of brute force tactic so if anybody has a more elegant solution them I would be happy to learn more.
import numpy as np
import math as ma
filename1 = "C:\Users\Justin\Desktop\file1.data"
data1 = np.genfromtxt(filename1,
skip_header=1,
usecols=(0, 1))
#dtype=[
#("x1", "f9"),
#("y1", "f9")])
#print "data1", data1
filename2 = "C:\Users\Justin\Desktop\file2.data"
data2 = np.genfromtxt(filename2,
skip_header=1,
usecols=(0, 1))
#dtype=[
#("x2", "f9"),
#("y2", "f9")])
#print "data2",data2
def d(a,b):
d = ma.acos(ma.sin(ma.radians(a[1]))*ma.sin(ma.radians(b[1]))
+ma.cos(ma.radians(a[1]))*ma.cos(ma.radians(b[1]))* (ma.cos(ma.radians((a[0]-b[0])))))
return d
results = open("results.txt", "w")
for coor1 in data1:
for coor2 in data2:
n=0
a = [coor1[0], coor1[1]]
b = [coor2[0], coor2[1]]
#print "a", a
#print "b", b
if d(a, b) < 0.07865: # if true what happens
results.write("\t".join([str(coor1), str(coor2), "True", str(d)]) + "\n")
else:
results.write("\t".join([str(coor1), str(coor2), "False", str(d)]) + "\n")
results.close()
This is the error message I get when I run the code:
results.write("\t".join([str(coor1), str(coor2), "False", str(d)]) + "\n")
ValueError: I/O operation on closed file
I think my problem is that I don't understand how I am supposed to write, save and organize the files in a meaningful format into my computer. So, again if anybody has any advice or suggestions I would be very grateful for the support!

My suggestion: take your code that writes to the file and wrap it in a context manager. E.g. https://jeffknupp.com/blog/2016/03/07/python-with-context-managers/

Splitting data file columns into separate arrays in Python

I'm new to python and have been trying to figure this out all day. I have a data file laid out as below,
time I(R_stkb)
Step Information: Temp=0 (Run: 1/11)
0.000000000000000e+000 0.000000e+000
9.999999960041972e-012 8.924141e-012
1.999999992008394e-011 9.623148e-012
3.999999984016789e-011 6.154220e-012
(Note: No empty line between the each data line.)
I want to plot the data using matplotlib functions, so I'll need the two separate columns in arrays.
I currently have
def plotdata():
Xvals=[], Yvals=[]
i = open(file,'r')
for line in i:
Xvals,Yvals = line.split(' ', 1)
print Xvals,Yvals
But obviously its completely wrong. Can anyone give me a simple answer to this, and with an explanation of what exactly the lines mean would be helpful. Cheers.
Edit: The first two lines repeat throughout the file.

This is a job for the * operator on the zip method.
>>> asdf
[[1, 2], [3, 4], [5, 6]]
>>> zip(*asdf)
[(1, 3, 5), (2, 4, 6)]
So in the context of your data it might be something like:
handle = open(file,'r')
lines = [line.split() for line in handle if line[:4] not in ('time', 'Step')]
Xvals, Yvals = zip(*lines)
or if your really need to be able to mutate the data afterwards you could just call the list constructor on each tuple:
Xvals, Yvals = [list(block) for block in zip(*lines)]

One way to do it is:
Xvals=[]; Yvals=[]
i = open(file,'r')
for line in i:
x, y = line.split(' ', 1)
Xvals.append(float(x))
Yvals.append(float(y))
print Xvals,Yvals
Note the call to the float function, which will change the string you get from the file into a number.

This is what numpy.loadtxt is designed for. Try:
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt(file, skiprows = 2) # assuming you have time and step information on 2 separate lines
# and you do not want to read them
plt.plot(data[:,0], data[:,1])
plt.show()
EDIT:
if you have time and step information scattered throughout the file and you want to plot data on every step, there is a possibility of reading all the file to memory (suppose it's small enough), and then split it on time strings:
l = open(fname, 'rb').read()
for chunk in l.split('time'):
data = np.array([s.split() for s in chunk.split('\n')[2:]][:-1], dtype = np.float)
plt.plot(data[:,0], data[:,1])
plt.show()
Or else you could add the # comment sign to the comment lines and use np.loadxt.

If you want to plot this file with matplotlib, you might want to check out it's plotfile function. See the official documentation here.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to turn a simple csv into a line graph using matplotlib? - python

You need to call plt.show() to actually see the plot.

Related

Scopus Abstract Retrieval - Value and Type Error only when too many entries are parsed

called the defined function: I/O operation on closed file

How to modify a set of concatenate traces in one file to a set of

How to write data results into an output file in python

Splitting data file columns into separate arrays in Python

Categories

Resources