UV-Vis Spectrum, using readline() method - python

So I have been set this work to list data that is provided as a data file called "Spectrum.dat". If the code is correct, when you run the cell in Jupyter, below you should see the list of energies [2.1674, 2.1724, 2.1774, 2.1824, etc...] and the list of intensities [6.4e-07, 1.26e-06, 2.39e-06, 4.36e-06, etc...].
spect = ___
eV = []
inten = []
for line in spect.readlines():
if not (">" in ___):
tmp = line___
eV.append(float(tmp[0]))
inten.append(float(tmp[1]))
print(eV)
print(inten)
I'm trying to find out what the underscores are. The first bit I added was spect = open("spectrum.dat", "r+") and I am assuming the second underscores is if not (">" in spect): so now i am stuck on the last one.
Below is the Spectrum.dat file:

Related

Write to .txt file the results from FOR looping in python

My program is search the upper and lower value from .txt file according to that input value.
def find_closer():
file = 'C:/.../CariCBABaru.txt'
data = np.loadtxt(file)
x, y = data[:,0], data[:,1]
print(y)
for k in range(len(spasi_baru)):
a = y #[0, 20.28000631, 49.43579604, 78.59158576, 107.7473755, 136.9031652, 166.0589549,
176.5645474, 195.2147447]
b = spasi_baru[k]
# diff_list = []
diff_dict = OrderedDict()
if b in a:
b = input("Number already exists, please enter another number ")
else:
for x in a:
diff = x - b
if diff < 0:
# diff_list.append(diff*(-1))
diff_dict[x] = diff*(-1)
else:
# diff_list.append(diff)
diff_dict[x] = diff
#print("diff_dict", diff_dict)
# print(diff_dict[9])
sort_dict_keys = sorted(diff_dict.keys())
#print(sort_dict_keys)
closer_less = 0
closer_more = 0
#cl = []
#cm = []
for closer in sort_dict_keys:
if closer < b:
closer_less = closer
else:
closer_more = closer
break
#cl.append(closer_less == len(spasi_baru) - 1)
#cm.append(closer_more == len(spasi_baru) - 1)
print(spasi_baru[k],": lower value=", closer_less, "and upper
value =", closer_more)
data = open('C:/.../Batas.txt','w')
text = "Spasi baru:{spasi_baru}, File: {closer_less}, line:{closer_more}".format(spasi_baru=spasi_baru[k], closer_less=closer_less, closer_more=closer_more)
data.write(text)
data.close()
print(spasi_baru[k],": lower value=", closer_less, "and upper value =", closer_more)
find_closer()
The results image is here 1
Then, i want to write these results to file (txt/csv no problem) into rows and columns sequence. But the problem that i have, the file contain just one row or written the last value output in terminal like below,
Spasi baru:400, File: 399.3052727, line: 415.037138
any suggestions to help fix my problem please? I stuck in a several hours to tried any different code algorithms. I'm using Python 3.7
The best solution is to use w+ or a+ mode when you're trying to append into the same test file.
Instead of doing this:
data = open('C:/.../Batas.txt','w')
Do this:
data = open('C:/.../Batas.txt','w+')
or
data = open('C:/.../Batas.txt','a+')
The reason is because you are overwriting the same file over and over inside the loop, so it will keep just the last interaction. Look for ways to save files without overwriting them.
‘r’ – Read mode which is used when the file is only being read
‘w’ – Write mode which is used to edit and write new information to the file (any existing files with the same name will be erased when this mode is activated)
‘a’ – Appending mode, which is used to add new data to the end of the file; that is new information is automatically amended to the end
‘r+’ – Special read and write mode, which is used to handle both actions when working with a file

feed class from list

I am still new to python but using it for my linguistics research.
So I am doing some research into toponyms, and I got a list of input data from a topographic institution, which looks like the following:
Official_Name, tab, Dialect_Name, tab, Administrative_district, Topographic_district, Y_coordinates, X_coordinates, Longitude, Latitude.
So, I defined a class:
class MacroTop:
def __init__(self, Official_Name, Dialect_Name, Adm_District, Topo_District, Y, X, Long, Lat):
self.Official_Name = Official_Name
self.Dialect_Name = Dialect_Name
self.Adm_District = Adm_District
self.Topo_District = Topo_District
self.Y = Y
self.X = X
self.Long = Long
self.Lat = Lat
So, with open(), I wanted to load my .txt file with the data I have to read it into the class using a loop but it did not work.
The result I want is to be able to access a feature of the class, say, Dialect_Name and be able to look through all the entries of that feature. I can do that just in the loop, but I wanted to define a class so I could be able to do more manipulation afterwards.
my loop:
with open("locLuxAll.txt", "r") as topo_list:
lines = topo_list.readlines()
for line in lines:
line = line.split('\t')
print(line)
print(line[0]) # This would access all the data that is characterized as Official_Name
I tried to make another loop:
for i in range(0-len(lines)):
lines[i] = MacroTop(str(line[0]), str(line[1]), str(line[2]), str(line[3]), str(line[4]), str(line[5]), str(line[6]), str(line[7]))
But that did not seem to work.
This line fails:
for i in range(0-len(lines)):
You're trying to loop through negative number I guess, so the output will be an empty list.
In [11]: [i for i in range(-200)]
Out[11]: []
EDIT:
Your code seems unreadable to me, you have for i in range(len(lines)) but in this for loop, you're iterating through line variable, where is it from? First of all I'd not write back to lines list as it comes from readlines. Create new list for that, and you dont need i variable, those lines will be kept in order anyway.
class_lines = []
for line in lines:
class_lines.append(MacroTop(str(line[0]), str(line[1]), str(line[2]), str(line[3]), str(line[4]), str(line[5]), str(line[6]), str(line[7])))
Or even with list comprehension:
class_lines = [MacroTop(str(line[0]), str(line[1]), str(line[2]), str(line[3]), str(
line[4]), str(line[5]), str(line[6]), str(line[7])) for line in lines]

Using split in python on excel to get two parameter

I'm getting really confused with all the information on here using 'split' in python. Basically I want to write a code which opens a spreadsheet (with two columns in it) and the function I write will use the first column as x's and the second column as y's and then it will plot it in the x-y plane.
I thought I would use line.splitlines to cut each line in excel into (x,y) but I keep getting
'ValueError: need more than 1 value to unpack'
I don't know what this means?
Below is what I've written so far, (xdir is an initial condition for a different part of my question):
def plotMo(filename, xdir):
infile = open(filename)
data = []
for line in infile:
x,y = line.splitlines()
x = float(x)
y = float(y)
data.append([x,y])
infile.close()
return data
plt.plot(x,y)
For example with
0 0.049976
0.01 0.049902
0.02 0.04978
0.03 0.049609
0.04 0.04939
0.05 0.049123
0.06 0.048807
I would want to the first point in my plane to be (0, 0.049976) and the second plot to be (0.01, 0.049902).
x,y = line.splitlines() tries to split the current line into several lines.
Since splitlines returns only 1 element, there's an error because python cannot find a value to assign to y.
What you want is x,y = line.split() which will split the line according to 1 or more spaces (like awk would do) if no parameter is specified.
However it depends of the format: if there are blank lines you'll get the "unpack" problem at some point, so to be safe and skip malformed lines, write:
items = line.split()
if len(items)==2: x,y = items
To sum it up, a more pythonic, shorter & safer way of writing your routine would be:
def plotMo(filename):
with open(filename) as infile:
data = []
for line in infile:
items = line.split()
if len(items)==2:
data.append([float(e) for e in items])
return data
(maybe it could be condensed more, but that's good for starters)

Extract multiple arrays from .DAT file with undefined size

I have a device that stores three data sets in a .DAT file, they always have the same heading and number of columns, but the number of rows vary.
They are (n x 4), (m x 4), (L x 3).
I need to extract the three data sets into seperate arrays for plotting.
I have been trying to use numpy.genfromtxt and numpy.loadtxt, but the only way I can get them to work for this format is to manually define the row which each data set starts.
As I will regularly need to deal with this format I have been trying to automate it.
If someone could suggest a method which might work I would greatly appreciate it. I have attached an example file.
example file
Just a quck and dirty solution. At your file size, you might run into performance issues. If you know m, n and L, initialize the output vectors with the respective length.
here is the strategy: Load the whole File in a variable. Read the variable line by line. As soon as you discover a keyword, raise a flag that you are in the specific block. In the next line, read out the line to the correct variables.
isblock1 = isblock2 = isblock3 = False
fout = [] # construct also all the other variables that you want to collect.
with open(file, 'r') as file:
lines = file.readlines() #read all the lines
for line in lines:
if isblock1:
(f, psd, ipj, itj) = line.split()
fout.append(f) #do this also with the other variables
if isblock2:
(t1, p1, p2, p12) = line.split()
if isblock3:
(t2, v1, v2) = line.split()
if 'Frequency' is in line:
isblock1 = True
isblock2 = isblock3 = False
if 'Phasor' is in line:
isblock2 = True
isblock1 = isblock3 = False
if 'Voltage' is in line:
isblock3 = True
isblock1 = isblock2 = False
Hope that helps.

Python: slow for loop performance on reading, extracting and writing from a list of thousands of files

I am extracting 150 different cell values from 350,000 (20kb) ascii raster files. My current code is fine for processing the 150 cell values from 100's of the ascii files, however it is very slow when running on the full data set.
I am still learning python so are there any obvious inefficiencies? or suggestions to improve the below code.
I have tried closing the 'dat' file in the 2nd function; no improvement.
dat = None
First: I have a function which returns the row and column locations from a cartesian grid.
def world2Pixel(gt, x, y):
ulX = gt[0]
ulY = gt[3]
xDist = gt[1]
yDist = gt[5]
rtnX = gt[2]
rtnY = gt[4]
pixel = int((x - ulX) / xDist)
line = int((ulY - y) / xDist)
return (pixel, line)
Second: A function to which I pass lists of 150 'id','x' and 'y' values in a for loop. The first function is called within and used to extract the cell value which is appended to a new list. I also have a list of files 'asc_list' and corresponding times in 'date_list'. Please ignore count / enumerate as I use this later; unless it is impeding efficiency.
def asc2series(id, x, y):
#count = 1
ls_id = []
ls_p = []
ls_d = []
for n, (asc,date) in enumerate(zip(asc, date_list)):
dat = gdal.Open(asc_list)
gt = dat.GetGeoTransform()
pixel, line = world2Pixel(gt, east, nort)
band = dat.GetRasterBand(1)
#dat = None
value = band.ReadAsArray(pixel, line, 1, 1)[0, 0]
ls_id.append(id)
ls_p.append(value)
ls_d.append(date)
Many thanks
In world2pixel you are setting rtnX and rtnY which you don't use.
You probably meant gdal.Open(asc) -- not asc_list.
You could move gt = dat.GetGeoTransform() out of the loop. (Rereading made me realize you can't really.)
You could cache calls to world2Pixel.
You're opening dat file for each pixel -- you should probably turn the logic around to only open files once and lookup all the pixels mapped to this file.
Benchmark, check the links in this podcast to see how: http://talkpython.fm/episodes/show/28/making-python-fast-profiling-python-code

Categories

Resources