python pyfits, reading header information and using in calculation - python

I am running some code with python and pyfits and I am reading out a line of information from the header. I am getting the correct line but due to how it is written in the header it is printing out with colons separating the numbers I need.
the line I am running is
print header[0].header['opp']
this prints
34:04:32.04
I need to do a calculation where I add these numbers together, but do not know how to do this as they are separated by colons.

Something like this should solve your problem:
header[0].header['opp'] = "34:04:32.04"
print (sum(float(x) for x in header[0].header['opp'].split(":")))
... which outputs:
70.03999999999999
(EDIT)
Or, if the values actually make up a time in hours, minutes and seconds:
s = "34:04:32.04"
ss = [float(x) for x in s.split(":")]
print (ss[0] + ss[1]/60 + ss[2]/3600)
... which outputs the value in hours:
34.07556666666667

Related

Finding Min and Max Values in List Pattern in Python

I have a csv file and the data pattern like this:
I am importing it from csv file. In input data, there are some whitespaces and I am handling it by using pattern as above. For output, I want to write a function that takes this file as an input and prints the lowest and highest blood pressure. Also, it will return average of all mean values. On the other side, I should not use pandas.
I wrote below code blog.
bloods=open(bloodfilename).read().split("\n")
blood_pressure=bloods[4].split(",")[1]
pattern=r"\s*(\d+)\s*\[\s*(\d+)-(\d+)\s*\]"
re.findall(pattern,blood_pressure)
#now extract mean, min and max information from the blood_pressure of each patinet and write a new file called blood_pressure_modified.csv
pattern=r"\s*(\d+)\s*\[\s*(\d+)-(\d+)\s*\]"
outputfilename="blood_pressure_modified.csv"
# create a writeable file
outputfile=open(outputfilename,"w")
for blood in bloods:
patient_id, blood_pressure=bloods.strip.split(",")
mean=re.findall(pattern,blood_pressure)[0]
blood_pressure_modified=re.sub(pattern,"",blood_pressure)
print(patient_id, blood_pressure_modified, mean, sep=",", file=outputfile)
outputfile.close()
Output should looks like this:
This is a very simple kind of answer to this. No regex, pandas or anything.
Let me know if this is working. I can try making it work better for any case it doesn't work.
bloods=open("bloodfilename.csv").read().split("\n")
means = []
'''
Also, rather than having two list of mins and maxs,
we can have just one and taking min and max from this
list later would do the same trick. But just for clarity I kept both.
'''
mins = []
maxs = []
for val in bloods[1:]: #assuming first line is header of the csv
mean, ranges = val.split(',')[1].split('[')
means.append(int(mean.strip()))
first, second = ranges.split(']')[0].split('-')
mins.append(int(first.strip()))
maxs.append(int(second.strip()))
print(f'the lowest and the highest blood pressure are: {min(mins)} {max(maxs)} respectively\naverage of mean values is {sum(means)/len(means)}')
You can also create functions to perform small small strip stuff. That's usually a better way to code. I wrote this in bit hurry, so don't mind.
Maybe this could help with your question,
Suppose you have a CSV file like this, and want to extract only the min and max values,
SN Number
1 135[113-166]
2 140[110-155]
3 132[108-180]
4 40[130-178]
5 133[118-160]
Then,
import pandas as pd
df = pd.read_csv("REPLACE_WITH_YOUR_FILE_NAME.csv")
results = df['YOUR_NUMBER_COLUMN'].apply(lambda x: x.split("[")[1].strip("]").split("-"))
with open("results.csv", mode="w") as f:
f.write("MIN," + "MAX")
f.write("\n")
for i in results:
f.write(str(i[0]) + "," + str(i[1]))
f.write("\n")
f.close()
After you ran the snippet after without any errors then in your current working directory their should be a file named results.csv Open it up and you will have the results

Written a little time delta function and want to capture results as a numpy array

I have a written some code that takes two data_dict lists one containing opening times and one containing closing times.
The functions finds the difference between these two times and returns a figure in hours X.X hours.
IF, the opening and closing times in the lists are not in the correct format (00:00:00), then the function returns '-1'.
It works perfectly, however I want to be able to capture the results and save them as a numpy array.
The results print like this...
X
Y
Z
A
X
etc...
I am very very new to python and just need some guidance.
Thanks guys.
opening_time_arr = data_dict['Open']
closing_time_arr = data_dict['Close']
if len(opening_time_arr) == len(closing_time_arr):
resultTime = []
for idx, closing_time in enumerate(closing_time_arr):
try:
FORMAT = '%H:%M:%S'
tdelta = datetime.strptime(closing_time, FORMAT) - datetime.strptime(opening_time_arr[idx], FORMAT)
resultTime.append(tdelta)
tdelta_h = tdelta.total_seconds()/3600
print(tdelta_h)
except ValueError:
print('-1')
The function returns
8.0
8.5
6.5
7.5
and so on... there is about 250 entries.
How can I take these numbers and convert them to a numpy array without printing the results like my code does currently.
Oliver - I think you were really close! If tdelta_h is your output in hours, then that is what you want to be appending to resultTime. After your for loop finishes, then you can convert the list to a numpy array using np.array(), and then print out the array if you want to make sure it looks OK.
Here's how I think it should look all together:
import numpy as np
opening_time_arr = data_dict['Open']
closing_time_arr = data_dict['Close']
if len(opening_time_arr) == len(closing_time_arr):
resultTime = []
for idx, closing_time in enumerate(closing_time_arr):
try:
FORMAT = '%H:%M:%S'
tdelta = (datetime.strptime(closing_time, FORMAT) - datetime.strptime(opening_time_arr[idx], FORMAT))
tdelta_h = tdelta.total_seconds()/3600
resultTime.append(tdelta_h)
except ValueError:
resultTime.append(-1)
np.array(resultTime)
print(resultTime)
Hope this helps :)

Loading delimited file with unequal number of rows at each column

I am trying to load this kind of file but I get the "wrong number of columns" error when I'm doing the following:
import numpy
ux = numpy.loadtxt('ux.txt',delimiter=None)
The file is like that:
.2496455E-03 -.1076763E-03 .2617193E-03 -.1371510E-03 .2694375E-03
-.1649617E-03 .2751468E-03 -.1895755E-03 .2890017E-03 -.2926575E-03
.1313772E-03
I could have the remainder be loaded as zeros, I don't care that much about it.
Thank you in advance!
What I did was the following and it worked successfully.
As I wanted to make a column with all the numbers at the end I did this:
uxf = []
for line in ux:
uxs = [float(x) for x in line.split()]
uxf = numpy.hstack((uxf,uxs))
That way I stacked all lines, which is what I eventually wanted. The [float(x) for x in line.split()] doesn't care about the number of columns within the line.

Extraction and processing the data from txt file

I am beginner in python (also in programming)I have a larg file containing repeating 3 lines with numbers 1 empty line and again...
if I print the file it looks like:
1.93202838
1.81608154
1.50676177
2.35787777
1.51866227
1.19643624
...
I want to take each three numbers - so that it is one vector, make some math operations with them and write them back to a new file and move to another three lines - to another vector.so here is my code (doesnt work):
import math
inF = open("data.txt", "r+")
outF = open("blabla.txt", "w")
a = []
fin = []
b = []
for line in inF:
a.append(line)
if line.startswith(" \n"):
fin.append(b)
h1 = float(fin[0])
k2 = float(fin[1])
l3 = float(fin[2])
h = h1/(math.sqrt(h1*h1+k1*k1+l1*l1)+1)
k = k1/(math.sqrt(h1*h1+k1*k1+l1*l1)+1)
l = l1/(math.sqrt(h1*h1+k1*k1+l1*l1)+1)
vector = [str(h), str(k), str(l)]
outF.write('\n'.join(vector)
b = a
a = []
inF.close()
outF.close()
print "done!"
I want to get "vector" from each 3 lines in my file and put it into blabla.txt output file. Thanks a lot!
My 'code comment' answer:
take care to close all parenthesis, in order to match the opened ones! (this is very likely to raise SyntaxError ;-) )
fin is created as an empty list, and is never filled. Trying to call any value by fin[n] is therefore very likely to break with an IndexError;
k2 and l3 are created but never used;
k1 and l1 are not created but used, this is very likely to break with a NameError;
b is created as a copy of a, so is a list. But you do a fin.append(b): what do you expect in this case by appending (not extending) a list?
Hope this helps!
This is only in the answers section for length and formatting.
Input and output.
Control flow
I know nothing of vectors, you might want to look into the Math module or NumPy.
Those links should hopefully give you all the information you need to at least get started with this problem, as yuvi said, the code won't be written for you but you can come back when you have something that isn't working as you expected or you don't fully understand.

Pretty printing a list of list of floats?

Basically i have to dump a series of temperature readings, into a text file. This is a space delimited list of elements, where each row represents something (i don't know, and it just gets forced into a fortran model, shudder). I am more or less handling it from our groups side, which is extracting those temperature readings and dumping them into a text file.
Basically a quick example is i have a list like this(but with alot more elements):
temperature_readings = [ [1.343, 348.222, 484844.3333], [12349.000002, -2.43333]]
In the past we just dumped this into a file, unfortunately there is some people who have this irritating knack of wanting to look directly at the text file, and picking out certain columns and changing some things (for testing.. i don't really know..). But they always complain about the columns not lining up properly, they pretty much the above list to be printed like this:
1.343 348.222 484844.333
12349.000002 -2.433333
So those wonderful decimals line up. Is there an easy way to do this?
you can right-pad like this:
str = '%-10f' % val
to left pad:
set = '%10f' % val
or in combination pad and set the precision to 4 decimal places:
str = '%-10.4f' % val
:
import sys
rows = [[1.343, 348.222, 484844.3333], [12349.000002, -2.43333]]
for row in rows:
for val in row:
sys.stdout.write('%20f' % val)
sys.stdout.write("\n")
1.343000 348.222000 484844.333300
12349.000002 -2.433330
The % (String formatting) operator is deprecated now.
You can use str.format to do pretty printing in Python.
Something like this might work for you:
for set in temperature_readings:
for temp in set:
print "{0:10.4f}\t".format(temp),
print
Which prints out the following:
1.3430 348.2220 484844.3333
12349.0000 -2.4333
You can read more about this here: http://docs.python.org/tutorial/inputoutput.html#fancier-output-formatting
If you also want to display a fixed number of decimals (which probably makes sense if the numbers are really temperature readings), something like this gives quite nice output:
for line in temperature_readings:
for value in line:
print '%10.2f' % value,
print
Output:
1.34 348.22 484844.33
12349.00 -2.43
In Python 2.*,
for sublist in temperature_readings:
for item in sublist:
print '%15.6f' % item,
print
emits
1.343000 348.222000 484844.333300
12349.000002 -2.433330
for your example. Tweak the lengths and number of decimals as you prefer, of course!

Categories

Resources