Basically i have to dump a series of temperature readings, into a text file. This is a space delimited list of elements, where each row represents something (i don't know, and it just gets forced into a fortran model, shudder). I am more or less handling it from our groups side, which is extracting those temperature readings and dumping them into a text file.
Basically a quick example is i have a list like this(but with alot more elements):
temperature_readings = [ [1.343, 348.222, 484844.3333], [12349.000002, -2.43333]]
In the past we just dumped this into a file, unfortunately there is some people who have this irritating knack of wanting to look directly at the text file, and picking out certain columns and changing some things (for testing.. i don't really know..). But they always complain about the columns not lining up properly, they pretty much the above list to be printed like this:
1.343 348.222 484844.333
12349.000002 -2.433333
So those wonderful decimals line up. Is there an easy way to do this?
you can right-pad like this:
str = '%-10f' % val
to left pad:
set = '%10f' % val
or in combination pad and set the precision to 4 decimal places:
str = '%-10.4f' % val
:
import sys
rows = [[1.343, 348.222, 484844.3333], [12349.000002, -2.43333]]
for row in rows:
for val in row:
sys.stdout.write('%20f' % val)
sys.stdout.write("\n")
1.343000 348.222000 484844.333300
12349.000002 -2.433330
The % (String formatting) operator is deprecated now.
You can use str.format to do pretty printing in Python.
Something like this might work for you:
for set in temperature_readings:
for temp in set:
print "{0:10.4f}\t".format(temp),
print
Which prints out the following:
1.3430 348.2220 484844.3333
12349.0000 -2.4333
You can read more about this here: http://docs.python.org/tutorial/inputoutput.html#fancier-output-formatting
If you also want to display a fixed number of decimals (which probably makes sense if the numbers are really temperature readings), something like this gives quite nice output:
for line in temperature_readings:
for value in line:
print '%10.2f' % value,
print
Output:
1.34 348.22 484844.33
12349.00 -2.43
In Python 2.*,
for sublist in temperature_readings:
for item in sublist:
print '%15.6f' % item,
print
emits
1.343000 348.222000 484844.333300
12349.000002 -2.433330
for your example. Tweak the lengths and number of decimals as you prefer, of course!
Related
I have a csv file and the data pattern like this:
I am importing it from csv file. In input data, there are some whitespaces and I am handling it by using pattern as above. For output, I want to write a function that takes this file as an input and prints the lowest and highest blood pressure. Also, it will return average of all mean values. On the other side, I should not use pandas.
I wrote below code blog.
bloods=open(bloodfilename).read().split("\n")
blood_pressure=bloods[4].split(",")[1]
pattern=r"\s*(\d+)\s*\[\s*(\d+)-(\d+)\s*\]"
re.findall(pattern,blood_pressure)
#now extract mean, min and max information from the blood_pressure of each patinet and write a new file called blood_pressure_modified.csv
pattern=r"\s*(\d+)\s*\[\s*(\d+)-(\d+)\s*\]"
outputfilename="blood_pressure_modified.csv"
# create a writeable file
outputfile=open(outputfilename,"w")
for blood in bloods:
patient_id, blood_pressure=bloods.strip.split(",")
mean=re.findall(pattern,blood_pressure)[0]
blood_pressure_modified=re.sub(pattern,"",blood_pressure)
print(patient_id, blood_pressure_modified, mean, sep=",", file=outputfile)
outputfile.close()
Output should looks like this:
This is a very simple kind of answer to this. No regex, pandas or anything.
Let me know if this is working. I can try making it work better for any case it doesn't work.
bloods=open("bloodfilename.csv").read().split("\n")
means = []
'''
Also, rather than having two list of mins and maxs,
we can have just one and taking min and max from this
list later would do the same trick. But just for clarity I kept both.
'''
mins = []
maxs = []
for val in bloods[1:]: #assuming first line is header of the csv
mean, ranges = val.split(',')[1].split('[')
means.append(int(mean.strip()))
first, second = ranges.split(']')[0].split('-')
mins.append(int(first.strip()))
maxs.append(int(second.strip()))
print(f'the lowest and the highest blood pressure are: {min(mins)} {max(maxs)} respectively\naverage of mean values is {sum(means)/len(means)}')
You can also create functions to perform small small strip stuff. That's usually a better way to code. I wrote this in bit hurry, so don't mind.
Maybe this could help with your question,
Suppose you have a CSV file like this, and want to extract only the min and max values,
SN Number
1 135[113-166]
2 140[110-155]
3 132[108-180]
4 40[130-178]
5 133[118-160]
Then,
import pandas as pd
df = pd.read_csv("REPLACE_WITH_YOUR_FILE_NAME.csv")
results = df['YOUR_NUMBER_COLUMN'].apply(lambda x: x.split("[")[1].strip("]").split("-"))
with open("results.csv", mode="w") as f:
f.write("MIN," + "MAX")
f.write("\n")
for i in results:
f.write(str(i[0]) + "," + str(i[1]))
f.write("\n")
f.close()
After you ran the snippet after without any errors then in your current working directory their should be a file named results.csv Open it up and you will have the results
So I wanna store a long integer which is too big for one line in python. Do I just ignore PEP 8 and just make it longer than 120 characters? Cause if I do it like this:
num="""7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843
8586156078911294949545950173795833195285320880551112540698747158523863050715693290963295227443043557
6689664895044524452316173185640309871112172238311362229893423380308135336276614282806444486645238749
3035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776
6572733300105336788122023542180975125454059475224352584907711670556013604839586446706324415722155397
5369781797784617406495514929086256932197846862248283972241375657056057490261407972968652414535100474
8216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586
1786645835912456652947654568284891288314260769004224219022671055626321111109370544217506941658960408
0719840385096245544436298123098787992724428490918884580156166097919133875499200524063689912560717606
0588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450"""
and try to access a specific index of that integer or use len() on it I get a length of 1009 instead of the 1000 digits the number actually has. And putting everything into one line would make that line 1004 characters long which doesn't seem that great either.
I would use the following literal over multiple lines in parentheses for cleanliness:
num = (
'7316717653'
'1330624919'
'2251196744'
)
so that len(num) from the above example returns: 30
Another option you have is to put the number into another file (say number.txt) and read it at runtime:
number.txt
7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450
main.py
with open("number.txt", "r") as f:
number = f.read()
I wouldn't use this personally, but one option is to remove the newlines:
num = """
123
456
""".replace('\n', '')
print(repr(num)) # -> '123456'
There's lots of good answers already, but here's one that will give you a bit of extra convenience. You just have to put in a number and the size of the chunks per line, and you can reuse it for lots of long numbers, if needed:
Format your number into multiple strings using a for loop and string concatenation:
x = str(7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450)
y = []
y.append("long_num = (")
chunksize = 10
for i in range(0, len(x), chunksize ):
y.append("\t"+"\""+x[i:i+chunksize ]+"\"")
y.append(")")
for part in y:
print (part)
Outputs the following string that you can use in your code, referencing #blhsing's answer:
long_num = (
"7316717653"
"1330624919"
"2251196744"
"2657474235"
"5349194934"
"9698352031"
"2774506326"
"2395783180"
"1698480186"
...
) ```
You can take a look at this post Is there a way to implement methods like __len__ or __eq__ as classmethods?
Simple make a class for your long integer, and replace the len(self) function to not count \n
I would like to go through a gene and get a list of 10bp long sequences containing the exon/intron borders from each feature.type =='mRNA'. It seems like I need to use compoundLocation, and the locations used in 'join' but I can not figure out how to do it, or find a tutorial.
Could anyone please give me an example or point me to a tutorial?
Assuming all the info in the exact format you show in the comment, and that you're looking for 20 bp on either side of each intro/exon boundary, something like this might be a start:
Edit: If you're actually starting from a GenBank record, then it's not much harder. Assuming that the full junction string you're looking for is in the CDS feature info, then:
for f in record.features:
if f.type == 'CDS':
jct_info = str(f.location)
converts the "location" information into a string and you can continue as below.
(There are ways to work directly with the location information without converting to a string - in particular you can use "extract" to pull the spliced sequence directly out of the parent sequence -- but the steps involved in what you want to do are faster and more easily done by converting to str and then int.)
import re
jct_info = "join{[0:229](+), [11680:11768](+), [11871:12135](+), [15277:15339](+), [16136:16416](+), [17220:17471](+), [17547:17671](+)"
jctP = re.compile("\[\d+\:\d+\]")
jcts = jctP.findall(jct_info)
jcts
['[0:229]', '[11680:11768]', '[11871:12135]', '[15277:15339]', '[16136:16416]', '[17220:17471]', '[17547:17671]']
Now you can loop through the list of start:end values, pull them out of the text and convert them to ints so that you can use them as sequence indexes. Something like this:
for jct in jcts:
(start,end) = jct.replace('[', '').replace(']', '').split(':')
try: # You need to account for going out of index, e.g. where start = 0
start_20_20 = seq[int(start)-20:int(start)+20]
except IndexError:
# do your alternatives e.g. start = int(start)
I want to retrieve hexadecimal data from user, using python. How to retrieve the data from user and convert it to hex.
#to read varibales from Python
STX = '\xF7' #hex(input("enter STX Value"))
Deviceid = hex(input("enter device id"))
subid = hex(input("enter address of the Device and load details"))
Comnd = hex(41)
Data = hex(01)
EorCode = input("enter EOR Code")
ADD_sum = '\xF2' #hex(input("Enter Add sum value"))
tuple = (STX, Deviceid,subid,Comnd,Data,EorCode,ADD_sum)
print tuple
i am reading the above data from user,but i am getting output as follows
enter device id03
enter address of the Device and load details81
enter EOR Code32
('\xf7', '0x3', '0x51', '0x29', '0x1', '0x20', '\xf2')
But i need to be printed as 0x03 and 0x01.
I am very new to PYTHON please help.
You're looking for string formatting:
>>> "0x{0:04x}".format(42)
'0x002a'
So you'll want to modify your lines like so:
Deviceid = "0x{0:2x}".format((input("enter device id"))
Also, if any other Python developer will be looking at this code you may want to look at the Python style guide, PEP8.
Following the style guide, your code might look like this:
stx = '\xF7' # hex(input("enter STX Value"))
device_id = hex(input("enter device id")) # deviceid might also be fine
sub_id = hex(input("enter address of the Device and load details"))
comnd = hex(41)
data = hex(01)
eor_code = input("enter EOR Code")
add_sum = '\xF2' # hex(input("Enter Add sum value"))
values = (stx, device_id, sub_id, comnd, data, eor_code, add_sum)
print values # tuple is a keyword - it's best to *not* override them if possible
Of course,
A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is most important.
But most importantly: know when to be inconsistent -- sometimes the style guide just doesn't apply. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don't hesitate to ask!
It seems to me that all you really need is to specify how to print the numbers, but hex function returns a string.
Because in python, '10' is a string and this is different from 10, which is an int. Python is dynamicaly, but strongly typed language.
So in order to have output you want, you may choose from 2 options:
write your own function to convert numbers to hexaxecimal numbers in a format you want and use it instead of hex:
def myhex(num):
return '0x%02x' % num
this 0x%02x means - first, 0x is just normal text which you probably want to prefix all your hexadecimal numbers, %02x means: print argument as hexadecimal number of length 2, prefixed with 0 if it's too short (one-digit hexadecimal number).
do not convert numbers to hex when reading values (it's probably a good thing to work with numbers represented as numbers) and print them formated to your specification at the end:
print '(' + ', '.join('%0x02x' % x for x in tuple) + ')'
which creates list of all values in tuple (btw, avoid using keywords as your variable names when possible) converted to correct 2-digit hexadecimal numbers with 0x prefixes, joins them using ', ' and surrounds them with parentheses. But feel free to change it - I'm just building on your example and trying to duplicate your output.
I am running some code with python and pyfits and I am reading out a line of information from the header. I am getting the correct line but due to how it is written in the header it is printing out with colons separating the numbers I need.
the line I am running is
print header[0].header['opp']
this prints
34:04:32.04
I need to do a calculation where I add these numbers together, but do not know how to do this as they are separated by colons.
Something like this should solve your problem:
header[0].header['opp'] = "34:04:32.04"
print (sum(float(x) for x in header[0].header['opp'].split(":")))
... which outputs:
70.03999999999999
(EDIT)
Or, if the values actually make up a time in hours, minutes and seconds:
s = "34:04:32.04"
ss = [float(x) for x in s.split(":")]
print (ss[0] + ss[1]/60 + ss[2]/3600)
... which outputs the value in hours:
34.07556666666667