assign a variable to each data in a line separated by whitespaces - python

Basically I want to use python to read each data in the last two lines of the following file to a different variable.
The file is of the following form:
a b c
10
10 0 0
2 5
xyz
10 12 13
11 12 12.4
1 34.5 10.8
I want the output to have the following
d=11, e=12, f=12.4
g=1 h =34.5 i=10.8
How can I loop over the lines if I have say 100 lines (after xyz) each with three data. And that I need to read only say last 3 lines in it.
The following is what I did, but doesn't seem to reach anywhere.
p1=open('aaa','r')
im=open('bbb','w')
t=open('test','w')
lines=p1.readlines()
i=0
for line in lines:
Nj=[]
Nk=[]
Cx=Cy=Cz=Nx=Ny=Nz=0
i=i+1
if line.strip():
if i==1:
t.write(line)
dummy=line.strip().split()
a1=dummy[0]
a2=dummy[1]
a3=dummy[2]
print("The atoms present are %s, %s and %s" %(a1, a2,a3))
if i==2:
t.write(line)
if i==3:
t.write(line)
if i==4:
t.write(line)
if i==5:
t.write(line)
if i==6:
t.write(line)
dummy=line.strip().split()
Na1=dummy[0]
Na2=dummy[1]
Na3=dummy[2]
import string
N1=string.atoi(Na1)
N2=string.atoi(Na2)
N3=string.atoi(Na3)
print("number of %s atoms= %d "%(a1,N1))
print("number of %s atoms= %d "%(a2,N2))
print("number of %s atoms= %d "%(a3,N3))
if i==7:
t.write(line)
if i==8:
t.write(line)
for i, line in enumerate(p1):
if i==8:
dummy=line.strip().split()
Njx=dummy[0]
Njy=dummy[1]
Njz=dummy[2]
import string
Njx=string.atof(Njx)
Njy=string.atof(Njy)
Njz=string.atof(Njz)
Nj = [Njx, Njy, Njz]
elif i==9:
dummy=line.strip().split()
Nkx=dummy[0]
Nky=dummy[1]
Nkz=dummy[2]
import string
Nkx=string.atof(Nkx)
Nky=string.atof(Nky)
Nkz=string.atof(Nkz)
Nk = [Nkx, Nky, Nkz]
break

You can read the file's last two lines with
f = open(file, "r")
lines = f.readlines()[-2:] # change this if you want more than the last two lines
f.close()
split1 = lines[0].strip().split(' ') # In the example below: lines[0] = "4 5 6\n"
split2 = lines[1].strip().split(' ') # lines[1] = "7 8 9"
Then, you can assign those values to your variables:
d,e,f = [int(x) for x in split1]
g,h,i = [int(x) for x in split2]
This will assign the three values of each line to d,e,f,g,h,i, for example:
(your file)
...
1 2 3
4 5 6
7 8 9
(result)
d = 4
e = 5
f = 6
g = 7
h = 8
i = 9

Here you go
with open("text.txt", "r") as f:
# Get two last lines, remove the '\n'
contents = map(lambda s : s[:-1], f.readlines()[-2:])
# Get the three last lines,
[[d,e,f],[g,h,i]] = map(lambda s : map(float, s.split(" ")[-3:]), contents)
# Check the result
print (d,e,f,g,h,i)
Explanation :
with open("text.txt", "r") as f: is recommended way of working with file in python, see file I/O tutorial to see why.
contents = map(lambda s : s[:-1], f.readlines()[-2:]) This load the contents of f into a list of strings using readlines(), take the last two using [-2:], and remove the unnecessary '\n' by mapping lambda s : s[:-1].
At this point, our contents should contain last two lines.
The expression map(lambda s : map(float, s.split(" ")[-3:]), contents) split each of the two lines by " " then unpack it to the list [[d,e,f],[g,h,i]]. The [-3:] here is to remove the spaces in the front.

Related

How to use Enumerate with Variable data properly?

I am trying to use enumerate with data in a variable but the variable data is getting enumerated
as a single string how can i use in the below format
Excepted output comes when i use with statement :
with open("sample.txt") as file:
for num, line in enumerate(file):
print(num, line)
output
0 sdasd
1 adad
2 adadf
but when
data = "adklkahdjsa saljdahsd \nsjdksd"
for num, line in enumerate(data):
print(num, line)
output
0 a
1 d
2 k
3 l
4 k
5 a
6 h
7 d
8 j
9 s
10 a
11
12 s ... so on
enumerate expects an iterable. In your example it takes the string as iterable an iterates over each character.
It seems what you want is to iterate over each word in the text. Then you first need to split the string into words.
Example:
data.split(' ') # split by whitespace
Full Example:
data = "adklkahdjsa saljdahsd \nsjdksd"
for num, line in enumerate(data.split(' ')):
print(num, line)

Print data between positions within a loop

I have one files.
File1 which has 3 columns. Data are tab separated
File1:
2 4 Apple
6 7 Samsung
Let's say if I run a loop of 10 iteration. If the iteration has value between column 1 and column 2 of File1, then print the corresponding 3rd column from File1, else print "0".
The columns may or may not be sorted, but 2nd column is always greater than 1st. Range of values in the two columns do not overlap between lines.
The output Result should look like this.
Result:
0
Apple
Apple
Apple
0
Samsung
Samsung
0
0
0
My program in python is here:
chr5_1 = [[]]
for line in file:
line = line.rstrip()
line = line.split("\t")
chr5_1.append([line[0],line[1],line[2]])
# Here I store all position information in chr5_1 list in list
chr5_1.pop(0)
for i in range (1,10):
for listo in chr5_1:
L1 = " ".join(str(x) for x in listo[:1])
L2 = " ".join(str(x) for x in listo[1:2])
L3 = " ".join(str(x) for x in listo[2:3])
if int(L1) <= i and int(L2) >= i:
print(L3)
break
else:
print ("0")
break
I am confused with loop iteration and it break point.
Try this:
chr5_1 = dict()
for line in file:
line = line.rstrip()
_from, _to, value = line.split("\t")
for i in range(int(_from), int(_to) + 1):
chr5_1[i] = value
for i in range (1, 10):
print chr5_1.get(i, "0")
I think this is a job for else:
position_information = []
with open('file1', 'rb') as f:
for line in f:
position_information.append(line.strip().split('\t'))
for i in range(1, 11):
for start, through, value in position_information:
if i >= int(start) and i <= int(through):
print value
# No need to continue searching for something to print on this line
break
else:
# We never found anything to print on this line, so print 0 instead
print 0
This gives the result you're looking for:
0
Apple
Apple
Apple
0
Samsung
Samsung
0
0
0
Setup:
import io
s = '''2 4 Apple
6 7 Samsung'''
# Python 2.x
f = io.BytesIO(s)
# Python 3.x
#f = io.StringIO(s)
If the lines of the file are not sorted by the first column:
import csv, operator
reader = csv.reader(f, delimiter = ' ', skipinitialspace = True)
f = list(reader)
f.sort(key = operator.itemgetter(0))
Read each line; do some math to figure out what to print and how many of them to print; print stuff; iterate
def print_stuff(thing, n):
while n > 0:
print(thing)
n -= 1
limit = 10
prev_end = 1
for line in f:
# if iterating over a file, separate the columns
begin, end, text = line.strip().split()
# if iterating over the sorted list of lines
#begin, end, text = line
begin, end = map(int, (begin, end))
# don't exceed the limit
begin = begin if begin < limit else limit
# how many zeros?
gap = begin - prev_end
print_stuff('0', gap)
if begin == limit:
break
# don't exceed the limit
end = end if end < limit else limit
# how many words?
span = (end - begin) + 1
print_stuff(text, span)
if end == limit:
break
prev_end = end
# any more zeros?
gap = limit - prev_end
print_stuff('0', gap)

Finding Multiple Sequences in a File by Position after SpecifIcally Repeating Characters

i am trying to write this code, so that i can get my sequences of different samples in a file after line breaks by position, the output is always blank for some reason, can you help me?
import readline
count = 0
brk = 0
with open("file.txt") as f:
while (count < 35):
l = f.readline()[brk + 2]
sp = raw_input ("Starting Position:")
sp = int(sp)
rl = sp + 6
print(l[sp:rl])
print(l[-30:0])
count = count + 1
brk = brk + 2
print ("Done")
In the line l = f.readline()[brk + 2] the program puts one character into variable l. So, when you are trying to print substring of l (in the lines print(l[sp:rl]) and print(l[-30:0])), the program prints empty lines. It is expected result.
To find this you could just add print l right after assigning of l.
It seems that you are trying to read 2-nd, 4-th, 6-th, etc lines of the file. To do it you can do something like this:
brk = 0
with open("file.txt") as f:
f.readline()
f.readline() #skip both first lines
while (count < 35):
l = f.readline()
f.readline() #skip next line
sp = raw_input ("Starting Position:")
sp = int(sp)
rl = sp + 6
print(l[sp:rl])
print(l[-30:0])
count = count + 1
brk = brk + 2
Also print(l[-30:0]) must always print empty line. It seems that you need print(l[-30:]) (last 30 characters of the string l).

Python: read line after string is found

I have a file which contains blocks of lines that I would like to separate. Each block contains a number identifier in the block's header: "Block X" is the header line for the X-th block of lines. Like this:
Block X
#L E C A F X M N
11.2145 15 27 29.444444 7.6025229 1539742 29.419783
11.21451 13 28 24.607143 6.8247935 1596787 24.586264
...
Block Y
#L E C A F X M N
11.2145 15 27 29.444444 7.6025229 1539742 29.419783
11.21451 13 28 24.607143 6.8247935 1596787 24.586264
...
I can use "enumerate" to find the header line of the block as follows:
with open(filename,'r') as indata:
for num, line in enumerate(indata):
if 'Block X' in line:
startblock=num
print startblock
This will yield the line number of the first line of block #X.
However, my problem is identifying the last line of the block. To do that, I could find the next occurrence of a header line (i.e., the next block) and subtract a few numbers.
My question: how can I find the line number of a the next occurrence of a condition (i.e., right after a certain condition was met)?
I tried using enumerate again, this time indicating the starting value, like this:
with open(filename,'r') as indata:
for num, line in enumerate(indata,startblock):
if 'Block Y ' in line:
endscan=num
break
print endscan
That doesn't work, because it still begins reading the file from line 0, NOT from the line number "startblock". Instead, by starting the "enumerate" counter from a different number, the resulting value of the counter, in this case "endscan" is shifted from 0 by the amount "startblock".
Please, help! How can tell python to disregard the lines previous to "startblock"?
If you want the groups using Block as the delimiter for each section, you can use itertools.groupby:
from itertools import groupby
with open('test.txt') as f:
grp = groupby(f,key=lambda x: x.startswith("Block "))
for k,v in grp:
if k:
print(list(v) + list(next(grp, ("", ""))[1]))
Output:
['Block X\n', '#L E C A F X M N \n', '11.2145 15 27 29.444444 7.6025229 1539742 29.419783\n', '11.21451 13 28 24.607143 6.8247935 1596787 24.586264\n']
['Block Y\n', '#L E C A F X M N \n', '11.2145 15 27 29.444444 7.6025229 1539742 29.419783\n', '11.21451 13 28 24.607143 6.8247935 1596787 24.586264']
If Block can appear elsewhere but you want it only when followed by a space and a single char:
import re
with open('test.txt') as f:
r = re.compile("^Block \w$")
grp = groupby(f, key=lambda x: r.search(x))
for k, v in grp:
if k:
print(list(v) + list(next(grp, ("", ""))[1]))
You can use the .tell() and .seek() methods of file objects to move around. So for example:
with open(filename, 'r') as infile:
start = infile.tell()
end = 0
for line in infile:
if line.startswith('Block'):
end = infile.tell()
infile.seek(start)
# print all the bytes in the block
print infile.read(end - start)
# now go back to where we were so we iterate correctly
infile.seek(end)
# we finished a block, mark the start
start = end
If the difference between the header lines is uniform throughout the file, just use the distance to increase the indexing variable accordingly.
file1 = open('file_name','r')
lines = file1.readlines()
numlines = len(lines)
i=0
for line in file:
if line == 'specific header 1':
line_num1 = i
if line == 'specific header 2':
line_num2 = i
i+=1
diff = line_num2 - line_num1
Now that we know the difference between the line numbers we use for loops to acquire the data.
k=0
array = np.zeros([numlines, diff])
for i in range(numlines):
if k % diff == 0:
for j in range(diff):
array[i][j] = lines[i+j]
k+=1
% is the mod operator which returns 0 only when k is a multiple of the difference in line numbers between the two header lines in the file, which will only occur when the line corresponds to the a header line. Once the line is fixed we go on to the second for loop that fills the array so that we have a matrix that is numlines number of rows and a diff number of columns. The nonzeros rows will contain the data inbetween the header lines.
I have not tried this out, I am just writing off the top of my head. Hopefully it helps!

Python txt files, average, information

i have a .txt file with this(it should be random names, tho):
My Name 4 8 7
Your Name 5 8 7
You U 5 9 7
My My 4 8 5
Y Y 8 7 9
I need to put the information into text file results.txt with the names + average of the numbers. How do I do that?
with open(r'stuff.txt') as f:
mylist = list(f)
i = 0
sk = len(mylist)
while i < sk - 4:
print(mylist[i], mylist[i+1], mylist[i+2], mylist[i+3])
i = i + 3
Firstly, open both the input and output files:
with open("stuff.txt") as in_file:
with open("results.txt", "w") as out_file:
Since the problem only needs to work on each line independently, a simple loop over each line would suffice:
for line in in_file:
Split each line at the whitespaces into list of strings (row):
row = line.split()
The numbers occur after the first two fields:
str_nums = row[2:]
However, these are still strings, so they must be converted to a floating-point number to allow arithmetic to be performed on them. This results in a list of floats (nums):
nums = map(float, str_nums)
Now calculate the average:
avg = sum(nums) / len(str_nums)
Finally, write the names and the average into the output file.
out_file.write("{} {} {}\n".format(row[0], row[1], avg))
what about this?
with open(fname) as f:
new_lines = []
lines = f.readlines()
for each in lines:
col = each.split()
l = len(col)#<-- length of each line
average = (int(col[l-1])+int(col[l-2])+int(col[l-3]))/3
new_lines.append(col[0]+col[1]+str(average) + '\n')
for each in new_lines:#rewriting new lines into file
f.write(each)
f.close()
I tried, and this worked:
inputtxt=open("stuff.txt", "r")
outputtxt=open("output.txt", "w")
output=""""""
for i in inputtxt.readlines():
nums=[]
name=""
for k in i:
try:
nums.append(int(k))
except:
if k!=" ":
name+=k
avrg=0
for j in nums:
avrg+=j
avrg/=len(nums)
line=name+" "+str(avrg)+"\n"
output+=line
outputtxt.write(output)
inputtxt.close()
outputtxt.close()

Categories

Resources