My function doesn't work as it is supposed to. I keep getting 'True' when all line[0] are less than line[2]. I know this is pretty trivial, but it's an exercise i've taken to better understand files and for
def contains_greater_than(filename):
"""
(str) --> bool
The text file of which <filename> is the name contains multiple lines.
Each line consists of two integer numbers, separated by a space.
This returns True iff in at least one of those lines, the first number
is larger than the second one.
"""
lines = open(filename).readlines()
for line in lines:
if line[0] > line[2]:
return True
return False
my data:
3 6
3 7
3 8
2 9
3 20
Having been thoroughly schooled in my over-thought previous answer, may I offer this far simpler solution which still short-circuits as intended:
for line in lines:
x, y = line.split()
if int(x) > int(y): return True
return False
line[0] = "3" , line[1] = " "
for all cases in your data ('3' < ' ' = False)
you need to do
split_line = line.split()
then
numbers = [int(x) for x in split_line]
then looks at numbers[0] and numbers[1]
1) You are comparing strings that you need to convert to integers
2) You will only grab the first and third character (so, you won't get the 0 in 20)
Instead use
first, second = line.split()
if first < second:
Here's a whole-hog functional rewrite. Hope this is enlightening ;-)
import functools
def line_iter(fname):
with open(fname) as inf:
for line in inf:
line = line.strip()
if line:
yield line
def any_line(fn, fname):
return any(fn(line) for line in line_iter(fname))
def is_greater_than(line):
a,b = [int(i) for i in line]
return a > b
contains_greater_than = functools.partial(any_line, is_greater_than)
"3 20" is a string, just do map(int, LINE.split()) before.
but how do you want compare 2 numbers with 2 numbers?
The main problem is you are comparing characters of the line, not the values of the two numbers on each one. This can be avoided first splitting the line into white-space-separated words, and then turning those into an integer value for the comparison by applying theint()function to each one:
def contains_greater_than(filename):
with open(filename) as inf:
for line in inf:
a, b = map(int, line.split())
if a > b:
return True
return False
print(contains_greater_than('comparison_data.txt'))
This can all be done very succinctly in Python using the built-inany()function with a couple of generator expressions:
def contains_greater_than(filename):
with open(filename) as inf:
return any(a > b for a, b in (map(int, line.split()) for line in inf))
Related
I am trying to do a calculation based on the content of a line, but only if another line in the same document satisfies specific criteria. The order of the lines is not consistent.
A file might look like this:
Line A: 200
Line B: 200
Line C: 5
So an example condition would be, if Line C is 6 or greater, add the value from Line A "200" to a counter.
I have tried a variety of if statements, and also tried setting a BOOL. I haven't been able to get either to work. An excerpt of my latest attempt follows:
counter = 0
good = True
for line in text:
line = line.strip()
if line.startswith('Line C') :
rtime = re.findall('[0-9]+:[0-9]+', line)
for t in rtime:
if t < 6 :
good = False
print("-----To Small. Ignore Line A")
break
else :
good = True
while good == True :
if line.startswith('Line A') :
numstring = re.findall('[0-9]+', line)
for num in numstring:
temp = float(num)
counter = counter + temp
else : continue
print("----- good must be False. Should be ignoring Line A")
First, read all the rows from the file into a dictionary so that you have:
{'Line A':200, 'Line B':200, 'Line C':5}
After this it is easy to apply the criterias with conditionals like "if value['Line A'] > 6:" etc.
I am leaving with you the implementation of this because it sounds a bit homework-y. Let me know if you need more help!
Maybe you can use a dictionary if the lines aren't too long. A simple way would just add the lines to a dictionary and then check your condition.
import re
allDataLines = []
allQualifierLines = []
dataFileName = 'testdata.txt'
def chckForQualifierLine(line):
# lines containing qualifier
if not line.startswith('Line C'):
return False
# do more checks here, if not good just return False
allQualifierLines.append(line)
return True
def chckForDataLine(line):
# lines containing data
if not line.startswith('Line A'):
return False
# do more checks here, if not good just return False
allDataLines.append(line)
return True
with open(dataFileName, 'r') as text:
# Further file processing goes here
line = text.readline()
while line != '':
#print(line, end='')
if not chckForQualifierLine(line):
chckForDataLine(line)
line = text.readline()
for qualifierLine in allQualifierLines:
# this line is valid qualifier
print(f"Q: {qualifierLine}")
for dataLine in allDataLines:
# do with data line(s) what ever is to do here
print(f"D: {dataLine}")
So I have this problem I need to solve. I have a file that goes something like:
11-08-2012;1485;10184;7,53;31;706;227;29;6;1102
12-08-2012;2531;10272;7,59;25;695;222;26;22;1234
13-08-2012;1800;13418;8,66;46;714;203;50;6;0757
14-08-2012;2009;11237;9,43;81;655;246;49;7;1783
And I should be able to read the "1485" and then the "2531" part and then the "1800" part and go all the way to the end of the file and finally sum them up. How do I do that? I wrote under this text how I tried to approach this problem with while. But I seem to be lost with this one. Anyone can help?
while True:
f.seek(12)
text=f.read(4)
text=f.readline()
if(text==""):
break
return text
There are number of ways to do this, with numpy, pandas, simple coroutines and so on. I am adding the one closest to your approach.
total = 0
with open('exmplefile.txt','r') as f:
for line in f:
elements = line.split(';')
num_of_interest = int(elements[1])
# you can add a print if you want
total += num_of_interest
print(total)
This solution is by getting the first and second index of a common term, in this case ;.
with open(filename,'r') as file:
file_list = file.readlines()
sum = 0
for line in file_list:
loc = line.find(';')
first_loc = loc + 1
last_loc = loc +line[loc+1:].find(';')+1
sum = sum + int(line[first_loc:last_loc])
print(sum)
Try this
mylist = []
for string in file:
mynum = string.split(';')[1]
mylist.append(mynum)
sum([int(i) for i in mylist])
This solution caters for when the 4 digit is not the second item in the array
with open("path/to/file") as f:
f1 = f.readlines()
sum = 0
for line in f1:
lineInArray = line.split(';')
for digit in lineInArray:
if len(digit.strip()) == 4 and digit.strip().isnumeric():
sum += int(digit)
I just started learning Python a few weeks ago and I want to write a function which opens a file, counts and adds up the characters in each line and prints that those equal the total number of characters in the file.
For example, given a file test1.txt:
lineLengths('test1.txt')
The output should be:
15+20+23+24+0=82 (+0 optional)
This is what I have so far:
def lineLengths(filename):
f=open(filename)
lines=f.readlines()
f.close()
answer=[]
for aline in lines:
count=len(aline)
It does what I want it to do, but I don't know how to include all the of numbers added together when I have the function print.
If you only want to print the sum of the length of each line, you can do it like so:
def lineLengths(filename):
with open(filename) as f:
answer = []
for aline in f:
answer.append(len(aline))
print("%s = %s" %("+".join(str(c) for c in answer), sum(answer))
If you however also need to track lengths of all the individual lines, you can append the length for each line in your answer list by using the append method and then print the sum by using sum(answer)
Try this :
f=open(filename)
mylist = f.read().splitlines()
sum([len(i) for i in mylist])
Simple as this:
sum(map(len, open(filename)))
open(filename) returns an iterator that passes through each line, each of which is run through the len function, and the results are summed.
Once you read lines from file you can count sum using:
sum([len(aline) for aline in lines])
Separate you problem in function : a responsible by return total sum of lines and other to format sum of each line.
def read_file(file):
with open(file) as file:
lines = file.readlines()
return lines
def format_line_sum(lines):
lines_in_str = []
for line in lines:
lines_in_str.append(str(line)
return "+".join(str_lines))
def lines_length(file):
lines = read_file(file)
total_sum = 0
for line in lines:
total_sum += len(line)
return format_lines_sum(lines) + "=" + total_sum
And to use:
print(lines_length('file1.txt'))
Assuming your output is literal, something like this should work.
You can use python sum() function when you figure out how to add numbers to the list
def lineLengths(filename):
with open(filename) as f:
line_lengths = [len(l.rstrip()) for l in f]
summ = '+'.join(map(str, line_lengths)) # can only join strings
return sum(line_lengths), summ
total_chars, summ = lineLengths(filename)
print("{} = {}".format(summ, total_chars))
This should have the output you want : x+y+z=a
def lineLengths(filename):
count=[]
with open(filename) as f: #this is an easier way to open/close a file
for line in f:
count.append(len(line))
print('+'.join(str(x) for x in count) + "=" + str(sum(count))
Suppose I have a text file containing this, where the number on the left says how many of the characters of the right should be there:
2 a
1 *
3 $
How would I get this output in the fastest time?
aa*$$$
This is my code, but has N^2 complexity:
f = open('a.txt')
for item in f:
item2=item.split()
num = int(item2[0])
for i in range(num):
line+=item2[1]
print(line)
f.close()
KISS
with open('file.txt') as f:
for line in f:
count, char = line.strip().split(' ')
print char * int(count),
Just print immediately:
for item in open('a.txt'):
num, char = item.strip().split()
print(int(num) * char, end='')
print() # Newline
You can multiply strings to repeat them in Python:
"foo" * 3 gives you foofoofoo.
line = []
with open("a.txt") as f:
for line in f:
n, c = line.rstrip().split(" ")
line.append(c * int(n))
print("".join(line))
You can print directly but the code above lets you get the output you want in a string if you care about that.
Using a list then joining is more efficient than using += on a string because strings are immutable in Python. This means that a new string must be created for each +=. Of course printing immediately avoids this issue.
You can try like this,
f = open('a.txt')
print ''.join(int(item.split()[0]) * item.split()[1] for item in f.readlines())
Your code is actually O(sum(n_i)) where n_i is the number in the row i. You can't do any better and none of the solutions in the other answers do, even if they might be faster than yours.
I would like to take a large file like this in Python 2.7:
123 456 GTHGGGTH
223 567 FGRTHSYS
12933 4656832 GJWSOOOSKKSSJ
.....
and I want to read in the file line by line, disregard the third element, and subtract the second element in each line by the first element. Thus line 1 above would return 333.
I have tried this so far:
def deleteLast(list):
NewL = list.pop()
return NewL
f = open(file_name, 'r')
line = f.readline()
while line:
L = line.split()
L2 = deleteLast(L)
L3 = [int(number) for number in L2]
Length = L3[1]-L3[0]
print Length
f.close()
But, when I try this the compiler says:
ValueError: invalid literal for int() with base 10: 'T'
All help is appreciated.
That is because list.pop() is returning the "popped off" item, it doesn't return the list again.
Instead of this deleteLast function you have written, it would be better just to use a slice like this:
L2 = line.split()[0:2]
You are going to run into another problem later because your while loop isn't advancing at all. Consider using a for loop instead.
You can try something like this :
In [8]: with open("abc") as f: #always use with statement when handling files
...: for line in f:
...: x,y=map(int,line.split()[:2])
...: print y-x
...:
333
344
4643899
try the following:
with open(file_name, 'r') as f:
for line in f.readlines():
rowData = line.split()
left, right = map(int, rowData[:2])
length = right - left
print length
Or:
from operator import sub
with open(file_name, 'r') as f:
for line in f.readlines():
print sub(*map(int, line.split()[:2])[::-1])
f = open(file_name, 'r')
for line in f.readlines():
x, y = line.split(' ')[:2]
print int(y) - int(x)