getting data out of a txt file

getting data out of a txt file - python

I'm only just beginning my journey into Python. I want to build a little program that will calculate shim sizes for when I do the valve clearances on my motorbike. I will have a file that will have the target clearances, and I will query the user to enter the current shim sizes, and the current clearances. The program will then spit out the target shim size. Looks simple enough, I have built a spread-sheet that does it, but I want to learn python, and this seems like a simple enough project...
Anyway, so far I have this:
def print_target_exhaust(f):
print f.read()
#current_file = open("clearances.txt")
print print_target_exhaust(open("clearances.txt"))
Now, I've got it reading the whole file, but how do I make it ONLY get the value on, for example, line 4. I've tried print f.readline(4) in the function, but that seems to just spit out the first four characters... What am I doing wrong?
I'm brand new, please be easy on me!
-d

To read all the lines:
lines = f.readlines()
Then, to print line 4:
print lines[4]
Note that indices in python start at 0 so that is actually the fifth line in the file.

with open('myfile') as myfile: # Use a with statement so you don't have to remember to close the file
for line_number, data in enumerate(myfile): # Use enumerate to get line numbers starting with 0
if line_number == 3:
print(data)
break # stop looping when you've found the line you want
More information:
with statement
enumerate

Not very efficient, but it should show you how it works. Basically it will keep a running counter on every line it reads. If the line is '4' then it will print it out.
## Open the file with read only permit
f = open("clearances.txt", "r")
counter = 0
## Read the first line
line = f.readline()
## If the file is not empty keep reading line one at a time
## till the file is empty
while line:
counter = counter + 1
if counter == 4
print line
line = f.readline()
f.close()

Related

Reading CSV file with python

filename = 'NTS.csv'
mycsv = open(filename, 'r')
mycsv.seek(0, os.SEEK_END)
while 1:
time.sleep(1)
where = mycsv.tell()
line = mycsv.readline()
if not line:
mycsv.seek(where)
else:
arr_line = line.split(',')
var3 = arr_line[3]
print (var3)
I have this Paython code which is reading the values from a csv file every time there is a new line printed in the csv from external program. My problem is that the csv file is periodically completely rewriten and then python stops reading the new lines. My guess is that python is stuck on some line number and the new update can put maybe 50 more or less lines. So for example python is now waiting a new line at line 70 and the new line has come at line 95. I think the solution is to let mycsv.seek(0, os.SEEK_END) been updated but not sure how to do that.

What you want to do is difficult to accomplish without rewinding the file every time to make sure that you are truly on the last line. If you know approximately how many characters there are on each line, then there is a shortcut you could take using mycsv.seek(-end_buf, os.SEEK_END), as outlined in this answer. So your code could work somehow like this:
avg_len = 50 # use an appropriate number here
end_buf = 3 * avg_len / 2
filename = 'NTS.csv'
mycsv = open(filename, 'r')
mycsv.seek(-end_buf, os.SEEK_END)
last = mycsv.readlines()[-1]
while 1:
time.sleep(1)
mycsv.seek(-end_buf, os.SEEK_END)
line = mycsv.readlines()[-1]
if not line == last:
arr_line = line.split(',')
var3 = arr_line[3]
print (var3)
Here, in each iteration of the while loop, you seek to a position close to the end of the file, just far back enough that you know for sure the last line will be contained in what remains. Then you read in all the remaining lines (this will probably include a partial amount of the second or third to last lines) and check if the last line of these is different to what you had before.

You can do a simpler way of reading lines in your program. Instead of trying to use seek in order to get what you need, try using readlines on the file object mycsv.
You can do the following:
mycsv = open('NTS.csv', 'r')
csv_lines = mycsv.readlines()
for line in csv_lines:
arr_line = line.split(',')
var3 = arr_line[3]
print(var3)

Python Performing calculation on every x line and place it back into original file

I have a list that looks like this
url1
number1
number2
number3
url2
number1
number2
number3
url3
etc
I want to perform calculations on number 2 , and replace it with the new value in the original position of the text file. Here is what i have so far
import itertools
#with open('math.txt', 'r') as f:
# fourthlines = itertools.islice(f, 2, None, 4)
# for line in fourthlines:
# line=float(line)
# #print line
# line=((line-100)/2)
# print line
Issue: this returns the value i want but i want to place it back into math.txt where it came from ( number 2's position )

First writing to another file:
with open("test.txt","r") as rp, open("write.txt","w") as wp:
for line_count, line in enumerate(rp,-2):
if line_count % 4 == 0:
print(line)
#do somethign to line
wp.write('found the line number 2\n')
else:
wp.write(line)
Second, using a temp file modifying the original one, temp files are the safest way of modifying a file:
import tempfile
temp_file = tempfile.NamedTemporaryFile(mode="r+")
with open("test.txt","r") as rp:
for line_count, line in enumerate(rp,-2):
if line_count % 4 == 0:
print(line)
#do somethign to line
line = 'found the line number 2'
temp_file.write(line + '\n')
else:
temp_file.write(line)
temp_file.seek(0) #reset the tempfile's cursor back to the start
with open("test.txt","w") as wp:
for line in temp_file:
wp.write(line)
In either case just change "test.txt" to your text file name, and where I put #do something to line just do whatever you need to do making sure line is a str at the end after doing things to it.

There is a read/write mode for python file objects.
fo = open("foo.txt",'r+')
However, THIS DOES NOT INSERT NEW TEXT INTO A FILE, instead it writes over what is already there. For example.
******original text file*****
hello world!
*****************************
fo = open("foo.txt",'r+')
fo.write('spam')
fo.close()
***********result************
spamo world!
*****************************
I would not use this on a plain text file unless I knew 100% that the data I was writing was the same size as the data that I was replacing. Even then, I would hesitate. If you are set on storing your information as plain text then I think you would be better off writing to a separate file.
On the other hand, if you change your information so it is stored in binary form and the original value and the new value have the same byte size then I think that this would be a good option for you to consider.

My solution will does the calculation and write it at the same time without using temp file.
Solution has two parts :
It will first take the lines from file on which the operation need to be done.
Second, it read file and do the operation on above list of lines and write the new value at the same time i.e. it is going to in place replacement.
Module : fileinput
Using this module you can to do reading and writing to file at the same time.
Check here for more info !!!
Content of math.txt before :
url
1
2
3
url
4
5
6
url
Code:
import fileinput
lines_to_check = []
#First - It will take out the lines on which operation is required
with open("math.txt","r") as fh:
for l_count, line in enumerate(fh,-2):
if l_count % 4 == 0:
lines_to_check.append(line.strip())
#Second - Calculate and write
#As calculation, I am just multiplying value by 100
for line in fileinput.input('math.txt',inplace=1):
if line.strip() in lines_to_check:
line=float(line)
line= line *100
print str(line).strip()
else:
print line.strip()
Content of math.txt after executing code :
url1
1
200.0
3
url2
4
500.0
6
url3

Here is a safe way to do it...which is the approach I'd take as I am a beginner.
Open the file and read its lines into a list (filename.readlines() method).
Lookup the element's index in the list (lines.index(element) method).
Replace the element in the list (list[index]=new_element).
Open the file and [over]write all the lines from you list back into it.
If the file is not extremely long then this method should be sufficient.
I hope this helps you, if not then I am sure it will benefit someone even newer than me. (;

Python 3.4.3: Iterating over each line and each character in each line in a text file

I have to write a program that iterates over each line in a text file and then over each character in each line in order to count the number of entries in each line.
Here is a segment of the text file:
N00000031,B,,D,D,C,B,D,A,A,C,D,C,A,B,A,C,B,C,A,C,C,A,B,D,D,D,B,A,B,A,C,B,,,C,A,A,B,D,D
N00000032,B,A,D,D,C,B,D,A,C,C,D,,A,A,A,C,B,D,A,C,,A,B,D,D
N00000033,B,A,D,D,C,,D,A,C,B,D,B,A,B,C,C,C,D,A,C,A,,B,D,D
N00000034,B,,D,,C,B,A,A,C,C,D,B,A,,A,C,B,A,B,C,A,,B,D,D
The first and last lines are "unusable lines" because they contain too many entries (more or less than 25). I would like to count the amount of unusable lines in the file.
Here is my code:
for line in file:
answers=line.split(",")
i=0
for i in answers:
i+=1
unusable_line=0
for line in file:
if i!=26:
unusable_line+=1
print("Unusable lines in the file:", unusable_line)
I tried using this method as well:
alldata=file.read()
for line in file:
student=alldata.split("\n")
answer=student.split(",")
My problem is each variable I create doesn't exist when I try to run the program. I get a "students" is not defined error.
I know my coding is awful but I'm a beginner. Sorry!!! Thank you and any help at all is appreciated!!!

A simplified code for your method using list,count and if condition
Code:
unusable_line = 0
for line in file:
answers = line.strip().split(",")
if len(answers) < 26:
unusable_line += 1
print("Unusable lines in the file:", unusable_line)
Notes:
Initially I have created a variable to store count of unstable lines unusable_line.
Then I iterate over the lines of the file object.
Then I split the lines at , to create a list.
Then I check if the count of list is less then 26. If so I increment the unusable_line varaiable.
Finally I print it.

You could use something like this and wrap it into a function. You don't need to re-iterate the items in the line, str.split() returns a list[] that has your elements in it, you can count the number of its elements with len()
my_file = open('temp.txt', 'r')
lines_count = usable = ununsable = 0
for line in my_file:
lines_count+=1
if len(line.split(',')) == 26:
usable+=1
else:
ununsable+=1
my_file.close()
print("Processed %d lines, %d usable and %d ununsable" % (lines_count, usable, ununsable))

You can do it much shorter:
with open('my_fike.txt') as fobj:
unusable = sum(1 for line in fobj if len(line.split(',')) != 26)
The line with open('my_fike.txt') as fobj: opens the file for reading and closes it automatically after leaving the indented block. I use a generator expression to go through all lines and add up all that have a length different from 26.

Include surrounding lines of text file match in output using Python 2.7.3

I've been working on a program which assists in log analysis. It finds error or fail messages using regex and prints them to a new .txt file. However, it would be much more beneficial if the program including the top and bottom 4 lines around what the match is. I can't figure out how to do this! Here is a part of the existing program:
def error_finder(filepath):
source = open(filepath, "r").readlines()
error_logs = set()
my_data = []
for line in source:
line = line.strip()
if re.search(exp, line):
error_logs.add(line)
I'm assuming something needs to be added to the very last line, but I've been working on this for a bit and either am not applying myself fully or just can't figure it out.
Any advice or help on this is appreciated.
Thank you!

Why python?
grep -C4 '^your_regex$' logfile > outfile.txt

Some comments:
I'm not sure why error_logs is a set instead of a list.
Using readlines() will read the entire file in memory, which will be inefficient for large files. You should be able to just iterate over the file a line at a time.
exp (which you're using for re.search) isn't defined anywhere, but I assume that's elsewhere in your code.
Anyway, here's complete code that should do what you want without reading the whole file in memory. It will also preserve the order of input lines.
import re
from collections import deque
exp = '\d'
# matches numbers, change to what you need
def error_finder(filepath, context_lines = 4):
source = open(filepath, 'r')
error_logs = []
buffer = deque(maxlen=context_lines)
lines_after = 0
for line in source:
line = line.strip()
if re.search(exp, line):
# add previous lines first
for prev_line in buffer:
error_logs.append(prev_line)
# clear the buffer
buffer.clear()
# add current line
error_logs.append(line)
# schedule lines that follow to be added too
lines_after = context_lines
elif lines_after > 0:
# a line that matched the regex came not so long ago
lines_after -= 1
error_logs.append(line)
else:
buffer.append(line)
# maybe do something with error_logs? I'll just return it
return error_logs

I suggest to use index loop instead of for each loop, try this:
error_logs = list()
for i in range(len(source)):
line = source[i].strip()
if re.search(exp, line):
error_logs.append((line,i-4,i+4))
in this case your errors log will contain ('line of error', line index - 4, line index + 4), so you can get these lines later form "source"

Python, checking data file for certain lines

I've never taken a class that used python, just c, c++, c#, java, etc..
This should be easy but I'm feeling like I'm missing something huge that python reacts to.
All I'm doing is reading in a file, checking for lines that are only digits, counting how many lines like that and displaying it.
So I'm opening, reading, striping, checking isdigit(), and incrementing. What's wrong?
# variables
sum = 0
switch = "run"
print( "Reading data.txt and counting..." )
# open the file
file = open( 'data.txt', 'r' )
# run through file, stripping lines and checking for numerics, incrementing sum when neeeded
while ( switch == "run" ):
line = file.readline()
line = line.strip()
if ( line.isdigit() ):
sum += 1
if ( line == "" ):
print( "End of file\ndata.txt contains %s lines of digits" %(sum) )
switch = "stop"

The correct way in Python to tell if you've reached the end of a file is not to see if it returns an empty line.
Instead, iterate over all the lines in the file, and the loop will end when the end of the file is reached.
num_digits = 0
with open("data.txt") as f:
for line in f:
if line.strip().isdigit():
num_digits += 1
Because files can be iterated over, you can simplify this using a generator expression:
with open("data.txt") as f:
num_digits = sum( 1 for line in f if line.strip().isdigit() )
I would also recommend against using reserved Python keywords such as sum as variable names, and it's also terribly inefficient to use string comparisons for flow logic like you're doing.

sum=0
f=open("file")
for line in f:
if line.strip().isdigit():
sum+=1
f.close()

I just tried running your code:
matti#konata:~/tmp$ cat data.txt
1
a
542
dfd
b
42
matti#konata:~/tmp$ python johnredyns.py
Reading data.txt and counting...
End of file
data.txt contains 3 lines of digits
It works fine here. What's in your data.txt?

As several people have said, your code appears to work perfectly. Perhaps your "data.txt" file is in a different directory than your current working directory (not necessarily the directory that your script is in)?
However, here's a more "pythonic" way of doing the same thing:
counter = 0
with open('data.txt', 'r') as infile:
for line in infile:
if line.strip().isdigit():
counter += 1
print 'There are a total of {0} lines that start with digits'.format(counter)
You could even make it a one-liner with:
counter = sum([line.strip().isdigit() for line in open('data.txt', 'r')])
I'd avoid that route at first though... It's much less readable!

How are you running the program? Are you sure data.txt has data? Is there an empty line in the file?
try this:
while 1:
line = file.readline()
if not line: break
line = line.strip()
if ( line.isdigit() ):
sum += 1
print( "End of file\ndata.txt contains %s lines of digits" %(sum) )

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

getting data out of a txt file - python

To read all the lines: lines = f.readlines() Then, to print line 4: print lines[4] Note that indices in python start at 0 so that is actually the fifth line in the file.

Related

Reading CSV file with python

Python Performing calculation on every x line and place it back into original file

Python 3.4.3: Iterating over each line and each character in each line in a text file

Include surrounding lines of text file match in output using Python 2.7.3

Python, checking data file for certain lines

Categories

Resources