Open text file, print new lines only in python - python

I am opening a text file, which once created is constantly being written to, and then printing this out to a console any new lines, as I don't want to reprint the whole text file each time. I am checking to see if the file grows in size, if it is, just print the next new line. This is mostly working, but occasionally it gets a bit confused about the next new line, and new lines appear a few lines up, mixed in with the old lines.
Is there a better way to do this, below is my current code.
infile = "Null"
while not os.path.exists(self.logPath):
time.sleep(.1)
if os.path.isfile(self.logPath):
infile = codecs.open(self.logPath, encoding='utf8')
else:
raise ValueError("%s isn't a file!" % file_path)
lastSize = 0
lastLineIndex = 0
while True:
wx.Yield()
fileSize = os.path.getsize(self.logPath)
if fileSize > lastSize:
lines = infile.readlines()
newLines = 0
for line in lines[lastLineIndex:]:
newLines += 1
self.running_log.WriteText(line)
lastLineIndex += newLines
if "DBG-X: Returning 1" in line:
self.subject = "FAILED! - "
self.sendEmail(self)
break
if "DBG-X: Returning 0" in line:
self.subject = "PASSED! - "
self.sendEmail(self)
break
fileSize1 = fileSize
infile.flush()
infile.seek(0)
infile.close()
Also my application freezes whilst waiting for the text file to be created, as it takes a couple of seconds to appear, which isn't great.
Cheers.

This solution could help. You'd also have to do a bit of waiting until the file appears, using os.path.isfile and time.sleep.

Maybe you could:
open the file each time you need to read in it,
use lastSize as argument to seek directly to where you stopped at last reading.
Additional comment: I don't know if you need some protection, but I think you should not bother to test whether given filename is a file or not; just open it in a try...except block and catch problems if any.
As for the freezing of your application, you may want to use some kind of Threading, for instance: one thread, your main one, is handling the GUI, and a second one would wait for the file to be created. Once the file is created, the second thread sends signals to the GUI thread, containing the data to be displayed.

Related

Reading file continuously and appending new lines to list (python)

I am practicing file reading stuff in python. I got a text file and I want to run a program which continuously reads this text file and appends the newly written lines to a list. To exit the program and print out the resulting list the user should press "enter".
The code I have written so far looks like this:
import sys, select, os
data = []
i = 0
while True:
os.system('cls' if os.name == 'nt' else 'clear')
with open('test.txt', 'r') as f:
for line in f:
data.append(int(line))
print(data)
if sys.stdin in select.select([sys.stdin], [], [], 0)[0]:
line_ = input()
break
So to break out of the while loop 'enter' should be pressed. To be fair I just copy pasted the solution to do this from here: Exiting while loop by pressing enter without blocking. How can I improve this method?
But this code just appends all lines to my list again and again.
So, say my text files contains the lines:
1
2
3
So my list is going to look like data = [1,2,3,1,2,3,1,2,3...] and have a certain lenght until I press enter. When I add a line (e.g. 4) it will go data = [1,2,3,1,2,3,1,2,3,1,2,3,4,1,2,3,4...].
So I am looking for some kind of if statement before my append command so that only the newly written line get appended. But I can't think of something easy.
I already got some tips, i.e.
Continuously checking the file size and only reading the part between old and new size.
Keeping track of line number and skipping to line that is not appended in next iteration.
At the moment, I can't think of a way on how to do this. I tried fiddling around with enumerate(f) and itertools.islice but can't make it work. Would appreciate some help, as I have not yet adopted the way of how programmers think.
Store the file position between iterations. This allows to efficiently fast-forward the file when it is opened again:
data = []
file_position = 0
while True:
with open('test.txt', 'r') as f:
f.seek(file_position) # fast forward beyond content read previously
for line in f:
data.append(int(line))
file_position = f.tell() # store position at which to resume
I could get it to work on Windows. First of all, exiting the while-loop and continuous reading the file are two different questions. I assume, that exiting the while loop is not the main problem, and because your select.select() statement doesn't work this way on Windows, I created an exit-while with a try-except clause that triggers on Ctrl-c. (It's just one way of doing it).
The second part of your questions is how to continuously read the file. Well, not by reopening it again and again within the while loop, open it before the while loop.
Then, as soon as the file is being changed, either a valid or an invalid line is read. I suppose this happens because the iteration over f may sometimes happen before the file was completely written (I'm not quite sure about that). Anyway, it is easy to check the read line. I used again a try-except clause for it which catches the error if int(line) raises an error.
Code:
import sys, select, os
data = []
with open('text.txt', 'r') as f:
try:
while True:
os.system('cls' if os.name == 'nt' else 'clear')
for line in f:
try:
data.append(int(line))
except:
pass
print(data)
except KeyboardInterrupt:
print('Quit loop')
print(data)

How to read a csv file in tail -f manner using python?

I want to read the csv file in a manner similar to tail -f i.e. like reading an error log file.
I can perform this operation in a text file with this code:
while 1:
where = self.file.tell()
line = self.file.readline()
if not line:
print "No line waiting, waiting for one second"
time.sleep(1)
self.file.seek(where)
if (re.search('[a-zA-Z]', line) == False):
continue
else:
response = self.naturalLanguageProcessing(line)
if(response is not None):
response["id"] = self.id
self.id += 1
response["tweet"] = line
self.saveResults(response)
else:
continue
How do I perform the same task for a csv file? I have gone through a link which can give me last 8 rows but that is not what I require. The csv file will be getting updated simultaneously and I need to get the newly appended rows.
Connecting A File Tailer To A csv.reader
In order to plug your code that looks for content newly appended to a file into a csv.reader, you need to put it into the form of an iterator.
I'm not intending to showcase correct code, but specifically to show how to adopt your existing code into this form, without making assertions about its correctness. In particular, the sleep() would be better replaced with a mechanism such as inotify to let the operating system assertively inform you when the file has changed; and the seek() and tell() would be better replaced with storing partial lines in memory rather than backing up and rereading them from the beginning over and over.
import csv
import time
class FileTailer(object):
def __init__(self, file, delay=0.1):
self.file = file
self.delay = delay
def __iter__(self):
while True:
where = self.file.tell()
line = self.file.readline()
if line and line.endswith('\n'): # only emit full lines
yield line
else: # for a partial line, pause and back up
time.sleep(self.delay) # ...not actually a recommended approach.
self.file.seek(where)
csv_reader = csv.reader(FileTailer(open('myfile.csv')))
for row in csv_reader:
print("Read row: %r" % (row,))
If you create an empty myfile.csv, start python csvtailer.py, and then echo "first,line" >>myfile.csv from a different window, you'll see the output of Read row: ['first', 'line'] immediately appear.
Finding A Correct File Tailer In Python
For a correctly-implemented iterator that waits for new lines to be available, consider referring to one of the existing StackOverflow questions on the topic:
How to implement a pythonic equivalent of tail -F?
Reading infinite stream - tail
Reading updated files on the fly in Python

Python conditional statement based on text file string

Noob question here. I'm scheduling a cron job for a Python script for every 2 hours, but I want the script to stop running after 48 hours, which is not a feature of cron. To work around this, I'm recording the number of executions at the end of the script in a text file using a tally mark x and opening the text file at the beginning of the script to only run if the count is less than n.
However, my script seems to always run regardless of the conditions. Here's an example of what I've tried:
with open("curl-output.txt", "a+") as myfile:
data = myfile.read()
finalrun = "xxxxx"
if data != finalrun:
[CURL CODE]
with open("curl-output.txt", "a") as text_file:
text_file.write("x")
text_file.close()
I think I'm missing something simple here. Please advise if there is a better way of achieving this. Thanks in advance.
The problem with your original code is that you're opening the file in a+ mode, which seems to set the seek position to the end of the file (try print(data) right after you read the file). If you use r instead, it works. (I'm not sure that's how it's supposed to be. This answer states it should write at the end, but read from the beginning. The documentation isn't terribly clear).
Some suggestions: Instead of comparing against the "xxxxx" string, you could just check the length of the data (if len(data) < 5). Or alternatively, as was suggested, use pickle to store a number, which might look like this:
import pickle
try:
with open("curl-output.txt", "rb") as myfile:
num = pickle.load(myfile)
except FileNotFoundError:
num = 0
if num < 5:
do_curl_stuff()
num += 1
with open("curl-output.txt", "wb") as myfile:
pickle.dump(num, myfile)
Two more things concerning your original code: You're making the first with block bigger than it needs to be. Once you've read the string into data, you don't need the file object anymore, so you can remove one level of indentation from everything except data = myfile.read().
Also, you don't need to close text_file manually. with will do that for you (that's the point).
Sounds more for a job scheduling with at command?
See http://www.ibm.com/developerworks/library/l-job-scheduling/ for different job scheduling mechanisms.
The first bug that is immediately obvious to me is that you are appending to the file even if data == finalrun. So when data == finalrun, you don't run curl but you do append another 'x' to the file. On the next run, data will be not equal to finalrun again so it will continue to execute the curl code.
The solution is of course to nest the code that appends to the file under the if statement.
Well there probably is an end of line jump \n character which makes that your file will contain something like xx\n and not simply xx. Probably this is why your condition does not work :)
EDIT
What happens if through the python command line you type
open('filename.txt', 'r').read() # where filename is the name of your file
you will be able to see whether there is an \n or not
Try using this condition along with if clause instead.
if data.count('x')==24
data string may contain extraneous data line new line characters. Check repr(data) to see if it actually a 24 x's.

Check if a file is modified in Python

I am trying to create a box that tells me if a file text is modified or not, if it is modified it prints out the new text inside of it. This should be in an infinite loop (the bot sleeps until the text file is modified).
I have tried this code but it doesn't work.
while True:
tfile1 = open("most_recent_follower.txt", "r")
SMRF1 = tfile1.readline()
if tfile1.readline() == SMRF1:
print(tfile1.readline())
But this is totally not working... I am new to Python, can anyone help me?
def read_file():
with open("most_recent_follower.txt", "r") as f:
SMRF1 = f.readlines()
return SMRF1
initial = read_file()
while True:
current = read_file()
if initial != current:
for line in current:
if line not in initial:
print(line)
initial = current
Read the file in once, to get it's initial state. Then continuously repeat reading of the file. When it changes, print out its contents.
I don't know what bot you are referring to, but this code, and yours, will continuously read the file. It never seems to exit.
I might suggest copying the file to a safe duplicate location, and possibly using a diff program to determine if the current file is different from the original copy, and print the added lines. If you just want lines appended you might try to utilize a utility like tail
You can also use a library like pyinotify to only trigger when the filesystem detects the file has been modified
This is the first result on Google for "check if a file is modified in python" so I'm gonna add an extra solution here.
If you're curious if a file is modified in the sense that its contents have changed, OR it was touched, then you can use os.stat:
import os
get_time = lambda f: os.stat(f).st_ctime
fn = 'file.name'
prev_time = get_time(fn)
while True:
t = get_time(fn)
if t != prev_time:
do_stuff()
prev_time = t

Data not getting written in file [Python]

final=open("war.txt","w+")
for line in madList:
line=line.split('A ')
dnsreg= line[1]
print dnsreg
final.write(dnsreg)
While printing dnsreg I can see the output, but when I write it to a file, nothing is being written. No syntax error is there either. Any idea?
The data written to a file is not written immediately, it's kept in a buffer, and large amounts are written at a time so save the writing-to-disk overhead. However, upon closing a file, all the buffered data is flushed to the disk.
So, you can do two things:
Call final.close() when you are done, or
Call final.flush() after final.write() if you don't want to close the file.
Thanks to #Matt Tanenbaum, a really nice way to handle this in python is to do the writing inside a with block:
with open("war.txt","w+") as final:
for line in madList:
line=line.split('A ')
dnsreg= line[1]
print dnsreg
final.write(dnsreg)
Doing this, you'll never have to worry about closing the file! But you may need to flush in case of premature termination of the program (e.g. due to exceptions).
You should use the with statement in Python when using resources that have to be setup and tear down, like opening and closing of files. Something like:
with open("war.txt","w+") as myFile:
for line in madList:
line=line.split('A ')
dnsreg= line[1]
myFile.write(dnsreg)
If you do not want to use with, you will have to manually close the file. In that case, you can use the try...finally blocks to handle this.
try:
myFile = open("war.txt", "w+")
for line in madList:
line=line.split('A ')
dnsreg= line[1]
myFile.write(dnsreg)
finally:
myFile.close()
finally will always work, so your file is closed, and changes are written.

Categories

Resources