I have following python code and it works fine but it brings error and then jumps to the last line.
Then I remove problematic line from a file, run again python script but it again finds problematic line and jumps to the end.
I want to be able to print all lines without jumping to the end of python script (just skip line and continue to the next):
import csv
with open('data.tsv', "rb") as f:
reader = csv.reader( f )
try:
for row in reader:
continue
except csv.Error, e:
print reader.line_num, e
pass
print "End of file!\n"
iterate manually, hoping that the csv reader object can recover from the exception:
import csv
with open('data.tsv', "r") as f:
reader = csv.reader( f )
while True:
try:
row = next(reader)
print(row)
except csv.Error as e:
print("line: {}, error: {}".format(reader.line_num, e))
except StopIteration:
break
print("End of file!\n")
the StopIteration exception is raised when csv.reader object reaches the end of file. At this point, break is used to exit from the infinite loop.
Let's test this by inserting a NULL byte in a row. An easy way is to replace f by a list of rows:
data = """hello,world
foo,bar
hi\x00,I'm joe
recovered,yeah
"""
f = data.splitlines()
now f can be fed to csv.reader with the code above (remove with block). Note the NUL byte inserted at the third line. Output:
['hello', 'world']
['foo', 'bar']
line: 3, error: line contains NULL byte
['recovered', 'yeah']
End of file!
yeah! it works (and the code is compatible with Python 2 and Python 3 as a bonus)
Move the try inside the for loop
import csv
with open('data.tsv', "rb") as f:
reader = csv.reader( f )
for row in reader:
try:
continue
except csv.Error, e:
print reader.line_num, e
pass
print "End of file!\n"
Related
In python it is easy to read and parse a csv file and process line-by-line:
reader = csv.reader(open("my_csv_file.csv"))
for row in reader:
# row is an array or dict
parsed_data = my_data_parser(row)
where my_data_parser is my own piece of logic that takes input data, parses and does logic.
If my parser fails, I would like to log the entire original line of csv file, but it seems that from the csv reader i have no more access to it.
Is it possible to retrieve the original raw line data?
It doesn't seem like the csv.reader() exposes the file object it's iterating, however, you could use the reader's line_num attribute to achieve what you want.
For example:
import csv
file = open("my_csv_file.csv")
lines = file.readlines()
reader = csv.reader(lines)
for row in reader:
# row is an array or dict
try:
parsed_data = my_data_parser(row)
except MyDataParserError:
print(f"ERROR in line number {reader.line_num}")
print("Full line:")
print(lines[reader.line_num])
file.close()
Alternative
If you'd like to avoid always loading the file into memory, you could instead keep your initial way of reading the file and only read the whole file into memory if an error occurred:
import csv
reader = csv.reader(open("my_csv_file.csv"))
for row in reader:
# row is an array or dict
try:
parsed_data = my_data_parser(row)
except MyDataParserError:
# Only read the whole file into memory when an error occurred.
file = open("my_csv_file.csv")
lines = file.readlines()
file.close()
print(f"ERROR in line number {reader.line_num}")
print("Full line:")
print(lines[reader.line_num])
You can access the row line number with
reader.line_num
But there seems to be no direct way to access the actual line (says doc). Here is iterative method that avoids reading the whole file to memory at any step:
import csv
class MyException(Exception):
pass
def super_logic(line): # Some silly logic to get test code running
if len(line) != 2 or line[1] != '1':
raise MyException("Invalid value")
print("Process: %s" % line)
class LastLineReader:
def __init__(self, fn ):
self.fid = open(fn)
def __iter__(self):
return self
def __next__(self):
line = self.fid.readline() # Read single line and cache it local object
if len(line) == 0:
raise StopIteration()
self.current_line = line.strip()
return line
reader_with_lines = LastLineReader( "my_csv_file.csv" )
reader = csv.reader( reader_with_lines )
for line in reader:
try:
super_logic(line)
except MyException as e:
print("Got exception: %s at line '%s'" % ( e, reader_with_lines.current_line ))
(Edited: removed other solutions as they are also visible on other ppl posts)
As alternative to reader.line_num
for index, row in enumerate(reader):
print(i + 1, row)
I streamed tweets using the following code
class CustomStreamListener(tweepy.StreamListener):
def on_data(self, data):
try:
with open('brasil.json', 'a') as f:
f.write(data)
return True
except BaseException as e:
print("Error on_data: %s" % str(e))
return True
Now I have a json file (brasil.json). I want to open it on python to do sentiment analysis but I can't find a way. I managed to open the first tweet using this:
with open('brasil.json') as f:
for line in f:
tweets.append(json.loads(line))
but it doesn't read all the other tweets. Any idea?
From comments: after examining the contents of the json data-file, all the tweets are in the odd number if rows. The even numbers are blank.
This caused a json.decoder.JSONDecodeError.
There are two ways to handle this error, either read only the odd rows or use exception-handling.
using odd rows:
with open('brasil.json') as f:
for n, line in enumerate(f, 1):
if n % 2 == 1: # this line is in an odd-numbered row
tweets.append(json.loads(line))
exception-handling:
with open('brasil.json', 'r') as f:
for line in f:
try:
tweets.append(json.loads(line))
except json.decoder.JSONDecodeError:
pass # skip this line
try and see which one works best.
I am trying to write all the rows that contain the string: from a bunch of Text files. This is my code:
import os
import glob
import csv
import re
#Defining Keyword
keyword = '2012-07-02'
#Code to merge all relevant LOG files into one file and insert
with open('Combined-01022012.txt' , 'w', newline = '') as combined_file:
csv_output = csv.writer(combined_file)
for filename in glob.glob('FAO_Agg_2012_Part_*.txt'):
with open(filename, 'rt', newline = '') as f_input:
#with gzip.open((filename.split('.')[0]) + '.gz', 'rt', newline='') as f_input:
csv_input = csv.reader(f_input)
for row in csv_input:
row.insert(0, os.path.basename(filename))
try:
if keyword in row[2]:
csv_output.writerow(row)
#row.insert(0, os.path.basename(filename))
#csv_output.writerow(row)
except:
continue
continue
Everything seems to be right and the code runs but nothing gets written on to my text file. What could be going wrong?
Your main problem is in the lines:
row.insert(0, os.path.basename(filename))
try:
if keyword in row[0]:
csv_output.writerow(row)
except:
continue
You're essentially inserting your parent folder name of the current file as the first entry of your row, and then on the very next line you're checking if that entry (row[0]) contains your keyword. Unless the parent folder contains your keyword (2012-07-02) that condition will never evaluate as True. I'd mix this up as:
if keyword in row[0]:
csv_output.writerow([os.path.basename(filename)] + row)
Also, using blank except is a very, very bad idea. If you're looking to capture a specific exception, define it in your except clause.
Using this Python code I get printed lines of file in UPPERCASE but file remains unchanged (lowercase.)
def open_f():
while True:
fname=raw_input("Enter filename:")
if fname != "done":
try:
fhand=open(fname, "r+")
break
except:
print "WRONG!!!"
continue
else: exit()
return fhand
fhand=open_f()
for line in fhand:
ss=line.upper().strip()
print ss
fhand.write(ss)
fhand.close()
Can you suggest please why files remain unaffected?
Code:
def file_reader(read_from_file):
with open(read_from_file, 'r') as f:
return f.read()
def file_writer(read_from_file, write_to_file):
with open(write_to_file, 'w') as f:
f.write(file_reader(read_from_file))
Usage:
Create a file named example.txt with the following content:
Hi my name is Dmitrii Gangan.
Create an empty file called file_to_be_written_to.txt
Add this as the last line file_writer("example.txt", "file_to_be_written_to.txt") of your .py python file.
python <your_python_script.py> from the terminal.
NOTE: They all must be in the same folder.
Result:
file_to_be_written_to.txt:
Hi my name is Dmitrii Gangan.
This program should do as you requested and allows for modifying the file as it is being read. Each line is read, converted to uppercase, and then written back to the source file. Since it runs on a line-by-line basis, the most extra memory it should need would be related to the length of the longest line.
Example 1
def main():
with get_file('Enter filename: ') as file:
while True:
position = file.tell() # remember beginning of line
line = file.readline() # get the next available line
if not line: # check if at end of the file
break # program is finished at EOF
file.seek(position) # go back to the line's start
file.write(line.upper()) # write the line in uppercase
def get_file(prompt):
while True:
try: # run and catch any error
return open(input(prompt), 'r+t') # r+t = read, write, text
except EOFError: # see if user if finished
raise SystemExit() # exit the program if so
except OSError as error: # check for file problems
print(error) # report operation errors
if __name__ == '__main__':
main()
The following is similar to what you see up above but works in binary mode instead of text mode. Instead of operating on lines, it processes the file in chunks based on the given BUFFER_SIZE and can operate more efficiently. The code under the main loop may replace the code in the loop if you wish for the program to check that it is operating correctly. The assert statements check some assumptions.
Example 2
BUFFER_SIZE = 1 << 20
def main():
with get_file('Enter filename: ') as file:
while True:
position = file.tell()
buffer = file.read(BUFFER_SIZE)
if not buffer:
return
file.seek(position)
file.write(buffer.upper())
# The following code will not run but can replace the code in the loop.
start = file.tell()
buffer = file.read(BUFFER_SIZE)
if not buffer:
return
stop = file.tell()
assert file.seek(start) == start
assert file.write(buffer.upper()) == len(buffer)
assert file.tell() == stop
def get_file(prompt):
while True:
try:
return open(input(prompt), 'r+b')
except EOFError:
raise SystemExit()
except OSError as error:
print(error)
if __name__ == '__main__':
main()
I suggest the following approach:
1) Read/close the file, return the filename and content
2) Create a new file with above filename, and content with UPPERCASE
def open_f():
while True:
fname=raw_input("Enter filename:")
if fname != "done":
try:
with open(fname, "r+") as fhand:
ss = fhand.read()
break
except:
print "WRONG!!!"
continue
else: exit()
return fname, ss
fname, ss =open_f()
with open(fname, "w+") as fhand:
fhand.write(ss.upper())
Like already alluded to in comments, you cannot successively read from and write to the same file -- the first write will truncate the file, so you cannot read anything more from the handle at that point.
Fortunately, the fileinput module offers a convenient inplace mode which works exactly like you want.
import fileinput
for line in fileinput.input(somefilename, inplace=True):
print(line.upper().strip())
I am stuck why the words.txt is not showing the full grid, below is the tasks i must carry out:
write code to prompt the user for a filename, and attempt to open the file whose name is supplied. If the file cannot be opened the user should be asked to supply another filename; this should continue until a file has been successfully opened.
The file will contain on each line a row from the words grid. Write code to read, in turn, each line of the file, remove the newline character and append the resulting string to a list of strings.After the input is complete the grid should be displayed on the screen.
Below is the code i have carried out so far, any help would be appreciated:
file = input("Enter a filename: ")
try:
a = open(file)
with open(file) as a:
x = [line.strip() for line in a]
print (a)
except IOError as e:
print ("File Does Not Exist")
Note: Always avoid using variable names like file, list as they are built in python types
while True:
filename = raw_input(' filename: ')
try:
lines = [line.strip() for line in open(filename)]
print lines
break
except IOError as e:
print 'No file found'
continue
The below implementation should work:
# loop
while(True):
# don't use name 'file', it's a data type
the_file = raw_input("Enter a filename: ")
try:
with open(the_file) as a:
x = [line.strip() for line in a]
# I think you meant to print x, not a
print(x)
break
except IOError as e:
print("File Does Not Exist")
You need a while loop?
while True:
file = input("Enter a filename: ")
try:
a = open(file)
with open(file) as a:
x = [line.strip() for line in a]
print (a)
break
except IOError:
pass
This will keep asking untill a valid file is provided.