Python OSError: Bad address when reading from large file

Python OSError: Bad address when reading from large file - python

I am observing a "Bad address" OSError when reading from a file handle created using the with syntax in Python 3.
The file in question is 39G, but I should have enough RAM available to read the whole file. The error message leads me to believe I am hitting some kind of OS restriction; I am running CentOS 6.9. Can anyone help me understand what might be causing this behavior?
The file is perfectly readable outside of python, e.g. in bash with head or vim.
Simplified code sample producing the error is shown below:
In [2]: with open(filename, 'r', encoding="utf8") as infile:
...: infile.read()
...:
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-2-3f216811bec7> in <module>()
1 with open(filename, 'r', encoding="utf8") as infile:
----> 2 infile.read()
3
OSError: [Errno 14] Bad address

Related

Getting an error: Line Contains Null, Not sure the cause [duplicate]

This question already has answers here:
"Line contains NULL byte" in CSV reader (Python)
(14 answers)
Closed 3 years ago.
I am getting and error: line contains NUL. I think it means there's a strange character in my CSV file. But this program and import file worked on a different machine (both Macs), so I don't know if the cause is a different version of Python or how I am running it. From reading the other entries, I am thinking this line may also be the cause:
reader = csv.reader(open(filePath, 'r', encoding="utf-8-sig", errors="ignore"))
Appreciate any help / advice!
paths CWD=/Users/sternit/Downloads/Ten-code-4, CPD=/Users/sternit/Downloads/Ten-code/
Traceback (most recent call last):
File "/Users/sternit/Downloads/Ten-code-4/Master.py", line 145, in <module>
main()
File "/Users/sternit/Downloads/Ten-code-4/Master.py", line 114, in main
playerLists = loadFiles(CPD + "PlayerFiles/")
File "/Users/sternit/Downloads/Ten-code-4/Master.py", line 50, in loadFiles
for n, row in enumerate(reader):
_csv.Error: line contains NUL

this should work fine:
data_initial = open(filePath, "rb")
data = csv.reader((line.replace('\0','') for line in data_initial), delimiter=",")

If the csv module says you have a "NULL" (silly message, should be "NUL") byte in your reading file, I would suggest checking out what is in your file.
Try use rb, it might make problem go away:
reader = csv.reader(open(filePath, 'rb', encoding="utf-8-sig", errors="ignore"))
Depends on how the file generated, there might include NULL byte, so you might need to
Open it in an editor, to see whether it is a reasonable CSV file, if the file too big, use nano or head in CLI.
Using another library like pandas, which could be more robust.
If the problem persists, you can replace all the '\x00', with empty string:
fi = open(filePath, 'rb')
data = fi.read()
fi.close()
fo = open('mynew.csv', 'wb')
fo.write(data.replace('\x00', ''))
fo.close()

Stuck with a ValueError, How to solve it?

I am doing a assignment and I need to extract text from PDF using PyPDF2 and while trying to do that am getting this error. How to fix this?
can someone help me? thank you in advance.
import PyPDF2
textFile = open('foo.txt', 'w')
file = open('foo.pdf','rb')
readpdf = PyPDF2.PdfFileReader(file)
print(readpdf.getNumPages())
1
read_pdf = readpdf.getPage(0)
textFile.write(read_pdf.extractText())
--------------------------------------------------------------------------
ValueError Traceback (most recent call
last)
<ipython-input-42-5a892ea3012b> in <module>
----> 1 textFile.write(read_pdf.extractText())
ValueError: I/O operation on closed file.
file.close
textFile.close()

I am not sure how you ended up with this error, but this might help:
textFile = open('foo.txt', 'w')
read_pdf = readpdf.getPage(0)
textFile.write(read_pdf.extractText())
Opening the file right before you do something with it seems to work for me, so give it a try and we'll see ;]

using with open you dont need to handle the exception and closing file, it handle this by itself
import PyPDF2
with open('foo.txt','w') textFile:
with open('foo.pdf','rb') as file:
readpdf = PyPDF2.PdfFileReader(file)
print(readpdf.getNumPages())
read_pdf = readpdf.getPage(0)
textFile.write(read_pdf.extractText())

File not ready for write after open?

I have the following code:
#!/usr/bin/python
export = open('/sys/class/gpio/export', 'w')
export.write('44\n')
And this code produces the following output:
close failed in file object destructor:
IOError: [Errno 16] Device or resource busy
If I change the code by adding a export.close() to the end, I get this as output:
Traceback (most recent call last):
File "./test.py", line 5, in <module>
export.close()
IOError: [Errno 16] Device or resource busy
However, if I change the code again as such, it works perfectly:
#!/usr/bin/python
from time import sleep
export = open('/sys/class/gpio/export', 'w')
sleep(1)
export.write('44\n')
Note that .close ALWAYS fails, even if I put a long sleep after the write.
Edit:
Changed my code to be the following:
with open('/sys/class/gpio/export', 'w') as export:
sleep(1)
export.write('44\n')
export.flush()
export.close()
Still gives errors:
Traceback (most recent call last):
File "./test.py", line 7, in <module>
export.flush()
IOError: [Errno 16] Device or resource busy
Edit 2:
My main issue turned out to be that you can't export a GPIO that has already been exported. I've updated my code to look like this and it seems to be working:
from os import path
if not path.isdir('/sys/class/gpio/gpio44'):
with open('/sys/class/gpio/export', 'w') as export:
export.write('44\n')
if path.exists('/sys/class/gpio/gpio44/direction'):
with open('/sys/class/gpio/gpio44/direction', 'w') as gpio44_dir:
gpio44_dir.write('out\n')
if path.exists('/sys/class/gpio/gpio44/value'):
with open('/sys/class/gpio/gpio44/value', 'w') as gpio44_val:
gpio44_val.write('1\n')
This code successfully exports a GPIO, sets its direction to "out", and actives it (value to 1).

My main issue turned out to be that you can't export a GPIO that has already been exported. I've updated my code to look like this and it seems to be working:
from os import path
if not path.isdir('/sys/class/gpio/gpio44'):
with open('/sys/class/gpio/export', 'w') as export:
export.write('44\n')
if path.exists('/sys/class/gpio/gpio44/direction'):
with open('/sys/class/gpio/gpio44/direction', 'w') as gpio44_dir:
gpio44_dir.write('out\n')
if path.exists('/sys/class/gpio/gpio44/value'):
with open('/sys/class/gpio/gpio44/value', 'w') as gpio44_val:
gpio44_val.write('1\n')
This code successfully exports a GPIO, sets its direction to "out", and actives it (value to 1).

PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

Goal = Open file, encrypt file, write encrypted file.
Trying to use the PyPDF2 module to accomplish this. I have verified theat "input" is a file type object. I have researched this error and it translates to "file not found". I believe that it is linked somehow to the file/file path but am unsure how to debug or troubleshoot. and getting the following error:
Traceback (most recent call last):
File "CommissionSecurity.py", line 52, in <module>
inputStream = PyPDF2.PdfFileReader(input)
File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1065, in __init__
File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1660, in read
IOError: [Errno 22] Invalid argument
Below is the relevant code. I'm not sure how to correct this issue because I'm not really sure what the issue is. Any guidance is appreciated.
for ID in FileDict:
if ID in EmailDict :
path = "C:\\Apps\\CorVu\\DATA\\Reports\\AlliD\\Monthly Commission Reports\\Output\\pdcom1\\"
#print os.listdir(path)
file = os.path.join(path + FileDict[ID])
with open(file, 'rb') as input:
print type(input)
inputStream = PyPDF2.PdfFileReader(input)
output = PyPDF2.PdfFileWriter()
output = inputStream.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
output.write(outputStream)
else : continue

I think your problem might be caused by the fact that you use the same filename to both open and write to the file, opening it twice:
with open(file, 'rb') as input :
with open(file, 'wb') as outputStream :
The w mode will truncate the file, thus the second line truncates the input.
I'm not sure what you're intention is, because you can't really try to read from the (beginning) of the file, and at the same time overwrite it. Even if you try to write to the end of the file, you'll have to position the file pointer somewhere.
So create an extra output file that has a different name; you can always rename that output file to your input file after both files are closed, thus overwriting your input file.
Or you could first read the complete file into memory, then write to it:
with open(file, 'rb') as input:
inputStream = PyPDF2.PdfFileReader(input)
output = PyPDF2.PdfFileWriter()
output = input.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
output.write(outputStream)
Notes:
you assign inputStream, but never use it
you assign PdfFileWriter() to output, and then assign something else to output in the next line. Hence, you never used the result from the first output = line.
Please check carefully what you're doing, because it feels there are numerous other problems with your code.
Alternatively, here are some other tips that may help:
The documentation suggests that you can also use the filename as first argument to PdfFileReader:
stream – A File object or an object that supports the standard read
and seek methods similar to a File object. Could also be a string
representing a path to a PDF file.
So try:
inputStream = PyPDF2.PdfFileReader(file)
You can also try to set the strict argument to False:
strict (bool) – Determines whether user should be warned of all
problems and also causes some correctable problems to be fatal.
Defaults to True.
For example:
inputStream = PyPDF2.PdfFileReader(file, strict=False)

Using open(file, 'rb') was causing the issue becuase PdfFileReader() does that automagically. I just removed the with statement and that corrected the problem.
with open(file, 'rb') as input:
inputStream = PyPDF2.PdfFileReader(input)

This error raised up because of PDF file is empty.
My PDF file was empty that's why my error was raised up. So First of all i fill my PDF file with some data and Then start reeading it using PyPDF2.PdfFileReader,
And it solved my Problem!!!

Late but, you may be opening an invalid PDF file or an empty file that's named x.pdf and you think it's a PDF file

Python ValueError: I/O operation on closed file. sample tutorial not working

I am following a tutorial to learn to read and write to a file.
I am getting the following error. I do not understand why.
C:\Python27\python.exe "C:/Automation/Python/Write to files/test3.py"
Traceback (most recent call last):
File "C:/Automation/Python/Write to files/test3.py", line 8, in <module>
f.read('newfile.txt', 'r')
ValueError: I/O operation on closed file
My code is
f = open("newfile.txt", "w")
f.write("hello world\n")
f.write("Another line\n")
f.close()
f.read('newfile.txt', 'r')
print f.read()
I have tried to put f.close at the bottom of the code but I still get the same error.
The write part works if I comment out the f.read. It is failing on the f.read part.

The line after f.close() that is f.read('newfile.txt', 'r') should be f = open('newfile.txt', 'r').
That is
f = open('newfile.txt', 'r')
print f.read()
f.close()
After which you need to add f.close() again.
Small Note
As in Python, the default value for 2nd arg of open is r, you can simple do open('newfile.txt')

You can't perform I/O operation on file_obj after closing it i.e.
file_obj.close()
So if you want to open the same file do:
if(file_obj.closed):
file_obj = open(file_obj.name, file_obj.mode)
print (file.obj.read())
file_obj.close()

As illustrated above when you closed the file you need to open your file so that you can read it
f = open('newfile.txt', 'r')
print f.read()
f.close()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python OSError: Bad address when reading from large file - python

Related

Getting an error: Line Contains Null, Not sure the cause [duplicate]

Stuck with a ValueError, How to solve it?

File not ready for write after open?

PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

Python ValueError: I/O operation on closed file. sample tutorial not working

Categories

Resources