PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7

PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7 - python

Goal = Open file, encrypt file, write encrypted file.
Trying to use the PyPDF2 module to accomplish this. I have verified theat "input" is a file type object. I have researched this error and it translates to "file not found". I believe that it is linked somehow to the file/file path but am unsure how to debug or troubleshoot. and getting the following error:
Traceback (most recent call last):
File "CommissionSecurity.py", line 52, in <module>
inputStream = PyPDF2.PdfFileReader(input)
File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1065, in __init__
File "build\bdist.win-amd64\egg\PyPDF2\pdf.py", line 1660, in read
IOError: [Errno 22] Invalid argument
Below is the relevant code. I'm not sure how to correct this issue because I'm not really sure what the issue is. Any guidance is appreciated.
for ID in FileDict:
if ID in EmailDict :
path = "C:\\Apps\\CorVu\\DATA\\Reports\\AlliD\\Monthly Commission Reports\\Output\\pdcom1\\"
#print os.listdir(path)
file = os.path.join(path + FileDict[ID])
with open(file, 'rb') as input:
print type(input)
inputStream = PyPDF2.PdfFileReader(input)
output = PyPDF2.PdfFileWriter()
output = inputStream.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
output.write(outputStream)
else : continue

I think your problem might be caused by the fact that you use the same filename to both open and write to the file, opening it twice:
with open(file, 'rb') as input :
with open(file, 'wb') as outputStream :
The w mode will truncate the file, thus the second line truncates the input.
I'm not sure what you're intention is, because you can't really try to read from the (beginning) of the file, and at the same time overwrite it. Even if you try to write to the end of the file, you'll have to position the file pointer somewhere.
So create an extra output file that has a different name; you can always rename that output file to your input file after both files are closed, thus overwriting your input file.
Or you could first read the complete file into memory, then write to it:
with open(file, 'rb') as input:
inputStream = PyPDF2.PdfFileReader(input)
output = PyPDF2.PdfFileWriter()
output = input.encrypt(EmailDict[ID][1])
with open(file, 'wb') as outputStream:
output.write(outputStream)
Notes:
you assign inputStream, but never use it
you assign PdfFileWriter() to output, and then assign something else to output in the next line. Hence, you never used the result from the first output = line.
Please check carefully what you're doing, because it feels there are numerous other problems with your code.
Alternatively, here are some other tips that may help:
The documentation suggests that you can also use the filename as first argument to PdfFileReader:
stream – A File object or an object that supports the standard read
and seek methods similar to a File object. Could also be a string
representing a path to a PDF file.
So try:
inputStream = PyPDF2.PdfFileReader(file)
You can also try to set the strict argument to False:
strict (bool) – Determines whether user should be warned of all
problems and also causes some correctable problems to be fatal.
Defaults to True.
For example:
inputStream = PyPDF2.PdfFileReader(file, strict=False)

Using open(file, 'rb') was causing the issue becuase PdfFileReader() does that automagically. I just removed the with statement and that corrected the problem.
with open(file, 'rb') as input:
inputStream = PyPDF2.PdfFileReader(input)

This error raised up because of PDF file is empty.
My PDF file was empty that's why my error was raised up. So First of all i fill my PDF file with some data and Then start reeading it using PyPDF2.PdfFileReader,
And it solved my Problem!!!

Late but, you may be opening an invalid PDF file or an empty file that's named x.pdf and you think it's a PDF file

Related

Attempt to use the open() function failing

I'm trying to learn to manipulate files on python, but I can't get the open function to work. I have made a .txt file called foo that holds the content "hello world!" in my user directory (/home/yonatan) and typed this line into the shell:
open('/home/yonatan/foo.txt')
What i get in return is:
<_io.TextIOWrapper name='/home/yonatan/foo.txt' mode='r' encoding='UTF-8'>
I get what that means, but why don't I get the content?

open() returns a file object.
You then need to use read() to read the whole file
f = open('/home/yonatan/foo.txt', 'r')
contents = f.read()
Or you can use readline() to read just one line
line = f.readline()
and don't forget to close the file at the end
f.close()

An example iterating through the lines of the file (using with which ensures file.close() gets called on the end of it's lexical scope):
file_path = '/home/yonatan/foo.txt'
with open(file_path) as file:
for line in file:
print line
A great resource on I/O and file handling operations.

You haven't specified the mode you want to open it in.
Try:
f = open("home/yonatan/foo.txt", "r")
print(f.read())

Delete a file after reading

In my code, user uploads file which is saved on server and read using the server path. I'm trying to delete the file from that path after I'm done reading it. But it gives me following error instead:
An error occurred while reading file. [WinError 32] The process cannot access the file because it is being used by another process
I'm reading file using with, and I've tried f.close() and also f.closed but its the same error every time.
This is my code:
f = open(filePath)
with f:
line = f.readline().strip()
tempLst = line.split(fileSeparator)
if(len(lstHeader) != len(tempLst)):
headerErrorMsg = "invalid headers"
hjsonObj["Line No."] = 1
hjsonObj["Error Detail"] = headerErrorMsg
data['lstErrorData'].append(hjsonObj)
data["status"] = True
f.closed
return data
f.closed
after this code I call the remove function:
os.remove(filePath)
Edit: using with open(filePath) as f: and then trying to remove the file gives the same error.

Instead of:
f.closed
You need to say:
f.close()
closed is just a boolean property on the file object to indicate if the file is actually closed.
close() is method on the file object that actually closes the file.
Side note: attempting a file delete after closing a file handle is not 100% reliable. The file might still be getting scanned by the virus scanner or indexer. Or some other system hook is holding on to the file reference, etc... If the delete fails, wait a second and try again.

Use below code:
import os
os.startfile('your_file.py')
To delete after completion:
os.remove('your_file.py')

This
import os
path = 'path/to/file'
with open(path) as f:
for l in f:
print l,
os.remove(path)
should work, with statement will automatically close the file after the nested block of code
if it fails, File could be in use by some external factor. you can use Redo pattern.
while True:
try:
os.remove(path)
break
except:
time.sleep(1)

There is probably an application that is opening the file; check and close the application before executing your code:
os.remove(file_path)
Delete files that are not used by another application.

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

I am getting an interesting error while trying to use Unpickler.load(), here is the source code:
open(target, 'a').close()
scores = {};
with open(target, "rb") as file:
unpickler = pickle.Unpickler(file);
scores = unpickler.load();
if not isinstance(scores, dict):
scores = {};
Here is the traceback:
Traceback (most recent call last):
File "G:\python\pendu\user_test.py", line 3, in <module>:
save_user_points("Magix", 30);
File "G:\python\pendu\user.py", line 22, in save_user_points:
scores = unpickler.load();
EOFError: Ran out of input
The file I am trying to read is empty.
How can I avoid getting this error, and get an empty variable instead?

Most of the answers here have dealt with how to mange EOFError exceptions, which is really handy if you're unsure about whether the pickled object is empty or not.
However, if you're surprised that the pickle file is empty, it could be because you opened the filename through 'wb' or some other mode that could have over-written the file.
for example:
filename = 'cd.pkl'
with open(filename, 'wb') as f:
classification_dict = pickle.load(f)
This will over-write the pickled file. You might have done this by mistake before using:
...
open(filename, 'rb') as f:
And then got the EOFError because the previous block of code over-wrote the cd.pkl file.
When working in Jupyter, or in the console (Spyder) I usually write a wrapper over the reading/writing code, and call the wrapper subsequently. This avoids common read-write mistakes, and saves a bit of time if you're going to be reading the same file multiple times through your travails

I would check that the file is not empty first:
import os
scores = {} # scores is an empty dict already
if os.path.getsize(target) > 0:
with open(target, "rb") as f:
unpickler = pickle.Unpickler(f)
# if file is not empty scores will be equal
# to the value unpickled
scores = unpickler.load()
Also open(target, 'a').close() is doing nothing in your code and you don't need to use ;.

It is very likely that the pickled file is empty.
It is surprisingly easy to overwrite a pickle file if you're copying and pasting code.
For example the following writes a pickle file:
pickle.dump(df,open('df.p','wb'))
And if you copied this code to reopen it, but forgot to change 'wb' to 'rb' then you would overwrite the file:
df=pickle.load(open('df.p','wb'))
The correct syntax is
df=pickle.load(open('df.p','rb'))

As you see, that's actually a natural error ..
A typical construct for reading from an Unpickler object would be like this ..
try:
data = unpickler.load()
except EOFError:
data = list() # or whatever you want
EOFError is simply raised, because it was reading an empty file, it just meant End of File ..

You can catch that exception and return whatever you want from there.
open(target, 'a').close()
scores = {};
try:
with open(target, "rb") as file:
unpickler = pickle.Unpickler(file);
scores = unpickler.load();
if not isinstance(scores, dict):
scores = {};
except EOFError:
return {}

if path.exists(Score_file):
try :
with open(Score_file , "rb") as prev_Scr:
return Unpickler(prev_Scr).load()
except EOFError :
return dict()

Had the same issue. It turns out when I was writing to my pickle file I had not used the file.close(). Inserted that line in and the error was no more.

I have encountered this error many times and it always occurs because after writing into the file, I didn't close it. If we don't close the file the content stays in the buffer and the file stays empty.
To save the content into the file, either file should be closed or file_object should go out of scope.
That's why at the time of loading it's giving the ran out of input error because the file is empty. So you have two options :
file_object.close()
file_object.flush(): if you don't wanna close your file in between the program, you can use the flush() function as it will forcefully move the content from the buffer to the file.

This error comes when your pickle file is empty (0 Bytes). You need to check the size of your pickle file first. This was the scenario in my case. Hope this helps!

Note that the mode of opening files is 'a' or some other have alphabet 'a' will also make error because of the overwritting.
pointer = open('makeaafile.txt', 'ab+')
tes = pickle.load(pointer, encoding='utf-8')

temp_model = os.path.join(models_dir, train_type + '_' + part + '_' + str(pc))
# print(type(temp_model)) # <class 'str'>
filehandler = open(temp_model, "rb")
# print(type(filehandler)) # <class '_io.BufferedReader'>
try:
pdm_temp = pickle.load(filehandler)
except UnicodeDecodeError:
pdm_temp = pickle.load(filehandler, fix_imports=True, encoding="latin1")

from os.path import getsize as size
from pickle import *
if size(target)>0:
with open(target,'rb') as f:
scores={i:j for i,j in enumerate(load(f))}
else: scores={}
#line 1.
we importing Function 'getsize' from Library 'OS' sublibrary 'path' and we rename it with command 'as' for shorter style of writing. Important is hier that we loading only one single Func that we need and not whole Library!
line 2.
Same Idea, but when we dont know wich modul we will use in code at the begining, we can import all library using a command '*'.
line 3.
Conditional Statement... if size of your file >0 ( means obj is not an empty). 'target' is variable that schould be a bit earlier predefined.
just an Example : target=(r'd:\dir1\dir.2..\YourDataFile.bin')
Line 4.
'With open(target) as file:' an open construction for any file, u dont need then to use file.close(). it helps to avoid some typical Errors such as "Run out of input" or Permissions rights.
'rb' mod means 'rea binary' that u can only read(load) the data from your binary file but u cant modify/rewrite it.
Line5.
List comprehension method in applying to a Dictionary..
line 6. Case your datafile is empty, it will not raise an any Error msg, but return just an empty dictionary.

Confusing Error when Reading from a File in Python

I'm having a problem opening the names.txt file. I have checked that I am in the correct directory. Below is my code:
import os
print(os.getcwd())
def alpha_sort():
infile = open('names', 'r')
string = infile.read()
string = string.replace('"','')
name_list = string.split(',')
name_list.sort()
infile.close()
return 0
alpha_sort()
And the error I got:
FileNotFoundError: [Errno 2] No such file or directory: 'names'
Any ideas on what I'm doing wrong?

You mention in your question body that the file is "names.txt", however your code shows you trying to open a file called "names" (without the ".txt" extension). (Extensions are part of filenames.)
Try this instead:
infile = open('names.txt', 'r')

As a side note, make sure that when you open files you use universal mode, as windows and mac/unix have different representations of carriage returns (/r/n vs /n etc.). Universal mode gets python to handle this, so it's generally a good idea to use it whenever you need to read a file. (EDIT - should read: a text file, thanks cameron)
So the code would just look like this
infile = open( 'names.txt', 'rU' ) #capital U indicated to open the file in universal mode

This doesn't solve that issue, but you might consider using with when opening files:
with open('names', 'r') as infile:
string = infile.read()
string = string.replace('"','')
name_list = string.split(',')
name_list.sort()
return 0
This closes the file for you and handles exceptions as well.

How can I edit a plain text file in Python?

I've been trying to create a python script that edits a file, but if the file is not already there, it has an error like this:
Traceback (most recent call last):
File "openorcreatfile.py", line 56, in <module>
fileHandle = (pathToFile, 'w')
IOError: [Errno 2] No such file or directory: '/home/me/The_File.txt'
It works fine if the file exists. I've also tried this:
fileHandle = (pathToFile, 'w+')
But it comes up with the same error. Do I need to explicitly check if the file is there? If so, how do I create the file?
EDIT: Sorry, I realized the folder was missing. I'm an idiot.

The error says "No such file or directory."
Since you're trying to create a file, that must not be what's missing. So you need to create the /home/me/ directory.
See os.makedirs.

fo = open("myfile.txt", "wb")
fo.write('blah')
fo.close()
That's it, this will do the job.

myfile = open('test.txt','w')
myfile.write("This is my first text file written in python\n")
myfile.close()

To check if the file is there you can do:
import os.path
os.path.isfile(pathToFile)
so you can handle it, only if it exists:
if os.path.isfile(pathToFile):
fileHandle = (pathToFile, 'w')
else:
pass #or other thing
There are several ways to create a file in python, but if you want to create a text file, take a look at numpy.savetxt, which I think is one of the easiest and most effective ways

with open("filename.txt", "w") as f:
f.write("test")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

PyPDF2 IOError: [Errno 22] Invalid argument on PyPdfFileReader Python 2.7 - python

Using open(file, 'rb') was causing the issue becuase PdfFileReader() does that automagically. I just removed the with statement and that corrected the problem. with open(file, 'rb') as input: inputStream = PyPDF2.PdfFileReader(input)

This error raised up because of PDF file is empty. My PDF file was empty that's why my error was raised up. So First of all i fill my PDF file with some data and Then start reeading it using PyPDF2.PdfFileReader, And it solved my Problem!!!

Late but, you may be opening an invalid PDF file or an empty file that's named x.pdf and you think it's a PDF file

Related

Attempt to use the open() function failing

Delete a file after reading

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Confusing Error when Reading from a File in Python

How can I edit a plain text file in Python?

Categories

Resources