Unable to load file via pickle - python

I am using an example program from github in which it has imported pickle module at the beginning. But when it is trying to open file via pickle. Its giving an error
I can't understand the reason of it.
file = open('df_train_train', 'rb')
df_train_train = pickle.load(file)
file.close()
file = open('df_train_test', 'rb')
df_train_test = pickle.load(file)
file.close()
screen shot of my result.

You need to dump it first, then load it. Try the following:
# dump df_train_train
file = open('df_train_train', 'wb')
pickle.dump(df_train_train, file)
file.close()
Then
file = open('df_train_train', 'rb')
df_train_train = pickle.load(file)
file.close()

Related

Having Trouble Loading a Pickle File

I am trying to create a small game for fun, and I want to save and load previous run scores. I started a test file to mess around and try to figure out how pickling works. I have a pickle file with a small set of number. How do I add numbers to the pickle file and save it for the next run.
Currently I have it like this:
new_score = 9
filename = "scoreTest.pk"
outfile = open(filename,'wb')
infile = open(filename,'rb')
with infile as f:
scores = pickle.load(f)
scores.add(new_score)
pickle.dump(scores, outfile)
When I run it like this I get this error:
EOFError: Ran out of input
If someone could please tell me what is wrong and how to do it correctly that would be great. Apologies for any un-optimal code, I'm new to code.
You are trying to juggle a reader and writer on the same file at the same time. The open(filename, 'wb') of the write deletes whatever happened to be in the file so there is no data for the reader. You should only open the file when you really need to use it. And its better to write to a temporary file and rename it. If something goes wrong you haven't lost your data.
import pickle
import os
new_score = 9
filename = "scoreTest.pk"
tmp_filename = "scoreTest.tmp"
try:
with open(filename, 'rb') as infile:
scores = pickle.load(f)
except (IOError, EOFError) as e:
scores = default # whatever that is
scores.add(new_score)
with open(tmp_filename, 'wb') as outfile:
pickle.dump(scores, outfile)
os.rename(tmp_filename, filename)

How to edit and repickle?

From what I understand, the only way to edit an object in a pickle file is to unpickle each object, edit the desired object, and repickle everything back into the original file.
This is what I tried doing:
pickleWrite = open(fileName, 'wb')
pickleRead = open(fileName, 'rb')
#unpickle objects and put it in dataList
dataList = list()
try:
while True:
dataList.append(pickle.load(pickleRead))
except EOFError:
pass
#change desired pickle object
dataList[0] = some change
#clear pickle file
pickleWrite.truncate(0)
#repickle each item in data list
for data in dataList:
pickle.dump(data, fileName)
For some reason, this makes the pickle file have some large number of unknown symbols at the front of the file making it unpickleable.
Error when we try to unpickle:
_pickle.UnpicklingError: invalid load key, '\x00'.
I would suggest avoiding to create multiple opened connection to the same file like this.
Instead, you can try:
# Read the contents
with open(filename, 'rb') as file:
dataList = pickle.load(file)
# something to dataList
# Overwrite the picle file
with open(filename, 'wb') as file:
pickle.dump(dataList, file)

IOError when downloading and decompressing gzip file

I'm trying to download and decompress a gzip file and then convert the resulting decompressed file which is of tsv format into a CSV format which would be easier to parse. I am trying to gather the data from the "Download Table" link in this URL. My code is as follows, where I am using the same idea as in this post, however I get the error IOError: [Errno 2] No such file or directory: 'file=data/irt_euryld_d.tsv' in the line with open(outFilePath, 'w') as outfile:
import os
import urllib2
import gzip
import StringIO
baseURL = "http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?"
filename = "D:\Sidney\irt_euryld_d.tsv.gz" #Edited after heinst's comment below
outFilePath = filename[:-3]
response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO()
compressedFile.write(response.read())
compressedFile.seek(0)
decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb')
with open(outFilePath, 'w') as outfile:
outfile.write(decompressedFile.read())
#Now have to deal with tsv file
import csv
with open(outFilePath,'rb') as tsvin, open('ECB.csv', 'wb') as csvout:
tsvin = csv.reader(tsvin, delimiter='\t')
csvout = csv.writer(csvout) #Converting output into CSV Format
Thank You
The path you were setting filename to was not a valid path to have a file written to it. So you have to change filename = "data/irt_euryld_d.tsv.gz" to be a valid path to wherever you want the irt_euryld_d.tsv.gz file to live. For example if I wanted the irt_euryld_d.tsv.gz file on my desktop I would set the value of filename = "/Users/heinst/Desktop/data/irt_euryld_d.tsv.gz". Since this is a valid path, python will not give you the No such file or directory error anymore.

Python code to move file over socket

SO family, I trying to write an application where i can transfer files between two computers. I currently this working using something like this:
On client side
file = open(srcfile, 'r')
content = file.read()
file.close()
send_message(srcfile)
send_message(content)
On Server side:
filename = receive_message(message)
content = receive_message(message)
file = open(filename, 'w')
file.write(content)
file.close()
This seems to work for text files, but for other file types it doesn't work..
I'm thinking there has to be a better way. Any suggestions?
you need to use
file = open(srcfile, 'rb')
and
file = open(srcfile, 'wb')
respectivly ... the b means binary...

Python - how to open a file that is not yet written to disk?

I am using a script to strip exif data from uploaded JPGs in Python, before writing them to disk. I'm using Flask, and the file is brought in through requests
file = request.files['file']
strip the exif data, and then save it
f = open(file)
image = f.read()
f.close()
outputimage = stripExif(image)
f = ('output.jpg', 'w')
f.write(outputimage)
f.close()
f.save(os.path.join(app.config['IMAGE_FOLDER'], filename))
Open isn't working because it only takes a string as an argument, and if I try to just set f=file, it throws an error about tuple objects not having a write attribute. How can I pass the current file into this function before it is read?
file is a FileStorage, described in http://werkzeug.pocoo.org/docs/datastructures/#werkzeug.datastructures.FileStorage
As the doc says, stream represents the stream of data for this file, usually under the form of a pointer to a temporary file, and most function are proxied.
You probably can do something like:
file = request.files['file']
image = file.read()
outputimage = stripExif(image)
f = open(os.path.join(app.config['IMAGE_FOLDER'], 'output.jpg'), 'w')
f.write(outputimage)
f.close()
Try the io package, which has a BufferedReader(), ala:
import io
f = io.BufferedReader(request.files['file'])
...
file = request.files['file']
image = stripExif(file.read())
file.close()
filename = 'whatever' # maybe you want to use request.files['file'].filename
dest_path = os.path.join(app.config['IMAGE_FOLDER'], filename)
with open(dest_path, 'wb') as f:
f.write(image)

Categories

Resources