This question already has answers here:
Saving an Object (Data persistence)
(5 answers)
Closed 9 years ago.
I'm looking for the easiest way to save and load gamedata for a mastermind game. At the moment the things that need to be saved are, the number of colours being played with, the number of pegs being played with, how many guesses there have been and the actual answer. At the moment I'm saving all this information to a text file. I am having difficulty loading this information back in to the game, and have come across something called pickling. I've done some reading on it but I don't fully understand how it works and how it's different from the way I'm trying to do.
Thanks!
Pickling is a method of serialization - persisting data on disk.
You can handle manual pickle/unpickling if you want. But you asked for the easiest way - python has that. Just use shelve:
import shelve
d = shelve.open('my_mastermind_shelf')
That's.. it. Now just treat d as you would treat any other dict; shelve handles all the pickling behind the scenes. The only caveat: remember to call its .close() method when you're done with it.
As I said in my comment, you can use a simple text file to save the current state of your game. You can use open() to open a (new) file. Usually it's like this:
fname = raw_input("Put in the filename: ")
f = open(fname, 'w') # This opens the text file for write operations.
# If the file already exists, it will truncate it.
# If not, then it will creaate a new file.
f.write("Hey, I'm a hippie coder, ho.") # Or, whatever string you want.
# You don't have to be a hippie coder.
f.close()
Oh, and don't mind my poor example. It's 4 in the morning here, and I'm supposed to be in bed.
Related
This question already has answers here:
Hashing a file in Python
(9 answers)
Closed 10 months ago.
I am currently doing a project where I turn my Pi (Model 4 2GB) into a sort of NAS archive. I decided to learn a bit of Python along the way and wrote a small console app to "manage" my data base. One function I added was that it hashes the files in the database so it knows when files are corrupted.
To achieve this I hash a file like this:
with open(file, "rb") as f:
rbytes = f.read()
readable_hash = sha256(rbytes).hexdigest()
Now when I run this on smaller files it works just fine but on large files like videos it spits out a MemoryError - I presume this is because it doesn't have enough RAM to hold the file?
I've seen that you can break the read up into chunks but does this also work for hashing? If so, how?
Also I'm not a programmer. I want to learn in the process, so the simpler the solution the better - I want to actually understand the code I use. :) Doesn't need to be a super fast algorithm that squeezes out every millisecond either, as long as it gets the job done.
Thanks for any help in advance!
One Solution is adding a part of the File with another already hashed file, the hash at the end will still consist of the File there a just a few extra steps.
import hashlib
def hash_file(filename, bytes):
hashed = "" #make a string
with open(filename, 'rb') as f: #read from file
while True: #read the defined number of bytes until the loop is closed/broke
chunk = f.read(bytes) #read bytes
if chunk: #as long as "chunk" is not None/Empty
hashed = str(hashlib.md5(str(chunk).encode() + hashed.encode()).digest()) #Hash the old Hash and append the newly hashed chunk of text
else:
break #stop the Loop
return hashed
print(hash_file('file.txt', 1000))
By Hashing the Contents over and over again we always create a string that originates from the old string/hash, this way the string is always new and smaller (because MD5 hashes always have the same size) than the Whole File while being basically the old file.
PS: the bytes variable can be anything but more bytes = more Memory while less bytes = longer compute time, try what fits your needs. 1000–9000 Seems to be a good spot.
This question already has answers here:
Why is dumping with `pickle` much faster than `json`?
(3 answers)
Closed 3 years ago.
I wish to write to a text file with a dictionary. There are three methods that I've seen and it seems that they are all valid, but I am interested in which one will be most optimized or efficient for reading/writing, especially when I have a large dictionary with many entries and why.
new_dict = {}
new_dict["city"] = "Boston"
# Writing to the file by string conversion
with open(r'C:\\Users\xy243\Documents\pop.txt', 'w') as new_file:
new_file.write(str(new_dict))
# Writing to the file using Pickle
import pickle
with open(r'C:\\Users\xy243\Documents\pop.txt', 'w') as new_file:
pickle.dump(new_dict, new_file, protocol=pickle.HIGHEST_PROTOCOL)
# Writing to the file using JSON
import json
with open(r'C:\\Users\xy243\Documents\pop.txt', 'w') as new_file:
json.dump(new_dict, new_file)
The answers about efficiency have been pretty much been covered with the comments, however, it would probably be useful to you, if your dataset is large and you might want to replicate your approach, to consider SQL alternatives, made easier in python with SQLAlchemy. That way, you can access it quickly, but store it neatly in a database.
Objects of some python classes may not be json serializable. If your dictionary contains such objects (as values), then you can't use json object.
Sure, some objects of some python classes may not be pickle serializable (for example, keras/tensorflow objects). Then, again, you can't use pickle method.
In my opinion, classes which can't be json serialized are more than classes which can't be pickled.
That being said, pickle method may be applicable more widely than json.
Efficiency wise (considering your dictionary is both json serializable and pickle-able), pickle will always win because no string conversion is involved (number to string while serializing and string to number while deserializing).
If you are trying to transport the object to another process/server (written in another programming language especially ... Java etc.), then you have to live with json. This applies even if you write to file and another process read from that file.
So ... it depends on your use-case.
I have my pickle function working properly
with open(self._prepared_data_location_scalar, 'wb') as output:
# company1 = Company('banana', 40)
pickle.dump(X_scaler, output, pickle.HIGHEST_PROTOCOL)
pickle.dump(Y_scaler, output, pickle.HIGHEST_PROTOCOL)
with open(self._prepared_data_location_scalar, 'rb') as input_f:
X_scaler = pickle.load(input_f)
Y_scaler = pickle.load(input_f)
However, I am very curious how does pickle know which to load? Does it mean that everything has to be in the same sequence?
What you have is fine. It's a documented feature of pickle:
It is possible to make multiple calls to the dump() method of the same Pickler instance. These must then be matched to the same number of calls to the load() method of the corresponding Unpickler instance.
There is no magic here, pickle is a really simple stack-based language that serializes python objects into bytestrings. The pickle format knows about object boundaries: by design, pickle.dumps('x') + pickle.dumps('y') is not the same bytestring as pickle.dumps('xy').
If you're interested to learn some background on the implementation, this article is an easy read to shed some light on the python pickler.
wow I did not even know you could do this ... and I have been using python for a very long time... so thats totally awesome in my book, however you really should not do this it will be very hard to work with later(especially if it isnt you working on it)
I would recommend just doing
pickle.dump({"X":X_scalar,"Y":Y_scalar},output)
...
data = pickle.load(fp)
print "Y_scalar:",data['Y']
print "X_scalar:",data['X']
unless you have a very compelling reason to save and load the data like you were in your question ...
edit to answer the actual question...
it loads from the start of the file to the end (ie it loads them in the same order they were dumped)
Yes, pickle pick objects in order of saving.
Intuitively, pickle append to the end when it write (dump) to a file,
and read (load) sequentially the content from a file.
Consequently, order is preserved, allowing you to retrieve your data in the exact order you serialize it.
This question already has answers here:
Is explicitly closing files important?
(7 answers)
Closed 7 years ago.
Looking at Learning Python the Hard Way. Example 17 opens a file, copies it, and then closes it. One of the study drills is to simplify the program down as much as possible. However when I simplified it, there doesn't seem to be any file to close. What I'm trying to understand with the one liner is what, if anything, needs to be closed.
For example-
in_file = open(from_file)
indata = in_file.read()
...
in_file.close()
This can be simplified to
indata = open(from_file).read()
I understand it's good practice to close the file after opening it, but both indata and from_file are strings. From some more digging, I understand this is unpythonic and should be done in 2 lines for readability, which would result in a file descriptor. However there is no open file descriptor here to close. Did I miss something? Should I have a file descriptor to explicitly close?
Instead of doing this
indata = open(from_file).read()
You should have tried using the with keyword
with open(from_file) as f: indata = f.read()
In the former, you still have the file descriptor with you, even though you do not have any references left for it and there is no guarantee when the file will be closed. In the second approach, the file will be closed as soon as you are done with the statement execution.
The reason for this is that indata does not refer to a file object.
It refers to the data that the method read() returns, so when you try to call close() on that data, it will not work.
You need to keep separate references to open() and read() or look into using the with() builtin.
This question already has answers here:
Python how to write to a binary file?
(7 answers)
Closed 8 years ago.
I am beginner with python. ASCII files I can create, but with binary it seems more difficult to get in.
The writing of binary files got me confused, when I have not been able to find simplest code EXAMPLES, which would effectively reveal me, how it is actually done.
So, here I write things, which I would like to solve:
python: a=254, write value a to binary file.
file1: FE
file2: 00FE
file3: 000000FE
file4: FE00
file5: FE000000
python: string="00AABBCCDDEEFF"
file: 00AABBCCDDEEFF
python: string="999 This is ASCII"
file: 090909[and the rest same way converted]
So, that was writing needs, but how to reverse the progress?
Additional explaining, how to read wwxxyyzz from
file: FFDD0045wwxxyyzzFA23
python: wwxxyyzz (as value or string)
python: zzyyxxww (reversed)
If I could find as basic information, it would help me a lot to the new things to play with.
As you may see, this is my first post, so very newbie...
1.st EDIT: Okay, first I thank the fast answer, but as I am so new here, I could not comment, upvoted or so. That example is fitting for my file1, but file2-5 will be still hard to figure out, even with provided links, if there is not as clear and small (full) example. Also my question was rapidly marked as a duplicate, but on there was information still a bit not clear enough for a newbie like me. I have to continue with trial and error.
Heres a basic example that will accomplish what you wanted for writing binary files
>>> filename = "file"
>>> file = open(filename,"wb")
>>> a = 254
>>> file.write(chr(a))
>>> file.close()
For reading binary files, and more examples:
Reading binary file in Python and looping over each byte
https://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files
Binary file IO in python, where to start?