I've made a function to write objects to a file:
def StoreToFile(Thefile,objekt):
utfil=None
utfil=open(Thefile,'wb')
pickle.dump(objekt,utfil)
return True
if utfil is not None:
utfil.close()
And my code to use this function:
for st in Stadion:
StoreToFile(r'C:\pytest\prod.psr',st)
This works like a charm, but how can I put the objects back to a list object?
I have the code to extract the objects, but I'm unable to see how I can iterate through the objects to put them in a new list.
So far I have this:
def ReadFromFile(filename):
infile=None
infile=open(filename,'rb')
objekt=pickle.load(infile)
for st in Stadion:
StoreToFile(r'C:\pytest\prod.psr',st)
This works like a charm.
If you mean "run without errors", then yes, it does "work". This code repeatedly overwrites the file, so it will only contain the last item in the list.
Use this instead:
StoreToFile(r'C:\pytest\prod.psr', Stadion)
Your ReadFromFile() function should work just fine as it is and return a list (assuming above fix).
Also not sure what this does:
return True
if Thefile.close()
Your code is silly the utfil = None business doesn't make sense, because the only way open(...) can fail is with an exception, in which case the rest of the function won't be executed anyway. The right way to do this is with a context manager: the with statement.
Instead, do:
def storeToFile(path, o):
try:
with open(path, "wb") as f:
pickle.dump(o, f)
return True
except pickle.PicklingError, IOError:
return False
You should just pickle the whole list.
To pickle the objects to the same file, use this function:
def storeToFile(fileName, o):
try:
with open(fileName, "a") as file:
cPickle.dump(o, file)
return True
except pickle.PicklingError, IOError:
return False
Note that the file is opened with mode "a", so that new data is appended to the end.
To load the objects again, use this:
def loadEntireFile(fileName):
try:
with open(fileName) as file:
unpickler = cPickle.Unpickler(file)
while True:
yield unpickler.load()
except EOFError:
pass
This function tries to load objects from the file until it encounters EOF, which is indicated by an EOFError. You can use it like this:
foo = [str(x) for x in range(10)]
for x in foo:
storeToFile("test.pickle", x)
foo2 = list(load("test.pickle"))
The list function takes any iterable and builds a list from it. The function loadEntireFile contains a yield statement, making it a generator, so it can be passed to any function taking an iterable.
Related
So far I've had almost no issues with scope in my project, however the one big exception is de-pickling files.
f = open("save_file.dat", "rb+")
Plr = pickle.load(f)[0]
f.close()
^ This code works just fine. However if I try to do the de-pickling inside a function, it seems to turn local.
def load():
f = open("save_file.dat", "rb+")
Plr = pickle.load(f)[0]
f.close()
So I decided to try and make it a method.
def load(self):
...
f = open("save_file.dat", "rb+")
self = pickle.load(f)[0]
f.close()
Usually when I try to use methods, self refers to the instance of the class, and it changes it's value, but here it seems to once again, be local.
EDIT it seems I was being dumb for trying to return the function inside another function, which was not the main one.
Here's something that works:
def load():
f = open("save_file.dat", "rb+")
return pickle.load(f)[0]
And then after that you can change the variable you need to like this:
Plr = load()
I'm writing a unit test for a function that takes an array of dictionaries and ends up saving it in a CSV. I'm trying to mock it with pytest as usual:
csv_output = (
"Name\tSurname\r\n"
"Eve\tFirst\r\n"
)
with patch("builtins.open", mock_open()) as m:
export_csv_func(array_of_dicts)
assert m.assert_called_once_with('myfile.csv', 'wb') is None
[and here I want to gather all output sent to the mock "m" and assert it against "csv_output"]
I cannot get in any simple way all the data sent to the mock during the open() phase by csv to do the comparison in bulk, instead of line by line. To simplify things, I verified that the following code mimics the operations that export_csv_func() does to the mock:
with patch("builtins.open", mock_open()) as m:
with open("myfile.csv", "wb") as f:
f.write("Name\tSurname\r\n")
f.write("Eve\tFirst\r\n")
When I dig into the mock, I see:
>>> m
<MagicMock name='open' spec='builtin_function_or_method' id='4380173840'>
>>> m.mock_calls
[call('myfile.csv', 'wb'),
call().__enter__(),
call().write('Name\tSurname\r\n'),
call().write('Eve\tFirst\r\n'),
call().__exit__(None, None, None)]
>>> m().write.mock_calls
[call('Name\tSurname\r\n'), call('Eve\tFirst\r\n')]
>>> dir(m().write.mock_calls[0])
['__add__'...(many methods), '_mock_from_kall', '_mock_name', '_mock_parent', 'call_list', 'count', 'index']
I don't see anything in the MagickMock interface where I can gather all the input that the mock has received.
I also tried calling m().write.call_args but it only returns the last call (the last element of the mock_calls attribute, i.e. call('Eve\tFirst\r\n')).
Is there any way of doing what I want?
You can create your own mock.call objects and compare them with what you have in the .call_args_list.
from unittest.mock import patch, mock_open, call
with patch("builtins.open", mock_open()) as m:
with open("myfile.csv", "wb") as f:
f.write("Name\tSurname\r\n")
f.write("Eve\tFirst\r\n")
# Create your array of expected strings
expected_strings = ["Name\tSurname\r\n", "Eve\tFirst\r\n"]
write_calls = m().write.call_args_list
for expected_str in expected_strings:
# assert that a mock.call(expected_str) exists in the write calls
assert call(expected_str) in write_calls
Note that you can use the assert call of your choice. If you're in a unittest.TestCase subclass, prefer to use self.assertIn.
Additionally, if you just want the arg values you can unpack a mock.call object as tuples. Index 0 is the *args. For example:
for write_call in write_calls:
print('args: {}'.format(write_call[0]))
print('kwargs: {}'.format(write_call[1]))
Indeed you can't patch builtins.open.write directly since the patch within a with would need to enter the patched method and see that write is not a class method.
There are a bunch of solutions and the one I would think of first would be to use your own mock. See the example:
class MockOpenWrite:
def __init__(self, *args, **kwargs):
self.res = []
# What's actually mocking the write. Name must match
def write(self, s: str):
self.res.append(s)
# These 2 methods are needed specifically for the use of with.
# If you mock using a decorator, you don't need them anymore.
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
return
mock = MockOpenWrite
with patch("builtins.open", mock):
with open("myfile.csv", "w") as f:
f.write("Name\tSurname\r\n")
f.write("Eve\tFirst\r\n")
print(f.res)
In that case, the res attribute is linked to the instance. So it disappears after the with closes.
You could eventually stored results somewhere else, like a global array, and check the results beyond the end of with.
Feel free to play around with your actual method.
I had to it this way (Python 3.9). It was quite tedious just to get the mock-args out of the function.
from somewhere import my_thing
#patch("lib.function", return_value=MagicMock())
def test_my_thing(my_mock):
my_thing(value1, value2)
(value1_call_args, value2_call_args) = my_mock.call_args_list[0].args
I've read a little bit about decorators without my puny brain understanding them fully, but I believe this is one of the cases where they would be of use.
I have a main method running some other methods:
def run_pipeline():
gene_sequence_fasta_files_made = create_gene_sequence_fasta_files()
....several other methods taking one input argument and having one output argument.
Since each method takes a long time to run, I'd like to store the result in a json object for each method. If the json file exists I load it, otherwise I run the method and store the result. My current solution looks like this:
def run_pipeline():
gene_sequence_fasta_files_made = _load_or_make(create_gene_sequence_fasta_files, "/home/myfolder/ff.json", method_input=None)
...
Problem is, I find this really ugly and hard to read. If it is possible, how would I use decorators to solve this problem?
Ps. sorry for not showing my attempts. I haven't tried anything since I'm working against a deadline for a client and do not have the time (I could deliver the code above; I just find it aesthetically displeasing).
Psps. definition of _load_or_make() appended:
def _load_or_make(method, filename, method_input=None):
try:
with open(filename, 'r') as input_handle:
data = json.load(input_handle)
except IOError:
if method_input == None:
data = method()
else:
data = method(method_input)
with open(filename, 'w+') as output_handle:
json.dump(data, output_handle)
return data
Here's a decorator that tries loading json from the given filename, and if it can't find the file or the json load fails, it runs the original function, writes the result as json to disk, and returns.
def load_or_make(filename):
def decorator(func):
def wraps(*args, **kwargs):
try:
with open(filename, 'r') as f:
return json.load(input_handle)
except Exception:
data = func(*args, **kwargs)
with open(filename, 'w') as out:
json.dump(data, out)
return data
return wraps
return decorator
#load_or_make(filename)
def your_method_with_arg(arg):
# do stuff
return data
#load_or_make(other_filename)
def your_method():
# do stuff
return data
Note that there is an issue with this approach: if the decorated method returns different values depending on the arguments passed to it, the cache won't behave properly. It looks like that isn't a requirement for you, but if it is, you'd need to pick a different filename depending on the arguments passed in (or use pickle-based serialization, and just pickle a dict of args -> results). Here's an example of how to do it using a pickle approach, (very) loosely based on the memoized decorator Christian P. linked to:
import pickle
def load_or_make(filename):
def decorator(func):
def wrapped(*args, **kwargs):
# Make a key for the arguments. Try to make kwargs hashable
# by using the tuple returned from items() instead of the dict
# itself.
key = (args, kwargs.items())
try:
hash(key)
except TypeError:
# Don't try to use cache if there's an
# unhashable argument.
return func(*args, **kwargs)
try:
with open(filename) as f:
cache = pickle.load(f)
except Exception:
cache = {}
if key in cache:
return cache[key]
else:
value = func(*args, **kwargs)
cache[key] = value
with open(filename, "w") as f:
pickle.dump(cache, f)
return value
return wrapped
return decorator
Here, instead of saving the result as json, we pickle the result as a value in a dict, where the key is the arguments provided to the function. Note that you would still need to use a different filename for every function you decorate to ensure you never got incorrect results from the cache.
Do you want to save the results to disk or is in-memory okay? If so, you can use the memoize decorator / pattern, found here: https://wiki.python.org/moin/PythonDecoratorLibrary#Memoize
For each set of unique input arguments, it saves the result from the function in memory. If the function is then called again with the same arguments, it returns the result from memory rather than trying to run the function again.
It can also be altered to allow for a timeout (depending on how long your program runs for) so that if called after a certain time, it should re-run and re-cache the results.
A decorator is simply a callable that takes a function (or a class) as an argument, does something with/to it, and returns something (usually the function in a wrapper, or the class modified or registered):
Since Flat is better than nested I like to use classes if the decorator is at all complex:
class GetData(object):
def __init__(self, filename):
# this is called on the #decorator line
self.filename = filename
self.method_input = input
def __call__(self, func):
# this is called by Python with the completed def
def wrapper(*args, **kwds):
try:
with open(self.filename) as stored:
data = json.load(stored)
except IOError:
data = func(*args, **kwds)
with open(self.filename, 'w+') as stored:
json.dump(data, stored)
return data
return wrapper
and in use:
#GetData('/path/to/some/file')
def create_gene_sequence_fasta_files('this', 'that', these='those'):
pass
#GetData('/path/to/some/other/file')
def create_gene_sequence_fastb_files():
pass
I am no expert in python's decorator.I just learn it from a tutorial.But i think this can help u.But u may not get more readablity from it.
Decorator is a way to give ur different function the similar solution to deal with things,without make ur code mess or losing their readablity.It seems like transparent to the rest of ur code.
def _load_or_make(filename):
def _deco(method):
def __deco():
try:
with open(filename, 'r') as input_handle:
data = json.load(input_handle)
return data
except IOError:
if method_input == None:
data = method()
else:
data = method(method_input)
with open(filename, 'w+') as output_handle:
json.dump(data, output_handle)
return data
return __deco
return _deco
#_load_or_make(filename)
def method(arg):
#things need to be done
pass
return data
How should I customize unittest.mock.mock_open to handle this code?
file: impexpdemo.py
def import_register(register_fn):
with open(register_fn) as f:
return [line for line in f]
My first attempt tried read_data.
class TestByteOrderMark1(unittest.TestCase):
REGISTER_FN = 'test_dummy_path'
TEST_TEXT = ['test text 1\n', 'test text 2\n']
def test_byte_order_mark_absent(self):
m = unittest.mock.mock_open(read_data=self.TEST_TEXT)
with unittest.mock.patch('builtins.open', m):
result = impexpdemo.import_register(self.REGISTER_FN)
self.assertEqual(result, self.TEST_TEXT)
This failed, presumably because the code doesn't use read, readline, or readlines.
The documentation for unittest.mock.mock_open says, "read_data is a string for the read(), readline(), and readlines() methods of the file handle to return. Calls to those methods will take data from read_data until it is depleted. The mock of these methods is pretty simplistic. If you need more control over the data that you are feeding to the tested code you will need to customize this mock for yourself. read_data is an empty string by default."
As the documentation gave no hint on what kind of customization would be required I tried return_value and side_effect. Neither worked.
class TestByteOrderMark2(unittest.TestCase):
REGISTER_FN = 'test_dummy_path'
TEST_TEXT = ['test text 1\n', 'test text 2\n']
def test_byte_order_mark_absent(self):
m = unittest.mock.mock_open()
m().side_effect = self.TEST_TEXT
with unittest.mock.patch('builtins.open', m):
result = impexpdemo.import_register(self.REGISTER_FN)
self.assertEqual(result, self.TEST_TEXT)
The mock_open() object does indeed not implement iteration.
If you are not using the file object as a context manager, you could use:
m = unittest.mock.MagicMock(name='open', spec=open)
m.return_value = iter(self.TEST_TEXT)
with unittest.mock.patch('builtins.open', m):
Now open() returns an iterator, something that can be directly iterated over just like a file object can be, and it'll also work with next(). It can not, however, be used as a context manager.
You can combine this with mock_open() then provide a __iter__ and __next__ method on the return value, with the added benefit that mock_open() also adds the prerequisites for use as a context manager:
# Note: read_data must be a string!
m = unittest.mock.mock_open(read_data=''.join(self.TEST_TEXT))
m.return_value.__iter__ = lambda self: self
m.return_value.__next__ = lambda self: next(iter(self.readline, ''))
The return value here is a MagicMock object specced from the file object (Python 2) or the in-memory file objects (Python 3), but only the read, write and __enter__ methods have been stubbed out.
The above doesn't work in Python 2 because a) Python 2 expects next to exist, not __next__ and b) next is not treated as a special method in Mock (rightly so), so even if you renamed __next__ to next in the above example the type of the return value won't have a next method. For most cases it would be enough to make the file object produced an iterable rather than an iterator with:
# Python 2!
m = mock.mock_open(read_data=''.join(self.TEST_TEXT))
m.return_value.__iter__ = lambda self: iter(self.readline, '')
Any code that uses iter(fileobj) will then work (including a for loop).
There is a open issue in the Python tracker that aims to remedy this gap.
As of Python 3.6, the mocked file-like object returned by the unittest.mock_open method doesn't support iteration. This bug was reported in 2014 and it is still open as of 2017.
Thus code like this silently yields zero iterations:
f_open = unittest.mock.mock_open(read_data='foo\nbar\n')
f = f_open('blah')
for line in f:
print(line)
You can work around this limitation via adding a method to the mocked object that returns a proper line iterator:
def mock_open(*args, **kargs):
f_open = unittest.mock.mock_open(*args, **kargs)
f_open.return_value.__iter__ = lambda self : iter(self.readline, '')
return f_open
I found the following solution:
text_file_data = '\n'.join(["a line here", "the second line", "another
line in the file"])
with patch('__builtin__.open', mock_open(read_data=text_file_data),
create=True) as m:
# mock_open doesn't properly handle iterating over the open file with for line in file:
# but if we set the return value like this, it works.
m.return_value.__iter__.return_value = text_file_data.splitlines()
with open('filename', 'rU') as f:
for line in f:
print line
In my code, I have a load_dataset function that reads a text file and does some processing. Recently I thought about adding support to file-like objects, and I wondered over the best approach to this. Currently I have two implementations in mind:
First, type checking:
if isinstance(inputelement, basestring):
# open file, processing etc
# or
# elif hasattr(inputelement, "read"):
elif isinstance(inputelement, file):
# Do something else
Alternatively, two different arguments:
def load_dataset(filename=None, stream=None):
if filename is not None and stream is None:
# open file etc
elif stream is not None and filename is None:
# do something else
Both solutions however don't convince me too much, especially the second as I see way too many pitfalls.
What is the cleanest (and most Pythonic) way to accept a file-like object or string to a function that does text reading?
One way of having either a file name or a file-like object as argument is the implementation of a context manager that can handle both. An implementation can be found here, I quote for the sake of a self contained answer:
class open_filename(object):
"""Context manager that opens a filename and closes it on exit, but does
nothing for file-like objects.
"""
def __init__(self, filename, *args, **kwargs):
self.closing = kwargs.pop('closing', False)
if isinstance(filename, basestring):
self.fh = open(filename, *args, **kwargs)
self.closing = True
else:
self.fh = filename
def __enter__(self):
return self.fh
def __exit__(self, exc_type, exc_val, exc_tb):
if self.closing:
self.fh.close()
return False
Possible usage then:
def load_dataset(file_):
with open_filename(file_, "r") as f:
# process here, read only if the file_ is a string
Don't accept both files and strings. If you're going to accept file-like objects, then it means you won't check the type, just call the required methods on the actual parameter (read, write, etc.). If you're going to accept strings, then you're going to end up open-ing files, which means you won't be able to mock the parameters. So I'd say accept files, let the caller pass you a file-like object, and don't check the type.
I'm using a context manager wrapper. When it's a filename (str), close the file on exit.
#contextmanager
def fopen(filein, *args, **kwargs):
if isinstance(filein, str): # filename
with open(filein, *args, **kwargs) as f:
yield f
else: # file-like object
yield filein
Then you can use it like:
with fopen(filename_or_fileobj) as f:
# do sth. with f
Python follows duck typing, you may check that this is file object by function that you need from object. For example hasattr(obj, 'read') against isinstance(inputelement, file).
For converting string to file objects you may also use such construction:
if not hasattr(obj, 'read'):
obj = StringIO(str(obj))
After this code you will safely be able to use obj as a file.