Title says it all. It seems like it ought be possible (somehow) to implement python-side pickling for PyObjC objects whose Objective-C classes implement NSCoding without re-implementing everything from scratch. That said, while value-semantic members would probably be straightforward, by-reference object graphs and conditional coding might be tricky. How might you get the two sides to "collaborate" on the object graph parts?
PyObjC does support writing Python objects to a (keyed) archive (that is, any object that can be pickled implements NSCoding).
That’s probably the easiest way to serialize arbitrary graphs of Python and Objective-C objects.
As I wrote in the comments for another answer I ran into problems when trying to find a way to implement pickle support for any object that implements NSCoding due to incompatibilities in how NSArchiver and pickle traverse the object graph (IIRC primarily when restoring the archive).
Shouldn't it be pretty straightforward?
On pickling, call encodeWithCoder on the object using an NSArchiver or something. Have pickle store that string.
On unpickling, use NSUnarchiver to create an NSObject from the pickled string.
Related
Recently, I have been asked to make "our C++ lib work in the cloud".
Basically, the lib is computer intensive (calculating prices), so it would make sense.
I have constructed a SWIG interface to make a python version with in the mind to use MapReduce with MRJob.
I wanted to serialize the objects in a file, and using a mapper, deserialize and calculate the price.
For example:
class MRTest(MRJob):
def mapper(self,key,value):
obj = dill.loads(value)
yield (key, obj.price())
But now I reach a dead end since it seems that dill cannot handle SWIG extension:
PicklingError: Can't pickle <class 'SwigPyObject'>: it's not found as builtins.SwigPyObject
Is there a way to make this work properly?
I'm the dill author. That's correct, dill can't pickle C++ objects. When you see it's not found as builtin.some_object… that almost invariably means that you are trying to pickle some object that is not written in python, but uses python to bind to C/C++ (i.e. an extension type). You have no hope of directly pickling such objects with a python serializer.
However, since you are interested in pickling a subclass of an extension type, you can actually do it. All you will need to do is to give your object the appropriate state you want to save as an instance attribute or attributes, and provide a __reduce__ method to tell dill (or pickle) how to save the state of your object. This method is how python deals with serializing extension types. See:
https://docs.python.org/2/library/pickle.html#pickling-and-unpickling-extension-types
There are probably better examples, but here's at least one example:
https://stackoverflow.com/a/19874769/4646678
Newbie Python question here - I am writing a little utility in Python to do disk space calculations when given the attributes of 2 different files.
Should I create a 'file' class with methods appropriate to the conversion and then create each file as an instance of that class? I'm pretty new to Python, but ok with Perl, and I believe that in Perl (I may be wrong, being self-taught), from the examples that I have seen, that most Perl is not OO.
Background info - These are IBM z/OS (mainframe) data sets, and when given the allocation attributes for a file on a specific disk type and file organisation (it's block size) and then given the allocation parameters for a different disk type & organisation, the space requirements can vary enormously.
Definition nitpicking preface: Everything in Python is technically an object, even functions and numbers. I'm going to assume you mean classes vs. functions in your question.
Actually I think one of the great things about Python is that it doesn't embrace classes for absolutely everything as some other languages (e.g., Java and C#).
It's perfectly acceptable in Python (and the built-in modules do this a lot) to define module level functions rather than encapsulating all logic in objects.
That said, classes do have their place, for example when you perform multiple actions on a single piece of data, and especially when these actions change the data and you want to keep its state encapsulated.
For Your Question and you requirements ..a short answer is "No"
The use of objects is not in itself "object oriented". Functional programming uses objects, too, just not in the same way. Python can be used in a very FP way, even though Python uses objects heavily behind the scenes.
Overuse of primitives can be a problem, but it's impossible to say whether that applies to your case without more data.
I think of OO as an interface design approach: If you are creating tools that are straightforward to interact with (and substitutable) as objects with predictable methods, then by all means, create objects. But if the interactions are straightforward to describe with module-level functions, then don't try too hard to engineer your code into classes.
First and foremost - Pythonic is a term that needs to disappear, preferably with everyone who uses it. It doesn't mean anything and it's used by people who can't use reason to justify anything, so they need a mandatory term to justify their nonsense.
But to the point you never HAVE to use object oriented concepts in your software development, as everything OOP can as easily be written with functions and solid spaghetti stringers. But the question is - do use of objects makes sense in my solution?
To understand when and how to use it, you have to ask what exactly is object oriented programming. And this was already very well explained in very old, but also free, book called Thinking in java which I consider to be the 101 bible of thinking on OO terms. I strongly urge you to grab a free copy and read couple of first chapters.
Because if you don't understand the object oriented approach, how can you apply it properly? When you do - then when to use it, or not use it, becomes clear, because you can clearly translate real life items and interactions into abstract objects. And this is the guideline - when the translation of given action, item or data to OOP model is straightforward and logical - then you should do it.
Python docs mention this word a lot and I want to know what it means.
It simply means it can be serialized by the pickle module. For a basic explanation of this, see What can be pickled and unpickled?. Pickling Class Instances provides more details, and shows how classes can customize the process.
Things that are usually not pickable are, for example, sockets, file(handler)s, database connections, and so on. Everything that's build up (recursively) from basic python types (dicts, lists, primitives, objects, object references, even circular) can be pickled by default.
You can implement custom pickling code that will, for example, store the configuration of a database connection and restore it afterwards, but you will need special, custom logic for this.
All of this makes pickling a lot more powerful than xml, json and yaml (but definitely not as readable)
These are all great answers, but for anyone who's new to programming and still confused here's the simple answer:
Pickling an object is making it so you can store it as it currently is, long term (to often to hard disk). A bit like Saving in a video game.
So anything that's actively changing (like a live connection to a database) can't be stored directly (though you could probably figure out a way to store the information needed to create a new connection, and that you could pickle)
Bonus definition: Serializing is packaging it in a form that can be handed off to another program. Unserializing it is unpacking something you got sent so that you can use it
Pickling is the process in which the objects in python are converted into simple binary representation that can be used to write that object in a text file which can be stored. This is done to store the python objects and is also called as serialization. You can infer from this what de-serialization or unpickling means.
So when we say an object is picklable it means that the object can be serialized using the pickle module of python.
Python docs mention this word a lot and I want to know what it means.
It simply means it can be serialized by the pickle module. For a basic explanation of this, see What can be pickled and unpickled?. Pickling Class Instances provides more details, and shows how classes can customize the process.
Things that are usually not pickable are, for example, sockets, file(handler)s, database connections, and so on. Everything that's build up (recursively) from basic python types (dicts, lists, primitives, objects, object references, even circular) can be pickled by default.
You can implement custom pickling code that will, for example, store the configuration of a database connection and restore it afterwards, but you will need special, custom logic for this.
All of this makes pickling a lot more powerful than xml, json and yaml (but definitely not as readable)
These are all great answers, but for anyone who's new to programming and still confused here's the simple answer:
Pickling an object is making it so you can store it as it currently is, long term (to often to hard disk). A bit like Saving in a video game.
So anything that's actively changing (like a live connection to a database) can't be stored directly (though you could probably figure out a way to store the information needed to create a new connection, and that you could pickle)
Bonus definition: Serializing is packaging it in a form that can be handed off to another program. Unserializing it is unpacking something you got sent so that you can use it
Pickling is the process in which the objects in python are converted into simple binary representation that can be used to write that object in a text file which can be stored. This is done to store the python objects and is also called as serialization. You can infer from this what de-serialization or unpickling means.
So when we say an object is picklable it means that the object can be serialized using the pickle module of python.
Common scenario: I have a library that uses other libraries. For example, a math library (let's call it foo) that uses numpy.
Functions of foo can either:
return a numpy object (either pure or an inherited reimplementation)
return a list
return a foo-implemented object that behaves like numpy (performing delegation)
The three solutions can be also restated as:
foo passes through the internally used object, clearly stating that its library dependency is also a API dependency (since it returns objects obeying the interface of the numpy library)
foo makes use of a common subset of objects that are part of the basis of the language.
foo completely hides what it uses internally. Nothing about the underlying libraries escapes from the foo library to the client code.
We are of course in a pros-cons scenario. Transparent or opaque? strong coupling with the underlying tools or not? I know the drill but I am in the process of having to do this choice, and I want to share opinions before taking a decision. Suggestions, ideas, personal experience are greatly appreciated.
Since you're talking about return values, that's not really about "internal objects" -- you should just document the interfaces your returned objects will support (it's OK if that's a subset of numpy.array or whatever;-). I recommend against returning a reference to your internal mutable attributes and documenting that mutators work to alter your own object indirectly (and NOT documenting it is not much better) -- that leads to way-too-strong coupling down the road.
If you WERE talking about actual internal objects, I'd recommend the Law of Demeter -- in a simplistic reading, if the client's coding a.b.c.d.e.f(), then something is very wrong ("just one dot" may be sometimes extreme, but, "four are Right Out"). Again, the problem is strong coupling -- making it impossible for you to change your internal implementation in even minor ways without breaking a million clients...!
The main question I would think about is how much of your library would return numpy objects? If its pervasive I would go with directly returning a numpy as your so tied to numpy you might as well make it explicit. Plus it will probably make it easier to use other numpy based libraries. If on the other hand you only have a few methods that would return numpy I would either go with the numpy like object or a list, probably the numpy like object.