I have a custom class, which has some string data. I wish to be able to save this string data to a file, using a file handle's write object. I have implemented __str__(), so i can do str(myobject), what is the equivalent method for making python consider my object to be a character buffer object?
If you are trying to use your object with library code, that expects to be able to write what you give it to a file, then you may have to resort to implementing a "duck file" class that acts like a file but supports your stringable object. Unfortunately, file is not a type that you can subclass easily, at least as of Python 2.6. You will have to implement enough of the file protocol (write, writelines, tell, etc.) to allow the library code to work as expected.
There isn't a single function, but a whole range of them - read, seek, etc.
Why don't you subclass StringIO.StringIO, which is already a string buffer?
Related
Regarding syntax in Python
Why do we use open("file") to open but not "file".close() to close it?
Why isn't it "file".open() or inversely close("file")?
It's because open() is a function, and .close() is an object method. "file".open() doesn't make sense, because you're implying that the open() function is actually a class or instance method of the string "file". Not all strings are valid files or devices to be opened, so how the interpreter is supposed to handle "not a file-like device".open() would be ambiguous. We don't use "file".close() for the same reason.
close("file") would require a lookup of the file name, then another lookup to see if there are file handles, owned by the current process, attached to that file. That would be very inefficient and probably has hidden pitfalls that would make it unreliable (for example, what if it's not a file, but a TTY device instead?). It's much faster and simpler to just keep a reference to the opened file or device and close the file through that reference (also called a handle).
Many languages take this approach:
f = open("file") # open a file and return a file object or handle
# stuff...
close(f) # close the file, using the file handle or object as a reference
This looks similar to your close("file") construct, but don't be fooled: it's closing the file through a direct reference to it, not the file name as stored in a string.
The Python developers have chosen to do the same thing, but it looks different because they have implemented it with an object-oriented approach instead. Part of the reason for this is that Python file objects have a lot of methods available to them, such as read(), flush(), seek(), etc. If we used close(f), then we would have to either change all of the rest of the file object methods to functions, or let it be one random function that behaves differently from the rest for no good reason.
TL;DR
The design of open() and file.close() is consistent with OOP principals and good file reference practices. open() is a factory-like function that creates objects that reference files or other devices. Once the object is created, then all other operations on that object are done through class or instance methods.
Normally you shouldn't use "file".close() explicitly but use open(file) as contextmanager so the file handle is also closed if an exception happened. End of problem :-)
But to actually answer your question I assume the reason is that open supports many options and the returned class differs depending on these options (see also io module). So it would be simply much more complicated for the end-user to remember which class he wants and then use the "class".open with the right class itself. Note you can also pass "integer file descriptor of the file to be wrapped." to open, this would mean besides having a str.open() method you also get a int.open(). This would be really bad OO design but also confusing. I wouldn't care to guess what kind of questions would be asked on StackOverflow about that ("door".open(), (1).open())...
However I must admit that there is a pathlib.Path.open function. But if you have a Path it isn't ambiguous anymore.
As to a close() function: Each instance will have a close() method already and there are no differences between the different classes, so why create an additional function? There is simply no advantage.
Only slightly less new than you but I'll give this one a go, basically opening and closing are pretty different actions in a language like python. When you are opening the file what you are really doing is creating an object to be worked within your application that represents the file, so you create it with a function that informs the OS that the file has been opened and creates an object that python can use to read and write to the file. When it comes time to close the file what basically needs to be done is for your app to tell the OS that it is done with the file and dispose of the object that represented the file fro memory, and the easiest way to do that is with a method on the object itself. Also note that a syntax like "file".open would require the string type to include methods for opening files, which would be a very strange design and require a lot of extensions on the string type for anything else you wanted to implement with that syntax. close(file) would make a bit more sense but would still be a clunky way of releasing that object/letting the OS know the file was no longer open, and you would be passing a variable file representing the object created when you opened the file rather than a string pointing to the file's path.
In addition to what has been said, I quote the changelog in Python removing the built in file type. It (briefly) explains why the class constructor approach using the file type (available in Python 2) was removed in Python 3:
Removed the file type. Use open(). There are now several different kinds of streams that open can return in the io module.
Basically, while file("filename") would create an instance of file, open("filename") can return instances of different stream classes, depending on the mode.
https://docs.python.org/3.4/whatsnew/3.0.html#builtins
Python docs mention this word a lot and I want to know what it means.
It simply means it can be serialized by the pickle module. For a basic explanation of this, see What can be pickled and unpickled?. Pickling Class Instances provides more details, and shows how classes can customize the process.
Things that are usually not pickable are, for example, sockets, file(handler)s, database connections, and so on. Everything that's build up (recursively) from basic python types (dicts, lists, primitives, objects, object references, even circular) can be pickled by default.
You can implement custom pickling code that will, for example, store the configuration of a database connection and restore it afterwards, but you will need special, custom logic for this.
All of this makes pickling a lot more powerful than xml, json and yaml (but definitely not as readable)
These are all great answers, but for anyone who's new to programming and still confused here's the simple answer:
Pickling an object is making it so you can store it as it currently is, long term (to often to hard disk). A bit like Saving in a video game.
So anything that's actively changing (like a live connection to a database) can't be stored directly (though you could probably figure out a way to store the information needed to create a new connection, and that you could pickle)
Bonus definition: Serializing is packaging it in a form that can be handed off to another program. Unserializing it is unpacking something you got sent so that you can use it
Pickling is the process in which the objects in python are converted into simple binary representation that can be used to write that object in a text file which can be stored. This is done to store the python objects and is also called as serialization. You can infer from this what de-serialization or unpickling means.
So when we say an object is picklable it means that the object can be serialized using the pickle module of python.
How to pass a pexpect spawn object as argument from one python file to another. I tried to pass it, but the error is that it has to be a string. Then I converted the object to string. But it's not working as expected.
There’s no way to pass complex object directly between two python programs even one spawns another. You should serialise object state and then pass it. other side should deserialise it before use.
If you want to pass data with command-line argument, you should use string object (not bytes, but string). Please note, that repr may not give string which you can use to create “clone”.
Also you can pass data with external files, IPC, TCP, UDP, FIFO or many other ways.
Python has the marvelous collections module that has tools to allow you to implement a full dict (for example) from a minimal set of methods. Is there a similar thing for the file interface in Python? If not, what would you recommend as a minimal set of methods to implement for a file-like object for duck-typing purposes?
And how do you deal with things that would like to use your file like object in a with statement, like you can with a regular file, or who want to iterate over it (like you can with a regular file) or who want to be able to call readline or readlines and have it do something intelligent and useful (like you can with a regular file)? Do you have to implement them all yourself? Or are there better options?
I know I can implement each and every single one of these myself, by hand. But the collections interface allows me to implement a dict by implementing just __len__, __iter__, __setitem__, and __getitem__. I get pop, popitem, clear, update, setdefault, __contains__, keys, items, values, get, __eq__, and __ne__ all for free. There is a minimal interface for __dict__ defined, and if I implement it, I get the full dict interface, all of the extra methods being implemented in terms of the minimal interface.
Similarly, I would like to know what the minimal interface for file is that I have to implement in order to get the full interface. Is there a way to get __enter__, __exit__, readline, readlines, __iter__ and next if I just implement read, write and close, or do I have to implement everything myself by hand each and every time I want the full file interface?
The with statement requires a context manager:
http://docs.python.org/library/stdtypes.html#typecontextmanager
The file type is fully defined:
http://docs.python.org/library/stdtypes.html#file-objects
Seems pretty simple.
The documentation lists the methods and attributes of a file and a context manager. Implement those.
What more information do you need?
http://docs.python.org/library/contextlib.html?highlight=context%20manager
If you want all the methods to work, you have to implement all the methods. Unlike the collections, there is no abstract base class for files.
I would look at io.IOBase[1] and io.RawIOBase for >2.6 compatibility. This will keep you moving forward with 3.x (io implements the 3.x file interface).
[1] http://docs.python.org/library/io.html#i-o-base-classes
You kind of answered it yourself. While there is no set of "special" methods you need to implement the file interface, you can do it just by providing a couple of methods normally associated with files. Duck typing takes care of the rest.
You only really need a read and/or a write method (depending on whether you want it to be readable and/or writable) which behave the same as a normal file object. You can have a look at the Python file object reference to see all of the methods of a file object. Basically, the more you implement, the more situations your class will work in place of a file. (For example, if you implement seek, then it will work in any function that performs seeking on a file.) Note that there is a continuum here, there is no absolute "it supports the file protocol or it doesn't." In fact, there is no way to work 100% in all the places that support file-like objects, because some code will access low-level details of the real file type, and yours won't work there.
In summary, any class that implements read and write will work in most situations that require a "file-like object".
(Note that the special method names like __getitem__ for dicts are really not special, except they are used by special syntax like [key] -- thats why dict has special method names and file does not.)
Python docs mention this word a lot and I want to know what it means.
It simply means it can be serialized by the pickle module. For a basic explanation of this, see What can be pickled and unpickled?. Pickling Class Instances provides more details, and shows how classes can customize the process.
Things that are usually not pickable are, for example, sockets, file(handler)s, database connections, and so on. Everything that's build up (recursively) from basic python types (dicts, lists, primitives, objects, object references, even circular) can be pickled by default.
You can implement custom pickling code that will, for example, store the configuration of a database connection and restore it afterwards, but you will need special, custom logic for this.
All of this makes pickling a lot more powerful than xml, json and yaml (but definitely not as readable)
These are all great answers, but for anyone who's new to programming and still confused here's the simple answer:
Pickling an object is making it so you can store it as it currently is, long term (to often to hard disk). A bit like Saving in a video game.
So anything that's actively changing (like a live connection to a database) can't be stored directly (though you could probably figure out a way to store the information needed to create a new connection, and that you could pickle)
Bonus definition: Serializing is packaging it in a form that can be handed off to another program. Unserializing it is unpacking something you got sent so that you can use it
Pickling is the process in which the objects in python are converted into simple binary representation that can be used to write that object in a text file which can be stored. This is done to store the python objects and is also called as serialization. You can infer from this what de-serialization or unpickling means.
So when we say an object is picklable it means that the object can be serialized using the pickle module of python.