Why doesn't Python's file write automatically call __str__?
$ cat person.py
class Person:
def __init__(self):
self.age = 22
def __str__(self):
return "my age is {}".format(self.age)
When I try to print it, everything goes fine, but writing Person to file fails:
>>> from person import Person
>>> dan = Person()
>>> print(dan)
my age is 22
>>> fl = open("dan.txt","wt")
>>> fl.write(dan)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: write() argument must be str, not Person
That's simply not how the API is designed.
From the definition of object.__str__(self):
Called by str(object) and the built-in functions format() and print() to compute the “informal” or nicely printable string representation of an object.
Beyond those explicitly listed instances, there's no reason to expect that implicit conversion will occur.
And from the definition of io.TextIOBase.write(s):
Write the string s to the stream and return the number of characters written.
So it's explicitly expecting a string, which is confirmed by the error message.
The simplest solution is to perform the conversion by using the str() function before passing it write() as an argument:
fl.write(str(dan))
When I try to print it, everything goes fine, but writing Person to file fails
Why not use print then?
print(dan, file=fl)
That will behave exactly like the print you are expecting (including a trailing newline), but write to a file instead of stdout.
Related
I'm teaching myself how to code with the help of some online tutorials. I've encountered "decorators", I can seem to understand how it works but something bothers me. Here's the code given:
def decor(func):
def wrap():
print("-----------")
func()
print("-----------")
return wrap
def print_text():
print("Hello World")
decorated = decor(print_text)
decorated()
output:
-----------
Hello World
-----------
The things that I want to understand are:
Why do you have to call "return wrap" instead of "return wrap()"? Whereas if you don't you'll get a "TypeError: 'NoneType' object is not callable.
When I assigned the value of decorated variable. How come I also had to use "print_text" rather than "print_text()" whereas it'll raise the same TypeError if I do?
When I used the variable "decorated". Why did I have to call it like a function (adding () at the end). When I call it using "decorated" or "print(decorated)" it says something completely different?
Sorry for the dumb questions. But I'm just starting out so please bear with me. Also please make your responses beginner-friendly. Thank you
In Python, just about everything is an object. Functions are objects too. You can reference them by their name:
>>> def print_text():
... print("Hello World")
...
>>> print_text # **no** call here, this is just referencing the object
<function print_text at 0x10e3f1c80>
>>> print_text() # With a call, so now we *run* the function
Hello World
Adding () to the name told Python to call the function, which caused it to actually execute the function body, without the call, it is just showing you what the name references.
You can assign function objects to other names too. Those other names can still be called, invoking the function:
>>> other_name = print_text
>>> other_name
<function print_text at 0x10e3f1c80>
>>> other_name()
Hello World
So other_name is just another reference to the same object, and adding () (a call expression) causes the function object to be executed. print_text() and other_name() do the exact same thing, run the code inside the function.
That's what name func inside of decor() refers to; it is a reference to the same function object. You passed it in with decor(print_text). Only later on, inside wrapper() the expression func() calls that function object. If you passed in print_text() instead, you'd pass in the None object that function returned, and None can't be called:
>>> return_value = print_text()
Hello World
>>> return_value is None
True
>>> return_value()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
Next, return wrapper returns the newly created wrapper function object to the caller. If you did return wrapper(), you'd return the result of the function, not the function object itself.
The whole point of a decorator is to replace the original function object with a new object that does extra stuff, which is why a decorator returns that replacement object (in your example wrapper) so that in future when you call decorated(), you call that wrapper function doing something extra before and after calling the original function (via the func name, which references print_text()).
So what decor(some_function) does is return a new function object, one that'll print something, call the function object that was passed in, then print something else. That new function object can then be used to replace the old function object.
print.__doc__ outputs:
SyntaxError: invalid syntax
where as
>>> getattr(__builtin__,"print").__doc__
Outputs:
print(value, ..., sep=' ', end='\n', file=sys.stdout)
Prints the values to a stream, or to sys.stdout by default. Optional keyword arguments:
file : a file-like object (stream); defaults to the current sys.stdout.
sep: string inserted between values, default a space.
end: string appended after the last value, default a newline.
Can anyone help me understand why print.__doc__ is giving a syntax error instead of printing the doc string
In Python 2 (or Python < 2.6 to be very exact) print is absolutely nothing like a function, and thus does not have docstring. It doesn't even evaluate all of its arguments before it starts printing:
>>> print 42, a
42
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
42 was printed before the a was evaluated. print is a statement that expects 0 to N comma separated expression following it, optionally preceded by the construct >> file, the construct print.__doc__ is illegal. It makes as little sense as if.__doc__, or return.__doc__.
However starting with Python 2.6, the print function is available in the __builtin__ module, but is not used by default as the print statement collides with it, unless the parsing for print the statement is disabled by from __future__ import print_function.
Print isn't globally available as a function in Python 2, so you can't treat it as an object. It's a statement.
In Python 3, or Python 2 with from __future__ import print_function, however, print is a normal function and you can read the __doc__ attribute.
See: https://docs.python.org/2/library/functions.html#print
Note: This function is not normally available as a built-in since the name print is recognized as the print statement. To disable the statement and use the print() function, use this future statement at the top of your module:
from __future__ import print_function
I am trying to get some args working in a class, I already got it running in a function from How to use *args in a function?.
I'm trying to get that function into a class but I don't seem to understand how to initialize that class which has an init function taking *args. The code is as following :
class classname(object):
def __init__(self, *args):
<code-snip>
...
</code-snip>
if __name__ == "__main__":
start = classname()
start()
So here I'm confused on what to do with 'start()'. Do I have to use 'start(*sys.argv[1:])' or 'start()'. Both doesn't seem to work. I want to get the *args which is expected in init to be passed properly.
Any pointers please.
Thanks a ton..
======
I'm sorry if I wasn't clear on detailing how it didn't work.
a) While using start(*sys.argv[1:])
Traceback (most recent call last):
File "check.py", line 126, in <module>
start(*sys.argv[1:])
TypeError: 'check' object is not callable
b) While using start(), I get :
Traceback (most recent call last):
File "check.py", line 126, in <module>
start()
TypeError: 'check' object is not callable
These were the errors which came up.
#alko, yes you are correct. I was looking on how to get the *args in init passed properly.
Objects are instantiated by passing arguments to class constructor. They are in turn initalized with __init__ function. In your example this would be
start = ClassName(*sys.argv[1:])
that expression is processed as follows:
New instance of classname is instantiated with object.__new__(ClassName, *sys.argv[1:]) named start in local namespace. From now on start object may be referenced inside your if __name__ == "__main__" script.
Its contents are in turn initialized invoking start.__init__(*sys.argv[1:]). Note that args to __init__ are the same passed to constructor.
And read PEP 8 for python naming convention. That is:
Class names should normally use the CapWords convention.
Your example contains a class which is first instantiated - which involves calling __init__() - and then called - which is done by calling __call__().
So your stuff should be put in the call start = classname(...) in order to be passed to __init__().
The call to the newly instantiated object will fail, however, unless it contains a __call__() method. That would have been easier to answer if you had told us what
Both doesn't seem to work.
exactly means.
When creating a new class instance, I'm trying to call a method in a different class however can't get it to work. Here's what I have:
class DataProject(object):
def __init__(self, name=none,input_file=None,datamode=None,comments=None,readnow=True):
..............
# here's where I'm trying to call the method in the other class
if type(input_file) == str:
self.input_file_format = self.input_file.split(".")[-1]
if readnow:
getattr(Analysis(),'read_'+self.input_file_format)(self,input_file)
class Analysis(object):
def __init__(self):
pass # nothing happens here atm
def read_xlsx(self,parent,input_file):
"""Method to parse xlsx files and dump them into a DataFrame"""
xl = pd.ExcelFile(input_file)
for s in sheet_names:
parent.data[s]=xl.parse(s)
I'm getting a NameError: global name 'read_xlsx' is not defined when I run this with afile.xlxs as input which made me think that I just discovered a massive hole in my Python knowledge (not that there aren't many but they tend to be hard to see, sort of like big forests...).
I would have thought that getattr(Analysis(), ... ) would access the global name space in which it would find the Analysis class and its methods. And in fact print(globals().keys()) shows that Analysis is part of this:
['plt', 'mlab', '__builtins__', '__file__', 'pylab', 'DataProject', 'matplotlib', '__package__', 'W32', 'Helpers', 'time', 'pd', 'pyplot', 'np', '__name__', 'dt', 'Analysis', '__doc__']
What am I missing here?
EDIT:
The full traceback is:
Traceback (most recent call last):
File "C:\MPython\dataAnalysis\dataAnalysis.py", line 101, in <module>
a=DataProject(input_file='C:\\MPython\\dataAnalysis\\EnergyAnalysis\\afile.xlxs',readnow=True)
File "C:\MPython\dataAnalysis\dataAnalysis.py", line 73, in __init__
getattr(Analysis(),'read_'+self.input_file_format)(self,input_file)
File "C:\MPython\dataAnalysis\dataAnalysis.py", line 90, in read_xls
read_xlsx(input_file)
NameError: global name 'read_xlsx' is not defined
My main call is:
if __name__=="__main__":
a=DataProject(input_file='C:\\MPython\\dataAnalysis\\EnergyAnalysis\\afile.xlx',readnow=True)
From the full traceback, it appears that your DataProject class is calling (successfully) the Analysys.read_xls method, which in turn is trying to call read_xlsx. However, it's calling it as a global function, not as a method.
Probably you just need to replace the code on line 90, turning read_xlsx(input_file) into self.read_xlsx(input_file), though you might need to pass an extra parameter for the parent DataProject instance too.
getattr() works as you describe it in both Python2.x and Python3.x. The bug must be somewhere else.
This modification of your code (none of the core logic is changed) works fine for instance:
class DataProject(object):
def __init__(self, name="myname",input_file="xlsx",datamode=None,comments=None,readnow=True):
if type(input_file) == str:
self.input_file_format = input_file.split(".")[-1]
if readnow:
getattr(Analysis(),'read_'+self.input_file_format)(self,input_file)
class Analysis(object):
def __init__(self):
pass # nothing happens here atm
def read_xlsx(self,parent,input_file):
"""Method to parse xlsx files and dumpt them into a DataFrame"""
print("hello")
a=DataProject()
Output is:
$ python3 testfn.py
hello
Why using getattr() in this way is usually a bad idea
The way you are using getattr forces a naming convention on your methods (read_someformat). The naming of your methods should not be a core part of your program's logic. - You should always be able to change a function's name at every call and definition of that function and leave behaviour of the program intact.
If a file format needs to be handled by a specific method this logic should be delegated to some unit (e.g a function) with responsibility for this. One way (there are others) of doing this is to have a function which takes the input and decides which function needs to handle it:
def read_file(self,file,format):
if format == `xls`:
self.read_xls(file)
if format == `csv`:
self.read_csv(file)
The above snippet does have its issues too (a better way to do it would be the chain of responsibility pattern for example) but it will be fine for small scripts and is much nicer.
When I type help('string') in the python interpreter I get information about the string class. There,upper() is indicated as a function. Yet I can only call it as a method like "hi".upper() instead of upper("hi").
So one could assume that any method will be indicated as a function in the docstrings of the built in modules. Yet when I do help('list') , methods of the list class are indicated as methods in the docstrings!!
Why is this so? Only because the person who wrote the doctrings was inconsistent or that different people wrote it? Or do these methods(the ones called 'functions' versus the ones called 'methods' in the docstrings) actually have different properties?
When you searched for help('string'), you were looking for the docstrings of the string module. If you do help(str) or help('str') you'll get the docstrings of the str type, and here upper appears as a method!
As you can see here, the function upper from the string module is actually a function and not a method:
>>> upper('hi')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'upper' is not defined
>>> 'hi'.upper() # method from the str type
'HI'
>>> from string import upper
>>> upper('hi') # function from the string module
'HI'
You mean to do help('str'), not help('string'). str is a type, string is a module providing functions for working with strings.
You are creating an instance of that object and then calling help on that instance.
So these all work:
help(1)
help({})
help([])
help('')
help(dir)
help(help)
Help grabs the docstring for that instance, and gives it back to you.
When you create your own objects, you can put in useful docstrings or whatever you want.
There's nothing wrong with what you see.
>>> help('string')
Will show you the string module documentation. And it looks like there's an upper function inside:
>>> import string
>>> string.upper('hello')
'hello'
I'd say that this upper is the same that is called if you do:
>>> 'hello'.upper()
But I'm not sure.
Notice that a string '' is a str type not a string type. This means that you're probably looking for:
>>> help('str')
And here you'll see too the str.upper method.
This is because 'string' is a string. So is 'list'
To get a similar result for lists, try help([])