Python not raising an exception on % without conversion specifier - python

The % operator for string formatting is described here.
Usually, when presented a string without conversion specifier, it will raise a TypeError: not all arguments converted during string formatting. For instance, "" % 1 will fail. So far, so good.
Sometimes, it won't fail, though, if the argument on the right of the % operator is something empty: "" % [], or "" % {} or "" % () will silently return the empty string, and it looks fair enough.
The same with "%s" instead of the empty string will convert the empty object into a string, except the last which will fail, but I think it's an instance of the problems of the % operator, which are solved by the format method.
There is also the case of a non-empty dictionary, like "" % {"a": 1}, which works because it's really supposed to be used with named type specifiers, like in "%(a)d" % {"a": 1}.
However, there is one case I don't understand: "" % b"x" will return the empty string, no exception raised. Why?

I'm not 100% sure, but after a quick look in the sources, I guess the reason is the following:
when there's only one argument on the right of %, Python looks if it has the getitem method, and, if yes, assumes it to be a mapping and expects us to use named formats like %(name)s. Otherwise, Python creates a single-element tuple from the argument and performs positional formatting. Argument count is not checked with mappings, therefore, since bytes and lists do have getitem, they won't fail:
>>> "xxx" % b'a'
'xxx'
>>> "xxx" % ['a']
'xxx'
Consider also:
>>> class X: pass
...
>>> "xxx" % X()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting
>>> class X:
... def __getitem__(self,x): pass
...
>>> "xxx" % X()
'xxx'
Strings are exception of this rule - they have getitem, but are still "tuplified" for positional formatting:
>>> "xxx" % 'a'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting
Of course, this "sequences as mappings" logic doesn't make much sense, because formatting keys are always strings:
>>> "xxx %(0)s" % ['a']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str
but I doubt anyone is going to fix that, given that % is abandoned anyways.

The offending line is at unicodeobject.c. It considers all objects that are "mappings", and explicitly are not either tuples or strings, or subclasses thereof, as "dictionaries", and for those it is not error if not all arguments are converted.
The PyMapping_Check is defined as:
int
PyMapping_Check(PyObject *o)
{
return o && o->ob_type->tp_as_mapping &&
o->ob_type->tp_as_mapping->mp_subscript;
}
That is, any type with tp_as_mapping defined and that having mp_subscript is a mapping.
And bytes does define that, as does any other object with __getitem__. Thus in Python 3.4 at least, no object with __getitem__ will fail as the rightside argument to the % format op.
Now this is a change from Python 2.7. Furthermore, the reason for this is that it is that there is no way to detect all possible types that could be used for %(name)s formatting, except by accepting all types that implement __getitem__, though the most obvious mistakes have been taken out. When the Python 3 was published, no one added bytes there, though it clearly shouldn't support strings as arguments to __getitem__; but neither is there list there.
Another oversight is that a list cannot be used for formatting for positional parameters.

Related

Is it possible for a class to define a result for the %d format specifier?

I have an instance of a class A passed as the value to the format specifier %d in the string formatting using the % operator. Without any preparation, this will result in the following error message: TypeError: %d format: a number is required, not A:
class A: pass
'%d' % A()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: %d format: a number is required, not A
If the class A defines a method called __int__, this gets called:
class A:
def __int__(self): return 42
'%d' % A()
'42'
In my use case I would like the formatting with %d create a string representations for the instances of my class which do not look like a number (but instead an arbitrary string like n/a, ²³, or similar).
Is this possible?
I was considering returning another special object in the __int__ method but that resulted in a warning (only returning basic ints is allowed, anything else might become illegal in later versions; I'm trying on Python 3.7.4, btw) and no success eventually.
I know it is an easy task using the __format__ method in combination with the '{0}'.format(a) way of formatting strings, but that's not what I'm asking for. I'm specifically and only asking about formatting using the %d specifier in formatting string used with the % operator.
Printing a string using the %d operator will not work. As far as I know there is no built in function to change this. However, if you want to print a string representation of an object their are three different ways using the modulo operator (%).
%s - Returns a string using the str() built in method
%r - Returns a string using the repr() built in method
%a - Returns a string using the ascii() built in method
Using these three you can customize the string using their corresponding dunber methods. For example, if you were to use %s it would use the built str() method. In order to edit what the str() method returns you would put this within your class definition.
def __str__(self):
return "String Representation"
The python documentation describes why %d can't print strings perfectly. Make sure to scroll down to where the % explanation is (about half way down the page.)

How to restrict a parameter of python function must be string or a function like lambda expression

For example, the following, the first parameter should be restricted to a string, and the second parameter should be a function. However, this is wrong syntax for both. Anyone can help suggest the correct syntax to impose the type restriction?
def GetArg(msg:String, converter:lambda, default):
print("{}, the default is '{}':".format(msg, default))
return converter(stdin.readline().strip())
It gives error
Traceback (most recent call last):
File "E:/stdin_ext.py", line 4, in <module>
def GetArg(msg:String, converter, default:String):
NameError: name 'String' is not defined
and
File "E:/stdin_ext.py", line 4
def GetArg(msg:String, converter:lambda, default:String):
^
SyntaxError: invalid syntax
You can use the typing module.
from typing import Callable
def get_arg(msg: str, converter: Callable[[str], str], default) -> str:
print(f"{msg}, the default is '{default}':")
return converter(stdin.readline().strip())
assuming your function converter takes a string a returns a string, and that get_arg returns a string. Note the code has also been modified to follow python's naming convention and uses f strings over older style format strings.
Also note that these type hints are just that, hints. They are not checked statically or at runtime. Although, certain IDE's will help you ensure they are correct.
You should use str instead of String and Callable instead of lambda:
from typing import Callable
def GetArg(msg:str, converter:Callable, default):
print("{}, the default is '{}':".format(msg, default))
return converter(stdin.readline().strip())
That is, unless you have a specific class called String and you are expecting an argument of its type.
When you write lambda you are defining a new lambda, not specifying a type for a function, whence the error.
I believe it is also important to point out that types in python are only a useful notation but it does not raise any error by itself, for example I could still call GetArg(1,2,3) and don't get any error just by the type hinting (of course I would get an error trying to pass an argument to an int afterwards).

String Formatting Confusion

O'Reilly's Learn Python Powerful Object Oriented Programming by Mark Lutz teaches different ways to format strings.
This following code has me confused. I am interpreting 'ham' as filling the format place marker at index zero, and yet it still pops up at index one of the outputted string. Please help me understand what is actually going on.
Here is the code:
template = '{motto}, {0} and {food}'
template.format('ham', motto='spam', food='eggs')
And here is the output:
'spam, ham and eggs'
I expected:
'ham, spam and eggs'
The only thing you have to understand is that {0} refers to the first (zeroeth) unnamed argument sent to format(). We can see this to be the case by removing all unnamed references and trying to use a linear fill-in:
>>> "{motto}".format("boom")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'motto'
You would expect that 'boom' would fill in 'motto' if this is how it works. But, instead, format() looks for a parameter named 'motto'. The key hint here is the KeyError. Similarly, if it were just taking the sequence of parameters passed to format(), then this wouldn't error, either:
>>> "{0} {1}".format('ham', motto='eggs')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: tuple index out of range
Here, format() is looking for the second unnamed argument in the parameter list - but that doesn't exist so it gets a 'tuple index out of range' error. This is just the difference between the unnamed (which are positionally sensitive) and named arguments passed in Python.
See this post to understand the difference between these types arguments, known as 'args' and 'kwargs'.

AttributeError: 'NoneType' object has no attribute 'format'

print ("Hello World")
print ("{} World").format(Hello)
I'm working on my first "Hello World" program and I can get it to work by using the print function and just a simple string text but when I try to use .format it gives me the error:
AttributeError: 'NoneType' object has no attribute 'format'
Is this saying that I need to initialize a variable for .format or am I missing something?
Your brackets are wrong
print("Hello World")
print("{} World".format('Hello'))
Note - the errors
The format function is an attribute of str so it needs to be called on the string
Unless declared, Hello is a string and should be 'Hello'
For Py2 you can do
print "{} World".format('Hello')
Function print returns None, so that's obviously what you're getting from the start of your second statement, namely
print ("{} World")
On that return value of None, you then call .format(Hello) -- even if a variable named Hello was assigned somewhere in your code (and you're not showing it to us!), you're calling that .format method on the None returned from your print call, which makes no sense.
Rather, you want to call .format on the string "{} World" -- so the closed-paren right after the string and before the dot is clearly a terrible mistake! Move that ) to the end of the statement, after the call to format on that string.
Moreover, is Hello the name of a variable whose value you want to format? I sure hope not, else why haven't you shown us that variable being assigned?! I suspect you want to format a constant string and just absent-mindedly forgot to put it in quotes (to show it's a constant, not the name of a variable!) -- 'Hello', not Hello without quotes! That is what you should be passing to the proper form of the .format call...!
even thought the requestor of the original question was wrong in using the .format structure - I believe he's still right about one thing - the behavior in Python3 is different when having a value that equals None
Example
fmt = '{:^9}|{:^13}|{:^18}'
data = [1, None, 'test']
print(fmt.format(*data))
Python2.7
$ python2.7 test
1 | None | test
In Python3.6
python3.6 test
Traceback (most recent call last):
File "test", line 5, in <module>
print(fmt.format(*data))
TypeError: unsupported format string passed to NoneType.__format__
But if we remove the format's features of column-width
fmt = '{}|{}|{}'
data = [1, None, 'test']
print(fmt.format(*data))
OR convert all values to Strings using !s
fmt = '{!s:^9}|{!s:^13}|{!s:^18}'
It works just fine in both versions ...

Misleading `ValueError` on bad formatting in python 2.7

When I try the following wrong code:
not_float = [1, 2, 3]
"{:.6f}".format(not_float)
I get the following misleading ValueError:
ValueError: Unknown format code 'f' for object of type 'str'
It is misleading, since it might make me think not_float is a string. Same message occurs for other non_float types, such as NoneType, tuple, etc. Do you have any idea why? And: should I expect this error message no matter what the type of non_float is, as long as it does not provide some formatting method for f?
On the other hand, trying:
non_date = 3
"{:%Y}".format(non_date)
brings
ValueError: Invalid conversion specification
which is less informative but also less misleading.
The str.format() method, and the format() function, call the .__format__() method of the objects that are being passed in, passing along everything after the : colon (.6f in this case).
The default object.__format__() implementation of that method is to call str(self) then apply format() on that result. This is implemented in C, but in Python that'd look like:
def __format__(self, fmt):
return format(str(self), fmt)
It is this call that throws the exception. For a list object, for example:
>>> not_float.__format__('.6f')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Unknown format code 'f' for object of type 'str'
because this is functionally the same as:
>>> format(str(not_float), '.6f')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Unknown format code 'f' for object of type 'str'
Integers have a custom .__format__ implementation instead; it does not use str() internally because there are integer-specific formatting options. It turns the value into a string differently. As a result, it throws a different exception because it doesn't recognize %Y as a valid formatting string.
The error message could certainly be improved; an open Python bug discusses this issue. Because of changes in how all this works the problem is no longer an issue in Python 3.4 though; if the format string is empty .__format__() will no longer be called.

Categories

Resources