How to make function's arguments optional as a group? - python

I was just wondering what would be the preferred way in Python to make a group of arguments of a function optional, but only as the whole group.
Meaning: they have to either all be given, or none.
For an example, let's say I want to make a print function, that takes a message string as first positional argument and optionally a file-like object and an encoding as second and third arguments.
Now I want this function to print to stdout if no file is given, and to the file otherwise.
The tricky bit is this: I want this function to always require an encoding to be specified whenever a file is used. And calling this function with an encoding, but no file should also be forbidden.
In Java, I could overload the function and give implementations for both valid variants:
public void print(string message);
public void print(string message, File f, string encoding);
This allows me to call this function in exactly the two ways I want to be possible, with either one or all three arguments.
In Python, I can make single arguments optional by supplying a default value, but I cannot group them together.
def print(msg, file=None, encoding=None)
allows me to call the function by providing a message and none, both or just any one of the other parameters:
print("test")
print("test", file=someFile)
print("test", encoding="utf-8")
print("test", file=someFile, encoding="utf-8")
These are all valid calls to the Python declaration above, even though with my implementation, setting an encoding or file without the other one might make no sense.
I am aware that I could simply check both optionals for an invalid default value and raise an Exception at runtime whenever I find only one is set, but I think that is bad for a couple of reasons:
The Exception is raised only if the invalid call is executed, so it might not occur during testing.
I have no way of telling that both parameters are required as a pair by just looking at the declaration or an auto-generated quick reference without diving into the implementation.
No code analysis tool would be able to warn me about an invalid call.
So is there any better way to syntactically specify that a number of optional arguments are grouped together?

Python is not supporting overloading methods. And there is not a really good way to simulate an overloading design. So best you can do is using if statements with different arguments. Like you do in your method.
Or you can use **kwargs as argument and use if only the desired argument is defined.
def a_very_important_method(**kwargs)
if kwargs["arg1"] is not None:
# logic
if kwargs["arg2"] is not None:
# another logic
a_very_important_method(arg1="value1", arg2="value2")

I mean you could make one parameter expect a tuple as input. Like idk an 2D-array might have a size attribute which requires an input in the shape (x, y). Though that won't save you from checking at runtime whether the supplied values make any sense, does it?

After reading the other answers, it seems to me like the most simple and readable solution would be to write the function with all parameters mandatory and then add a second, "wrapper"- function which has a reduced set of parameters, passes these arguments to the original function on and also gives default values for the other parameters:
def print(msg, file, encoding):
# no default values here, so no parameter is optional
pass
def printout(msg):
# forward the argument and provide default values for the others
print(msg, sys.stdout, "")

Related

Why would an API author prevent positional parameters in Python?

Implementing some Neural Network with tensorflow, I've faced a method which parameters have took my attention. I'm talking about tf.nn.sigmoid_cross_entropy_with_logits (Documentation here).
The first parameter it receives as first parameter _sentinel=None which, according to the documentation:
_sentinel: Used to prevent positional parameters. Internal, do not use.
I understand that by having this parameter, next ones have to be named instead of positional is this one don't have to be used, but my question is. In which cases does prevent positional parameters have some benefit? What is their main goal to use this? Because I could also run
tf.nn.sigmoid_cross_entropy_with_logits(None, my_labels, my_logits)
being all arguments positional. Anyway, I want to clarify that my question is not focused in TensorFlow, it's just the example that I have found.
Positional parameters couple the caller and receiver on the order of the parameters. It makes refactoring the order of the reciver's parameters more difficult.
For example, if I have
def foo(a, b, c):
do_stuff(a,b,c)
and I decide, for reasons, perhaps I want to make a partial function or whatever, that it would be better to have
def foo(b, a, c):
do_stuff(a,b,c)
But now I have callers in the wild and it would be very rude to change my contract, so I'm stuck.
Sandi Metz in Practical Object-Oriented Design in Ruby also addresses this. (I know this is python, but oop is oop)
When the code [is changed to use keyword arguments], it lost its dependency
on argument order but it gained a dependency on the names of the keys
in the [keyword arguments]. This change is healthy. The new dependency is
more stable than the old, and thus this code faces less risk of being
forced to change. Additionally, and perhaps unexpectedly, the [keywords]
provides one new, secondary benefit: The key names in the hash furnish
explicit documentation about the arguments. This is a byproduct of
using a hash but the fact that it is unintentional makes it no less
useful. Future maintainers of this code will be grateful for the
information.
Keyword arguments are also nice if you have a lot of parameters. Order is easy to get wrong. It may also make a nicer API in the opinion of the authors.
PEP-3102 also addresses this, but I find the rationale unsatisfying from the perspective of "why would I choose to design something like this"
The current Python function-calling paradigm allows arguments to be
specified either by position or by keyword. An argument can be filled
in either explicitly by name, or implicitly by position.
There are often cases where it is desirable for a function to take a
variable number of arguments. The Python language supports this using
the 'varargs' syntax (*name), which specifies that any 'left over'
arguments be passed into the varargs parameter as a tuple.
One limitation on this is that currently, all of the regular argument
slots must be filled before the vararg slot can be.
This is not always desirable. One can easily envision a function which
takes a variable number of arguments, but also takes one or more
'options' in the form of keyword arguments. Currently, the only way to
do this is to define both a varargs argument, and a 'keywords'
argument (**kwargs), and then manually extract the desired keywords
from the dictionary.
What is the use for keyword only parameters:
For some function, it is impossible to do otherwise (ex: print(a, b, end=''))
It prevents you from making silly mistakes, consider the following example:
# if it wasn't made with kw-only parameters, this would return 3
>>> sorted(3, 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: sorted expected 1 arguments, got 2
>>> sorted((1,2), reverse=True)
[2, 1]
It allows you to change things later:
# if
def sorted(iterable, reverse=False)
# becomes
def sorted(iterable, key=None, reverse=False)
# you can guarantee backwards compatibility
First, a caveat that I can't know the intention of the person who wrote that. However, I can offer reason why “prevent positional parameters” might be desirable.
It's often important that a parameter be keyword-only, that is, it must be used only by name. The parameter is not conceptually an input to the function's purpose; it's more a modifier (change the behaviour in this way), or an external resource (here is the log file to emit your messages to), etc.
For that reason, Python 3 now allows you to define, in the signature of the function, specific parameters as keyword-only parameters. The change is documented in PEP 3102 Keyword-only arguments along with rationale.

How Can I Pythonically pass complex arguments to functions?

I have a SOAP web service I have to work with, and one of the commands it supports is a "SQL like" query where I input a select, from, and where statements. I think the "where" clause will be most demonstrative of what I'm trying to do so here:
def sql_soap(tablename, where):
sql_where = [soap_object(where_statement) for where_statement in where]
return query
sql_soap('student',where=[{'Condition':'=','Field':'Subject','Value':'Calculus'}])
Basically, the way I've thought to do this is to package a list of where-clause dictionaries. But the dictionaries should always have the same keys. Is there a way to define this type in the function definition? I don't want kwargs or args because I know in advance the data structure.
One thing I looked at was
def sql_soap(tablename, *, where):
Apparently this is only available in newer versions of Python (which I have) but my understanding is the where clause after this is expecting a dictionary, and I want a list of dictionaries.
Generally speaking how do I define a function argument, when I want a dictionary inside of a list, or something else nested? Is there any way besides a dictionary, that I can get a single function parameter (where) to accept all of the arguments I need to make the SOAP where object?
I do not know if this helps, but you could use *where to expect an arbitrary amount of args:
def sql_soap(tablename, *where):
sql_where = [soap_object(where_statement) for where_statement in where]
return query
sql_soap('student',
{'Condition':'=','Field':'Subject','Value':'Calculus'},
{'Condition':'=','Field':'Subject2','Value':'Calculus2'},
)
One thing you can also do, but you would to have to change probaply a lot of code for that, is use namedtuple instead of dictionaries:
from collections import namedtuple
wheretuple = namedtuple("wheretuple", "field condition value")
sql_soap('student', wheretuple("Subject", "=", "Calculus"))
You have not specified anything about types. The * syntax in a function definition only specifies how a caller can provide arguments for the parameters. Parameters before it can be filled with both positional arguments and keyword arguments, those that follow the * can only be specified with keyword arguments.
Put differently, the following calls are now legal:
sql_soap('student', where=[...]) # one positional, one keyword argument
sql_soap(tablename='student', where=[...]) # two keyword arguments
but the following is not:
sql_soap('student', [...]) # two positional arguments
You'll instead get a TypeError exception, TypeError: sql_soap() takes 1 positional argument but 2 were given.
Using * in a function definition does not say anything about what type of objects the parameter accepts. You can still pass anything you like to the function call.
Perhaps you got confused with the *args and **kwargs syntax in function definitions, where those parameters capture all remaining positional or keyword arguments passed in, which did not address any of the other parameters. They don't say anything about the argument types either; instead they put those remaining argument values in a tuple and dictionary, respectively.
Python does now support type hinting, but even type hinting will not let you specify what keys to use in a dictionary.
I'd use named tuples instead here, together with type hints:
from typing import NamedTuple, Sequence
class WhereClause(NamedTuple):
condition: str
field: str
value: str
def sql_soap(tablename: str, where: Sequence[WhereClause]):
...
This lets the type checker know that the where argument must be a sequence type (like a list), that contains only WhereClause instances. And those instances will have specific attributes.
Anytime you want to use any of the WhereClause instances, you can use attributes to get at the contents, so whereclause.condition and whereclause.value.

Why a calling function in python contains variable equal to value?

I have started to learn python, and I would like to ask you about something which I considered a little magic in this language.
I would like to note that before learning python I worked with PHP and there I haven't noticed that.
What's going on - I have noticed that some call constructors or methods in Python are in this form.
object.call(variable1 = value1, variable2 = value2)
For example, in FLask:
app.run(debug=True, threaded=True)
Is any reason for this convention? Or is there some semantical reason outgoing from the language fundamentals? I haven't seen something like that in PHP as often as in Python and because I'm really surprised. I'm really curious if there is some magic or it's only convention to read code easier.
These are called keyword arguments, and they're usually used to make the call more readable.
They can also be used to pass the arguments in a different order from the declared parameters, or to skip over some default parameters but pass arguments to others, or because the function requires keyword arguments… but readability is the core reason for their existence.
Consider this:
app.run(True, False)
Do you have any idea what those two arguments mean? Even if you can guess that the only two reasonable arguments are threading and debugging flags, how can you guess which one comes first? The only way you can do it is to figure out what type app is, and check the app.run method's docstring or definition.
But here:
app.run(debug=True, threaded=False)
It's obvious what it means.
It's worth reading the FAQ What is the difference between arguments and parameters?, and the other tutorial sections near the one linked above. Then you can read the reference on Function definitions for full details on parameters and Calls for full details on arguments, and finally the inspect module documentation on kinds of parameters.
This blog post attempts to summarize everything in those references so you don't have to read your way through the whole mess. The examples at the end should also serve to show why mixing up arguments and parameters in general, keyword arguments and default parameters, argument unpacking and variable parameters, etc. will lead you astray.
Specifying arguments by keyword often creates less risk of error than specifying arguments solely by position. Consider this function to compute loan payments:
def pmt(principal, interest, term):
return **something**;
When one tries to compute the amortization of their house purchase, it might be invoked thus:
payment = pmt(100000, 4.2, 360)
But it is difficult to see which of those values should be associated with which parameter. Without checking the documentation, we might think it should have been:
payment = pmt(360, 4.2, 100000)
Using keyword parameters, the call becomes self-documenting:
payment = pmt(principal=100000, interest=4.2, term=360)
Additionally, keyword parameters allow you to change the order of the parameters at the call site, and everything still works correctly:
# Equivalent to previous example
payment = pmt(term=360, interest=4.2, principal=100000)
See http://docs.python.org/2/tutorial/controlflow.html#keyword-arguments for more information.
They are arguments passed by keywords. There is no semantical difference between keyword arguments and positional arguments.
They are often used like "options", and provide a much more readable syntax for this circumstance. Think of this:
>>> sorted([2,-1,3], key=lambda x: x**2, reverse=True)
[3, 2, -1]
Versus(python2):
>>> sorted([2,-1,3], None, lambda x: x**2, True)
[3, 2, -1]
In this second example can you tell what's the meaning of None or True?
Note that in keyword only arguments, i.e. arguments that you can only specify using this syntax, were introduced in python3. In python2 any argument can be specified by position(except when using **kwargs but that's another issue).
There is no "magic".
A function can take:
Positional arguments (args)
Keyworded arguments (kwargs)
Always is this order.
Try this:
def foo(*args, **kwargs):
print args
print kwargs
foo(1,2,3,4,a=8,b=12)
Output:
(1, 2, 3, 4)
{'a': 8, 'b': 12}
Python stores the positional arguments in a tuple, which has to be immutable, and the keyworded ones in a dictionary.
The main utility of the convention is that it allows for setting certain inputs when there may be some defaults in between. It's particularly useful when a function has many parameters, most of which work fine with their defaults, but a few need to be set to other values for the function to work as desired.
example:
def foo(i1, i2=1, i3=3, i4=5):
# does something
foo(1,2,3,4)
foo(1,2,i4=3)
foo(1,i2=3)
foo(0,i3=1,i2=3,i4=5)

Python default arguments and argument names

I was wondering if the 'a=a', and 'b=b' can lead to problems/unexpected behaviour? code works fine in the example.
def add_func(a=2,b=3):
return a+b
a=4
b=5
answer = add_func(a=a, b=b)
Thanks
Not that I know of, although I'd love to be proved wrong.
The formal language reference defines the lexical structure of a function call. The important bit is that it defines a "keyword_item" as identifier "=" expression. Also, here's what it says about how the arguments to the call are interpreted:
If keyword arguments are present, they are first converted to
positional arguments, as follows. First, a list of unfilled slots is
created for the formal parameters. If there are N positional
arguments, they are placed in the first N slots. Next, for each
keyword argument, the identifier is used to determine the
corresponding slot (if the identifier is the same as the first formal
parameter name, the first slot is used, and so on). If the slot is
already filled, a TypeError exception is raised. Otherwise, the value
of the argument is placed in the slot, filling it (even if the
expression is None, it fills the slot). When all arguments have been
processed, the slots that are still unfilled are filled with the
corresponding default value from the function definition.
This lists a few possible scenarios.
In the simple case, like you mentioned, where there are two formal arguments (a and b), and if you specify the function call using keyword parameters like add_func(a=a, b=b), here's what happens:
Two slots are created to hold the parameters.
Since you didn't provide any positional arguments in the call (just keyword arguments), none of the slots are filled initially.
Each of your keyword arguments are analyzed individually, and the identifier of your argument (the "a" in the a= part) is compared with all of the formal parameters names of the function (the names that were given the parameters when the function was defined, in our case, a and b).
When a match occurs, the value of the keyword arguments (in this case, 4!) is used to fill the corresponding slot.
This repeats until all keyword arguments are analyzed. If all slots aren't filled, then Python tries to assign a default value to the remaining slots if one exists. If not, an error is raised.
So, Python treats the "identifier" in a keyword argument completely differently. This is only true for keyword arguments, though; obviously, if you tried something like add_func(b, a), even though your parameters themselves are called b and a, this would not be mapped to the formal parameters in the function; your parameters would be backwards. However, add_func(b=b, a=a) works fine; the positions don't matter as long as they are keyword arguments.
It depends on whether or not the global objects pointed to are mutable or immutable. immutable objects such as your integers are copies when modified, so it's safe. Mutable objects such as lists are modified in-place, and are NOT safe to use this way. Any change to them persists between calls and may (and probably will) cause unexpected behaviors.
This:
a=[]
def f(a=a):
pass
Is the same as:
def f(a=[]):
pass
Which is a known bad practice in Python programs.

Vector in python

I'm working on this project which deals with vectors in python. But I'm new to python and don't really know how to crack it. Here's the instruction:
"Add a constructor to the Vector class. The constructor should take a single argument. If this argument is either an int or a long or an instance of a class derived from one of these, then consider this argument to be the length of the Vector instance. In this case, construct a Vector of the specified length with each element is initialized to 0.0. If the length is negative, raise a ValueError with an appropriate message. If the argument is not considered to be the length, then if the argument is a sequence (such as a list), then initialize with vector with the length and values of the given sequence. If the argument is not used as the length of the vector and if it is not a sequence, then raise a TypeError with an appropriate message.
Next implement the __repr__ method to return a string of python code which could be used to initialize the Vector. This string of code should consist of the name of the class followed by an open parenthesis followed by the contents of the vector represented as a list followed by a close parenthesis."
I'm not sure how to do the class type checking, as well as how to initialize the vector based on the given object. Could someone please help me with this? Thanks!
Your instructor seems not to "speak Python as a native language". ;) The entire concept for the class is pretty silly; real Python programmers just use the built-in sequence types directly. But then, this sort of thing is normal for academic exercises, sadly...
Add a constructor to the Vector class.
In Python, the common "this is how you create a new object and say what it's an instance of" stuff is handled internally by default, and then the baby object is passed to the class' initialization method to make it into a "proper" instance, by setting the attributes that new instances of the class should have. We call that method __init__.
The constructor should take a single argument. If this argument is either an int or a long or an instance of a class derived from one of these
This is tested by using the builtin function isinstance. You can look it up for yourself in the documentation (or try help(isinstance) at the REPL).
In this case, construct a Vector of the specified length with each element is initialized to 0.0.
In our __init__, we generally just assign the starting values for attributes. The first parameter to __init__ is the new object we're initializing, which we usually call "self" so that people understand what we're doing. The rest of the arguments are whatever was passed when the caller requested an instance. In our case, we're always expecting exactly one argument. It might have different types and different meanings, so we should give it a generic name.
When we detect that the generic argument is an integer type with isinstance, we "construct" the vector by setting the appropriate data. We just assign to some attribute of self (call it whatever makes sense), and the value will be... well, what are you going to use to represent the vector's data internally? Hopefully you've already thought about this :)
If the length is negative, raise a ValueError with an appropriate message.
Oh, good point... we should check that before we try to construct our storage. Some of the obvious ways to do it would basically treat a negative number the same as zero. Other ways might raise an exception that we don't get to control.
If the argument is not considered to be the length, then if the argument is a sequence (such as a list), then initialize with vector with the length and values of the given sequence.
"Sequence" is a much fuzzier concept; lists and tuples and what-not don't have a "sequence" base class, so we can't easily check this with isinstance. (After all, someone could easily invent a new kind of sequence that we didn't think of). The easiest way to check if something is a sequence is to try to create an iterator for it, with the built-in iter function. This will already raise a fairly meaningful TypeError if the thing isn't iterable (try it!), so that makes the error handling easy - we just let it do its thing.
Assuming we got an iterator, we can easily create our storage: most sequence types (and I assume you have one of them in mind already, and that one is certainly included) will accept an iterator for their __init__ method and do the obvious thing of copying the sequence data.
Next implement the __repr__ method to return a string of python code which could be used to initialize the Vector. This string of code should consist of the name of the class followed by an open parenthesis followed by the contents of the vector represented as a list followed by a close parenthesis."
Hopefully this is self-explanatory. Hint: you should be able to simplify this by making use of the storage attribute's own __repr__. Also consider using string formatting to put the string together.
Everything you need to get started is here:
http://docs.python.org/library/functions.html
There are many examples of how to check types in Python on StackOverflow (see my comment for the top-rated one).
To initialize a class, use the __init__ method:
class Vector(object):
def __init__(self, sequence):
self._internal_list = list(sequence)
Now you can call:
my_vector = Vector([1, 2, 3])
And inside other functions in Vector, you can refer to self._internal_list. I put _ before the variable name to indicate that it shouldn't be changed from outside the class.
The documentation for the list function may be useful for you.
You can do the type checking with isinstance.
The initialization of a class with done with an __init__ method.
Good luck with your assignment :-)
This may or may not be appropriate depending on the homework, but in Python programming it's not very usual to explicitly check the type of an argument and change the behaviour based on that. It's more normal to just try to use the features you expect it to have (possibly catching exceptions if necessary to fall back to other options).
In this particular example, a normal Python programmer implementing a Vector that needed to work this way would try using the argument as if it were an integer/long (hint: what happens if you multiply a list by an integer?) to initialize the Vector and if that throws an exception try using it as if it were a sequence, and if that failed as well then you can throw a TypeError.
The reason for doing this is that it leaves your class open to working with other objects types people come up with later that aren't integers or sequences but work like them. In particular it's very difficult to comprehensively check whether something is a "sequence", because user-defined classes that can be used as sequences don't have to be instances of any common type you can check. The Vector class itself is quite a good candidate for using to initialize a Vector, for example!
But I'm not sure if this is the answer your teacher is expecting. If you haven't learned about exception handling yet, then you're almost certainly not meant to use this approach so please ignore my post. Good luck with your learning!

Categories

Resources