Unable to understand __lt__ method [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Hi I was solving this question on leetcode [Given a list of non-negative integers, arrange them such that they form the largest number.] I saw this solution.
I'm unable to understand how is the class LargerNumKey is working? Also, what is the purpose lt . and what are the variables x and y
class LargerNumKey(str):
def __lt__(x, y):
return x+y > y+x
class Solution:
def largestNumber(self, nums):
largest_num = ''.join(sorted(map(str, nums), key=LargerNumKey))
return '0' if largest_num[0] == '0' else largest_num

The __lt__ "dunder" method is what allows you to use the < less-than sign for an object. It might make more sense written as follows:
class LargerNumKey(str):
def __lt__(self, other):
return self+other > other+self
# This calls into LargerNumKey.__lt__(LargerNumKey('0'), LargerNumKey('1'))
LargerNumKey('0') < LargerNumKey('1')
Behind the scenes when str is subclassed, adding self+other actually generates a str object rather than a LargerNumKey object, so you don't have infinite recursion problems defining an inequality on a type in terms of its own inequality operator.
The reason this works is perhaps more interesting:
The first fact we need is that for any positive integers we actually have (x>y) == (str(x)>str(y)), so when the custom __lt__ is operating it's actually asking whether the integers represented by those string concatenations are greater or less than each other.
The second interesting fact is that the new inequality defined as such is actually transitive -- if s<t and t<u then s<u, so the sorted() method is able to place all the numbers in the correct order by just getting the correct answer for each possible pair.

__lt__ is a magic method that lets you change the behavior of the < operator. sorted uses the < operator to compare values. So when python is comparing two values with < it checks to see if those objects have the magic method __lt__ defined. If they do, then it uses that method for the comparison. The variables x and y in the example are the two variables being compared. So if you had a line of code like x < y, then x and y would be passed as arguments to __lt__. Sorted presumably does have that line of code. But you don't have to call them 'x' and 'y', you can call them whatever you want. Often you will see them called self and other.
sorted works by comparing two items at a time. For example, let's call them x and y. So somewhere sorted has to compare them, probably with a line that looks like:
if x < y:
However, if you pass sorted a key argument, then it instead compares them more like this:
if key(x) < key(y):
Since the example passes LargerNumKey as the key, it ends up looking like this after python looks up key:
if LargerNumKey(x) < LargerNumKey(y):
When python then sees the < operator, it looks for the __lt__ method, and because it finds it turns the statement into basically:
if LargerNumKey(x).__lt__(LargerNumKey(y)):
Because __lt__ is a method on an object, the object itself becomes the first argument (x in this case). Also, because LargerNumKey is a subclass of str it behaves exactly like a regular string, except fo the __lt__ method that you overrode.
This is a useful technique when you want things to be sortable. You can use __lt__ to allow your objects to be sorted in any way you wish. And if the objects you are sorting have the __lt__ method defined, then you don't have to even pass key. But since we are working with different types of objects and don't want to use the default __lt__ method, we use key instead.
References:
Python Docs
https://rszalski.github.io/magicmethods/#comparisons
Note that while my example pretends that sorted is python code, it is in fact usually c code. However, since python is "pseudo code that runs", I think it conveys the idea accurately.

This'd also pass without __lt__:
from functools import cmp_to_key
class Solution:
def largestNumber(self, nums):
nums = list(map(str, nums))
nums.sort(key=cmp_to_key(lambda a, b: 1 if a + b > b +
a else -1 if a + b < b + a else 0), reverse=1)
return str(int(''.join(nums)))
print(Solution().largestNumber([10, 2]))
print(Solution().largestNumber([3, 30, 34, 5, 9]))
Outputs
210
9534330
References
For additional details, you can see the Discussion Board. There are plenty of accepted solutions with a variety of languages and explanations, efficient algorithms, as well as asymptotic time/space complexity analysis1, 2 in there.

Related

Is it really necessary to hash the same for classes that compare the same?

Reading this answer it seems, that if __eq__ is defined in custom class, __hash__ needs to be defined as well. This is understandable.
However it is not clear, why - effectively - __eq__ should be same as self.__hash__()==other.__hash__
Imagining a class like this:
class Foo:
...
self.Name
self.Value
...
def __eq__(self,other):
return self.Value==other.Value
...
def __hash__(self):
return id(self.Name)
This way class instances could be compared by value, which could be the only reasonable use, but considered identical by name.
This way set could not contain multiple instances with equal name, but comparison would still work.
What could be the problem with such definition?
The reason for defining __eq__, __lt__ and other by Value is to be able to sort instances by Value and to be able to use functions like max. For example, he class should represent a physical output of a device (say heating element). Each of these outputs has unique Name. The Value is power of the output device. To find optimal combination of heating elements to turn on, it is useful to be able to compare them by power (Value). In a set or dictionary, however, it should not be possible to have multiple outputs with same names. Of course, different outputs with different names might easily have equal power.
The problem is that it does not make sense, hash is used to do efficient bucketing of objects. Consequently, when you have a set, which is implemented as a hash table, each hash points to a bucket, which is usually a list of elements. In order to check if an element is in the set (or other hash based container) you go to the bucket pointed by a hash and then you iterate over all elements in the list, comparing them one by one.
In other words - hash is not supposed to be a comparator (as it can, and should give you sometimes a false positive). In particular, in your example, your set will not work - it will not recognize duplicate, as they do not compare to each other.
class Foo:
def __eq__(self,other):
return self.Value==other.Value
def __hash__(self):
return id(self.Name)
a = set()
el = Foo()
el.Name = 'x'
el.Value = 1
el2 = Foo()
el2.Name = 'x'
el2.Value = 2
a.add(el)
a.add(el2)
print len(a) # should be 1, right? Well it is 2
actually it is even worse then that, if you have 2 objects with the same values but different names, they are not recognized to be the same either
class Foo:
def __eq__(self,other):
return self.Value==other.Value
def __hash__(self):
return id(self.Name)
a = set()
el = Foo()
el.Name = 'x'
el.Value = 2
el2 = Foo()
el2.Name = 'a'
el2.Value = 2
a.add(el)
a.add(el2)
print len(a) # should be 1, right? Well it is 2 again
while doing it properly (thus, "if a == b, then hash(a) == hash(b)") gives:
class Foo:
def __eq__(self,other):
return self.Name==other.Name
def __hash__(self):
return id(self.Name)
a = set()
el = Foo()
el.Name = 'x'
el.Value = 1
el2 = Foo()
el2.Name = 'x'
el2.Value = 2
a.add(el)
a.add(el2)
print len(a) # is really 1
Update
There is also an non deterministic part, which is hard to easily reproduce, but essentially hash does not uniquely define a bucket. Usually it is like
bucket_id = hash(object) % size_of_allocated_memory
consequently things that have different hashes can still end up in the same bucket. Consequently, you can get two elements equal to each (inside set) due to equality of Values even though Names are different, as well as the other way around, depending on actual internal implementation, memory constraints etc.
In general there are many more examples where things can go wrong, as hash is defined as a function h : X -> Z such that x == y => h(x) == h(y), thus people implementing their containers, authorization protocols, and other tools are free to assume this property. If you break it - every single tool using hashes can break. Furthermore, it can break in time, meaning that you update some library and your code will stop working, as a valid update to the underlying libraries (using the above assumption) can lead to exploiting your violation of this assumption.
Update 2
Finally, in order to solve your issue - you simply should not define your eq, lt operators to handle sorting. It is about actual comparison of the elements, which should be compatible with the rest of the behaviours. All you have to do is define a separate comparator and use it in your sorting routines (sorting in python accepts any comparator, you do not need to rely on <, > etc.). The other way around is to instead have valid <, >, = defined on values, but in order to keep names unique - keep a set with... well... names, and not objects themselves. Whichever path you choose - the crucial element here is:
equality and hashing have to be compatible, that's all.
It is possible to implement your class like this and not have any problems. However, you have to be 100% sure that no two different objects will ever produce the same hash. Consider the following example:
class Foo:
def __init__(self, name, value):
self.name= name
self.value= value
def __eq__(self, other):
return self.value == other.value
def __hash__(self):
return hash(self.name[0])
s= set()
s.add(Foo('a', 1))
s.add(Foo('b', 1))
print(len(s)) # output: 2
But you have a problem if a hash collision occurs:
s.add(Foo('abc', 1))
print(len(s)) # output: 2
In order to prevent this, you would have to know exactly how the hashes are generated (which, if you rely on functions like id or hash, might vary between implementations!) and also the values of the attribute(s) used to generate the hash (name in this example). That's why ruling out the possibility of a hash collision is very difficult, if not impossible. It's basically like begging for unexpected things to happen.

Python3 style sorting -- old cmp method functionality in new key mechanism?

I read about the wrapper function to move a cmp style comparison into a key style comparison in Python 3, where the cmp capability was removed.
I'm having a heck of a time wrapping my head around how a Python3 straight key style sorted() function, with, at least as I understand it, just one item specified for the key, can allow you to properly compare, for instance, two IPs for ordering. Or ham calls.
Whereas with cmp there was nothing to it: sorted() and sort() called you with the two ips, you looked at the appropriate portions, made your decisions, done.
def ipCompare(dqA,dqB):
...
ipList = sorted(ipList,cmp=ipCompare)
Same thing with ham radio calls. The sorting isn't alphabetic; the calls are generally letter(s)+number(s)+letter(s); the first sorting priority is the number portion, then the first letter(s), then the last letter(s.)
Using cmp... no sweat.
def callCompare(callA,callB):
...
hamlist = sorted(hamlist,cmp=callCompare)
With Python3... without going through the hoop jumping of the wrapper... and being passed one item... I think... how can that be done?
And if the wrapper is absolutely required... then why remove cmp within Python3 in the first place?
I'm sure I'm missing something. I just can't see it. :/
ok, now I know what I was missing. Solutions for IPs were given in the answers below. Here's a key I came up with for sorting ham calls of the common prefix, region, postfix form:
import re
def callKey(txtCall):
l = re.split('([0-9]*)',txtCall.upper(),1)
return l[1],l[0],l[2]
hamList = ['N4EJI','W1AW','AA7AS','na1a']
sortedHamList = sorted(hamList,key=callKey)
sortedHamList result is ['na1a','W1AW','N4EJI','AA7AS']
Detail:
AA7AS comes out of callKey() as 7,AA,AS
N4EJI comes out of callKey() as 4,N,EJI
W1AW comes out of callKey() as 1,W,AW
na1a comes out of callKey() as 1,NA,A
First, if you haven't read the Sorting HOWTO, definitely read that; it explains a lot that may not be obvious at first.
For your first example, two IPv4 addresses, the answer is pretty simple.
To compare two addresses, one obvious thing to do is convert them both from dotted-four strings into tuples of 4 ints, then just compare the tuples:
def cmp_ip(ip1, ip2):
ip1 = map(int, ip1.split('.'))
ip2 = map(int, ip2.split('.'))
return cmp(ip1, ip2)
An even better thing to do is convert them to some kind of object that represents an IP address and has comparison operators. In 3.4+, the stdlib has such an object built in; let's pretend 2.7 did as well:
def cmp_ip(ip1, ip2):
return cmp(ipaddress.ip_address(ip1), ipaddress.ip_address(ip2))
It should be obvious that these are both even easier as key functions:
def key_ip(ip):
return map(int, ip.split('.'))
def key_ip(ip):
return ipaddress.ip_address(ip)
For your second example, ham radio callsigns: In order to write a cmp function, you have to be able to break each ham address into the letters, numbers, letters portions, then compare the numbers, then compare the first letters, then compare the second letters. In order to write a key function, you have to be able to break down a ham address into the letters, numbers, letters portions, then return a tuple of (numbers, first letters, second letters). Again the key function is actually easier, not harder.
And really, this is the case for most examples anyone was able to come up with. Most complicated comparisons ultimately come down to a complicated conversion into some sequence of parts, and then simple lexicographical comparison of that sequence.
That's why cmp functions were deprecated way back in 2.4 and finally removed in 3.0.
Of course there are some cases where a cmp function is easier to read—most of the examples people try to come up with turn out to be wrong, but there are some. And there's also code which has been working for 20 years and nobody wants to rethink it in new terms for no benefit. For those cases, you've got cmp_to_key.
There's actually another reason cmp was deprecated, on top of this one, and maybe a third.
In Python 2.3, types had a __cmp__ method, which was used for handling all of the operators. In 2.4, they grew the six methods __lt__, __eq__, etc. as a replacement. This allows for more flexibility—e.g., you can have types that aren't total-ordered. So, 2.3's when compared a < b, it was actually doing a.__cmp__(b) < 0, which maps in a pretty obvious way to a cmp argument. But in 2.4+, a < b does a.__lt__(b), which doesn't. This confused a lot of people over the years, and removing both __cmp__ and the cmp argument to sort functions removed that confusion.
Meanwhile, if you read the Sorting HOWTO, you'll notice that before we had cmp, the only way to do this kind of thing was decorate-sort-undecorate (DSU). Notice that it's blindly obvious how to map a good key function to a good DSU sort and vice-versa, but it's definitely not obvious with a cmp function. I don't remember anyone explicitly mentioning this one on the py3k list, but I suspect people may have had it in their heads when deciding whether to finally kill cmp for good.
To use the new key argument, simply decompose the comparison to another object that already implements a well-ordered comparison, such as to a tuple or list (eg. a sequence of integers). These types work well because they are sequence-wise ordered.
def ip_as_components (ip):
return map(int, ip.split('.'))
sorted_ips = sorted(ips, key=ip_as_components)
The ordering of the each of the components are the same as the individual tests as found in a traditional compare-and-then-compare-by function.
Looking at the HAM ordering it may look like:
def ham_to_components (ham_code):
# .. decompose components based on ordering of each
return (prefix_letters, numbers, postfix_letters)
The key approach (similar to "order by" found in other languages) is generally a simpler and more natural construct to deal with - assuming that the original types are not already well-ordered. The main drawback with this approach is that partially reversed (eg. asc then desc) ordering can be tricky, but that is solvable by returning nested tuples etc.
In Py3.0, the cmp parameter was removed entirely (as part of a larger effort to simplify and unify the language, eliminating the conflict between rich comparisons and the __cmp__() magic method).
If absolutely needing sorted with a custom "cmp", cmp_to_key can be trivially used.
sorted_ips = sorted(ips, key=functools.cmp_to_key(ip_compare))
According to official docs - https://docs.python.org/3/howto/sorting.html#the-old-way-using-the-cmp-parameter
When porting code from Python 2.x to 3.x, the situation can arise when you have the user supplying a comparison function and you need to convert that to a key function. The following wrapper makes that easy to do:
def cmp_to_key(mycmp):
'Convert a cmp= function into a key= function'
class K:
def __init__(self, obj, *args):
self.obj = obj
def __lt__(self, other):
return mycmp(self.obj, other.obj) < 0
def __gt__(self, other):
return mycmp(self.obj, other.obj) > 0
def __eq__(self, other):
return mycmp(self.obj, other.obj) == 0
def __le__(self, other):
return mycmp(self.obj, other.obj) <= 0
def __ge__(self, other):
return mycmp(self.obj, other.obj) >= 0
def __ne__(self, other):
return mycmp(self.obj, other.obj) != 0
return K
To convert to a key function, just wrap the old comparison function:
>>> def reverse_numeric(x, y):
... return y - x
>>> sorted([5, 2, 4, 1, 3], key=cmp_to_key(reverse_numeric))
[5, 4, 3, 2, 1]

In Python, is there a good "magic method" to represent object similarity?

I want instances of my custom class to be able to compare themselves to one another for similarity. This is different than the __cmp__ method, which is used for determining the sorting order of objects.
Is there a magic method that makes sense for this? Is there any standard syntax for doing this?
How I imagine this could look:
>>> x = CustomClass("abc")
>>> y = CustomClass("abd")
>>> z = CustomClass("xyz")
>>> x.__<???>__(y)
0.75
>>> x <?> y
0.75
>>> x.__<???>__(z)
0.0
>>> x <?> z
0.0
Where <???> is the magic method name and <?> is the operator.
Take a look at the numeric types emulation in the datamodel and pick an operator hook that suits you.
I don't think there is currently an operator that is an exact match though, so you'll end up surprising some poor hapless future code maintainer (could even be you) that you overloaded a standard operator.
For a Levenshtein Distance I'd just use a regular method instead. I'd find a one.similarity(other) method a lot clearer when reading the code.
well, you could override __eq__ to mean both boolean logical equality and 'fuzzy' simlirity, by returning a sufficiently weird result from __eq__:
class FuzzyBool(object):
def __init__(self, quality, tolerance=0):
self.quality, self._tolerance = quality, tolerance
def __nonzero__(self):
return self.quality <= self._tolerance
def tolerance(self, tolerance):
return FuzzyBool(self.quality, tolerance)
def __repr__(self):
return "sorta %s" % bool(self)
class ComparesFuzzy(object):
def __init__(self, value):
self.value = value
def __eq__(self, other):
return FuzzyBool(abs(self.value - other.value))
def __hash__(self):
return hash((ComparesFuzzy, self.value))
>>> a = ComparesFuzzy(1)
>>> b = ComparesFuzzy(2)
>>> a == b
sorta False
>>> (a == b).tolerance(3)
sorta True
the default behavior of the comparator should be that it is Truthy only if the compared values are exactly equal, so that normal equality is unaffected
No, there is not. You can make a class method, but I don't think there is any intuitive operator to overload that would do what you're looking for. And, to avoid confusion, I would avoid overloading unless it is obviously intuitive.
I would simply call is CustomClass.similarity(y)
I don't think there is a magic method (and corresponding operator) that would make sense for this in any context.
However, if, with a bit of fantasy, your instances can be seen as vectors, then checking for similarity could be analogous to calculating the scalar product. It would make sense then to use __mul__ and multiplication sign for this (unless you have already defined product for CustomClass instances).
No magic function/operator for that.
When I think of "similarity" for ints and floats, I think of the difference being lower than a certain threshold. Perhaps that's something you might use?
E.g. being able to calculate the "difference" between your objects might be suitable in the sub method.
In the example you've cited, I would use difflib. This conducts spell-check like comparisons between strings. But in general, if you really are comparing objects rather than strings, then I agree with the others; you should probably create something context-specific.

Which of `if x:` or `if x != 0:` is preferred in Python?

Assuming that x is an integer, the construct if x: is functionally the same as if x != 0: in Python. Some languages' style guides explicitly forbid against the former -- for example, ActionScript/Flex's style guide states that you should never implicitly cast an int to bool for this sort of thing.
Does Python have a preference? A link to a PEP or other authoritative source would be best.
The construct: if x: is generally used to check against boolean values.
For ints the use of the explicit x != 0 is preferred - along the lines of explicit is better than implicit (PEP 20 - Zen of Python).
There's no hard and fast rule here. Here are some examples where I would use each:
Suppose that I'm interfacing to some function that returns -1 on error and 0 on success. Such functions are pretty common in C, and they crop up in Python frequently when using a library that wraps C functions. In that case, I'd use if x:.
On the other hand, if I'm about to divide by x and I want to make sure that x isn't 0, then I'm going to be explicit and write if x != 0.
As a rough rule of thumb, if I treat x as a bool throughout a function, then I'm likely to use if x: -- even if I can prove that x will be an int. If in the future I decide I want to pass a bool (or some other type!) to the function, I wouldn't need to modify it.
On the other hand, if I'm genuinely using x like an int, then I'm likely to spell out the 0.
Typically, I read:
if(x) to be a question about existence.
if( x != 0) to be a question about a number.
It depends on what you want; if x is an integer, they're equivalent, but you should write the code that matches your exact intention.
if x:
# x is anything that evaluates to a True value
if x != 0:
# x is anything that is not equal to 0
If you want to test x in a boolean context:
if x:
More explicit, for x validity (doesn't match empty containers):
if x is not None:
If you want to test strictly in integer context:
if x != 0:
This last one is actually implicitly comparing types.
Might I suggest that the amount of bickering over this question is enough to answer it?
Some argue that it "if x" should only be used for Z, others for Y, others for X.
If such a simple statement is able to create such a fuss, to me it is clear that the statement is not clear enough. Write what you mean.
If you want to check that x is equal to 0, then write "if x == 0". If you want to check if x exists, write "if x is not None".
Then there is no confusion, no arguing, no debate.
Wouldn't if x is not 0: be the preferred method in Python, compared to if x != 0:?
Yes, the former is a bit longer to write, but I was under the impression that is and is not are preferred over == and !=. This makes Python easier to read as a natural language than as a programming language.

Declaring Unknown Type Variable in Python?

I have a situation in Python(cough, homework) where I need to multiply EACH ELEMENT in a given list of objects a specified number of times and return the output of the elements. The problem is that the sample inputs given are of different types. For example, one case may input a list of strings whose elements I need to multiply while the others may be ints. So my return type needs to vary. I would like to do this without having to test what every type of object is. Is there a way to do this? I know in C# i could just use "var" but I don't know if such a thing exists in Python?
I realize that variables don't have to be declared, but in this case I can't see any way around it. Here's the function I made:
def multiplyItemsByFour(argsList):
output = ????
for arg in argsList:
output += arg * 4
return output
See how I need to add to the output variable. If I just try to take away the output assignment on the first line, I get an error that the variable was not defined. But if I assign it a 0 or a "" for an empty string, an exception could be thrown since you can't add 3 to a string or "a" to an integer, etc...
Here are some sample inputs and outputs:
Input: ('a','b') Output: 'aaaabbbb'
Input: (2,3,4) Output: 36
Thanks!
def fivetimes(anylist):
return anylist * 5
As you see, if you're given a list argument, there's no need for any assignment whatsoever in order to "multiply it a given number of times and return the output". You talk about a given list; how is it given to you, if not (the most natural way) as an argument to your function? Not that it matters much -- if it's a global variable, a property of the object that's your argument, and so forth, this still doesn't necessitate any assignment.
If you were "homeworkically" forbidden from using the * operator of lists, and just required to implement it yourself, this would require assignment, but no declaration:
def multiply_the_hard_way(inputlist, multiplier):
outputlist = []
for i in range(multiplier):
outputlist.extend(inputlist)
return outputlist
You can simply make the empty list "magicaly appear": there's no need to "declare" it as being anything whatsoever, it's an empty list and the Python compiler knows it as well as you or any reader of your code does. Binding it to the name outputlist doesn't require you to perform any special ritual either, just the binding (aka assignment) itself: names don't have types, only objects have types... that's Python!-)
Edit: OP now says output must not be a list, but rather int, float, or maybe string, and he is given no indication of what. I've asked for clarification -- multiplying a list ALWAYS returns a list, so clearly he must mean something different from what he originally said, that he had to multiply a list. Meanwhile, here's another attempt at mind-reading. Perhaps he must return a list where EACH ITEM of the input list is multiplied by the same factor (whether that item is an int, float, string, list, ...). Well then:
define multiply_each_item(somelist, multiplier):
return [item * multiplier for item in somelist]
Look ma, no hands^H^H^H^H^H assignment. (This is known as a "list comprehension", btw).
Or maybe (unlikely, but my mind-reading hat may be suffering interference from my tinfoil hat, will need to go to the mad hatter's shop to have them tuned) he needs to (say) multiply each list item as if they were the same type as the first item, but return them as their original type, so that for example
>>> mystic(['zap', 1, 23, 'goo'], 2)
['zapzap', 11, 2323, 'googoo']
>>> mystic([23, '12', 15, 2.5], 2)
[46, '24', 30, 4.0]
Even this highly-mystical spec COULD be accomodated...:
>>> def mystic(alist, mul):
... multyp = type(alist[0])
... return [type(x)(mul*multyp(x)) for x in alist]
...
...though I very much doubt it's the spec actually encoded in the mysterious runes of that homework assignment. Just about ANY precise spec can be either implemented or proven to be likely impossible as stated (by requiring you to solve the Halting Problem or demanding that P==NP, say;-). That may take some work ("prove the 4-color theorem", for example;-)... but still less than it takes to magically divine what the actual spec IS, from a collection of mutually contradictory observations, no examples, etc. Though in our daily work as software developer (ah for the good old times when all we had to face was homework!-) we DO meet a lot of such cases of course (and have to solve them to earn our daily bread;-).
EditEdit: finally seeing a precise spec I point out I already implemented that one, anyway, here it goes again:
def multiplyItemsByFour(argsList):
return [item * 4 for item in argsList]
EditEditEdit: finally/finally seeing a MORE precise spec, with (luxury!-) examples:
Input: ('a','b') Output: 'aaaabbbb' Input: (2,3,4) Output: 36
So then what's wanted it the summation (and you can't use sum as it wouldn't work on strings) of the items in the input list, each multiplied by four. My preferred solution:
def theFinalAndTrulyRealProblemAsPosed(argsList):
items = iter(argsList)
output = next(items, []) * 4
for item in items:
output += item * 4
return output
If you're forbidden from using some of these constructs, such as built-ins items and iter, there are many other possibilities (slightly inferior ones) such as:
def theFinalAndTrulyRealProblemAsPosed(argsList):
if not argsList: return None
output = argsList[0] * 4
for item in argsList[1:]:
output += item * 4
return output
For an empty argsList, the first version returns [], the second one returns None -- not sure what you're supposed to do in that corner case anyway.
Very easy in Python. You need to get the type of the data in your list - use the type() function on the first item - type(argsList[0]). Then to initialize output (where you now have ????) you need the 'zero' or nul value for that type. So just as int() or float() or str() returns the zero or nul for their type so to will type(argsList[0])() return the zero or nul value for whatever type you have in your list.
So, here is your function with one minor modification:
def multiplyItemsByFour(argsList):
output = type(argsList[0])()
for arg in argsList:
output += arg * 4
return output
Works with::
argsList = [1, 2, 3, 4] or [1.0, 2.0, 3.0, 4.0] or "abcdef" ... etc,
Are you sure this is for Python beginners? To me, the cleanest way to do this is with reduce() and lambda, both of which are not typical beginner tools, and sometimes discouraged even for experienced Python programmers:
def multiplyItemsByFour(argsList):
if not argsList:
return None
newItems = [item * 4 for item in argsList]
return reduce(lambda x, y: x + y, newItems)
Like Alex Martelli, I've thrown in a quick test for an empty list at the beginning which returns None. Note that if you are using Python 3, you must import functools to use reduce().
Essentially, the reduce(lambda...) solution is very similar to the other suggestions to set up an accumulator using the first input item, and then processing the rest of the input items; but is simply more concise.
My guess is that the purpose of your homework is to expose you to "duck typing". The basic idea is that you don't worry about the types too much, you just worry about whether the behaviors work correctly. A classic example:
def add_two(a, b):
return a + b
print add_two(1, 2) # prints 3
print add_two("foo", "bar") # prints "foobar"
print add_two([0, 1, 2], [3, 4, 5]) # prints [0, 1, 2, 3, 4, 5]
Notice that when you def a function in Python, you don't declare a return type anywhere. It is perfectly okay for the same function to return different types based on its arguments. It's considered a virtue, even; consider that in Python we only need one definition of add_two() and we can add integers, add floats, concatenate strings, and join lists with it. Statically typed languages would require multiple implementations, unless they had an escape such as variant, but Python is dynamically typed. (Python is strongly typed, but dynamically typed. Some will tell you Python is weakly typed, but it isn't. In a weakly typed language such as JavaScript, the expression 1 + "1" will give you a result of 2; in Python this expression just raises a TypeError exception.)
It is considered very poor style to try to test the arguments to figure out their types, and then do things based on the types. If you need to make your code robust, you can always use a try block:
def safe_add_two(a, b):
try:
return a + b
except TypeError:
return None
See also the Wikipedia page on duck typing.
Python is dynamically typed, you don't need to declare the type of a variable, because a variable doesn't have a type, only values do. (Any variable can store any value, a value never changes its type during its lifetime.)
def do_something(x):
return x * 5
This will work for any x you pass to it, the actual result depending on what type the value in x has. If x contains a number it will just do regular multiplication, if it contains a string the string will be repeated five times in a row, for lists and such it will repeat the list five times, and so on. For custom types (classes) it depends on whether the class has an operation defined for the multiplication operator.
You don't need to declare variable types in python; a variable has the type of whatever's assigned to it.
EDIT:
To solve the re-stated problem, try this:
def multiplyItemsByFour(argsList):
output = argsList.pop(0) * 4
for arg in argsList:
output += arg * 4
return output
(This is probably not the most pythonic way of doing this, but it should at least start off your output variable as the right type, assuming the whole list is of the same type)
You gave these sample inputs and outputs:
Input: ('a','b') Output: 'aaaabbbb' Input: (2,3,4) Output: 36
I don't want to write the solution to your homework for you, but I do want to steer you in the correct direction. But I'm still not sure I understand what your problem is, because the problem as I understand it seems a bit difficult for an intro to Python class.
The most straightforward way to solve this requires that the arguments be passed in a list. Then, you can look at the first item in the list, and work from that. Here is a function that requires the caller to pass in a list of two items:
def handle_list_of_len_2(lst):
return lst[0] * 4 + lst[1] * 4
Now, how can we make this extend past two items? Well, in your sample code you weren't sure what to assign to your variable output. How about assigning lst[0]? Then it always has the correct type. Then you could loop over all the other elements in lst and accumulate to your output variable using += as you wrote. If you don't know how to loop over a list of items but skip the first thing in the list, Google search for "python list slice".
Now, how can we make this not require the user to pack up everything into a list, but just call the function? What we really want is some way to accept whatever arguments the user wants to pass to the function, and make a list out of them. Perhaps there is special syntax for declaring a function where you tell Python you just want the arguments bundled up into a list. You might check a good tutorial and see what it says about how to define a function.
Now that we have covered (very generally) how to accumulate an answer using +=, let's consider other ways to accumulate an answer. If you know how to use a list comprehension, you could use one of those to return a new list based on the argument list, with the multiply performed on each argument; you could then somehow reduce the list down to a single item and return it. Python 2.3 and newer have a built-in function called sum() and you might want to read up on that. [EDIT: Oh drat, sum() only works on numbers. See note added at end.]
I hope this helps. If you are still very confused, I suggest you contact your teacher and ask for clarification. Good luck.
P.S. Python 2.x have a built-in function called reduce() and it is possible to implement sum() using reduce(). However, the creator of Python thinks it is better to just use sum() and in fact he removed reduce() from Python 3.0 (well, he moved it into a module called functools).
P.P.S. If you get the list comprehension working, here's one more thing to think about. If you use a list comprehension and then pass the result to sum(), you build a list to be used once and then discarded. Wouldn't it be neat if we could get the result, but instead of building the whole list and then discarding it we could just have the sum() function consume the list items as fast as they are generated? You might want to read this: Generator Expressions vs. List Comprehension
EDIT: Oh drat, I assumed that Python's sum() builtin would use duck typing. Actually it is documented to work on numbers, only. I'm disappointed! I'll have to search and see if there were any discussions about that, and see why they did it the way they did; they probably had good reasons. Meanwhile, you might as well use your += solution. Sorry about that.
EDIT: Okay, reading through other answers, I now notice two ways suggested for peeling off the first element in the list.
For simplicity, because you seem like a Python beginner, I suggested simply using output = lst[0] and then using list slicing to skip past the first item in the list. However, Wooble in his answer suggested using output = lst.pop(0) which is a very clean solution: it gets the zeroth thing on the list, and then you can just loop over the list and you automatically skip the zeroth thing. However, this "mutates" the list! It's better if a function like this does not have "side effects" such as modifying the list passed to it. (Unless the list is a special list made just for that function call, such as a *args list.) Another way would be to use the "list slice" trick to make a copy of the list that has the first item removed. Alex Martelli provided an example of how to make an "iterator" using a Python feature called iter(), and then using iterator to get the "next" thing. Since the iterator hasn't been used yet, the next thing is the zeroth thing in the list. That's not really a beginner solution but it is the most elegant way to do this in Python; you could pass a really huge list to the function, and Alex Martelli's solution will neither mutate the list nor waste memory by making a copy of the list.
No need to test the objects, just multiply away!
'this is a string' * 6
14 * 6
[1,2,3] * 6
all just work
Try this:
def timesfourlist(list):
nextstep = map(times_four, list)
sum(nextstep)
map performs the function passed in on each element of the list(returning a new list) and then sum does the += on the list.
If you just want to fill in the blank in your code, you could try setting object=arglist[0].__class__() to give it the zero equivalent value of that class.
>>> def multiplyItemsByFour(argsList):
output = argsList[0].__class__()
for arg in argsList:
output += arg * 4
return output
>>> multiplyItemsByFour('ab')
'aaaabbbb'
>>> multiplyItemsByFour((2,3,4))
36
>>> multiplyItemsByFour((2.0,3.3))
21.199999999999999
This will crash if the list is empty, but you can check for that case at the beginning of the function and return whatever you feel appropriate.
Thanks to Alex Martelli, you have the best possible solution:
def theFinalAndTrulyRealProblemAsPosed(argsList):
items = iter(argsList)
output = next(items, []) * 4
for item in items:
output += item * 4
return output
This is beautiful and elegant. First we create an iterator with iter(), then we use next() to get the first object in the list. Then we accumulate as we iterate through the rest of the list, and we are done. We never need to know the type of the objects in argsList, and indeed they can be of different types as long as all the types can have operator + applied with them. This is duck typing.
For a moment there last night I was confused and thought that you wanted a function that, instead of taking an explicit list, just took one or more arguments.
def four_x_args(*args):
return theFinalAndTrulyRealProblemAsPosed(args)
The *args argument to the function tells Python to gather up all arguments to this function and make a tuple out of them; then the tuple is bound to the name args. You can easily make a list out of it, and then you could use the .pop(0) method to get the first item from the list. This costs the memory and time to build the list, which is why the iter() solution is so elegant.
def four_x_args(*args):
argsList = list(args) # convert from tuple to list
output = argsList.pop(0) * 4
for arg in argsList:
output += arg * 4
return output
This is just Wooble's solution, rewritten to use *args.
Examples of calling it:
print four_x_args(1) # prints 4
print four_x_args(1, 2) # prints 12
print four_x_args('a') # prints 'aaaa'
print four_x_args('ab', 'c') # prints 'ababababcccc'
Finally, I'm going to be malicious and complain about the solution you accepted. That solution depends on the object's base class having a sensible null or zero, but not all classes have this. int() returns 0, and str() returns '' (null string), so they work. But how about this:
class NaturalNumber(int):
"""
Exactly like an int, but only values >= 1 are possible.
"""
def __new__(cls, initial_value=1):
try:
n = int(initial_value)
if n < 1:
raise ValueError
except ValueError:
raise ValueError, "NaturalNumber() initial value must be an int() >= 1"
return super(NaturalNumber, cls).__new__ (cls, n)
argList = [NaturalNumber(n) for n in xrange(1, 4)]
print theFinalAndTrulyRealProblemAsPosed(argList) # prints correct answer: 24
print NaturalNumber() # prints 1
print type(argList[0])() # prints 1, same as previous line
print multiplyItemsByFour(argList) # prints 25!
Good luck in your studies, and I hope you enjoy Python as much as I do.

Categories

Resources