Convenient way to make new tuples/strings extended by 1

Convenient way to make new tuples/strings extended by 1 - python

In part of a project, given a single integer/character x and a tuple/string a, I need to create a new tuple/string that's the original tuple/string extended by that element. My current code looks like:
def extend(x, a):
if type(a) is tuple:
return a + (x,)
if type(a) is str:
return a + x
Is there a better way to code this, to either make it shorter or more generalizable to data types?

It is not clear why you want to extend tuples and string at the same part of code. May be you need refactoring. list seems to be the correct type for such operations and it has already .append(x) for it.
If you are sure you need different types, you function seems OK. But just add
raise TypeError()
at the end of it. So you'll be sure not to miss unpredicted data type.

How about this way:
def extend(x, a):
return a + {tuple: (x,), str: x, list: [x]}[type(a)]
Of course, keep in mind that the number of datatypes is really big and a one-size-fits-all approach does not exist.
So, take another hard look on whatever code comes before this and if you really need it, use this dictionary approach.
EDIT
If you need to do this many many times, do it with an if block. As #chepner says, building a dictionary every time will render this approach too smart for its own good.
If subclassing is also an issue, you should change from type to isinstance as #Jean-FrancoisFabre says.

Ev Kounis use of a dictionary is original but if the aim was to gain speed, it doesn't really work, because the dictionary is rebuilt each time (x varies)
A slight modification would be to use a dictionary of conversion functions. This one is fixed.
Pass the parameter to the proper conversion function and it's done:
# this is built once
d = {tuple: lambda x:(x,), str: lambda x:x, list: lambda x:[x]}
def extend(x, a):
return a + d[type(a)](x)
(which still doesn't work is passing a subclassed type of str, tuple, whatever, but if you're sure you won't, this works)
Honestly, using a dictionary for 3 keys won't be that fast, and a chain of ifs is just as efficient.

Related

Pythonic way of looping over variable that is either an element or a list

I am trying to use a for loop with a variable that is either an element(of a list) or a list. The code I have now, I find very ugly.
for x in test if isinstance(test, list) else [test]:
print(x)
Any pythonic way of improving on this?

First, you might want to include different types (rather than list) in your check, and a quick way of doing that would be:
def is_iterable(x):
return type(x) in [list, tuple] # or just isinstance(x, list)
With that, I would probably end up doing something like:
if is_iterable(test):
for x in test:
do_stuff(x)
else:
do_stuff(test)
Or if you expect any return:
if is_iterable(test):
return [do_stuff(x) for x in test]
else:
return [do_stuff(test)]
I don't know whether is the more pythonic way (or not) of doing that, but for me, is the most readable one. If you really really want to reduce space, probably your option is the way to go, as is the best practical way of getting a one-liner. However, I don't think there is any performance improvement (maybe just quite the opposite).
Another last option, if your do_stuff is not defined as a function, and thus you don't want to copy-paste code (never do that), would be to just get the assignment out:
test = test if is_iterable(test) else [test]
for x in test:
do_stuff
...
But this is in essence the same as what you already have. In my personal experience, it is usually useful to get all the preprocessing out of the calculation step and make sure all the parameters have valid types. Then just perform whatever operation you need to do on them.

import collections
[isinstance(x, collections.Iterable) and x or [x] for x in test]
This gives a list where each element is unchanged if it is already a list and is made in to a list if not already.

Given an arbitrary collection, is there a way to tell if it is ordered?

Here's what I have so far:
def is_ordered(collection):
if isinstance(collection, set):
return False
if isinstance(collection, list):
return True
if isinstance(collection, dict):
return False
raise Exception("unknown collection")
Is there a much better way to do this?
NB: I do mean ordered and not sorted.
Motivation:
I want to iterate over an ordered collection. e.g.
def most_important(priorities):
for p in priorities:
print p
In this case the fact that priorities is ordered is important. What kind of collection it is is not. I'm trying to live duck-typing here. I have frequently been dissuaded by from type checking by Pythonistas.

If the collection is truly arbitrary (meaning it can be of any class whatsoever), then the answer has to be no.
Basically, there are two possible approaches:
know about every possible class that can be presented to your method, and whether it's ordered;
test the collection yourself by inserting into it every possible combination of keys, and seeing whether the ordering is preserved.
The latter is clearly infeasible. The former is along the lines of what you already have, except that you have to know about every derived class such as collections.OrderedDict; checking for dict is not enough.
Frankly, I think the whole is_ordered check is a can of worms. Why do you want to do this anyway?

Update: In essence, you are trying to unittest the argument passed to you. Stop doing that, and unittest your own code. Test your consumer (make sure it works with ordered collections), and unittest the code that calls it, to ensure it is getting the right results.
In a statically-typed language you would simply restrict yourself to specific types. If you really want to replicate that, simply specify the only types you accept, and test for those. Raise an exception if anything else is passed. It's not pythonic, but it reliably achieves what you want to do
Well, you have two possible approaches:
Anything with an append method is almost certainly ordered; and
If it only has an add method, you can try adding a nonce-value, then iterating over the collection to see if the nonce appears at the end (or, perhaps at one end); you could try adding a second nonce and doing it again just to be more confident.
Of course, this won't work where e.g. the collection is empty, or there is an ordering function that doesn't result in addition at the ends.
Probably a better solution is simply to specify that your code requires ordered collections, and only pass it ordered collections.

I think that enumerating the 90% case is about as good as you're going to get (if using Python 3, replace basestring with str). Probably also want to consider how you would handle generator expressions and similar ilk, too (again, if using Py3, skip the xrangor):
generator = type((i for i in xrange(0)))
enumerator = type(enumerate(range(0)))
xrangor = type(xrange(0))
is_ordered = lambda seq : isinstance(seq,(tuple, list, collections.OrderedDict,
basestring, generator, enumerator, xrangor))
If your callers start using itertools, then you'll also need to add itertools types as returned by islice, imap, groupby. But the sheer number of these special cases really starts to point to a code smell.

What if the list is not ordered, e.g. [1,3,2]?

Dynamic typing design : is recursivity for dealing with lists a good design?

Lacking experience with maintaining dynamic-typed code, I'm looking for the best way to handle this kind of situations :
(Example in python, but could work with any dynamic-typed language)
def some_function(object_that_could_be_a_list):
if isinstance(object_that_could_be_a_list, list):
for element in object_that_could_be_a_list:
some_function(element)
else:
# Do stuff that expects the object to have certain properties
# a list would not have
I'm quite uneasy with this, since I think a method should do only one thing, and I'm thinking that it is not as readable as it should be. So, I'd be tempted to make three functions : the first that'll take any object and "sort" between the two others, one for the lists, another for the "simple" objects. Then again, that'd add some complexity.
What is the most "sustainable" solution here, and the one that guarantee ease of maintenance ? Is there an idiom in python for those situations that I'm unaware of ? Thanks in advance.

Don't type check - do what you want to do, and if it won't work, it'll throw an exception which you can catch and manage.
The python mantra is 'ask for forgiveness, not permission'. Type checking takes extra time, when most of the time, it'll be pointless. It also doesn't make much sense in a duck-typed environment - if it works, who cares why type it is? Why limit yourself to lists when other iterables will work too?
E.g:
def some_function(object_that_could_be_a_list):
try:
for element in object_that_could_be_a_list:
some_function(element)
except TypeError:
...
This is more readable, will work in more cases (if I pass in any other iterable which isn't a list, there are a lot) and will often be faster.
Note you are getting terminology mixed up. Python is dynamically typed, but not weakly typed. Weak typing means objects change type as needed. For example, if you add a string and an int, it will convert the string to an int to do the addition. Python does not do this. Dynamic typing means you don't declare a type for a variable, and it may contain a string at some point, then an int later.
Duck typing is a term used to describe the use of an object without caring about it's type. If it walks like a duck, and quacks like a duck - it's probably a duck.
Now, this is a general thing, and if you think your code will get the 'wrong' type of object more often than the 'right', then you might want to type check for speed. Note that this is rare, and it's always best to avoid premature optimisation. Do it by catching exceptions, and then test - if you find it's a bottleneck, then optimise.

A common practice is to implement the multiple interface by way of using different parameters for different kinds of input.
def foo(thing=None, thing_seq=None):
if thing_seq is not None:
for _thing in thing_seq:
foo(thing=_thing)
if thing is not None:
print "did foo with", thing

Rather than doing it recursive I tend do it this way:
def foo(x):
if not isinstance(x, list):
x = [x]
for y in x:
do_something(y)

You can use decorators in this case to make it more maintainable:
from mm import multimethod
#multimethod(int, int)
def foo(a, b):
...code for two ints...
#multimethod(float, float):
def foo(a, b):
...code for two floats...
#multimethod(str, str):
def foo(a, b):
...code for two strings...

Best / most pythonic way to get an ordered list of unique items

I have one or more unordered sequences of (immutable, hashable) objects with possible duplicates and I want to get a sorted sequence of all those objects without duplicates.
Right now I'm using a set to quickly gather all the elements discarding duplicates, convert it to a list and then sort that:
result = set()
for s in sequences:
result = result.union(s)
result = list(result)
result.sort()
return result
It works but I wouldn't call it "pretty". Is there a better way?

This should work:
sorted(set(itertools.chain.from_iterable(sequences)))

I like your code just fine. It is straightforward and easy to understand.
We can shorten it just a little bit by chaining off the list():
result = set()
for s in sequences:
result = result.union(s)
return sorted(result)
I really have no desire to try to boil it down beyond that, but you could do it with reduce():
result = reduce(lambda s, x: s.union(x), sequences, set())
return sorted(result)
Personally, I think this is harder to understand than the above, but people steeped in functional programming might prefer it.
EDIT: #agf is much better at this reduce() stuff than I am. From the comments below:
return sorted(reduce(set().union, sequences))
I had no idea this would work. If I correctly understand how this works, we are giving reduce() a callable which is really a method function on one instance of a set() (call it x for the sake of discussion, but note that I am not saying that Python will bind the name x with this object). Then reduce() will feed this function the first two iterables from sequences, returning x, the instance whose method function we are using. Then reduce() will repeatedly call the .union() method and ask it to take the union of x and the next iterable from sequences. Since the .union() method is likely smart enough to notice that it is being asked to take the union with its own instance and not bother to do any work, it should be just as fast to call x.union(x, some_iterable) as to just call x.union(some_iterable). Finally, reduce() will return x, and we have the set we want.
This is a bit tricky for my personal taste. I had to think this through to understand it, while the itertools.chain() solution made sense to me right away.
EDIT: #agf made it less tricky:
return sorted(reduce(set.union, sequences, set()))
What this is doing is much simpler to understand! If we call the instance returned by set() by the name of x again (and just like above with the understanding that I am not claiming that Python will bind the name x with this instance); and if we use the name n to refer to each "next" value from sequences; then reduce() will be repeatedly calling set.union(x, n). And of course this is exactly the same thing as x.union(n). IMHO if you want a reduce() solution, this is the best one.
--
If you want it to be fast, ask yourself: is there any way we can apply itertools to this? There is a pretty good way:
from itertools import chain
return sorted(set(chain(*sequences)))
itertools.chain() called with *sequences serves to "flatten" the list of lists into a single iterable. It's a little bit tricky, but only a little bit, and it's a common idiom.
EDIT: As #Jbernardo wrote in the most popular answer, and as #agf observes in comments, itertools.chain() returns an object that has a .from_iterable() method, and the documentation says it evaluates an iterable lazily. The * notation forces building a list, which may consume considerable memory if the iterable is a long sequence. In fact, you could have a never-ending generator, and with itertools.chain().from_iterable() you would be able to pull values from it for as long as you want to run your program, while the * notation would just run out of memory.
As #Jbernardo wrote:
sorted(set(itertools.chain.from_iterable(sequences)))
This is the best answer, and I already upvoted it.

Way to Treat Python single vals and lists of vals identically?

I'm running into this problem often: I'm creating a function that needs to perform a series of operations on a value, whether that value be a single value or a list of values.
Is there an elegant way to do this:
def convert_val(val):
do a series of things to each value, whether list or single val
return answer or list of answers
rather than what I've been doing?:
def convert_val(val):
if isinstance(val, list):
... do a series of things to each list item,
return a list of answers
else:
... do the same series, just on a single value
return a single answer
One solution would be to create a sub_convert() that would do the series of actions, and then just call it once or iteratively, depending on the type passed in to convert().
Another would be to create a single convert() that would accept the arguments (value, sub_convert()).
Other suggestions that would be more compact, elegant and preferably all in one function?
(I've done several searches here to see if my issue has already been addressed. My appologies if it has.)
Thanks,
JS

You need to fix your design to make all uses of the function actually correct.
Ralph Waldo Emerson. "A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines."
We're not talking about a foolish consistency. You have what might be a design problem based on inconsistent use of this function.
Option 1. Don't call convert_val( x ) where x is a non-list. Do this. convert_val( [x] ). Don't fix your function, fix all the places that use your function. Consistency helps reduce bugs.
Option 2. Change the design of convert_val to use multiple positional arguments. This doesn't generalize well.
def convert_val( *args ):
whatever it's supposed to do to the arguments.
Then fix all the places you provide a list to be convert_val( *someList ). That's okay, and may be closer to your intent.
Note.
You can find your design errors using the warnings module.
def convert_val( arg ):
if isinstance( arg, collections.Sequence ):
return convert_val_list( arg )
else:
warnings.warn( "Fix this" )
return convert_val_list( [arg] )[0]
def convert_val_list( arg ):
assert isinstance( arg, collections.Sequence )
the original processing
Once you've fixed all the design problems, you can then do this
convert_val = convert_val_list
And delete the original function.

If the function makes sense for a single value, as well as for a list, then logically the function's result for a certain list item will not depend on the other items in the list.
For example, a and b should end up identical:
items = [1, 2]
a = convert_val(items)
b = map(convert_val, items)
This example already hints at the solution: the caller knows whether a list or a single value is passed in. When passing a single value, the function can be used as-is. When passing a list, a map invocation is easily added, and makes it clearer what's happening on the side of the caller.
Hence, the function you describe should not exist in the first place!

I'm late to the party here and I'm not sure if this is what OP wants.
I much prefer to keep the implementation details hidden inside the function. The caller shouldn't care about what happens inside.
def convert_val(val):
values = []
values.extend(val)
for value in values:
# do things to each list item,
return a list of answers
This would make the convert_val put val into the values list (if not a list) or all values of val into the values list.
In addition should predictably get a list back (since you'd be using the same logic).
In the end:
assert convert_val([1]) == convert_val(1)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.