Indexing a list with nested lists [duplicate] - python

The list.index(x) function returns the index in the list of the first item whose value is x.
Is there a function, list_func_index(), similar to the index() function that has a function, f(), as a parameter. The function, f() is run on every element, e, of the list until f(e) returns True. Then list_func_index() returns the index of e.
Codewise:
>>> def list_func_index(lst, func):
for i in range(len(lst)):
if func(lst[i]):
return i
raise ValueError('no element making func True')
>>> l = [8,10,4,5,7]
>>> def is_odd(x): return x % 2 != 0
>>> list_func_index(l,is_odd)
3
Is there a more elegant solution? (and a better name for the function)

You could do that in a one-liner using generators:
next(i for i,v in enumerate(l) if is_odd(v))
The nice thing about generators is that they only compute up to the requested amount. So requesting the first two indices is (almost) just as easy:
y = (i for i,v in enumerate(l) if is_odd(v))
x1 = next(y)
x2 = next(y)
Though, expect a StopIteration exception after the last index (that is how generators work). This is also convenient in your "take-first" approach, to know that no such value was found --- the list.index() function would throw ValueError here.

One possibility is the built-in enumerate function:
def index_of_first(lst, pred):
for i,v in enumerate(lst):
if pred(v):
return i
return None
It's typical to refer a function like the one you describe as a "predicate"; it returns true or false for some question. That's why I call it pred in my example.
I also think it would be better form to return None, since that's the real answer to the question. The caller can choose to explode on None, if required.

#Paul's accepted answer is best, but here's a little lateral-thinking variant, mostly for amusement and instruction purposes...:
>>> class X(object):
... def __init__(self, pred): self.pred = pred
... def __eq__(self, other): return self.pred(other)
...
>>> l = [8,10,4,5,7]
>>> def is_odd(x): return x % 2 != 0
...
>>> l.index(X(is_odd))
3
essentially, X's purpose is to change the meaning of "equality" from the normal one to "satisfies this predicate", thereby allowing the use of predicates in all kinds of situations that are defined as checking for equality -- for example, it would also let you code, instead of if any(is_odd(x) for x in l):, the shorter if X(is_odd) in l:, and so forth.
Worth using? Not when a more explicit approach like that taken by #Paul is just as handy (especially when changed to use the new, shiny built-in next function rather than the older, less appropriate .next method, as I suggest in a comment to that answer), but there are other situations where it (or other variants of the idea "tweak the meaning of equality", and maybe other comparators and/or hashing) may be appropriate. Mostly, worth knowing about the idea, to avoid having to invent it from scratch one day;-).

Not one single function, but you can do it pretty easily:
>>> test = lambda c: c == 'x'
>>> data = ['a', 'b', 'c', 'x', 'y', 'z', 'x']
>>> map(test, data).index(True)
3
>>>
If you don't want to evaluate the entire list at once you can use itertools, but it's not as pretty:
>>> from itertools import imap, ifilter
>>> from operator import itemgetter
>>> test = lambda c: c == 'x'
>>> data = ['a', 'b', 'c', 'x', 'y', 'z']
>>> ifilter(itemgetter(1), enumerate(imap(test, data))).next()[0]
3
>>>
Just using a generator expression is probably more readable than itertools though.
Note in Python3, map and filter return lazy iterators and you can just use:
from operator import itemgetter
test = lambda c: c == 'x'
data = ['a', 'b', 'c', 'x', 'y', 'z']
next(filter(itemgetter(1), enumerate(map(test, data))))[0] # 3

A variation on Alex's answer. This avoids having to type X every time you want to use is_odd or whichever predicate
>>> class X(object):
... def __init__(self, pred): self.pred = pred
... def __eq__(self, other): return self.pred(other)
...
>>> L = [8,10,4,5,7]
>>> is_odd = X(lambda x: x%2 != 0)
>>> L.index(is_odd)
3
>>> less_than_six = X(lambda x: x<6)
>>> L.index(less_than_six)
2

you could do this with a list-comprehension:
l = [8,10,4,5,7]
filterl = [a for a in l if a % 2 != 0]
Then filterl will return all members of the list fulfilling the expression a % 2 != 0. I would say a more elegant method...

Intuitive one-liner solution:
i = list(map(lambda value: value > 0, data)).index(True)
Explanation:
we use map function to create a list containing True or False based on if each element in our list pass the condition in the lambda or not.
then we convert the map output to list
then using the index function, we get the index of the first true which is the same index of the first value passing the condition.

Related

How do I map multiple functions over a list?

It's come to this:
How do I map multiple functions over a list?
items = ['this', 'that', 100]
item_types = (type(i) for i in items)
items_gen = map(next(item_types), items)
This does not work, neither do many other things I've tried.
What am I missing?
I get either the first type mapped all over the entire list, or the type applied to the first item and itself cut into character snippets...
If this is a dupe - sorry, can't find this question in any reasonable way being asked here due to a million ways being asked.
I am going to switch out the items for input()'s so this is just a crude example, but I want to apply types to input values.
Expected output:
I want to call next on this item_gen object and get: 'this', 'that', 100
Not: 'this', 'that', '100'
[see EDIT below]
I think you need one more generator in the chain, if this is what you're looking for, a type conversion (or verification?) system?
def mapper(typedefs, target):
igen = (i for i in typedefs)
item_types = (type(i) for i in igen)
return map(next(item_types), target)
so then if you say:
list(mapper(['a','b','c'],[1,2,3]))
You'd get:
['1','2','3']
This will throw ValueError exceptions in the reverse conversion, however.
[EDIT]: that's incorrect above. This looks good:
def foo(types, target):
if len(types) == len(target):
gen1 = (type(i) for i in types)
gen2 = ([i] for i in target) #key is make this a list
gen3 = (next(map(next(gen1),next(gen2))) for _ in types)
yield from gen3
Now we get item-by-item type conversion (attempts):
bar = foo(['a',True,3.14,1],[1,1,1,1]))
list(bar)
['1',True,1.0,1]
You can just zip 2 iterable and apply each function to each element. Example:
>>> def f1(x): return x+1
...
>>> def f2(x): return x+2
...
>>> def f3(x): return x+3
...
>>> functions = [f1,f2,f3]
>>> elements = [1,1,1]
>>> [f(el) for f,el in zip(functions,elements)]
[2, 3, 4]
Which in your case becomes:
>>> [f(el) for f,el in zip(item_types,items)]
['this', 'that', 100]
To use map here, you need to map function application. then, you can use the multi-argument form of map, something like:
>>> funcs = [lambda x: x+1, lambda x: x*2, lambda x: x**3]
>>> data = [1, 2, 3]
>>> def apply(f, x): return f(x)
...
>>> list(map(apply, funcs, data))
[2, 4, 27]
Note, passing next(item_types) makes no sense, that gives you the next item in item_types, which in this case, is the first type, if you want to understand what you were seeing before.
Or, with your example:
>>> items = ['this', 'that', 100]
>>> list(map(apply, map(type, items), items))
['this', 'that', 100]
What you seem to want to do is create a generator out of a list (I am not sure what the whole type thing there is for).
To do just that you can just call iter:
item_gen = iter(items)
res = []
res.append(next(item_gen))
res.append(next(item_gen))
res.append(next(item_gen))
print(res)
will print (and the types will be the original ones):
['this', 'that', 100]
Assuming you have just one set of types in a types list and you want them to get applied you can do the following thing:
types = [str, str, int]
item_gen = (tp(i) for tp, i in zip(types, items))

Hash function for collection of items that disregards ordering

I am using a the hash() function to get the hash value of my object which contains two integers and two Strings. Moreover, I have a dictionary where I store these objects; the process is that I check if the object exists with the hash value, if yes I update if not I insert the new one.
The thing is that when creating the objects, I do not know the order of the object variables and I want to treat the objects as same no matter the order of these variables.
Is there an alternative function to the hash() function that does not consider the order of the variables?
#Consequently what I want is:
hash((int1,str1,int2,str2)) == hash((int2,str2,int1,str1))
You could use a frozenset instead of a tuple:
>>> hash(frozenset([1, 2, 'a', 'b']))
1190978740469805404
>>>
>>> hash(frozenset([1, 'a', 2, 'b']))
1190978740469805404
>>>
>>> hash(frozenset(['a', 2, 'b', 1]))
1190978740469805404
However, the removal of duplicates from the iterable presents a subtle problem:
>>> hash(frozenset([1,2,1])) == hash(frozenset([1,2,2]))
True
You can fix this by creating a counter from the iterable using collections.Counter, and calling frozenset on the counter's items, thus preserving the count of each item from the original iterable:
>>> from collections import Counter
>>>
>>> hash(frozenset(Counter([1,2,1]).items()))
-307001354391131208
>>> hash(frozenset(Counter([1,1,2]).items()))
-307001354391131208
>>>
>>> hash(frozenset(Counter([1,2,1]).items())) == hash(frozenset(Counter([1,2,2]).items()))
False
Usually for things like this it helps immeasurably if you post some sample code, but I'll assume you've got something like this:
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
def __hash__(self):
return hash((self.x, self.y))
You're taking a hash of a tuple there, which does care about order. If you want your hash to not care about the order of the ints, then just use a frozenset:
def __hash__(self):
return hash(frozenset([self.x, self.y]))
If the range of the values is not too great you could add them together, that way the order can be disregarded, however it does increase the possibility for 2 hashes to have the same value:
def hash_list(items):
value = 0
for item in items:
value+= hash(item)
return value
hash_list(['a', 'b', 'c'])
>>> 8409777985338339540
hash_list(['b', 'a', 'c'])
>>> 8409777985338339540

Powerset with frozenset in Python

I'm sitting here for almost 5 hours trying to solve the problem and now I'm hoping for your help.
Here is my Python Code:
def powerset3(a):
if (len(a) == 0):
return frozenset({})
else:
s=a.pop()
b=frozenset({})
b|=frozenset({})
b|=frozenset({s})
for subset in powerset3(a):
b|=frozenset({str(subset)})
b|=frozenset({s+subset})
return b
If I run the program with:
print(powerset3(set(['a', 'b'])))
I get following solution
frozenset({'a', 'b', 'ab'})
But I want to have
{frozenset(), frozenset({'a'}), frozenset({'b'}), frozenset({'b', 'a'})}
I don't want to use libraries and it should be recursive!
Thanks for your help
Here's a slightly more readable implementation using itertools, if you don't want to use a lib for the combinations, you can replace the combinations code with its implementation e.g. from https://docs.python.org/2/library/itertools.html#itertools.combinations
def powerset(l):
result = [()]
for i in range(len(l)):
result += itertools.combinations(l, i+1)
return frozenset([frozenset(x) for x in result])
Testing on IPython, with different lengths
In [82]: powerset(['a', 'b'])
Out[82]:
frozenset({frozenset(),
frozenset({'b'}),
frozenset({'a'}),
frozenset({'a', 'b'})})
In [83]: powerset(['x', 'y', 'z'])
Out[83]:
frozenset({frozenset(),
frozenset({'x'}),
frozenset({'x', 'z'}),
frozenset({'y'}),
frozenset({'x', 'y'}),
frozenset({'z'}),
frozenset({'y', 'z'}),
frozenset({'x', 'y', 'z'})})
In [84]: powerset([])
Out[84]: frozenset({frozenset()})
You sort of have the right idea. If a is non-empty, then the powerset of a can be formed by taking some element s from a, and let's called what's left over rest. Then build up the powerset of s from the powerset of rest by adding to it, for each subset in powerset3(rest) both subset itself and subset | frozenset({s}).
That last bit, doing subset | frozenset({s}) instead of string concatenation is half of what's missing with your solution. The other problem is the base case. The powerset of the empty set is not the empty set, is the set of one element containing the empty set.
One more issue with your solution is that you're trying to use frozenset, which is immutable, in mutable ways (e.g. pop(), b |= something, etc.)
Here's a working solution:
from functools import partial
def helper(x, accum, subset):
return accum | frozenset({subset}) | frozenset({frozenset({x}) | subset})
def powerset(xs):
if len(xs) == 0:
return frozenset({frozenset({})})
else:
# this loop is the only way to access elements in frozenset, notice
# it always returns out of the first iteration
for x in xs:
return reduce(partial(helper, x), powerset(xs - frozenset({x})), frozenset({}))
a = frozenset({'a', 'b'})
print(powerset(a))

Test if set is a subset, considering the number (multiplicity) of each element in the set

I know I can test if set1 is a subset of set2 with:
{'a','b','c'} <= {'a','b','c','d','e'} # True
But the following is also True:
{'a','a','b','c'} <= {'a','b','c','d','e'} # True
How do I have it consider the number of times an element in the set occurs so that:
{'a','b','c'} <= {'a','b','c','d','e'} # True
{'a','a','b','c'} <= {'a','b','c','d','e'} # False since 'a' is in set1 twice but set2 only once
{'a','a','b','c'} <= {'a','a','b','c','d','e'} # True because both sets have two 'a' elements
I know I could do something like:
A, B, C = ['a','a','b','c'], ['a','b','c','d','e'], ['a','a','b','c','d','e']
all([A.count(i) == B.count(i) for i in A]) # False
all([A.count(i) == C.count(i) for i in A]) # True
But I was wondering if there was something more succinct like set(A).issubset(B,count=True) or a way to stay from list comprehensions. Thanks!
Since Python 3.10, "Counters support rich comparison operators for equality, subset, and superset":
def issubset(X, Y):
return Counter(X) <= Counter(Y)
Old: As stated in the comments, a possible solution using Counter:
from collections import Counter
def issubset(X, Y):
return len(Counter(X)-Counter(Y)) == 0
The short answer to your question is there is no set operation that does this, because the definition of a set does not provide those operations. IE defining the functionality you're looking for would make the data type not a set.
Sets by definition have unique, unordered, members:
>>> print {'a', 'a', 'b', 'c'}
set(['a', 'c', 'b'])
>>> {'a', 'a', 'b', 'c'} == {'a', 'b', 'c'}
True
Combining previous answers gives a solution which is as clean and fast as possible:
def issubset(X, Y):
return all(v <= Y[k] for k, v in X.items())
No instances created instead of 3 in #A.Rodas version (both arguments must already be of type Counter, since this is the Pythonic way to handle multisets).
Early return (short-circuit) as soon as predicate is falsified.
As #DSM deleted his solution , I will take the opportunity to provide a prototype based on which you can expand
>>> class Multi_set(Counter):
def __le__(self, rhs):
return all(v == rhs[k] for k,v in self.items())
>>> Multi_set(['a','b','c']) <= Multi_set(['a','b','c','d','e'])
True
>>> Multi_set(['a','a','b','c']) <= Multi_set(['a','b','c','d','e'])
False
>>> Multi_set(['a','a','b','c']) <= Multi_set(['a','a','b','c','d','e'])
True
>>>
For those that are interested in the usual notion of multiset inclusion, the easiest way to test for multiset inclusion is to use intersection of multisets:
from collections import Counter
def issubset(X, Y):
return X & Y == X
issubset(Counter("ab"), Counter("aab")) # returns True
issubset(Counter("abc"), Counter("aab")) # returns False
This is a standard idea used in idempotent semirings.

Filtering lists

I want to filter repeated elements in my list
for instance
foo = ['a','b','c','a','b','d','a','d']
I am only interested with:
['a','b','c','d']
What would be the efficient way to do achieve this ?
Cheers
list(set(foo)) if you are using Python 2.5 or greater, but that doesn't maintain order.
Cast foo to a set, if you don't care about element order.
Since there isn't an order-preserving answer with a list comprehension, I propose the following:
>>> temp = set()
>>> [c for c in foo if c not in temp and (temp.add(c) or True)]
['a', 'b', 'c', 'd']
which could also be written as
>>> temp = set()
>>> filter(lambda c: c not in temp and (temp.add(c) or True), foo)
['a', 'b', 'c', 'd']
Depending on how many elements are in foo, you might have faster results through repeated hash lookups instead of repeated iterative searches through a temporary list.
c not in temp verifies that temp does not have an item c; and the or True part forces c to be emitted to the output list when the item is added to the set.
>>> bar = []
>>> for i in foo:
if i not in bar:
bar.append(i)
>>> bar
['a', 'b', 'c', 'd']
this would be the most straightforward way of removing duplicates from the list and preserving the order as much as possible (even though "order" here is inherently wrong concept).
If you care about order a readable way is the following
def filter_unique(a_list):
characters = set()
result = []
for c in a_list:
if not c in characters:
characters.add(c)
result.append(c)
return result
Depending on your requirements of speed, maintanability, space consumption, you could find the above unfitting. In that case, specify your requirements and we can try to do better :-)
If you write a function to do this i would use a generator, it just wants to be used in this case.
def unique(iterable):
yielded = set()
for item in iterable:
if item not in yielded:
yield item
yielded.add(item)
Inspired by Francesco's answer, rather than making our own filter()-type function, let's make the builtin do some work for us:
def unique(a, s=set()):
if a not in s:
s.add(a)
return True
return False
Usage:
uniq = filter(unique, orig)
This may or may not perform faster or slower than an answer that implements all of the work in pure Python. Benchmark and see. Of course, this only works once, but it demonstrates the concept. The ideal solution is, of course, to use a class:
class Unique(set):
def __call__(self, a):
if a not in self:
self.add(a)
return True
return False
Now we can use it as much as we want:
uniq = filter(Unique(), orig)
Once again, we may (or may not) have thrown performance out the window - the gains of using a built-in function may be offset by the overhead of a class. I just though it was an interesting idea.
This is what you want if you need a sorted list at the end:
>>> foo = ['a','b','c','a','b','d','a','d']
>>> bar = sorted(set(foo))
>>> bar
['a', 'b', 'c', 'd']
import numpy as np
np.unique(foo)
You could do a sort of ugly list comprehension hack.
[l[i] for i in range(len(l)) if l.index(l[i]) == i]

Categories

Resources