Mapping func over dictionary

Mapping func over dictionary - python

How might one map a function over certain values in a dictionary and also update those values in the dictionary?
dic1 = { 1 : [1, 2, 3, 4], 2 : [2, 3, 5, 5], 3 : [6, 3, 7, 2] ... }
map(func, (data[col] for data in dic1.itervalues()))
This is sort of what I'm looking for, but I need a way to reinsert the new func(val) back into each respective slot in the dict. The function works fine, and printed it returns all the proper index values with the func applied, but I can't think of a good way to update the dictionary. Any ideas?

You don't want to use map for updating any kind of sequence; that's not what it's for. map is for generating a new sequence:
dict2 = dict(map(func, dict1.iteritems()))
Of course func has to take a (key, old_value) and return (key, new_value) for this to work as-is. If it just returns, say, new_value, you need to wrap it up in some way. But at that point, you're probably better off with a dictionary comprehension than a map call and a dict constructor:
dict2 = {key: func(value) for key, value in dict1.itervalues()}
If you want to use map, and you want to mutate, you could do it by creating a function that updates things, like this:
def func_wrapped(d, key):
d[key] = func(d[key])
map(partial(func_wrapped, d), dict1)
(This could even be done as a one-liner by using partial with d.__setitem__ if you really wanted.)
But that's a silly thing to do. For one thing, it means you're building a list of values just to throw them away. (Or, in Python 3, you're not actually doing anything unless you write some code that iterates over the map iterator.) But more importantly, you're making things harder for yourself for no good reason. If you don't need to modify things in-place, don't do it. If you do need to modify things in place, use a for loop.
PS, I'm not saying this was a silly question to ask. There are other languages that do have map-like mutating functions, so it wouldn't be a silly thing to do in, say, C++. It's just not pythonic, and you wouldn't know that without asking.

Your function can mutate the list:
>>> d = {1:[1],2:[2],3:[3]}
>>> def func(lst):
... lst.append(lst[0])
... return lst
...
>>> x = map(func,d.values())
>>> x
[[1, 1], [2, 2], [3, 3]]
>>> d
{1: [1, 1], 2: [2, 2], 3: [3, 3]}
however, please note that this really isn't idiomatic python and should be considered for instructional/educational purposes only ... Usually if a function mutates it's arguments, it's polite to have it return None.

Related

Example of set subtraction in python

I'm taking a data structures course in Python, and a suggestion for a solution includes this code which I don't understand.
This is a sample of a dictionary:
vc_metro = {
'Richmond-Brighouse': set(['Lansdowne']),
'Lansdowne': set(['Richmond-Brighouse', 'Aberdeen'])
}
It is suggested that to remove some of the elements in the value, we use this code:
vc_metro['Lansdowne'] -= set(['Richmond-Brighouse'])
I have never seen such a structure, and using it in a basic situation such as:
my_list = [1, 2, 3, 4, 5, 6]
other_list = [1, 2]
my_list -= other_list
doesn't work. Where can I learn more about this recommended strategy?

You can't subtract lists, but you can subtract set objects meaningfully. Sets are hashtables, somewhat similar to dict.keys(), which allow only one instance of an object.
The -= operator is equivalent to the difference method, except that it is in-place. It removes all the elements that are present in both operands from the left one.
Your simple example with sets would look like this:
>>> my_set = {1, 2, 3, 4, 5, 6}
>>> other_set = {1, 2}
>>> my_set -= other_set
>>> my_set
{3, 4, 5, 6}
Curly braces with commas but no colons are interpreted as a set object. So the direct constructor call
set(['Richmond-Brighouse'])
is equivalent to
{'Richmond-Brighouse'}
Notice that you can't do set('Richmond-Brighouse'): that would add all the individual characters of the string to the set, since strings are iterable.
The reason to use -=/difference instead of remove is that differencing only removes existing elements, and silently ignores others. The discard method does this for a single element. Differencing allows removing multiple elements at once.
The original line vc_metro['Lansdowne'] -= set(['Richmond-Brighouse']) could be rewritten as
vc_metro['Lansdowne'].discard('Richmond-Brighouse')

Why does sorting a list of dict keys in one line, with .sort() not work, while sorted() does?

While practicing Python (3.7.3), I find myself wanting to sort the keys of a dict. But I am walking up against something I don't understand, and can't find explained on SO.
edit: I know that the sort() method changes the list itself, while sorted() leaves the original list intact and returns new one. But can someone explain why the list() constructor doesn't seem to return the list anymore when I call it's sort() method?
Can someone explain why this doesn't return anything:
>>> md = {5: 3, 2: 1, 8: 9}
>>> ml = list(md.keys()).sort()
>>> ml
>>>
While if I do it in two separate steps, it does work:
>>> ml = list(md.keys())
>>> ml
[5, 2, 8]
>>> ml.sort()
>>> ml
[2, 5, 8]
>>>
Also, I found that doing it in one line using sorted(), it works as well:
>>> sorted(list(md.keys()))
[2, 5, 8]

sort sorts the iterable in-place, but returns None, which is assigned to ml. That's why the REPL does not show anything.
On the contrary, sorted returns a sorted representation of the original iterable.

sort() sorts directly your array, while sorted() returns a new array. (Docs)

Is extending a list of dictionaries higher performing than iterating over the keys?

When helping my co-worker troubleshoot a problem I saw something I was unaware python did. When compared to other ways of doing this I am curious where the performance and time complexity stacks up and the best approach is for sake of performance.
what my co-worker did that prompted this question:
list_of_keys = []
test_dict = {'foo': 1, 'bar': [1, 2, 3, 4, 5]}
list_of_keys.extend(test_dict)
print(list_of keys)
['foo', 'bar']
vs other examples I have seen:
list_of_keys = []
test_dict = {'foo': 1, 'bar': [1, 2, 3, 4, 5]}
for i in test_dict.keys():
list_of_keys.append(i)
and
keys = list(test_dict)
which one of these is shown to be the most beneficial and the most pythonic for the sake of simply appending keys. which one yields the best performance?

As the docs explain, s.extend(t):
extends s with the contents of t (for the most part the same as s[len(s):len(s)] = t)
OK, so that isn't very clear as to whether it should be faster or slower than calling append in a loop. But it is a little faster—the looping is happening in C rather than in Python, and it can use some special optimized code for adding onto the list because it knows you're not touching the list at the same time.
More importantly, it's a lot simpler, more readable, and harder to get wrong.
As for starting with an empty list and then extending it (or appending to it), there's no good reason to do that. If you already have a list with some values in it, and want to add the dict keys, then use extend. But if you just want to create a list of the keys, just do list(d).
As for d.keys() vs. d, there's really no difference at all. Whether you iterate over a dict or its dict_keys view, you get the exact same values iterated, even using the exact same dict_keyiterator. The extra call to keys() does make things a tiny bit slower, but that's a fixed cost, not once per element, so unless your dicts are tiny, you won't see any noticeable difference.
So, do whichever one seems more readable in the circumstances. Generally speaking, the only reason you want to loop over d.keys() is when you want to make it clear that you're iterating over a dict's keys, but it isn't obvious from the surrounding code that d is a dict.
Among other things, you also asked about complexity.
All of these solutions have the same (linear) complexity, because they all do the same thing under the covers: for every keys in the dictionary, append it to the end of a list. That's one step per key, and the complexity of each step is amortized constant (because Python lists expand exponentially), so the title time is O(N) where N is the length of the dict.

After #thebjorn mentioned the module. seems that calling extend is fastest
It seems that list() is the most pythonic for sake of readability and cleanliness.
the most beneficial seems dependent on use-case. but more or less doing this is redundant as mentioned in a comment. This was discovered from a mistake and i got curious.
timeit.timeit("for i in {'foo': 1, 'bar': [1, 2, 3, 4, 5]}.keys():[].append(i)", number=1000000)
0.6147394659928977
timeit.timeit("[].extend({'foo': 1, 'bar': [1, 2, 3, 4, 5]})", number=1000000)
0.36140396299015265
timeit.timeit("list({'foo': 1, 'bar': [1, 2, 3, 4, 5]})", number=1000000)
0.4726199270080542

Selective flattening of a Python list

Suppose I have a list containing (among other things) sublists of different types:
[1, 2, [3, 4], {5, 6}]
that I'd like to flatten in a selective way, depending on the type of its elements (i.e. I'd like to only flatten sets, and leave the rest unflattened):
[1, 2, [3, 4], 5, 6]
My current solution is a function, but just for my intellectual curiosity, I wonder if it's possible to do it with a single list comprehension?

List comprehensions aren't designed for flattening (since they don't have a way to combine the values corresponding to multiple input items).
While you can get around this with nested list comprehensions, this requires each element in your top level list to be iterable.
Honestly, just use a function for this. It's the cleanest way.

Amber is probably right that a function is preferable for something like this. On the other hand, there's always room for a little variation. I'm assuming the nesting is never more than one level deep -- if it is ever more than one level deep, then you should definitely prefer a function for this. But if not, this is a potentially viable approach.
>>> from itertools import chain
>>> from collections import Set
>>> list(chain.from_iterable(x if isinstance(x, Set) else (x,) for x in l))
[1, 2, [3, 4], 5, 6]
The non-itertools way to do this would involve nested list comprehensions. Better to break that into two lines:
>>> packaged = (x if isinstance(x, collections.Set) else (x,) for x in l)
>>> [x for y in packaged for x in y]
[1, 2, [3, 4], 5, 6]
I don't have a strong intuition about whether either of these would be faster or slower than a straightforward function. These create lots of singleton tuples -- that's kind of a waste -- but they also happen at LC speed, which is usually pretty good.

You can use flatten function from funcy library:
from funcy import flatten, isa
flat_list = flatten(your_list, follow=isa(set))
You can also peek at its implementation.

Filter duplicates from a list in Python [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I'm given a problem where I have to filter out dupes from a list such a
a = [1,1,4,5,6,5]
This is my code:
def unique(a):
uni = []
for value in a:
if value[0] not in found:
yield value
found.add(value[0])
print list(unique(a))
However, when I define the list, a, and try unique(a), I get this output:
<generator object unique at 0x0000000002891750>
Can someone tell me what I'm doing wrong? Why can't I get the list?
EDIT, NEW PROBLEM..
I was able to get it print out the filtered list, but I lose the order of the list.
How can I prevent this?
def unique(a):
s = set()
for i in a:
if i not in s:
s.add(i)
return s

You have to keep track of all the elements that have been seen. The best way is to use set as lookup complexity of it is O(1).
>>> def unique(it):
s = set()
for el in it:
if el not in s:
s.add(el)
yield el
>>> list(unique(a))
[1, 4, 5, 6]
If you don't need to keep the order of the elements you can utilize the set constructor, and then convert it back to list. This will remove all the duplicates, but will destroy the order of the elements:
list(set(a))

First of all, to remove duplicates, use a set:
>>> a = [1, 1, 4, 5, 6, 5]
>>> set(a)
{1, 4, 5, 6}
>>> list(set(a)) # if you really _need_ a list, you can convert it back
[1, 4, 5, 6]
Second, the output you get, generator object unique at 0x..., means that you have a generator object, instead of a simple list as its return value. And this is what you should expect after using yield in the function. yield will make any function a generator and will give you only all results, if you request them (or iterate over it). If you just want to get the full result, you can call list() on the object to create a list from the generator object: list(unique(a)).
However, then you will notice the errors your function gives you: TypeError: 'int' object is not subscriptable. The reason for that is the value[0] you use. value is an element from the list (you iterate over the list) and as such is an integer. You cannot get the first element from the integer, so you probably meant just value there.
Next, you add elements to found although you defined the list as uni first, so you should decide on one of the names there. Also, the method is append, not add.
Finally, you should really not recursively call the method with the same parameter multiple times inside the function again, as this will just fill up the stack without providing any use, so remove the print out of it.
Then, you end up with this, which works just fine:
>>> def unique(a):
found = [] # better: use a set() here
for value in a:
if value not in found:
yield value
found.append(value)
>>> list(unique(a))
[1, 4, 5, 6]
But still, this is not really a good solution, and you should really just use set instead, as it will also give you further methods to work with that set once its created (e.g. a quick check for containedness).
I'm also required to get the answer just by inputting unique(a)
In that case, just remove the yield value from your function, and return the found list at the end of it.

This is a well known classic:
>>> def unique(xs):
... seen = set()
... seen_add = seen.add
... return [x for x in xs if x not in seen and not seen_add(x)]
...
>>> unique([1, 2, 3, 3, 4, 1, 3, 5, 5, 4, 6])
[1, 2, 3, 4, 5, 6]

The usual way to do this is list(set(a)
def unique(a):
return list(set(a))
Now, coming to to your question. yield returns a generator that you must iterator over and not print. So if you have a function, which has a yield in it, iterate over like like for return_value from function_that_yields():
There are more problems with your question. You have not defined found and then you indexing value which may not be a container.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.