Good afternoon.
I'm sorry if my question may seem dumb or if it has already been posted (I looked for it but didn't seem to find anything. If I'm wrong, please let me know: I'm new here and I may not be the best at searching for the correct questions).
I was wondering if it was possible to remove (pop) a generic item from a dictionary in python.
The idea came from the following exercise:
Write a function to find the sum of the VALUES in a given dictionary.
Obviously there are many ways to do it: summing dictionary.values(), creating a variable for the sum and iterate through the dict and updating it, etc.. But I was trying to solve it with recursion, with something like:
def total_sum(dictionary):
if dictionary == {}:
return 0
return dictionary.pop() + total_sum(dictionary)
The problem with this idea is that we don't know a priori which could be the "first" key of a dict since it's unordered: if it was a list, the index 0 would have been used and it all would have worked.
Since I don't care about the order in which the items are popped, it would be enough to have a way to delete any of the items (a "generic" item). Do you think something like this is possible or should I necessarily make use of some auxiliary variable, losing the whole point of the use of recursion, whose advantage would be a very concise and simple code?
I actually found the following solution, which though, as you can see, makes the code more complex and harder to read: I reckon it could still be interesting and useful if there was some built-in, simple and direct solution to that particular problem of removing the "first" item of a dict, although many "artificious", alternative solutions could be found.
def total_sum(dictionary):
if dictionary == {}:
return 0
return dictionary.pop(list(dictionary.keys())[0]) + total_sum(dictionary)
I will let you here a simple example dictionary on which the function could be applied, if you want to make some simple tests.
ex_dict = {"milk":5, "eggs":2, "flour": 3}
ex_dict.popitem()
it removes the last (most recently added) element from the dictionary
(k := next(iter(d)), d.pop(k))
will remove the leftmost (first) item (if it exists) from a dict object.
And if you want to remove the right most/recent value from the dict
d.popitem()
You can pop items from a dict, but it get's destroyed in the process. If you want to find the sum of values in a dict, it's probably easiest to just use a list comprehension.
sum([v for v in ex_dict.values()])
Instead of thinking in terms of popping values, a more pythonic approach (as far is recursion is pythonic here) is to use an iterator. You can turn the dict's values into an iterator and use that for recursion. This will be memory efficient, and give you a very clean stopping condition for your recursion:
ex_dict = {"milk":5, "eggs":2, "flour": 3}
def sum_rec(it):
if isinstance(it, dict):
it = iter(it.values())
try:
v = next(it)
except StopIteration:
return 0
return v + sum_rec(it)
sum_rec(ex_dict)
# 10
This doesn't really answer the question about popping values, but that really shouldn't be an option because you can't destroy the input dict, and making a copy just to get the sum, as you noted in the comment, could be pretty expensive.
Using popitem() would be almost the same code. You would just catch a different exception and expect the tuple from the pop. (And of course understand you emptied the dict as a side effect):
ex_dict = {"milk":5, "eggs":2, "flour": 3}
def sum_rec(d):
try:
k,v = d.popitem()
except KeyError:
return 0
return v + sum_rec(d)
sum_rec(ex_dict)
# 10
We can use:
dict.pop('keyname')
Related
I came across this piece of code:
for i, s_point in enumerate(sampled_dict["points"]):
s_box = sampled_dict["gt_bboxes_3d"][i]
continue # i is not needed afterwards
I thought of another way to do it:
for s_box, s_point in zip(sampled_dict["gt_bboxes_3d"], sampled_dict["points"]):
continue
Which one would be faster?
Is there a more efficient way than these two?
(Does the conclusion depend on the type of involved iterators? Such as, deque or tuple instead of list)
I am trying to understand what is the simplest way to check if an object is in either the keys of a dictionary or in the values of a dictionary. I've tried using .items() but with no results.
Now I am using this solution but I wonder if there is a better solution:
zdict = { 'a':1,'b':2,'c':3}
print(list(zdict.values()) + list(zdict.keys()))
'b' in list(zdict.values()) + list(zdict.keys())
Stop unecessarily making list objects out of the views returned by .keys and .values. To check if an object is a dictionary key, you simply use some_object in some_dict, to check if it is in the values, you use some_object in some_dict.values(), so combining both:
some_object in some_dict or some_obect in some_dict.values()
This is fundamentally going to be a linear operation altogether, but checking if it is in the keys is constant-time, it's a hash-lookup, so you should check that first to take advantage of short-circuiting behavior. Note, if you make a list out of the keys then you force a linear search.
I wouldn't say this is simpler but maybe:
any('b'==k or 'b'==v for k,v in zdict.items())
"b" in sum(zdict.items(), ())
We turn the "tuples of list" into a tuple of all keys and values together, making use of the sum where we supply an empty tuple for its initial value.
Edit: A commenter said above is a quadratic operation. A linear version might be:
from itertools import chain
"b" in chain(*zdict.items())
Linearity is thanks to the lazy evaluation of chain.
This is not much different than others but quite different syntax:
any([ k for k, v in zdict.items() if k=='b' or v=='b'])
This question already has answers here:
dict.keys()[0] on Python 3 [duplicate]
(3 answers)
Closed 6 years ago.
I just wanna make sure that in Python dictionaries there's no way to get just a key (with no specific quality or relation to a certain value) but doing iteration. As much as I found out you have to make a list of them by going through the whole dictionary in a loop. Something like this:
list_keys=[k for k in dic.keys()]
The thing is I just need an arbitrary key if the dictionary is not empty and don't care about the rest. I guess iterating over a long dictionary in addition to creation of a long list for just randomly getting a key is a whole lot overhead, isn't it?
Is there a better trick somebody can point out?
Thanks
A lot of the answers here produce a random key but the original question asked for an arbitrary key. There's quite a difference between those two. Randomness has a handful of mathematical/statistical guarantees.
Python dictionaries are not ordered in any meaningful way. So, yes, accessing an arbitrary key requires iteration. But for a single arbitrary key, we do not need to iterate the entire dictionary. The built-in functions next and iter are useful here:
key = next(iter(mapping))
The iter built-in creates an iterator over the keys in the mapping. The iteration order will be arbitrary. The next built-in returns the first item from the iterator. Iterating the whole mapping is not necessary for an arbitrary key.
If you're going to end up deleting the key from the mapping, you may instead use dict.popitem. Here's the docstring:
D.popitem() -> (k, v), remove and return some (key, value) pair as a 2-tuple;
but raise KeyError if D is empty.
You can use random.choice
rand_key = random.choice(dict.keys())
And this will only work in python 2.x, in python 3.x dict.keys returns an iterator, so you'll have to do cast it into a list -
rand_key = random.choice(list(dict.keys()))
So, for example -
import random
d = {'rand1':'hey there', 'rand2':'you love python, I know!', 'rand3' : 'python has a method for everything!'}
random.choice(list(d.keys()))
Output -
rand1
You are correct: there is not a way to get a random key from an ordinary dict without using iteration. Even solutions like random.choice must iterate through the dictionary in the background.
However you could use a sorted dict:
from sortedcontainers import SortedDict as sd
d = sd(dic)
i = random.randrange(len(d))
ran_key = d.iloc[i]
More here:.
http://www.grantjenks.com/docs/sortedcontainers/sorteddict.html
Note that whether or not using something like SortedDict will result in any efficiency gains is going to be entirely dependent upon the actual implementation. If you are creating a lot of SD objects, or adding new keys very often (which have to be sorted), and are only getting a random key occasionally in relation to those other two tasks, you are unlikely to see much of a performance gain.
How about something like this:
import random
arbitrary_key = random.choice( dic.keys() )
BTW, your use of a list comprehension there really makes no sense:
dic.keys() == [k for k in dic.keys()]
check the length of dictionary like this, this should do !!
import random
if len(yourdict) > 0:
randomKey = random.sample(yourdict,1)
print randomKey[0]
else:
do something
randomKey will return a list, as we have passed 1 so it will return list with 1 key and then get the key by using randomKey[0]
When iterating through a dictionary, I want to skip an item if it has a particular key. I tried something like mydict.next(), but I got an error message 'dict' object has no attribute 'next'
for key, value in mydict.iteritems():
if key == 'skipthis':
mydict.next()
# for others do some complicated process
I am using Python 2.7 if that matters.
Use continue:
for key, value in mydict.iteritems():
if key == 'skipthis':
continue
Also see:
Are break and continue bad programming practices?
I think you want to call mydict.iteritems().next(), however you should just filter the list before iterating.
To filter your list, you could use a generator expression:
r = ((k, v) for k, v in mydict.iteritems() if k != 'skipthis')
for k,v in r:
#do something complicated to filtered items
Because this is a generator expression, it has the property of only traversing the original dict once, leading to a boost in performance over other alternatives which iterate the dictionary, and optionally copy elements to a new one or delete existing elements from it. Generators can also be chained, which can be a powerful concept when iterating.
More info on generator expressions:
http://www.python.org/dev/peps/pep-0289/
Another alternative is this:
for key, value in mydict.iteritems():
if key != 'skipthis':
# Do whatever
It does the same thing as skipping the key with continue. The code under the if statement will only run if the key is not 'skipthis'.
The advantage of this method is that it is cleaner and saves lines. Also is a little better to read in my opinion.
You should ask the question why are you needing to do this? One unit of code should do one thing, so in this case the loop should have had the dict 'cleaned' before it reaches it.
Something along these lines:
def dict_cleaner(my_dict):
#make a dict of stuff you want your loop to deal with
return clean_dict
for key, value in dict_cleaner(mydict).iteritems():
#Do the stuff the loop actually does, no worrying about selecting items from it.
I have the following code in Python:
def point_to_index(point):
if point not in points:
points.append(point)
return points.index(point)
This code is awfully inefficient, especially since I expect points to grow to hold a few million elements.
If the point isn't in the list, I traverse the list 3 times:
look for it and decide it isn't there
go to the end of the list and add a new element
go to the end of the list until I find the index
If it is in the list, I traverse it twice:
1. look for it and decide it is there
2. go almost to the end of the list until I find the index
Is there any more efficient way to do this? For instance, I know that:
I'm more likely to call this function with a point that isn't in the list.
If the point is in the list, it's likelier to be near the end than in the beginning.
So if I could have the line:
if point not in points:
search the list from the end to the beginning it would improve performance when the point is already in the list.
However, I don't want to do:
if point not in reversed(points):
because I imagine that reversed(points) itself will come at a huge cost.
Nor do I want to add new points to the beginning of the list (assuming I knew how to do that in Python) because that would change the indices, which must remain constant for the algorithm to work.
The only improvement I can think of is to implement the function with only one pass, if possible from the end to the beginning. The bottom line is:
Is there a good way to do this?
Is there a better way to optimize the function?
Edit: I've gotten suggestions for implementing this with only one pass. Is there any way for index() to go from the end to the beginning?
Edit: People have asked why the index is critical. I'm trying to describe a 3D surface using the OFF file format. This format describes a surface using its vertices and faces. First the vertices are listed, and the faces are described using a list of indices of vertices. That's why once I add a vortex to the list, its index must not change.
Edit: There have been some suggestions (such as igor's) to use a dict. This is a good solution for scanning the list. However, when I'm done I need to print out the list in the same order it was created. If I use a dict, I need to print out its keys sorted by value. Is there a good way to do that?
Edit: I implemented www.brool.com's suggestion. This was the simplest and fastest. It is essentially an ordered Dict, but without the overhead. The performance is great!
You want to use a set:
>>> x = set()
>>> x
set([])
>>> x.add(1)
>>> x
set([1])
>>> x.add(1)
>>> x
set([1])
A set contains only one instance of any item you add, and it will be a lot more efficient than iterating a list manually.
This wikibooks page looks like a good primer if you haven't used sets in Python before.
This will traverse at most once:
def point_to_index(point):
try:
return points.index(point)
except ValueError:
points.append(point)
return len(points)-1
You may also want to try this version, which takes into account that matches are likely to be near the end of the list. Note that reversed() has almost no cost even on very large lists - it does not create a copy and does not traverse the list more than once.
def point_to_index(point):
for index, this_point in enumerate(reversed(points)):
if point == this_point:
return len(points) - (index+1)
else:
points.append(point)
return len(points)-1
You might also consider keeping a parallel dict or set of points to check for membership, since both of those types can do membership tests in O(1). There would be, of course, a substantial memory cost.
Obviously, if the points were ordered somehow, you would have many other options for speeding this code up, notably using a binary search for membership tests.
If you're worried about memory usage, but want to optimize the common case, keep a dictionary with the last n points and their indexes. points_dict = dictionary, max_cache = size of the cache.
def point_to_index(point):
try:
return points_dict.get(point, points.index(point))
except:
if len(points) >= max_cache:
del points_dict[points[len(points)-max_cache]]
points.append(point)
points_dict[points] = len(points)-1
return len(points)-1
def point_to_index(point):
try:
return points.index(point)
except:
points.append(point)
return len(points)-1
Update: Added in Nathan's exception code.
As others said, consider using set or dict. You don't explain why you need the indices. If they are needed only to assign unique ids to the points (and I can't easily come up with another reason for using them), then dict will indeed work much better, e.g.,
points = {}
def point_to_index(point):
if point in points:
return points[point]
else:
points[point] = len(points)
return len(points) - 1
What you really want is an ordered dict (key insertion determines the order):
Recipe: http://code.activestate.com/recipes/107747/
PEP: http://www.python.org/dev/peps/pep-0372/