Lambda that searches list and increments - python

Using a Python lambda can you check whether an element exists in another list (of maps) and also increment a variable? I'm attempting to optimise/refactor my code using a lambda but I've gone and confused myself.
Below is my existing code that I want to convert to a lambda. Is it possible to do this using one lambda or will I need to use 2 lambdas? Any advice how can I convert it to a lambda/s?
current_orders = auth.get_orders()
# returns [{'id': 'foo', 'price': 1.99, ...}, ...]
deleted_orders = auth.cancel_orders()
# returns id's of all cancelled orders [{'id': 'foo'}, {'id': 'bar'}, ...]
# Attempting to convert to lambda
n_deleted = 0
for del_order in deleted_orders:
for order in current_orders:
if del_order['id'] == order['id']:
n_deleted += 1
# lambda
n_deleted = filter(lambda order, n: n += order['id'] in current_orders, deleted_orders)
# end
if n_deleted != len(orders):
logger.error("Failed to cancel all limit orders")
Note: I know I can say if len(deleted_orders) < len(current_orders): logger.error("Failed to delete ALL orders") but I want to expand my lambda eventually to say ...: logger.error("Failed to delete ORDER with ID: %s")

You can't use += (or assignment of any kind) in a lambda at all, and using filter for side-effects is a terrible idea (this pattern looks kind of like how reduce is used, but it's hard to tell what you're trying to do).
It looks like you're trying to count how many order['id'] values appear in current_orders. You shouldn't use a lambda for this at all. To improve efficiency, get the ids from out as a set and use set operations to check if all the ids were found in both list:
from future_builtins import map # Only on Py2, to get generator based map
from operator import itemgetter
... rest of your code ...
getid = itemgetter('id')
# Creating the `set`s requires a single linear pass, and comparison is
# roughly linear as well; your original code had quadratic performance.
if set(map(getid, current_orders)) != set(map(getid, deleted_orders)):
logger.error("Failed to cancel all limit orders")
If you want to know which orders weren't canceled, a slight tweak, replacing the if check and logger output with:
for oid in set(map(getid, current_orders)).difference(map(getid, deleted_orders)):
logger.error("Failed to cancel order ID %s", oid)
If you want the error logs ordered by oid, wrap the set.difference call in sorted, if you want it in the same order returned in current_orders, change to:
from itertools import filterfalse # On Py2, it's ifilterfalse
# Could inline deletedids creation in filterfalse if you prefer; frozenset optional
deletedids = frozenset(map(getid, deleted_orders))
for oid in filterfalse(deletedids.__contains__, map(getid, current_orders)):
logger.error("Failed cancel order ID %s", oid)

It is possible to hack around it but lambdas should not mutate, they should return a new result. Also you should not overcomplicate lambdas, they are meant for short quick functions such a key for a sort method

Probably you should be using a list comprehension. eg
current_order_ids = {order['id'] for order in current_orders}
not_del = [order for order in deleted_orders if order['id'] not in current_order_ids]
for order in not_del:
logger.error("Failed to delete ORDER with ID: %s", order['id'])

Related

Python, django filter by kwargs or list, inclusive output

I want to get get account Ids that will be associated with determined list of ids, currently I filter by one exactly id and I would like to input various Ids so I can get a Wider result.
My code:
from typing import List
from project import models
def get_followers_ids(system_id) -> List[int]:
return list(models.Mapper.objects.filter(system_id__id=system_id
).values_list('account__id', flat=True))
If I run the code, I get the Ids associated with the main ID, the output will be a list of ids related to the main one (let's say, "with connection to"):
Example use:
system_id = 12350
utility_ids = get_followers_ids(system_id)
print(utility_ids)
output:
>>> [14338, 14339, 14341, 14343, 14344, 14346, 14347, 14348, 14349, 14350, 14351]
But I would like to input more variables avoiding to fell in a for loop, which will be slow because it will do many requests to the server.
The input I would like to use is a list or similar, it should be able to input various arguments at a time.
And the output should be a list of relations (doing the least number of requests to DB), example
if id=1 is related to [3,4,5,6]
and if id=2 is related to [5,6,7,8]
The output should be [3,4,5,6,7,8]
you can use the Field lookups, in your case use "In" lookup
so:
# system_ids is now a list
def get_followers_ids(system_ids) -> List[int]:
# here add in the end in filter field the "__in"
return list(models.Mapper.objects.filter(system_id__id__in=system_ids
).values_list('account__id', flat=True))
system_ids = [12350, 666]
utility_ids = get_followers_ids(system_ids)
print(utility_ids)

Is there a way to extract only the values from a dictionary object to add to a list, using the lamba function?

I am experimenting using lambda functions to create a list that contains only the values of a particular key to a list.
I have the following:
names = None
names = list(map(lambda restaurant: dict(name=restaurant['name']
).values(), yelp_restaurants))
names
# This is what I want for the list:
# ['Fork & Fig',
# 'Salt And Board',
# 'Frontier Restaurant',
# 'Nexus Brewery',
# "Devon's Pop Smoke",
# 'Cocina Azul',
# 'Philly Steaks',
# 'Stripes Biscuit']
What I get however, is the following:
[dict_values(['Fork & Fig']),
dict_values(['Salt And Board']),
dict_values(['Frontier Restaurant']),
dict_values(['Nexus Brewery']),
dict_values(["Devon's Pop Smoke"]),
dict_values(['Cocina Azul']),
dict_values(['Philly Steaks']),
dict_values(['Stripes Biscuit'])]
Is there a way to only pass the values, an eliminate the redundant 'dict_values' prefix?
The function you are using to create names is a bit redundant:
names = list(map(lambda restaurant: dict(name=restaurant['name']
).values(), yelp_restaurants))
The control flow that you have outlined is "from a list of dict entries called yelp_restaurants, I want to create a dict of each name and grab the values from each dict and put that in a list."
Why? Don't get hung up on lambda functions yet. Start simple, like a for loop:
names = []
for restaurant in yelp_restaurants:
names.append(restaurant['name'])
Look at how much simpler that is. It does exactly what you want, just get the name and stuff it into a list. You can put that in a list comprehension:
names = [restaurant['name'] for restaurant in yelp_restaurants]
Or, if you really need to use lambda, now it's much easier to see what you actually want to accomplish
names = list(map(lambda x: x['name'], yelp_restaurants))
Remember, the x in lambda x: is each member of the iterable yelp_restaurants, so x is a dict. With that in mind, you are using direct access on name to extract what you want.

sort/graph traversal order of a list of objects which "depend" on each other [duplicate]

I'm trying to work out if my problem is solvable using the builtin sorted() function or if I need to do myself - old school using cmp would have been relatively easy.
My data-set looks like:
x = [
('business', Set('fleet','address'))
('device', Set('business','model','status','pack'))
('txn', Set('device','business','operator'))
....
The sort rule should be basically for all value of N & Y where Y > N, x[N][0] not in x[Y][1]
Although I'm using Python 2.6 where the cmp argument is still available I'm trying to make this Python 3 safe.
So, can this be done using some lambda magic and the key argument?
-== UPDATE ==-
Thanks Eli & Winston! I didn't really think using key would work, or if it could I suspected it would be a shoe horn solution which isn't ideal.
Because my problem was for database table dependencies I had to make a minor addition to Eli's code to remove an item from its list of dependencies (in a well designed database this wouldn't happen, but who lives in that magical perfect world?)
My Solution:
def topological_sort(source):
"""perform topo sort on elements.
:arg source: list of ``(name, set(names of dependancies))`` pairs
:returns: list of names, with dependancies listed first
"""
pending = [(name, set(deps)) for name, deps in source]
emitted = []
while pending:
next_pending = []
next_emitted = []
for entry in pending:
name, deps = entry
deps.difference_update(set((name,)), emitted) # <-- pop self from dep, req Py2.6
if deps:
next_pending.append(entry)
else:
yield name
emitted.append(name) # <-- not required, but preserves original order
next_emitted.append(name)
if not next_emitted:
raise ValueError("cyclic dependancy detected: %s %r" % (name, (next_pending,)))
pending = next_pending
emitted = next_emitted
What you want is called a topological sort. While it's possible to implement using the builtin sort(), it's rather awkward, and it's better to implement a topological sort directly in python.
Why is it going to be awkward? If you study the two algorithms on the wiki page, they both rely on a running set of "marked nodes", a concept that's hard to contort into a form sort() can use, since key=xxx (or even cmp=xxx) works best with stateless comparison functions, particularly because timsort doesn't guarantee the order the elements will be examined in. I'm (pretty) sure any solution which does use sort() is going to end up redundantly calculating some information for each call to the key/cmp function, in order to get around the statelessness issue.
The following is the alg I've been using (to sort some javascript library dependancies):
edit: reworked this greatly, based on Winston Ewert's solution
def topological_sort(source):
"""perform topo sort on elements.
:arg source: list of ``(name, [list of dependancies])`` pairs
:returns: list of names, with dependancies listed first
"""
pending = [(name, set(deps)) for name, deps in source] # copy deps so we can modify set in-place
emitted = []
while pending:
next_pending = []
next_emitted = []
for entry in pending:
name, deps = entry
deps.difference_update(emitted) # remove deps we emitted last pass
if deps: # still has deps? recheck during next pass
next_pending.append(entry)
else: # no more deps? time to emit
yield name
emitted.append(name) # <-- not required, but helps preserve original ordering
next_emitted.append(name) # remember what we emitted for difference_update() in next pass
if not next_emitted: # all entries have unmet deps, one of two things is wrong...
raise ValueError("cyclic or missing dependancy detected: %r" % (next_pending,))
pending = next_pending
emitted = next_emitted
Sidenote: it is possible to shoe-horn a cmp() function into key=xxx, as outlined in this python bug tracker message.
I do a topological sort something like this:
def topological_sort(items):
provided = set()
while items:
remaining_items = []
emitted = False
for item, dependencies in items:
if dependencies.issubset(provided):
yield item
provided.add(item)
emitted = True
else:
remaining_items.append( (item, dependencies) )
if not emitted:
raise TopologicalSortFailure()
items = remaining_items
I think its a little more straightforward than Eli's version, I don't know about efficiency.
Over looking bad formatting and this strange Set type... (I've kept them as tuples and delimited the list items correctly...) ... and using the networkx library to make things convenient...
x = [
('business', ('fleet','address')),
('device', ('business','model','status','pack')),
('txn', ('device','business','operator'))
]
import networkx as nx
g = nx.DiGraph()
for key, vals in x:
for val in vals:
g.add_edge(key, val)
print nx.topological_sort(g)
This is Winston's suggestion, with a docstring and a tiny tweak, reversing dependencies.issubset(provided) with provided.issuperset(dependencies). That change permits you to pass the dependencies in each input pair as an arbitrary iterable rather than necessarily a set.
My use case involves a dict whose keys are the item strings, with the value for each key being a list of the item names on which that key depends. Once I've established that the dict is non-empty, I can pass its iteritems() to the modified algorithm.
Thanks again to Winston.
def topological_sort(items):
"""
'items' is an iterable of (item, dependencies) pairs, where 'dependencies'
is an iterable of the same type as 'items'.
If 'items' is a generator rather than a data structure, it should not be
empty. Passing an empty generator for 'items' (zero yields before return)
will cause topological_sort() to raise TopologicalSortFailure.
An empty iterable (e.g. list, tuple, set, ...) produces no items but
raises no exception.
"""
provided = set()
while items:
remaining_items = []
emitted = False
for item, dependencies in items:
if provided.issuperset(dependencies):
yield item
provided.add(item)
emitted = True
else:
remaining_items.append( (item, dependencies) )
if not emitted:
raise TopologicalSortFailure()
items = remaining_items

How to efficiently select entries by date in python?

I have emails and dates. I can use 2 nested for loops to choose emails sent on same date, but how can i do it 'smart way' - efficiently?
# list of tuples - (email,date)
for entry in list_emails_dates:
current_date = entry[1]
for next_entry in list_emails_dates:
if current_date = next_entry[1]
list_one_date_emails.append(next_entry)
I know it can be written in shorter code, but I don't know itertools, or maybe use map, xrange?
You can just convert this to a dictionary, by collecting all emails related to a date into the same key.
To do this, you need to use defaultdict from collections. It is an easy way to give a new key in a dictionary a default value.
Here we are passing in the function list, so that each new key in the dictionary will get a list as the default value.
emails = defaultdict(list)
for email,email_date in list_of_tuples:
emails[email].append(email_date)
Now, you have emails['2013-14-07'] which will be a list of emails for that date.
If we don't use a defaultdict, and do a dictionary comprehension like this:
emails = {x[1]:x[0] for x in list_of_tuples}
You'll have one entry for each date, which will be the last email for that that, since assigning to the same key will override its value. A dictionary is the most efficient way to lookup a value by a key. A list is good if you want to lookup a value by its position in a series of values (assuming you know its position).
If for some reason you are not able to refactor it, you can use this template method, which will create a generator:
def find_by_date(haystack, needle):
for email, email_date in haystack:
if email_date == needle:
yield email
Here is how you would use it:
>>> email_list = [('foo#bar.com','2014-07-01'), ('zoo#foo.com', '2014-07-01'), ('a#b.com', '2014-07-03')]
>>> all_emails = list(find_by_date(email_list, '2014-07-01'))
>>> all_emails
['foo#bar.com', 'zoo#foo.com']
Or, you can do this:
>>> july_first = find_by_date(email_list, '2014-07-01')
>>> next(july_first)
'foo#bar.com'
>>> next(july_first)
'zoo#foo.com'
I would do an (and it's good to try using itertools)
itertools.groupby(list_of_tuples, lambda x: x[1])
which gives you the list of emails grouped by the date (x[1]). Note that when you do it you have to sort it regarding the same component (sorted(list_of_tuples, lambda x: x[1])).
One nice thing (other than telling the reader that we do a group) is that it works lazily and, if the list is kind of long, its performance is dominated by n log n for the sorting instead of n^2 for the nested loop.

some issue with **keywords_arguments in python

I have some issue trying to define keywords_aguments.
I'm trying to define a function that returns all the objects with *_control in the scene, when nothing is specified, but i'd like to choose which ones about 'left' or 'right' it has to return.
Below you can find my function. I don't understand where the error is.
from maya import cmds
def correct_value(selection=None, **keywords_arguments):
if selection is None:
selection = cmds.ls ('*_control')
if not isinstance(selection, list):
selection = [selection]
for each in keywords_arguments:
keywords_list = []
if each.startswith('right','left'):
selection.append(each)
return selection
correct_value()
Keyword arguments are dictionaries. You can print them or could have verified the type with the type() function. This allows you to try use of dictionary in isolated context on your own and finding out how to solve your problem yourself.
Now, when you have a dictionary x = {1:2}, iterating over it with for will give you just one, i.e. it will only iterate over the keys(!), not the according values. For that, use for key, value in dictionary.items() and then use the value if key in ('right', 'left').
The code you have would add 'right' or 'left' on to the end of the list.
I think you want something like this:
def find_controls(*selection, **kwargs): # with *args you can pass one item, several items, or a list
selection = selection or cmds.ls("*_control") or [] # supplied objects, or the ls command, or an empty list
if not kwargs:
return list(selection) # no flags? reutrn the whole list
lefty = lambda ctrl: ctrl.lower().startswith("left") # this will filter for items with left
righty = lambda ctrl: ctrl.lower().startswith("right") # this will filter for items with left
filters = []
if kwargs.get('left'): # safe way to ask 'is this key here and does it have a true value?'
filters.append(lefty)
if kwargs.get('right'):
filters.append(righty)
result = []
for each_filter in filters:
result += filter (each_filter, selection)
return result
find_controls (left=True, right=True)
# Result: [u'left_control', u'right_control'] #
find_controls (left=True, right =False) # or just left=True
# Result: [u'left_control'] #
find_controls()
# Result: [u'left_control', u'middle_control', u'right_control'] #
The trick here is to use the lambdas (which are basically just functions in a shorter format) and the built in filter function (which applies a function to everything in a list and returns things where the function true a non-zero, non-false answer. It's easy to see how you could extend it just by adding more keywords and corresponding lambdas

Categories

Resources