Aggregate nested lists without nested loops in Python 3.x - python

I'm about to work my way into python's 3.5 lambda notation and I'm wondering wether nested loops can simply be replaced with a lambda one-liner.
e.g.:
I have this simple dummy class hierarchy:
class Resource:
def __init__(self, name="foo"):
self.name = name
class Course:
def __init__(self):
self.resources = list()
class College:
def __init__(self):
self.courses = list()
I have an instance of Collegewith multiple Courses and Resources as my starting point.
college = College()
Now if I want a listof all the Resources in my College I could easily do this with 2 for-loops:
all_resources = list()
for course in college.courses:
for resource in course.resources:
all_resources.append(resource)
This is indeed very simple but I wondered whether I could also achieve this by doing something like this:
all_resources = list(map(lambda r: r, [c.resources for c in college.courses]))
But unfortunately this gives me a listof lists and not a listof Resources, what I wanted to achieve. Is lambda suitable for something like that?
What would be the most pythonic way for a operation like this?

First of all, please stop creating empty lists by calling list() — it's both simpler and quicker to use a literal empty list [] (cheeck this with timeit if you don't believe me).
Secondly, there is absolutely no need to use lambda to create the list you want. A simple list comprehension will do the job:
all_resources = [r for r in course.resources for course in college.courses]
Don't overthink it. Keep it as simple as you can.

You shouldn't be using lambdas for this. You should be list comprehensions.
all_resources = [resource for resource for course in college.courses in course.resources]
Lambdas would be completely unnecessary here. If you absolutely wanted to, you could do something like:
all_resources = []
[(lambda c: [all_resources.append(r) for r in c])(x) for x in college.courses)]
But I'd recommend the first one. An alternative giving a different order would be
all_resources = [resource for resource in course.resources for course in college.courses]

Related

Quick way to convert all instance variables in a class, to a list (Python)

I have created a class with around 100+ instance variables (as it will be used in a function to do something else).
Is there a way to translate all the instance variables; into an array list. Without manually appending each instance variable.
For instance:
class CreateHouse(object):
self.name = "Foobar"
self.title = "FooBarTest"
self.value = "FooBarValue"
# ...
# ...
# (100 more instance variables)
Is there a quicker way to append all these items to a list:
Quicker than:
theList = []
theList.append(self.name)
theList.append(self.title)
theList.append(self.value)
# ... (x100 elements)
The list would be used to perform another task, in another class/method.
The only solution (without totally rethinking your whole design - which FWIW might be an option to consider, cf my comments on your question) is to have a list of the attribute names (in the order you want them in the final list) and use getattr
class MonstruousGodClass(object):
_fields_list = ["name", "title", "value", ] #etc...
def as_list(self):
return [getattr(self, fieldname) for fieldname in self._fields_list]
Now since, as I mentionned in a comment, a list is NOT the right datatype here (from a semantical POV at least), you may want to use a dict instead - which makes the code much simpler:
import copy
def as_dict(self):
# we return a deepcopy to avoid unexpected side-effects
return copy.deepcopy(self.__dict__)

Python: how to automatically create instance object?

I want to create instance objects automatically as I explained in the following:
Class MyClass:
def __init__(self,x):
self.x = x
list = ["A","B"]
I want to create the following but automatically, means to loop through the list and create identical object for each element:
A = MyClass(text)
B = MyClass(text)
e.g. like the following which doesn't work:
# this doesn't work but explains more what I need
for i in list:
i = MyClass(text)
Thanks to all of your help!
In general, you can't and shouldn't shove things into your namespace like that. It's better to store those instances in a dict or a list
Class MyClass:
def __init__(self,x):
self.x = x
lst = ["A","B"] # don't use list as an identifier
myclasses = {k: MyClass(text) for k in lst}
Now your instances are
myclasses['A'], myclasses['B'] etc.
If you really want to create a handful of variables in your namespace:
A, B = (MyClass(text) for x in range(2))
note that this means you need to be explicit. You can't get the A,B from a file or user input etc.
Don't be tempted to use exec to pull this off. It's probably the wrong way to go about solving your problem. Tell us why you think you need to do it instead.

Python - Proper way of serially reassign/update class members

I have a class whose members are lists of numbers built by accumulating values from experimental data, like
class MyClass:
def __init__(self):
container1 = []
container2 = []
...
def accumulate_from_dataset(self,dataset):
for entry in dataset:
container1.append( foo (entry) )
container2.append( bar (entry) )
...
def process_accumulated_data(self):
'''called when all the data is gathered
'''
process1(container1)
process2(container2)
...
Issue: it would be beneficial if I could convert all the lists into numpy arrays.
what I tried: the simple conversion
self.container1 = np.array(self.container1)
works. Although, if I would like to consider "more fields in one shot", like
lists_to_convert = [self.container1, self.container2, ...]
def converter(lists_to_convert):
for list in lists_to_convert:
list = np.array(list)
there is not any effective change since the references to the class members are passed by value.
I am thus wondering if there is a smart approach/workaround to handle the whole conversion process.
Any help appreciated
From The Pragmatic Programmer:
Ask yourself: "Does it have to be done this way? Does it have to be done at all?
Maybe you should rethink your data structure? Maybe some dictionary or a simple list of lists would be easier to handle?
Note that in the example presented, container1 and container2 are just transformations on the initial dataset. It looks like a good place for list comprehension:
foo_data = [foo(d) for d in dataset]
# or even
foo_data = map(foo, dataset)
# or generator version
foo_data_iter = (foo(d) for d in dataset)
If you really want to operate on the instance variables as in the example, have a look at getattr and hasattr built-in functions
There isn't an easy way to do this because as you say python passes "by-reference-by-value"
You could add a to_numpy method in your class:
class MyClass:
def __init__(self):
container1 = []
container2 = []
...
def to_numpy(self,container):
list = self.__getattr__(container)
self.__setattr__(container,np.array(list))
...
And then do something like:
object = MyClass()
lists_to_convert = ["container1", "container2" ...]
def converter(lists_to_convert):
for list in lists_to_convert:
object.to_numpy(list)
But it's not very pretty and this sort of code would normally make me take a step back and think about my design.

is this the right way to delete object inside dict

i wrote a class inheriting from dict, i wrote a member method to remove objects.
class RoleCOList(dict):
def __init__(self):
dict.__init__(self)
def recyle(self):
'''
remove roles too long no access
'''
checkTime = time.time()-60*30
l = [k for k,v in self.items() if v.lastAccess>checkTime]
for x in l:
self.pop(x)
isn't it too inefficient? i used 2 list loops but i couldn't find other way
At the SciPy conference last year, I attended a talk where the speaker said that any() and all() are fast ways to do a task in a loop. It makes sense; a for loop rebinds the loop variable on each iteration, whereas any() and all() simply consume the value.
Clearly, you use any() when you want to run a function that always returns a false value such as None. That way, the whole loop will run to the end.
checkTime = time.time() - 60*30
# use any() as a fast way to run a loop
# The .__delitem__() method always returns `None`, so this runs the whole loop
lst = [k for k in self.keys() if self[k].lastAccess > checkTime]
any(self.__delitem__(k) for k in lst)
what about this?
_ = [self.pop(k) for k,v in self.items() if v.lastAccess>checkTime]
Since you don't need the list you generated, you could use generators and a snippet from this consume recipe. In particular, use collections.deque to run through a generator for you.
checkTime = time.time()-60*30
# Create a generator for all the values you will age off
age_off = (self.pop(k) for k in self.keys() if self[k].lastAccess>checkTime)
# Let deque handle iteration (in one shot, with little memory footprint)
collections.deque(age_off,maxlen=0)
Since the dictionary is changed during the iteration of age_off, use self.keys() which returns a list. (Using self.iteritems() will raise a RuntimeError.)
My (completly unreadable solution):
from operator import delitem
map(lambda k: delitem(self,k), filter(lambda k: self[k].lastAccess<checkTime, iter(self)))
but at least it should be quite time and memory efficient ;-)
If performance is an issue, and if you will have large volumes of data, you might want to look into using a Python front-end for a system like memcached or redis; those can handle expiring old data for you.
http://memcached.org/
http://pypi.python.org/pypi/python-memcached/
http://redis.io/
https://github.com/andymccurdy/redis-py

Is there a way to get the return value of a function and test it's "nonzero" at the same time?

I have code that looks like this:
if(func_cliche_start(line)):
a=func_cliche_start(line)
#... do stuff with 'a' and line here
elif(func_test_start(line)):
a=func_test_start(line)
#... do stuff with a and line here
elif(func_macro_start(line)):
a=func_macro_start(line)
#... do stuff with a and line here
...
Each of the func_blah_start functions either return None or a string (based on the input line). I don't like the redundant call to func_blah_start as it seems like a waste (func_blah_start is "pure", so we can assume no side effects). Is there a better idiom for this type of thing, or is there a better way to do it?
Perhaps I'm wrong, (my C is rusty), but I thought that you could do something this in C:
int a;
if(a=myfunc(input)){ /*do something with a and input here*/ }
is there a python equivalent?
Why don't you assign the function func_cliche_start to variable a before the if statement?
a = func_cliche_start(line)
if a:
pass # do stuff with 'a' and line here
The if statement will fail if func_cliche_start(line) returns None.
You can create a wrapper function to make this work.
def assign(value, lst):
lst[0] = value
return value
a = [None]
if assign(func_cliche_start(line), a):
#... do stuff with 'a[0]' and line here
elif assign(func_test_start(line), a):
#...
You can just loop thru your processing functions that would be easier and less lines :), if you want to do something different in each case, wrap that in a function and call that e.g.
for func, proc in [(func_cliche_start, cliche_proc), (func_test_start, test_proc), (func_macro_start, macro_proc)]:
a = func(line)
if a:
proc(a, line)
break;
I think you should put those blocks of code in functions. That way you can use a dispatcher-style approach. If you need to modify a lot of local state, use a class and methods. (If not, just use functions; but I'll assume the class case here.) So something like this:
from itertools import dropwhile
class LineHandler(object):
def __init__(self, state):
self.state = state
def handle_cliche_start(self, line):
# modify state
def handle_test_start(self, line):
# modify state
def handle_macro_start(self, line):
# modify state
line_handler = LineHandler(initial_state)
handlers = [line_handler.handle_cliche_start,
line_handler.handle_test_start,
line_handler.handle_macro_start]
tests = [func_cliche_start,
func_test_start,
func_macro_start]
handlers_tests = zip(handlers, tests)
for line in lines:
handler_iter = ((h, t(line)) for h, t in handlers_tests)
handler_filter = ((h, l) for h, l in handler_iter if l is not None)
handler, line = next(handler_filter, (None, None))
if handler:
handler(line)
This is a bit more complex than your original code, but I think it compartmentalizes things in a much more scalable way. It does require you to maintain separate parallel lists of functions, but the payoff is that you can add as many as you want without having to write long if statements -- or calling your function twice! There are probably more sophisticated ways of organizing the above too -- this is really just a roughed-out example of what you could do. For example, you might be able to create a sorted container full of (priority, test_func, handler_func) tuples and iterate over it.
In any case, I think you should consider refactoring this long list of if/elif clauses.
You could take a list of functions, make it a generator and return the first Truey one:
functions = [func_cliche_start, func_test_start, func_macro_start]
functions_gen = (f(line) for f in functions)
a = next((x for x in functions_gen if x), None)
Still seems a little strange, but much less repetition.

Categories

Resources