I need to run several functions in a module as follws:
mylist = open('filing2.txt').read()
noTables = remove_tables(mylist)
newPassage = clean_text_passage(noTables)
replacement = replace(newPassage)
ncount = count_words(replacement)
riskcount = risk_count(ncount)
Is there any way that I can run all the functions at once? Should I make all the functions into a big function and run that big function?
Thanks.
You should make a new function in the module which executes the common sequence being used. This will require you to figure out what input arguments are required and what results to return. So given the code you posted, the new function might look something like this -- I just guessed as to what final results you might be interested in. Also note that I opened the file within a with statement to ensure that it gets closed after reading it.
def do_combination(file_name):
with open(file_name) as input:
mylist = input.read()
noTables = remove_tables(mylist)
newPassage = clean_text_passage(noTables)
replacement = replace(newPassage)
ncount = count_words(replacement)
riskcount = risk_count(ncount)
return replacement, riskcount
Example of usage:
replacement, riskcount = do_combination('filing2.txt')
If you simply store these lines in a Python (.py) file you can simply execute them.
Or am I missing something here?
Creating a function is also easy to call them though:
def main():
mylist = open('filing2.txt').read()
noTables = remove_tables(mylist)
newPassage = clean_text_passage(noTables)
replacement = replace(newPassage)
ncount = count_words(replacement)
riskcount = risk_count(ncount)
main()
As far as I understood, use need function composition. There is no special function for this in Python stdlib, but you can do this with reduce function:
funcs = [remove_tables, clean_text_passage, replace, count_words, risk_count]
do_all = lambda args: reduce(lambda prev, f: f(prev), funcs, args)
Using as
with open('filing2.txt') as f:
riskcount = do_all(f.read())
Here's another approach.
You could write a general function somewhat like that shown in the First-class composition section of the Wikipedia article on Function composition. Note that unlike in the article the functions are applied in the the order they are listed in the call to compose().
try:
from functools import reduce # Python 3 compatibility
except:
pass
def compose(*funcs, **kwargs):
"""Compose a series of functions (...(f3(f2(f1(*args, **kwargs))))) into
a single composite function which passes the result of each
function as the argument to the next, from the first to last
given.
"""
return reduce(lambda f, g:
lambda *args, **kwargs: f(g(*args, **kwargs)),
reversed(funcs))
Here's a trivial example illustrating what it does:
f = lambda x: 'f({!r})'.format(x)
g = lambda x: 'g({})'.format(x)
h = lambda x: 'h({})'.format(x)
my_composition = compose(f, g, h)
print my_composition('X')
Output:
h(g(f('X')))
Here's how it could be applied to the series of functions in your module:
my_composition = compose(remove_tables, clean_text_passage, replace,
count_words, risk_count)
with open('filing2.txt') as input:
riskcount = my_composition(input.read())
Related
I'm reading about generators on http://www.dabeaz.com/generators/
(which is very fine, informative article even if it's a ppt slide)
It has the following section about creating geneators
Any single-argument function is easy to turn
into a generator function
def generate(func):
def gen_func(s):
for item in s:
yield func(item)
return gen_func
• Example:
gen_sqrt = generate(math.sqrt)
for x in gen_sqrt(range(100)):
print(x)
I don't see the point of this slide. (it's on 114p of the slide)
Isn't it just (math.sqrt(e) for e in range(100))
What is he acomplishing with generate function?
The point of such higher-order functions is to allow multiple inputs to a function to be chosen at different times/places:
def filter_lines(f,filt):
with open(f) as f:
for l in f:
print(' '.join(list(filt(map(float,l.split())))))
This can accept any kind of iterable-transformer as filt, like
def ints(it):
for f in it:
if f==int(f): yield f
or the result of generate:
filter_lines("…",ints)
filter_lines("…",list) # the identity: print all
filter_lines("…",generate(math.sqrt))
filter_lines("…",generate(abs))
Therefore we can see that generate transforms a function of one element into a function of iterables of elements. (This is what is meant by “turn into a generator function”.) We can go one further:
import functools
filter_lines("…",functools.partial(map,math.sqrt))
from which we can conclude that generate itself is equivalent to functools.partial(functools.partial,map). Applying partial twice like that splits a parameter list in two, changing a normal function into a higher-order function.
Seeking guidance to understand a lambda-map function. In the below, I see that the file "feedback" is read line by line and stored in a list "feedback". I'm unable to get my head around the variable x. I don't see the variable "x" declared anywhere. Can someone help me understand the statement?Thanks in advance
f = open('feedback.txt','r')
feedback = list(map(lambda x:x[:-1],f.readlines())
f.close()
The map function will execute the given function for every element in the list.
In your code the map function will get lambda x:x[:-1].
You can read that like: for every x in f.readlines() return everything except the last element of x.
So x will be every line the file. lambda x: you could see as def thing(x):.
I replaced lambda with a standard func:
def read_last(x): #x means a line
return x[:-1]
f = open('feedback.txt','r')
feedback = list(map(read_last, f.readlines())
f.close()
Maybe it will help.
lambda function is a simple anonymous function that takes any number of arguments, but has only one expression.
lambda arguments : expression
It is anonymous because we have not assigned it to an object, and thus it has no name.
example f and g are somewhat same:
def f(x):
# take a string and return all but last value
return x[:-1]
g = lambda x: x[:-1]
so:
f('hello') == g('hello') #True ->'hell'
But g is not how we would use lambda. The whole aim is to avoid assigning ;)
Now map takes in a function and applies it to an iteratable:it returns a generator in Python 3+ and thus a list is used to case that generator to a list
data = ['we are 101','you are 102','they are 103']
print(list(map(lambda x:x[:-1],data)))
#->['we are 10','you are 10','they are 10']
In principle, same as passing a function:
data = ['we are 101','you are 102','they are 103']
print(list(map(f,data)))
but often faster and awesome. I love lambdas
Keep in mind, while explaining lambda is solved here, it is not the implementation of choice for your particular example. Suggestion:
f = open('feedback.txt', 'r')
feedback = f.read().splitlines()
f.close()
See also 'Reading a file without newlines'.
I am trying to generate some random expressions in the form f(g(x)). I'd like to be able to replace g with something like sin(x) or x**2 and f with something like cos(x) or log(x). So I'd get something like sin(cos(x)) or log(x**2) (but randomized).
The part of this task I'm having trouble with is replacing both an outer and inner function.
Here's my code:
import sympy
from sympy import abc
x = abc.x
f = sympy.Function('f')(x)
g = sympy.Function('g')(x)
full=f.subs(x, g)
newExpr = sympy.sin(x)
newExpr2 = sympy.cos(x)
print(full)
replaced_inner = full.subs(g, newExpr)
print(replaced_inner)
both = replaced_inner.subs(f, newExpr2)
print(both)
full prints f(g(x)) so that works
replaced_inner prints f(sin(x)) so that works as well
both prints f(sin(x)) when I want it to print cos(sin(x))
I've tried using args[0] and f.func but haven't made progress.
How can I replace both the inner and outer functions (and eventually more complex things like f(g(h(x))).
I could simply create cos(sin(x)) but I want to do it using variables so I can randomize what function gets replaced.
The problem is in confusion of functions like sympy.Function('f') and expressions like sympy.Function('f')(x). Having defined f = sympy.Function('f')(x) you made f the expression f(x). And since
the expression f(g(x)) does not have f(x) as a subexpression, attempted substitution fails.
All this is fixed if you work with actual functions, not plugging x in prematurely.
f = sympy.Function('f')
g = sympy.Function('g')
full = f(g(x))
newExpr = sympy.sin
newExpr2 = sympy.cos
print(full)
replaced_inner = full.subs(g, newExpr)
print(replaced_inner)
both = replaced_inner.subs(f, newExpr2)
print(both)
This prints
f(g(x))
f(sin(x))
cos(sin(x))
Aside: you may also be interested in replace method which supports certain patterns. Not necessary here, but may be necessary for more advanced replacements.
I have code that looks like this:
if(func_cliche_start(line)):
a=func_cliche_start(line)
#... do stuff with 'a' and line here
elif(func_test_start(line)):
a=func_test_start(line)
#... do stuff with a and line here
elif(func_macro_start(line)):
a=func_macro_start(line)
#... do stuff with a and line here
...
Each of the func_blah_start functions either return None or a string (based on the input line). I don't like the redundant call to func_blah_start as it seems like a waste (func_blah_start is "pure", so we can assume no side effects). Is there a better idiom for this type of thing, or is there a better way to do it?
Perhaps I'm wrong, (my C is rusty), but I thought that you could do something this in C:
int a;
if(a=myfunc(input)){ /*do something with a and input here*/ }
is there a python equivalent?
Why don't you assign the function func_cliche_start to variable a before the if statement?
a = func_cliche_start(line)
if a:
pass # do stuff with 'a' and line here
The if statement will fail if func_cliche_start(line) returns None.
You can create a wrapper function to make this work.
def assign(value, lst):
lst[0] = value
return value
a = [None]
if assign(func_cliche_start(line), a):
#... do stuff with 'a[0]' and line here
elif assign(func_test_start(line), a):
#...
You can just loop thru your processing functions that would be easier and less lines :), if you want to do something different in each case, wrap that in a function and call that e.g.
for func, proc in [(func_cliche_start, cliche_proc), (func_test_start, test_proc), (func_macro_start, macro_proc)]:
a = func(line)
if a:
proc(a, line)
break;
I think you should put those blocks of code in functions. That way you can use a dispatcher-style approach. If you need to modify a lot of local state, use a class and methods. (If not, just use functions; but I'll assume the class case here.) So something like this:
from itertools import dropwhile
class LineHandler(object):
def __init__(self, state):
self.state = state
def handle_cliche_start(self, line):
# modify state
def handle_test_start(self, line):
# modify state
def handle_macro_start(self, line):
# modify state
line_handler = LineHandler(initial_state)
handlers = [line_handler.handle_cliche_start,
line_handler.handle_test_start,
line_handler.handle_macro_start]
tests = [func_cliche_start,
func_test_start,
func_macro_start]
handlers_tests = zip(handlers, tests)
for line in lines:
handler_iter = ((h, t(line)) for h, t in handlers_tests)
handler_filter = ((h, l) for h, l in handler_iter if l is not None)
handler, line = next(handler_filter, (None, None))
if handler:
handler(line)
This is a bit more complex than your original code, but I think it compartmentalizes things in a much more scalable way. It does require you to maintain separate parallel lists of functions, but the payoff is that you can add as many as you want without having to write long if statements -- or calling your function twice! There are probably more sophisticated ways of organizing the above too -- this is really just a roughed-out example of what you could do. For example, you might be able to create a sorted container full of (priority, test_func, handler_func) tuples and iterate over it.
In any case, I think you should consider refactoring this long list of if/elif clauses.
You could take a list of functions, make it a generator and return the first Truey one:
functions = [func_cliche_start, func_test_start, func_macro_start]
functions_gen = (f(line) for f in functions)
a = next((x for x in functions_gen if x), None)
Still seems a little strange, but much less repetition.
I have an unknown number of functions in my python script (well, it is known, but not constant) that start with site_...
I was wondering if there's a way to go through all of these functions in some main function that calls for them.
something like:
foreach function_that_has_site_ as coolfunc
if coolfunc(blabla,yada) == true:
return coolfunc(blabla,yada)
so it would go through them all until it gets something that's true.
thanks!
The inspect module, already mentioned in other answers, is especially handy because you get to easily filter the names and values of objects you care about. inspect.getmembers takes two arguments: the object whose members you're exploring, and a predicate (a function returning bool) which will accept (return True for) only the objects you care about.
To get "the object that is this module" you need the following well-known idiom:
import sys
this_module = sys.modules[__name__]
In your predicate, you want to select only objects which are functions and have names that start with site_:
import inspect
def function_that_has_site(f):
return inspect.isfunction(f) and f.__name__.startswith('site_')
With these two items in hand, your loop becomes:
for n, coolfunc in inspect.getmembers(this_module, function_that_has_site):
result = coolfunc(blabla, yada)
if result: return result
I have also split the loop body so that each function is called only once (which both saves time and is a safer approach, avoiding possible side effects)... as well as rewording it in Python;-)
Have you tried using the inspect module?
http://docs.python.org/library/inspect.html
The following will return the methods:
inspect.getmembers
Then you could invoke with:
methodobjToInvoke = getattr(classObj, methodName)
methodobj("arguments")
This method goes through all properties of the current module and executes all functions it finds with a name starting with site_:
import sys
import types
for elm in dir():
f = getattr(sys.modules[__name__], elm)
if isinstance(f, types.FunctionType) and f.__name__[:5] == "site_":
f()
The function-type check is unnecessary if only functions are have names starting with site_.
def run():
for f_name, f in globals().iteritems():
if not f_name.startswith('site_'):
continue
x = f()
if x:
return x
It's best to use a decorator to enumerate the functions you care about:
_funcs = []
def enumfunc(func):
_funcs.append(func)
return func
#enumfunc
def a():
print 'foo'
#enumfunc
def b():
print 'bar'
#enumfunc
def c():
print 'baz'
if __name__ == '__main__':
for f in _funcs:
f()
Try dir(), globals() or locals(). Or inspect module (as mentioned above).
def site_foo():
pass
def site_bar():
pass
for name, f in globals().items():
if name.startswith("site_"):
print name, f()