Problems with 'lambda' expression in python - python

I have the following problem solving question
Please write a program using generator to print the even numbers
between 0 and n in comma separated form while n is input by console.
Example: If the following n is given as input to the program: 10 Then,
the output of the program should be: 0,2,4,6,8,10.
And below is my answer
n=int(input("enter the number of even numbers needed:"))
eve=''
st=(lambda x:(for i in range(0,x))[(str(i)) if i%2==0 else (",")])(n)
However, I have a problem with the third line that has the lambda

Take full advantage of Python 3's features to create a generator using generator expression syntax, also do the even number stepping with range()'s third paramter.
This would be much briefer:
>>> n = 12
>>>
>>> fn = lambda x: (f"{i}," for i in range(0, x + 1, 2))
>>>
>>> ''.join(list(fn(n)))[:-1] + '.'
'0,2,4,6,8,10,12.'
>>>
>>> fn(10)
<generator object <lambda>.<locals>.<genexpr> at 0x107f67660>
What looks like a tuple comprehension is actually called a "generator expression". Note that in the last line above the interpeter is indicating that the type returned by the lambda is indeed a generator.
Even briefer, you could do it this way:
>>> fn = lambda x: ','.join( (f"{i}" for i in range(0, x + 1, 2)) ) + '.'
>>>
>>> fn(n)
'0,2,4,6,8,10,12.'
>>>
Looks like you might have been on the right track in your question post.
A function that uses the yield keyword also creates a generator. So the other poster is correct in that regard.

Related

Is there a way to replace a variable with a value in a sympy function without automatically evaluating it?

I'm trying to write a simple script to do some math, and print out some specific intermediate steps in LaTeX. I've found there's no real simple way to do this using sympy, so I've started manually writing functions to print out each step I want to see.
I need to take a sympy function and format it so that every variable is replaced by it's associated value, I've been accessing the values through a dictionary.
basically,
import sympy as sym
x,a = sym.symbols("x a")
var_Values = {'x':3, 'a':2}
f=(x+1)/a
print(some_Function(f, var_Values))
so that the print statement reads \frac{3+1}{2}.
I've already tried two methods, using f.subs() to replace the variable with the value, which prints out 2 in this case, since it evaluates the expression's value.
I've also tried this textual method:
def some_Function(f, var_Values):
expression = sym.latex(f)
for variable in f.free_symbols:
expression = expression.replace(variable.name, var_Values.get(variable.name)
return expression
which is even worse, as it returns \fr2c{3+1}{2}, which turns more than what I wanted into numbers. It could be possible to get around this by using variable names that don't appear in the LaTeX commands I"m using, but that approach is impossible for me, as I don't choose my variable names.
SymPy is not great when it comes to leaving an expression unchanged because it inherently tries to simplify anything to make any future computations faster. subs and replace try to simplify the expression afterwards.
Here is the best I can think of:
import sympy as sym
x, a = sym.symbols("x a")
var_values = {x: 3, a: 2}
f = (x + 1) / a
def some_function(func: sym.Expr, var_dict: dict) -> sym.Expr:
new_dict = {key: sym.Symbol(str(val)) for key, val in var_dict.items()}
result = func.subs(new_dict)
return result
print(some_function(f, var_values))
This produces (3 + 1)/2. It should work for most cases. Sadly, this does not work in general as SymPy since with addition, it will sort the terms on its own. That means (x + y + 1) / a; var_values = {x: 3, y: 1, a: 2} produces (1 + 3 + 1)/2 which is not right. I have opened up an issue about this.
The reason why the second method does not produce valid latex is because expression is a string and has nothing to do with SymPy variables. Running in debug mode or in interactive mode, you'll see that it is "x + \frac{1}{a}". When you replace "a" for 2, "\frac" becomes "\fr2c". It is best to keep in mind what variable type each object is and to use SymPy Symbols instead of strings when replacing variables in an expression.
What about
>>> with evaluate(False):
... u = f.subs(val_Values)
That gives the (1 + 3)/2 unevaluated result. If you want the order you wrote terms to be respected then you have to do 2 things: create the expression and do the replacement in an unevaluated context and 2) use a printer that doesn't re-order the args:
>>> with evaluate(False):
... a=(1 + x).subs(var_Values)
... b=(x + 1).subs(var_Values)
>>> a,b
(1 + 3, 1 + 3)
>>> p=StrPrinter(dict(order='none'))
>>> p.doprint(a),p.doprint(b)
(1 + 3, 3 + 1)

Print inside lambda

I'm currently using syntax like this:
print(*list(map(lambda a: [something to do], input())))
It's working fine if the return type is a string but not for a number.
For example. The below script is for finding the square root of three in two different version
print(*list(map(lambda a: int(a)**(1/3), input())))
and
SQRT = lambda a: a**(1/3)
print(SQRT(int(input())))
When I input 9 both return 2.080083823051904 (Which is correct) but when I input 10 the first one return 1.0 0.0 and 2.154434690031884 for the second one.
I wanted to know that is there any way I can print directly from lambda that returns a number without causing a problem addressed above?
You are focusing on the wrong issue here. This is not a problem with printing; removing the print() function from the equation will give you the same results.
You are iterating over the individual characters of the input string, producing the cube root of 1 and 0 respectively when entering '10' into the input prompt, or of 9 when you enter '9':
>>> list(map(lambda a: int(a)**(1/3), '10'))
[1.0, 0.0] # [1 ** (1/3), 0 ** (1/3)]
>>> list(map(lambda a: int(a)**(1/3), '9'))
[2.080083823051904] # [9 ** (1/3)]
input() returns a string object, and strings are iterables; a sequence of the individual characters. For '10' iteration gives you '1' and '0':
>>> list('10') # just iteration, pulling out the separate parts
['1', '0']
Your second code snippets applies int() to the whole input() string, so then you get the square root of 10 and 9:
>>> SQRT(int('10'))
2.154434690031884
>>> SQRT(int('9'))
2.080083823051904
If you wanted the cube root of the input, don't use map():
>>> CBRT = lambda a: a ** (1/3)
>>> CBRT(int('10'))
>>> CBRT(int('10'))
2.154434690031884
Side note: * works on any iterable, including the iterator object that map() produces, so print(*map(...)) works just as well as print(*list(map(...))), but without creating a list object first that then is discarded again.
By using map on the returning string of input(), you are treating the string as a sequence of individual characters, and hence the cube root of 1 and 0 when you input '10'.
You should use the second method you posted, since the all the lambda, map, and unpacking add no value to the problem you're solving.

Python fluent filter, map, etc

I love python. However, one thing that bugs me a bit is that I don't know how to format functional activities in a fluid manner like a can in javascript.
example (randomly created on the spot): Can you help me convert this to python in a fluent looking manner?
var even_set = [1,2,3,4,5]
.filter(function(x){return x%2 === 0;})
.map(function(x){
console.log(x); // prints it for fun
return x;
})
.reduce(function(num_set, val) {
num_set[val] = true;
}, {});
I'd like to know if there are fluid options? Maybe a library.
In general, I've been using list comprehensions for most things but it's a real problem if I want to print
e.g., How can I print every even number between 1 - 5 in python 2.x using list comprehension (Python 3 print() as a function but Python 2 it doesn't). It's also a bit annoying that a list is constructed and returned. I'd rather just for loop.
Update Here's yet another library/option : one that I adapted from a gist and is available on pipy as infixpy:
from infixpy import *
a = (Seq(range(1,51))
.map(lambda x: x * 4)
.filter(lambda x: x <= 170)
.filter(lambda x: len(str(x)) == 2)
.filter( lambda x: x % 20 ==0)
.enumerate() Ï
.map(lambda x: 'Result[%d]=%s' %(x[0],x[1]))
.mkstring(' .. '))
print(a)
pip3 install infixpy
Older
I am looking now at an answer that strikes closer to the heart of the question:
fluentpy https://pypi.org/project/fluentpy/ :
Here is the kind of method chaining for collections that a streams programmer (in scala, java, others) will appreciate:
import fluentpy as _
(
_(range(1,50+1))
.map(_.each * 4)
.filter(_.each <= 170)
.filter(lambda each: len(str(each))==2)
.filter(lambda each: each % 20 == 0)
.enumerate()
.map(lambda each: 'Result[%d]=%s' %(each[0],each[1]))
.join(',')
.print()
)
And it works fine:
Result[0]=20,Result[1]=40,Result[2]=60,Result[3]=80
I am just now trying this out. It will be a very good day today if this were working as it is shown above.
Update: Look at this: maybe python can start to be more reasonable as one-line shell scripts:
python3 -m fluentpy "lib.sys.stdin.readlines().map(str.lower).map(print)"
Here is it in action on command line:
$echo -e "Hello World line1\nLine 2\Line 3\nGoodbye"
| python3 -m fluentpy "lib.sys.stdin.readlines().map(str.lower).map(print)"
hello world line1
line 2
line 3
goodbye
There is an extra newline that should be cleaned up - but the gist of it is useful (to me anyways).
Generators, iterators, and itertools give added powers to chaining and filtering actions. But rather than remember (or look up) rarely used things, I gravitate toward helper functions and comprehensions.
For example in this case, take care of the logging with a helper function:
def echo(x):
print(x)
return x
Selecting even values is easy with the if clause of a comprehension. And since the final output is a dictionary, use that kind of comprehension:
In [118]: d={echo(x):True for x in s if x%2==0}
2
4
In [119]: d
Out[119]: {2: True, 4: True}
or to add these values to an existing dictionary, use update.
new_set.update({echo(x):True for x in s if x%2==0})
another way to write this is with an intermediate generator:
{y:True for y in (echo(x) for x in s if x%2==0)}
Or combine the echo and filter in one generator
def even(s):
for x in s:
if x%2==0:
print(x)
yield(x)
followed by a dict comp using it:
{y:True for y in even(s)}
Comprehensions are the fluent python way of handling filter/map operations.
Your code would be something like:
def evenize(input_list):
return [x for x in input_list if x % 2 == 0]
Comprehensions don't work well with side effects like console logging, so do that in a separate loop. Chaining function calls isn't really that common an idiom in python. Don't expect that to be your bread and butter here. Python libraries tend to follow the "alter state or return a value, but not both" pattern. Some exceptions exist.
Edit: On the plus side, python provides several flavors of comprehensions, which are awesome:
List comprehension: [x for x in range(3)] == [0, 1, 2]
Set comprehension: {x for x in range(3)} == {0, 1, 2}
Dict comprehension: ` {x: x**2 for x in range(3)} == {0: 0, 1: 1, 2: 4}
Generator comprehension (or generator expression): (x for x in range(3)) == <generator object <genexpr> at 0x10fc7dfa0>
With the generator comprehension, nothing has been evaluated yet, so it is a great way to prevent blowing up memory usage when pipelining operations on large collections.
For instance, if you try to do the following, even with python3 semantics for range:
for number in [x**2 for x in range(10000000000000000)]:
print(number)
you will get a memory error trying to build the initial list. On the other hand, change the list comprehension into a generator comprehension:
for number in (x**2 for x in range(1e20)):
print(number)
and there is no memory issue (it just takes forever to run). What happens is the range object gets built (which only stores the start, stop and step values (0, 1e20, and 1)) the object gets built, and then the for-loop begins iterating over the genexp object. Effectively, the for-loop calls
GENEXP_ITERATOR = `iter(genexp)`
number = next(GENEXP_ITERATOR)
# run the loop one time
number = next(GENEXP_ITERATOR)
# run the loop one time
# etc.
(Note the GENEXP_ITERATOR object is not visible at the code level)
next(GENEXP_ITERATOR) tries to pull the first value out of genexp, which then starts iterating on the range object, pulls out one value, squares it, and yields out the value as the first number. The next time the for-loop calls next(GENEXP_ITERATOR), the generator expression pulls out the second value from the range object, squares it and yields it out for the second pass on the for-loop. The first set of numbers are no longer held in memory.
This means that no matter how many items in the generator comprehension, the memory usage remains constant. You can pass the generator expression to other generator expressions, and create long pipelines that never consume large amounts of memory.
def pipeline(filenames):
basepath = path.path('/usr/share/stories')
fullpaths = (basepath / fn for fn in filenames)
realfiles = (fn for fn in fullpaths if os.path.exists(fn))
openfiles = (open(fn) for fn in realfiles)
def read_and_close(file):
output = file.read(100)
file.close()
return output
prefixes = (read_and_close(file) for file in openfiles)
noncliches = (prefix for prefix in prefixes if not prefix.startswith('It was a dark and stormy night')
return {prefix[:32]: prefix for prefix in prefixes}
At any time, if you need a data structure for something, you can pass the generator comprehension to another comprehension type (as in the last line of this example), at which point, it will force the generators to evaluate all the data they have left, but unless you do that, the memory consumption will be limited to what happens in a single pass over the generators.
The biggest dealbreaker to the code you wrote is that Python doesn't support multiline anonymous functions. The return value of filter or map is a list, so you can continue to chain them if you so desire. However, you'll either have to define the functions ahead of time, or use a lambda.
Arguments against doing this notwithstanding, here is a translation into Python of your JS code.
from __future__ import print_function
from functools import reduce
def print_and_return(x):
print(x)
return x
def isodd(x):
return x % 2 == 0
def add_to_dict(d, x):
d[x] = True
return d
even_set = list(reduce(add_to_dict,
map(print_and_return,
filter(isodd, [1, 2, 3, 4, 5])), {}))
It should work on both Python 2 and Python 3.
There's a library that already does exactly what you are looking for, i.e. the fluid syntaxt, lazy evaluation and the order of operations is the same with how it's written, as well as many more other good stuff like multiprocess or multithreading Map/Reduce.
It's named pyxtension and it's prod ready and maintained on PyPi.
Your code would be rewritten in this form:
from pyxtension.strams import stream
def console_log(x):
print(x)
return x
even_set = stream([1,2,3,4,5])\
.filter(lambda x:x%2 === 0)\
.map(console_log)\
.reduce(lambda num_set, val: num_set.__setitem__(val,True))
Replace map with mpmap for multiprocessed map, or fastmap for multithreaded map.
We can use Pyterator for this (disclaimer: I am the author).
We define the function that prints and returns (which I believe you can omit completely however).
def print_and_return(x):
print(x)
return x
then
from pyterator import iterate
even_dict = (
iterate([1,2,3,4,5])
.filter(lambda x: x%2==0)
.map(print_and_return)
.map(lambda x: (x, True))
.to_dict()
)
# {2: True, 4: True}
where I have converted your reduce into a sequence of tuples that can be converted into a dictionary.

Init method; len() of self object

def __init__(self,emps=str(""),l=[">"]):
self.str=emps
self.bl=l
def fromFile(self,seqfile):
opf=open(seqfile,'r')
s=opf.read()
opf.close()
lisst=s.split(">")
if s[0]==">":
lisst.pop(0)
nlist=[]
for x in lisst:
splitenter=x.split('\n')
splitenter.pop(0)
splitenter.pop()
splitstring="".join(splitenter)
nlist.append(splitstring)
nstr=">".join(nlist)
nstr=nstr.split()
nstr="".join(nstr)
for i in nstr:
self.bl.append(i)
self.str=nstr
return nstr
def getSequence(self):
print self.str
print self.bl
return self.str
def GpCratio(self):
pgenes=[]
nGC=[]
for x in range(len(self.lb)):
if x==">":
pgenes.append(x)
for i in range(len(pgenes)):
if i!=len(pgenes)-1:
c=krebscyclus[pgenes[i]:pgenes[i+1]].count('c')+0.000
g=krebscyclus[pgenes[i]:pgenes[i+1]].count('g')+0.000
ratio=(c+g)/(len(range(pgenes[i]+1,pgenes[i+1])))
nGC.append(ratio)
return nGC
s = Sequence()
s.fromFile('D:\Documents\Bioinformatics\sequenceB.txt')
print 'Sequence:\n', s.getSequence(), '\n'
print "G+C ratio:\n", s.GpCratio(), '\n'
I dont understand why it gives the error:
in GpCratio for x in range(len(self.lb)): AttributeError: Sequence instance has no attribute 'lb'.
When i print the list in def getSequence it prints the correct DNA sequenced list, but i can not use the list for searching for nucleotides. My university only allows me to input 1 file and not making use of other arguments in definitions, but "self"
btw, it is a class, but it refuses me to post it then.. class called Sequence
Looks like a typo. You define self.bl in your __init__() routine, then try to access self.lb.
(Also, emps=str("") is redundant - emps="" works just as well.)
But even if you correct that typo, the loop won't work:
for x in range(len(self.bl)): # This iterates over a list like [0, 1, 2, 3, ...]
if x==">": # This condition will never be True
pgenes.append(x)
You probably need to do something like
pgenes=[]
for x in self.bl:
if x==">": # Shouldn't this be != ?
pgenes.append(x)
which can also be written as a list comprehension:
pgenes = [x for x in self.bl if x==">"]
In Python, you hardly ever need len(x) or for n in range(...); you rather iterate directly over the sequence/iterable.
Since your program is incomplete and lacking sample data, I can't run it here to find all its other deficiencies. Perhaps the following can point you in the right direction. Assuming a string that contains the characters ATCG and >:
>>> gene = ">ATGAATCCGGTAATTGGCATACTGTAG>ATGATAGGAGGCTAG"
>>> pgene = ''.join(x for x in gene if x!=">")
>>> pgene
'ATGAATCCGGTAATTGGCATACTGTAGATGATAGGAGGCTAG'
>>> ratio = float(pgene.count("G") + pgene.count("C")) / (pgene.count("A") + pgene.count("T"))
>>> ratio
0.75
If, however, you don't want to look at the entire string but at separate genes (where > is the separator), use something like this:
>>> gene = ">ATGAATCCGGTAATTGGCATACTGTAG>ATGATAGGAGGCTAG"
>>> genes = [g for g in gene.split(">") if g !=""]
>>> genes
['ATGAATCCGGTAATTGGCATACTGTAG', 'ATGATAGGAGGCTAG']
>>> nGC = [float(g.count("G")+g.count("C"))/(g.count("A")+g.count("T")) for g in genes]
>>> nGC
[0.6875, 0.875]
However, if you want to calculate GC content, then of course you don't want (G+C)/(A+T) but (G+C)/(A+T+G+C) --> nGC = [float(g.count("G")+g.count("C"))/len(g)].

Using math.factorial in a lambda function with reduce()

I'm attempting to write a function that calculates the number of unique permutations of a string. For example aaa would return 1 and abc would return 6.
I'm writing the method like this:
(Pseudocode:)
len(string)! / (A!*B!*C!*...)
where A,B,C are the number of occurrences of each unique character. For example, the string 'aaa' would be 3! / 3! = 1, while 'abc' would be 3! / (1! * 1! * 1!) = 6.
My code so far is like this:
def permutations(n):
'''
returns the number of UNIQUE permutations of n
'''
from math import factorial
lst = []
n = str(n)
for l in set(n):
lst.append(n.count(l))
return factorial(len(n)) / reduce(lambda x,y: factorial(x) * factorial(y), lst)
Everything works fine, except when I try to pass a string that has only one unique character, i.e. aaa - I get the wrong answer:
>>> perm('abc')
6
>>> perm('aaa')
2
>>> perm('aaaa')
6
Now, I can tell the problem is in running the lambda function with factorials on a list of length 1. I don't know why, though. Most other lambda functions works on a list of length 1 even if its expecting two elements:
>>> reduce(lambda x,y: x * y, [3])
3
>>> reduce(lambda x,y: x + y, [3])
3
This one doesn't:
>>> reduce(lambda x,y: ord(x) + ord(y), ['a'])
'a'
>>> reduce(lambda x,y: ord(x) + ord(y), ['a','b'])
195
Is there something I should be doing differently? I know I can rewrite the function in many different ways that will circumvent this, (e.g. not using lambda), but I'm looking for why this specifically doesn't work.
See the documentation for reduce(), there is an optional 'initializer' argument that is placed before all other elements in the list so that the behavior for one element lists is consistent, for example, for your ord() lambda you could set initializer to the the character with an ord() of 0:
>>> reduce(lambda x, y: ord(x) + ord(y), ['a'], chr(0))
97
Python's reduce function doesn't always know what the default (initial) value should be. There should be a version that takes an initial value. Supply a sensible initial value and your reduce should work beautifully.
Also, from the comments, you should probably just use factorial on the second argument in your lambda:
reduce(lambda x,y: x * factorial(y), lst, 1)
If you want len(s)! / A!*B!*C! then the use of reduce() won't work, as it will calculate factorial(factorial(A)*factorial(B))*factorial(C). In other words, it really needs the operation to be commutative.
Instead, you'll need to generate the list of factorials, then multiply them together:
import operator
reduce(operator.mul, [factorial(x) for x in lst])
Reduce works by first computing the result for the first two elements in the sequence and then pseudo-recursively follows from there. A list of size 1 is a special case.
I would use a list comprehension here:
prod( [ factorial(val) for val in lst ] )
Good luck!

Categories

Resources