I have sample dataset df like;
tel_no
tel:+1-860-752-8792
tel:+1-949-722-8838
Th goal is to get the output as;
tel_no
18607528792
19497228838
Here is my attempt;
df['tel_no'].apply(lambda x: x.replace(i, '') for i in ['+','-','tel:'])
But this gives an error message;
TypeError: 'generator' object is not callable
I am aware that it can be done in 3 separate lines, on for each character. But I was wondering can we do it one line as above. Help is appreciated.
An easier way to go would be to use pandas str methods, namely findall (to find all digits using the regex \d+) and join (to join the resulting list of digit substrings together):
>>> df.tel_no.str.findall("\d+").str.join("")
0 18607528792
1 19497228838
Name: tel_no, dtype: object
I agree that using regex matching is a good solution to your problem, but I can at least address the problem with your code.
You current code is:
df['tel_no'].apply(lambda x: x.replace(i, '') for i in ['+','-','tel:'])
Python parses this (perhaps surprisingly) as:
df['tel_no'].apply(
(
(lambda x: x.replace(i, ''))
for i in ['+','-','tel:'])
)
)
That is, you have written a generator comprehension, creating a new anonymous function at each iteration of the loop. You have not created a single anonymous function with a generator comprehension inside it!
Obviously, generators are not callable, which is what caused the error.
Your attempt reflects two additional misunderstandings:
Comprehension syntax cannot be used outside of an actual comprehension. Perhaps you meant to write lambda x: (x.replace(i, '')) for i in ['+','-','tel:']), which would at least be one function that contains a generator comprehension.
String functions like str.replace do not modify the string. They return a new string. See the example below.
s1 = 'hello'
s2 = s1.replace('e', 'f')
# s1 will be unchanged
assert s1 == 'hello'
# s2 will be changed
assert s2 == 'hfllo'
To write this as a function, you would need to use def, not `lambda:
def clean_tel(x):
for bad_string in ['+', '-', 'tel:']:
x = x.replace(bad_string, '')
return x
df['tel_no'].apply(clean_tel)
Or you can omit the loop and write it like this:
df['tel_no'].apply(
lambda x: x.replace('+', '').replace('-', '').replace('tel:', '')
)
Related
I'm working on getting a better grasp of Python 3 fundamentals, specifically objects and modifying them in the context of a list (for now).
I created a simple class called MyThing() that just has a number, letter, and instance method for incrementing the number. My goal with this program was to create a list of 3 "MyThings", and manipulate the list in various ways. To start, I iterated through the list (obj_list_1) and incremented each number using each object's instance method. Easy enough.
What I'm trying to figure out how to do is perform the same operation in one line using the map function and lambda expressions (obj_list_2).
#!/usr/bin/env py
import copy
class MyThing:
def __init__(self, letter='A', number=0):
self.number = number
self.letter = letter
def __repr__(self) -> str:
return("(letter={}, number={})".format(self.letter, self.number))
def incr_number(self, incr=0):
self.number += incr
# Test program to try different ways of manipulating lists
def main():
obj1 = MyThing('A', 1)
obj2 = MyThing('B', 2)
obj3 = MyThing('C', 3)
obj_list_1 = [obj1, obj2, obj3]
obj_list_2 = copy.deepcopy(obj_list_1)
# Show the original list
print("Original List: {}".format(obj_list_1))
# output: [(letter=A, number=1), (letter=B, number=2), (letter=C, number=3)]
# Standard iterating over a list and incrementing each object's number.
for obj in obj_list_1:
obj.incr_number(1)
print("For loop over List, adding one to each number:\n{}".format(obj_list_1))
# output: [(letter=A, number=2), (letter=B, number=3), (letter=C, number=4)]
# Try using map function with lambda
obj_list_2 = list(map(lambda x: x.incr_number(1), obj_list_2))
print("Using maps with incr_number instance method:\n{}".format(obj_list_2))
# actual output: [None, None, None] <--- If I don't re-assign obj_list_2...it shows the proper sequence
# expected output: [(letter=A, number=2), (letter=B, number=3), (letter=C, number=4)]
if __name__ == "__main__":
main()
What I can't figure out is how to get map() to return the correct type, a list of "MyThing"s.
I understand that between Python 2 and Python 3, map changed to return an iterable instead of a list, so I made sure to cast the output. What I get is a list of 'None' objects.
What I noticed, though, is that if I don't re-assign obj_list_2, and instead just call list(map(lambda x: x.incr_number(1), obj_list_2)), then print obj_list_2 in the next line, the numbers get updated as I expect.
However, if I don't cast the map iterable and just do map(lambda x: x.incr_number(1), obj_list_2), the following print statement shows the list as having not been updated. I read in some documentation that the map function is lazy and doesn't operate until it's use by something...so this makes sense.
Is there a way that I can get the output of list(map(lambda x: x.incr_number(1), obj_list_2)) to actually return my list of objects?
Are there any other cool one-liner solutions for updating a list of objects with their instance methods that I'm not thinking of?
TL;DR: Just use the for-loop. There's no advantage to using a map in this case.
Firstly:
You're getting a list of Nones because the mapped function returns None. That is, MyThing.incr_number() doesn't return anything, so it returns None implicitly.
Fewer lines is not necessarily better. Two simple lines are often easier to read than one complex line.
Notice that you're not creating a new list in the for-loop, you're only modifying the elements of the existing list.
list(map(lambda)) is longer and harder to read than a list comprehension:
[x.incr_number(1) for x in obj_list_2]
vs
list(map(lambda x: x.incr_number(1), obj_list_2))
Now, take a look at Is it Pythonic to use list comprehensions for just side effects? The top answer says no, it creates a list that never gets used. So there's your answer: just use the for-loop instead.
This is because, your incr_number doesn't return anything. Change it to:
def incr_number(self, incr=0):
self.number += incr
return self
The loop is clearly better, but here's another way anyway. Your incr_number doesn't return anything, or rather returns the default None. Which is a false value, so if you simply append or x, then you do get the modified value instead of the None
Change
list(map(lambda x: x.incr_number(1), obj_list_2))
to this:
list(map(lambda x: x.incr_number(1) or x, obj_list_2))
Seeking guidance to understand a lambda-map function. In the below, I see that the file "feedback" is read line by line and stored in a list "feedback". I'm unable to get my head around the variable x. I don't see the variable "x" declared anywhere. Can someone help me understand the statement?Thanks in advance
f = open('feedback.txt','r')
feedback = list(map(lambda x:x[:-1],f.readlines())
f.close()
The map function will execute the given function for every element in the list.
In your code the map function will get lambda x:x[:-1].
You can read that like: for every x in f.readlines() return everything except the last element of x.
So x will be every line the file. lambda x: you could see as def thing(x):.
I replaced lambda with a standard func:
def read_last(x): #x means a line
return x[:-1]
f = open('feedback.txt','r')
feedback = list(map(read_last, f.readlines())
f.close()
Maybe it will help.
lambda function is a simple anonymous function that takes any number of arguments, but has only one expression.
lambda arguments : expression
It is anonymous because we have not assigned it to an object, and thus it has no name.
example f and g are somewhat same:
def f(x):
# take a string and return all but last value
return x[:-1]
g = lambda x: x[:-1]
so:
f('hello') == g('hello') #True ->'hell'
But g is not how we would use lambda. The whole aim is to avoid assigning ;)
Now map takes in a function and applies it to an iteratable:it returns a generator in Python 3+ and thus a list is used to case that generator to a list
data = ['we are 101','you are 102','they are 103']
print(list(map(lambda x:x[:-1],data)))
#->['we are 10','you are 10','they are 10']
In principle, same as passing a function:
data = ['we are 101','you are 102','they are 103']
print(list(map(f,data)))
but often faster and awesome. I love lambdas
Keep in mind, while explaining lambda is solved here, it is not the implementation of choice for your particular example. Suggestion:
f = open('feedback.txt', 'r')
feedback = f.read().splitlines()
f.close()
See also 'Reading a file without newlines'.
I would like to output a user input expression to a string.
The reason is that the input expression is user defined. I want to output the result of the expression, and print the statement which lead to this result.
import sys
import shutil
expression1 = sys.path
expression2 = shutil.which
def get_expression_str(expression):
if callable(expression):
return expression.__module__ +'.'+ expression.__name__
else:
raise TypeError('Could not convert expression to string')
#print(get_expression_str(expression1))
# returns : builtins.TypeError: Could not convert expression to string
#print(get_expression_str(expression2))
# returns : shutil.which
#print(str(expression1))
#results in a list like ['/home/bernard/clones/it-should-work/unit_test', ... ,'/usr/lib/python3/dist-packages']
#print(repr(expression1))
#results in a list like ['/home/bernard/clones/it-should-work/unit_test', ... ,'/usr/lib/python3/dist-packages']
I looked into the Python inspect module but even
inspect.iscode(sys.path)
returns False
For those who wonder why it is the reverse of a string parsed to an expression using functools.partial see parse statement string
Background.
A program should work. Should, but it not always does. Because a program need specific resources, OS, OS version, other packages, files, etc. Every program needs different requirements (resources) to function properly.
Which specific requirement are needed can not be predicted. The system knows best which resources are and are not available. So instead of manually checking all settings and configurations let a help program do this for you.
So the user, or developer of a program, specify his requirements together with statements how to to retrieve this information : expressions. Which could be executed using eval. Could. Like mentioned on StackOverflow eval is evil.
Use of eval is hard to make secure using a blacklist, see : http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html
Using multiple tips of SO I use a namedtuple, with a string, to compare with the user input string, and a function.
A white-list is better then a blacklist. Only if the parsed expression string match a "bare_expression" then an expression is returned.
This white-list contains more information how to process f.e. the "unit_of_measurement" . It goes to far to explain what and why, but this is needed. The list of the namedtuples is much more then just a white-list and is defined :
Expr_UOfM = collections.namedtuple('Expr_UOfM', ['bare_expression', 'keylist', 'function', 'unit_of_measurement', 'attrlist'])
The namedtuple which match a (very limited) list:
Exp_list = [Expr_UOfM('sys.path', '' , sys.path, un.STR, []),
Expr_UOfM('shutil.which', '', shutil.which, None, [])]
This list may be very long and the content is crucial for further correct processing. Note the first and third field are very similar. There should be a single point of reference, but for me, this is on this moment not possible. Note the string : 'sys.path' is equal to (a part of) the user input, and the expression : sys.path is part of the namedtuple list. A good separation, limiting possible abuse.
If the string and the expression are not 100% identical weird behavior may occur which is very hard to debug.
So it want using the get_expression_str function check if the first and third field are identical. Just for total robustness of
the program.
I use Python 3.4
You could use inspect.getsource() and wrap your expression in a lambda. Then you can get an expression with this function:
def lambda_to_expr_str(lambda_fn):
"""c.f. https://stackoverflow.com/a/52615415/134077"""
if not lambda_fn.__name__ == "<lambda>":
raise ValueError('Tried to convert non-lambda expression to string')
else:
lambda_str = inspect.getsource(lambda_fn).strip()
expression_start = lambda_str.index(':') + 1
expression_str = lambda_str[expression_start:].strip()
if expression_str.endswith(')') and '(' not in expression_str:
# i.e. l = lambda_to_expr_str(lambda x: x + 1) => x + 1)
expression_str = expression_str[:-1]
return expression_str
Usage:
$ lambda_to_expr_str(lambda: sys.executable)
> 'sys.executable'
OR
$ f = lambda: sys.executable
$ lambda_to_expr_str(f)
> 'sys.executable'
And then eval with
$ eval(lambda_to_expr_str(lambda: sys.executable))
> '/usr/bin/python3.5'
Note that you can take parameters with this approach and pass them with the locals param of eval.
$ l = lambda_to_expr_str(lambda x: x + 1) # now l == 'x + 1'
$ eval(l, None, {'x': 1})
> 2
Here be Dragons. There are many pathological cases with this approach:
$ l, z = lambda_to_expr_str(lambda x: x + 1), 1234
$ l
> 'x + 1), 1234'
This is because inspect.getsource gets the entire line of code the lambda was declared on. Getting source of functions declared with def would avoid this problem, however passing a function body to eval is not possible as there could be side effects, i.e. setting variables, etc... Lambda's can produce side effects as well in Python 2, so even more dragons lie in pre-Python-3 land.
Why not use eval?
>>> exp1 = "sys.path"
>>> exp2 = "[x*x for x in [1,2,3]]"
>>> eval(exp1)
['', 'C:\\Python27\\lib\\site-packages\\setuptools-0.6c11-py2.7.egg', 'C:\\Pytho
n27\\lib\\site-packages\\pip-1.1-py2.7.egg', 'C:\\Python27\\lib\\site-packages\\
django_celery-3.1.1-py2.7.egg', 'C:\\Python27\\lib\\site-packages\\south-0.8.4-p
y2.7.egg', 'C:\\Windows\\system32\\python27.zip', 'C:\\Python27\\DLLs', 'C:\\Pyt
hon27\\lib', 'C:\\Python27\\lib\\plat-win', 'C:\\Python27\\lib\\lib-tk', 'C:\\Py
thon27', 'C:\\Python27\\lib\\site-packages', 'C:\\Python27\\lib\\site-packages\\
PIL']
>>> eval(exp2)
[1, 4, 9]
I am trying to make a decrypter that decrypts code from the encrypter I made. I am getting this type error when I run the code though
getcrypt = ''.join(map(Decrypt.get,split_up_into_sixteen_chars(x_str)))
TypeError: split_up_into_sixteen_cjars() takes 0 positional arguments but 1 was given
I'm fairly new to programming and not sure whats causing this.
heres my code
Decrypt = {'1s25FF5ML10IF7aC' : 'A', 1s2afF5ML10I7ac' : 'a'} #I obviously have more than this but I'm trying to make it as simplified as possible
def split_up_into_sixteen_chars():
while len(x_str)>0:
v = x_str[:16]
print(v)
x_str = (input())
getcrypt = ''.join(map(Decrypt.get,split_up_into_sixteen_chars(x_str)))
print(getcrypt)
You have defined a function that takes no parameters:
def split_up_into_sixteen_chars():
yet you are passing it one:
split_up_into_sixteen_chars(x_str)
You need to tell Python that the function takes one parameter here, and name it:
def split_up_into_sixteen_chars(x_str):
The name used does not have to match the name that you pass in for the function call, but it does have to match what you use inside the function. The following function would also work; all I did was rename the parameter:
def split_up_into_sixteen_chars(some_string):
while len(some_string) > 0:
v = some_string[:16]
print(v)
This works because the parameter some_string becomes a local name, local to the function. It only exists inside of the function, and is gone again once the function completes.
Note that your function creates an infinite loop; the length of some_string will either always be 0, or always be longer than 0. The length does not change in the body of the loop.
The following would work better:
def split_up_into_sixteen_chars(some_string):
while len(some_string) > 0:
v = some_string[:16]
print(v)
some_string = some_string[16:]
because then we replace some_string with a shorter version of itself each time.
Your next problem is that the function doesn't return anything; Python then takes a default return value of None. Printing is something else entirely, print() writes the data to your console or IDE, but the caller of the function does not get to read that information.
In this case, you really want a generator function, and use yield. Generator functions return information in chunks; you can ask a generator for the next chunk one by one, and that is exactly what map() would do. Change the function to:
def split_up_into_sixteen_chars(some_string):
while len(some_string) > 0:
v = some_string[:16]
yield v
some_string = some_string[16:]
or even:
def split_up_into_sixteen_chars(some_string):
while some_string:
yield some_string[:16]
some_string = some_string[16:]
because an empty string is 'false-y' when it comes to boolean tests as used by while and if.
As your map(Decrypt.get, ...) stands, if split_up_into_sixteen_chars() yields anything that is not present as a key in Dycrypt, a None is produced (the default value for dict.get() if the key is not there), and ''.join() won't like that. The latter method can only handle strings.
One option would be to return a string default instead:
''.join(map(lambda chunk: Decrypt.get(chunk, ''), split_up_into_sixteen_chars(x_str)))
Now '', the empty string, is returned for chunks that are not present in Decrypt. This makes the whole script work for whatever string input you have:
>>> x_str='Hello world!'
>>> ''.join(map(lambda chunk: Decrypt.get(chunk, ''), split_up_into_sixteen_chars(x_str)))
''
>>> x_str = '1s25FF5ML10IF7aC'
>>> ''.join(map(lambda chunk: Decrypt.get(chunk, ''), split_up_into_sixteen_chars(x_str)))
'A'
Assume you have a function, that sometimes returns a value, and sometimes doesn't, because there really is nothing you could return in this case, not even a default value or something. Now you want to do something with the result, but of course only when there is one.
Example:
result = function_call(params)
if result:
print result
Is there a way to write this in a more pythonic way, maybe even in one line?
Like that:
print function_call(params) or #nothing
(Note that I mean it shouldn't print "nothing" or "None". It should actually just not print at all, if the result is None)
No; in Python, name binding is a statement and so cannot be used as an expression within a statement. Since print is also a statement you're going to require 3 lines; in Python 3 you could write:
result = function_call(params)
print(result) if result else None
This isn't quite true for name binding within a comprehension or generator, where name binding is a syntax item that has statement-like semantics:
[print(result) for result in generator_call(params) if result]
As Kos says, you can abuse this to create a one-element comprehension:
[print(result) for result in (function_call(params), ) if result]
Another syntax item that performs name binding and can similarly be abused is the lambda expression:
(lambda result: print(result) if result else None)(function_call(params))
Note that in both these cases the operation on the return value must be an expression and not a statement.
I think the more Pythonic version is actually closer to your original:
result = function_call(params)
if result is not None:
do_something(result)
Checking for is (not) None seems very idiomatic to me - I've used it several times myself and I've also seen it used elsewhere[citation-needed].
From the answers up to now I would do that:
>>> from __future__ import print_function #if Python2.7
>>> def filtered_print(txt):
... txt and print(txt)
...
>>> filtered_print('hello world')
hello world
>>> filtered_print('None')
None
>>> filtered_print(None)
>>>
If someone else has a better solution in mind, I am still open for alternatives, though!