How can I simplify my function and make it more pythonic? - python

I have written a Python function for some post processing in my text recognition algorithm. It works fine, but it seems to be kind of lengthy and has many if-else-conditions. I am looking for a way to simplify my code:
def postProcessing(new_s): #new_s is a list
import string
new_s=removeFrontLetters(new_s) #This function has to be called first
if all(ch in string.digits for ch in new_s[:4]): #checking if first 4 elements are digits.
if all(ch in string.ascii_letters for ch in new_s[4:5]): #checking if the [4] and [5] are letters
if len(new_s)>=7:
if new_s[6]=='C':
new_s=new_s[:7] #if length>=7 and the [6] =='C' the reversed of the first 7 elements has to be returned.
new_s=list(reversed(new_s))
return(new_s)
else:
new_s=new_s[:6] #if length>=7 but the [6] =!'C' the reversed of the first 6 elements has to be returned.
new_s=list(reversed(new_s))
return(new_s)
else:
new_s=list(reversed(new_s)) #if length<7, the reversed of the given list has to be returned.
return(new_s)
else:
print('not valid') #if the [4] and [5] are not letters, it is not valid
else:
print('not valid') #if the [:4] are not digits, it is not valid
This seems very beginner-level and lengthy. I am a beginner, but I am trying to improve my function. Do you have suggestions?

You can invert your if statements and use early returns to reduce the indentation of your code.
def postProcessing(new_s): # new_s is a list
import string
new_s = removeFrontLetters(new_s) # This function has to be called first
if not all(ch in string.digits for ch in new_s[:4]): # checking if first 4 elements are digits.
raise ValueError("First four elements must be digits")
if not all(ch in string.ascii_letters for ch in new_s[4:5]): # checking if the [4] and [5] are letters
raise ValueError("First elements 4 and 5 must be digits")
if len(new_s) <= 7:
new_s = list(reversed(new_s)) # if length<7, the reversed of the given list has to be returned.
return (new_s)
if new_s[6] == 'C':
new_s = new_s[
:7] # if length>=7 and the [6] =='C' the reversed of the first 7 elements has to be returned.
new_s = list(reversed(new_s))
return (new_s)
new_s = new_s[:6] # if length>=7 but the [6] =!'C' the reversed of the first 6 elements has to be returned.
new_s = list(reversed(new_s))
return (new_s)

It's quite neat you ask 'the world' for advise. Did you know there's a dedicated stackexchange site for this? https://codereview.stackexchange.com/.
Unless you insist writing Python code for this, it seems that you need a regular expression here.
So some tips:
use regex for pattern matching
use variables to express what an expression means
use exceptions instead of an 'invalid value' string
separate the 'parsing' from the processing to keep functions small and focused
use doctest to document and test small functions
def post_process(new_s):
"""
reverse a string (with a twist)
what follows is a doctest. You can run it with
$ python -m doctest my_python_file.py
>>> post_process('1234abCdef')
'Cba4321'
>>> post_process('1234abdef')
'ba4321'
"""
cmds = {
'C': cmd_c,
'': cmd_none
}
command, content = command_and_content(new_s)
process = cmds[command]
return process(content)
def cmd_c(content):
return 'C' + "".join(reversed(content))
def cmd_none(content):
return "".join(reversed(content))
The command_and_content function replaces the parsing logic:
def command_and_content(new_s):
# get help on https://regex101.com/ to find the
# regular expression for four digits and two letters
digits_then_ascii = re.compile(r"(\d{4}[a-z]{2})(C?)(.*)")
if match := digits_then_ascii.match(new_s):
content = match.group(1)
command = match.group(2)
return command, content
# pylint will ask you to not use an else clause after a return
# Also, Python advises exceptions for notifying erroneous input
raise ValueError(new_s)

From the context you provided, I assume that all this processing can happen in-place (i.e. without the need to allocate additional memory). The benefit of lists is that they are mutable, so you can actually do all your operations in-place.
This adheres to the style conventions (PEP 8) and uses correct type annotations (PEP 484):
from string import digits, ascii_letters
def remove_front_letters(new_s: list[str]) -> None:
...
raise NotImplementedError
def post_processing(new_s: list[str]) -> None:
remove_front_letters(new_s)
if any(ch not in digits for ch in new_s[:4]):
raise ValueError("The first 4 characters must be digits")
if any(ch not in ascii_letters for ch in new_s[4:6]):
raise ValueError("The 5th and 6th characters must be letters")
if len(new_s) >= 7:
if new_s[6] == 'C':
del new_s[7:]
else:
del new_s[6:]
new_s.reverse()
If you do want a new list, you can just call this function with a .copy() of your input list.
References: list methods; del statement
PS: If you use Python version 3.8 or lower, instead of list[str] you'll need to use typing.List[str].
Also someone mentioned the possibility of replacing the iteration via all() (or any()) with a "".join(...).isdigit() for example. While this is certainly also correct and technically less code, I am not sure it is necessarily more readable. More importantly it creates a new string in the process, which I don't think is necessary.
By the way, you could even reduce that conditional deletion of list elements to a one liner like this:
...
if len(new_s) >= 7:
del new_s[7 if new_s[6] == 'C' else 6:]
new_s.reverse()
But I would argue that this is worse because it is less readable. Personal preference I guess.

Related

python Question: Could any one explain how the above class reversed the string return val? sorry coming back to python after many years [duplicate]

I want to use recursion to reverse a string in python so it displays the characters backwards (i.e "Hello" will become "olleh"/"o l l e h".
I wrote one that does it iteratively:
def Reverse( s ):
result = ""
n = 0
start = 0
while ( s[n:] != "" ):
while ( s[n:] != "" and s[n] != ' ' ):
n = n + 1
result = s[ start: n ] + " " + result
start = n
return result
But how exactly do I do this recursively? I am confused on this part, especially because I don't work with python and recursion much.
Any help would be appreciated.
def rreverse(s):
if s == "":
return s
else:
return rreverse(s[1:]) + s[0]
(Very few people do heavy recursive processing in Python, the language wasn't designed for it.)
To solve a problem recursively, find a trivial case that is easy to solve, and figure out how to get to that trivial case by breaking the problem down into simpler and simpler versions of itself.
What is the first thing you do in reversing a string? Literally the first thing? You get the last character of the string, right?
So the reverse of a string is the last character, followed by the reverse of everything but the last character, which is where the recursion comes in. The last character of a string can be written as x[-1] while everything but the last character is x[:-1].
Now, how do you "bottom out"? That is, what is the trivial case you can solve without recursion? One answer is the one-character string, which is the same forward and reversed. So if you get a one-character string, you are done.
But the empty string is even more trivial, and someone might actually pass that in to your function, so we should probably use that instead. A one-character string can, after all, also be broken down into the last character and everything but the last character; it's just that everything but the last character is the empty string. So if we handle the empty string by just returning it, we're set.
Put it all together and you get:
def backward(text):
if text == "":
return text
else:
return text[-1] + backward(text[:-1])
Or in one line:
backward = lambda t: t[-1] + backward(t[:-1]) if t else t
As others have pointed out, this is not the way you would usually do this in Python. An iterative solution is going to be faster, and using slicing to do it is going to be faster still.
Additionally, Python imposes a limit on stack size, and there's no tail call optimization, so a recursive solution would be limited to reversing strings of only about a thousand characters. You can increase Python's stack size, but there would still be a fixed limit, while other solutions can always handle a string of any length.
I just want to add some explanations based on Fred Foo's answer.
Let's say we have a string called 'abc', and we want to return its reverse which should be 'cba'.
def reverse(s):
if s == "":
return s
else:
return reverse(s[1:]) + s[0]
s = "abc"
print (reverse(s))
How this code works is that:
when we call the function
reverse('abc') #s = abc
=reverse('bc') + 'a' #s[1:] = bc s[0] = a
=reverse('c') + 'b' + 'a' #s[1:] = c s[0] = a
=reverse('') + 'c' + 'b' + 'a'
='cba'
If this isn't just a homework question and you're actually trying to reverse a string for some greater goal, just do s[::-1].
def reverse_string(s):
if s: return s[-1] + reverse_string(s[0:-1])
else: return s
or
def reverse_string(s):
return s[-1] + reverse_string(s[0:-1]) if s else s
I know it's too late to answer original question and there are multiple better ways which are answered here already. My answer is for documentation purpose in case someone is trying to implement tail recursion for string reversal.
def tail_rev(in_string,rev_string):
if in_string=='':
return rev_string
else:
rev_string+=in_string[-1]
return tail_rev(in_string[:-1],rev_string)
in_string=input("Enter String: ")
rev_string=tail_rev(in_string,'')
print(f"Reverse of {in_string} is {rev_string}")
s = input("Enter your string: ")
def rev(s):
if len(s) == 1:
print(s[0])
exit()
else:
#print the last char in string
#end="" prints all chars in string on same line
print(s[-1], end="")
"""Next line replaces whole string with same
string, but with 1 char less"""
return rev(s.replace(s, s[:-1]))
rev(s)
if you do not want to return response than you can use this solution. This question is part of LeetCode.
class Solution:
i = 0
def reverseString(self, s: List[str]) -> None:
"""
Do not return anything, modify s in-place instead.
"""
if self.i >= (len(s)//2):
return
s[self.i], s[len(s)-self.i-1] = s[len(s)-self.i-1], s[self.i]
self.i += 1
self.reverseString(s)

Python algorithm in list

In a list of N strings, implement an algorithm that outputs the largest n if the entire string is the same as the preceding n strings. (i.e., print out how many characters in front of all given strings match).
My code:
def solution(a):
import numpy as np
for index in range(0,a):
if np.equal(a[index], a[index-1]) == True:
i += 1
return solution
else:
break
return 0
# Test code
print(solution(['abcd', 'abce', 'abchg', 'abcfwqw', 'abcdfg'])) # 3
print(solution(['abcd', 'gbce', 'abchg', 'abcfwqw', 'abcdfg'])) # 0
Some comments on your code:
There is no need to use numpy if it is only used for string comparison
i is undefined when i += 1 is about to be executed, so that will not run. There is no actual use of i in your code.
index-1 is an invalid value for a list index in the first iteration of the loop
solution is your function, so return solution will return a function object. You need to return a number.
The if condition is only comparing complete words, so there is no attempt to only compare a prefix.
A possible way to do this, is to be optimistic and assume that the first word is a prefix of all other words. Then as you detect a word where this is not the case, reduce the size of the prefix until it is again a valid prefix of that word. Continue like that until all words have been processed. If at any moment you find the prefix is reduced to an empty string, you can actually exit and return 0, as it cannot get any less than that.
Here is how you could code it:
def solution(words):
prefix = words[0] # if there was only one word, this would be the prefix
for word in words:
while not word.startswith(prefix):
prefix = prefix[:-1] # reduce the size of the prefix
if not prefix: # is there any sense in continuing?
return 0 # ...: no.
return len(prefix)
The description is somewhat convoluted but it does seem that you're looking for the length of the longest common prefix.
You can get the length of the common prefix between two strings using the next() function. It can find the first index where characters differ which will correspond to the length of the common prefix:
def maxCommon(S):
cp = S[0] if S else "" # first string is common prefix (cp)
for s in S[1:]: # go through other strings (s)
cs = next((i for i,(a,b) in enumerate(zip(s,cp)) if a!=b),len(cp))
cp = cp[:cs] # truncate to new common size (cs)
return len(cp) # return length of common prefix
output:
print(maxCommon(['abcd', 'abce', 'abchg', 'abcfwqw', 'abcdfg'])) # 3
print(maxCommon(['abcd', 'gbce', 'abchg', 'abcfwqw', 'abcdfg'])) # 0

Contradictory outputs in simple recursive function

Note: Goal of the function is to remove duplicate(repeated) characters.
Now for the same given recursive function, different output pops out for different argument:
def rd(x):
if x[0]==x[-1]:
return x
elif x[0]==x[1]:
return rd(x[1: ])
else:
return x[0]+rd(x[1: ])
print("Enter a sentence")
r=raw_input()
print("simplified: "+rd(r))
This functions works well for the argument only if the duplicate character is within the starting first six characters of the string, for example:
if r=abcdeeeeeeefghijk or if r=abcdeffffffghijk
but if the duplicate character is after the first six character then the output is same as the input,i.e, output=input. That means with the given below value of "r", the function doesn't work:
if r=abcdefggggggggghijkde (repeating characters are after the first six characters)
The reason you function don't work properly is you first if x[0]==x[-1], there you check the first and last character of the substring of the moment, but that leave pass many possibility like affffffa or asdkkkkkk for instance, let see why:
example 1: 'affffffa'
here is obvious right?
example 2: 'asdkkkkkk'
here we go for case 3 of your function, and then again
'a' +rd('sdkkkkkk')
'a'+'s' +rd('dkkkkkk')
'a'+'s'+'d' +rd('kkkkkk')
and when we are in 'kkkkkk' it stop because the first and last are the same
example 3: 'asdfhhhhf'
here is the same as example 2, in the recursion chain we arrive to fhhhhf and here the first and last are the same so it leave untouched
How to fix it?, simple, as other have show already, check for the length of the string first
def rd(x):
if len(x)<2: #if my string is 1 or less character long leave it untouched
return x
elif x[0]==x[1]:
return rd(x[1: ])
else:
return x[0]+rd(x[1: ])
here is alternative and iterative way of doing the same: you can use the unique_justseen recipe from itertools recipes
from itertools import groupby
from operator import itemgetter
def unique_justseen(iterable, key=None):
"List unique elements, preserving order. Remember only the element just seen."
# unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
# unique_justseen('ABBCcAD', str.lower) --> A B C A D
return map(next, map(itemgetter(1), groupby(iterable, key)))
def clean(text):
return "".join(unique_justseen(text)
test
>>> clean("abcdefggggggggghijk")
'abcdefghijk'
>>> clean("abcdefghijkkkkkkkk")
'abcdefghijk'
>>> clean("abcdeffffffghijk")
'abcdefghijk'
>>>
and if you don't want to import anything, here is another way
def clean(text):
result=""
last=""
for c in text:
if c!=last:
last = c
result += c
return result
The only issue I found with you code was the first if statement. I assumed you used it to make sure that the string was at least 2 long. It can be done using string modifier len() in fact the whole function can but we will leave it recursive for OP sake.
def rd(x):
if len(x) < 2: #Modified to return if len < 2. accomplishes same as original code and more
return x
elif x[0]==x[1]:
return rd(x[1: ])
else:
return x[0]+rd(x[1: ])
r=raw_input("Enter a sentence: ")
print("simplified: "+rd(r))
I would however recommend not making the function recursive and instead mutating the original string as follows
from collections import OrderedDict
def rd(string):
#assuming order does matter we will use OrderedDict, no longer recursive
return "".join(OrderedDict.fromkeys(string)) #creates an empty ordered dict eg. ({a:None}), duplicate keys are removed because it is a dict
#grabs a list of all the keys in dict, keeps order because list is orderable
#joins all items in list with '', becomes string
#returns string
r=raw_input("Enter a sentence: ")
print("simplified: "+rd(r))
Your function is correct but, if you want to check the last letter, the function must be:
def rd(x):
if len(x)==1:
return x
elif x[0]==x[1]:
return rd(x[1: ])
else:
return x[0]+rd(x[1: ])
print("Enter a sentence")
r=raw_input()
print("simplified: "+rd(r))

Python: Assertion error when converting a string to binary

I am getting AssertionError when testing some basic functions in my python program. This is the scenario:
I have written a function that converts a single letter into binary from ASCII:
def ascii8Bin(letter):
conv = ord(letter)
return '{0:08b}'.format(conv)
Next there's a function that uses the previous function to convert all the letters in a word/sentence into binary:
def transferBin(string):
l = list(string)
for c in l:
print ascii8Bin(c)
Now when I try to assert this function like this:
def test():
assert transferBin('w') == '01110111'
print "You test yielded no errors"
print test()
It throws an AssertionError. Now I looked up the binary alphabet and tripple checked: w in binary is definitely 01110111. I tried calling just the transferBin('w') and it yielded 01110111 like it should.
I am genuinely interested in why the assertion fails. Any insight is very much appreciated.
Your function doesn't return anything. You use print, which writes text to stdout, but you are not testing for that.
As such, transferBin() returns None, the default return value for functions without an explicit return statement.
You'll have to collect the results of each ascii8Bin() result into a list and join the results into a string:
def transferBin(string):
results = [ascii8Bin(c) for c in l]
return '\n'.join(results)
This'll use newlines to separate the result for each character; for your single character 'w' string that'll give you the expected string.
Note that you don't need to turn l in to a list; you can iterate over strings directly; you'll get individual characters either way.
You need to return, you are comparing to None not the string. python will return None from a function if you don't specify a return value.
def transferBin(s):
l = list(s)
for c in l:
return ascii8Bin(c) # return
You are basically doing assert None == '01110111', also if you are only going to have single character string simply return ascii8Bin(string), having a loop with a return like in your code will return after the first iteration so the loop is redundant.
If you actually have multiple characters just use join and just iterate over the string you don't need to call list on it to iterate over a string:
def transferBin(s):
return "".join(ascii8Bin(ch) for ch in s)
You can also just do it all in your transferBin function:
def transferBin(s):
return "".join('{0:08b}'.format(ord(ch)) for ch in s)

Python reversing a string using recursion

I want to use recursion to reverse a string in python so it displays the characters backwards (i.e "Hello" will become "olleh"/"o l l e h".
I wrote one that does it iteratively:
def Reverse( s ):
result = ""
n = 0
start = 0
while ( s[n:] != "" ):
while ( s[n:] != "" and s[n] != ' ' ):
n = n + 1
result = s[ start: n ] + " " + result
start = n
return result
But how exactly do I do this recursively? I am confused on this part, especially because I don't work with python and recursion much.
Any help would be appreciated.
def rreverse(s):
if s == "":
return s
else:
return rreverse(s[1:]) + s[0]
(Very few people do heavy recursive processing in Python, the language wasn't designed for it.)
To solve a problem recursively, find a trivial case that is easy to solve, and figure out how to get to that trivial case by breaking the problem down into simpler and simpler versions of itself.
What is the first thing you do in reversing a string? Literally the first thing? You get the last character of the string, right?
So the reverse of a string is the last character, followed by the reverse of everything but the last character, which is where the recursion comes in. The last character of a string can be written as x[-1] while everything but the last character is x[:-1].
Now, how do you "bottom out"? That is, what is the trivial case you can solve without recursion? One answer is the one-character string, which is the same forward and reversed. So if you get a one-character string, you are done.
But the empty string is even more trivial, and someone might actually pass that in to your function, so we should probably use that instead. A one-character string can, after all, also be broken down into the last character and everything but the last character; it's just that everything but the last character is the empty string. So if we handle the empty string by just returning it, we're set.
Put it all together and you get:
def backward(text):
if text == "":
return text
else:
return text[-1] + backward(text[:-1])
Or in one line:
backward = lambda t: t[-1] + backward(t[:-1]) if t else t
As others have pointed out, this is not the way you would usually do this in Python. An iterative solution is going to be faster, and using slicing to do it is going to be faster still.
Additionally, Python imposes a limit on stack size, and there's no tail call optimization, so a recursive solution would be limited to reversing strings of only about a thousand characters. You can increase Python's stack size, but there would still be a fixed limit, while other solutions can always handle a string of any length.
I just want to add some explanations based on Fred Foo's answer.
Let's say we have a string called 'abc', and we want to return its reverse which should be 'cba'.
def reverse(s):
if s == "":
return s
else:
return reverse(s[1:]) + s[0]
s = "abc"
print (reverse(s))
How this code works is that:
when we call the function
reverse('abc') #s = abc
=reverse('bc') + 'a' #s[1:] = bc s[0] = a
=reverse('c') + 'b' + 'a' #s[1:] = c s[0] = a
=reverse('') + 'c' + 'b' + 'a'
='cba'
If this isn't just a homework question and you're actually trying to reverse a string for some greater goal, just do s[::-1].
def reverse_string(s):
if s: return s[-1] + reverse_string(s[0:-1])
else: return s
or
def reverse_string(s):
return s[-1] + reverse_string(s[0:-1]) if s else s
I know it's too late to answer original question and there are multiple better ways which are answered here already. My answer is for documentation purpose in case someone is trying to implement tail recursion for string reversal.
def tail_rev(in_string,rev_string):
if in_string=='':
return rev_string
else:
rev_string+=in_string[-1]
return tail_rev(in_string[:-1],rev_string)
in_string=input("Enter String: ")
rev_string=tail_rev(in_string,'')
print(f"Reverse of {in_string} is {rev_string}")
s = input("Enter your string: ")
def rev(s):
if len(s) == 1:
print(s[0])
exit()
else:
#print the last char in string
#end="" prints all chars in string on same line
print(s[-1], end="")
"""Next line replaces whole string with same
string, but with 1 char less"""
return rev(s.replace(s, s[:-1]))
rev(s)
if you do not want to return response than you can use this solution. This question is part of LeetCode.
class Solution:
i = 0
def reverseString(self, s: List[str]) -> None:
"""
Do not return anything, modify s in-place instead.
"""
if self.i >= (len(s)//2):
return
s[self.i], s[len(s)-self.i-1] = s[len(s)-self.i-1], s[self.i]
self.i += 1
self.reverseString(s)

Categories

Resources