I am just learning programming in Python for fun. I was writing a palindrome program and I thought of how I can make further improvements to it.
First thing that came to my mind is to prevent the program from having to go through the entire word both ways since we are just checking for a palindrome. Then I realized that the loop can be broken as soon as the first and the last character doesn't match.
I then implemented them in a class so I can just call a word and get back true or false.
This is how the program stands as of now:
class my_str(str):
def is_palindrome(self):
a_string = self.lower()
length = len(self)
for i in range(length/2):
if a_string[i] != a_string[-(i+1)]:
return False
return True
this = my_str(raw_input("Enter a string: "))
print this.is_palindrome()
Are there any other improvements that I can make to make it more efficient?
I think the best way to improvise write a palindrome checking function in Python is as follows:
def is_palindrome(s):
return s == s[::-1]
(Add the lower() call as required.)
What about something way simpler? Why are you creating a class instead of a simple method?
>>> is_palindrome = lambda x: x.lower() == x.lower()[::-1]
>>> is_palindrome("ciao")
False
>>> is_palindrome("otto")
True
While other answers have been given talking about how best to approach the palindrome problem in Python, Let's look at what you are doing.
You are looping through the string by using indices. While this works, it's not very Pythonic. Python for loops are designed to loop over the objects you want, not simply over numbers, as in other languages. This allows you to cut out a layer of indirection and make your code clearer and simpler.
So how can we do this in your case? Well, what you want to do is loop over the characters in one direction, and in the other direction - at the same time. We can do this nicely using the zip() and reversed() builtins. reversed() allows us to get the characters in the reversed direction, while zip() allows us to iterate over two iterators at once.
>>> a_string = "something"
>>> for first, second in zip(a_string, reversed(a_string)):
... print(first, second)
...
s g
o n
m i
e h
t t
h e
i m
n o
g s
This is the better way to loop over the characters in both directions at once. Naturally this isn't the most effective way of solving this problem, but it's a good example of how you should approach things differently in Python.
Building on Lattyware's answer - by using the Python builtins appropriately, you can avoid doing things like a_string[-(i+1)], which takes a second to understand - and, when writing more complex things than palindromes, is prone to off-by-one errors.
The trick is to tell Python what you mean to do, rather than how to achieve it - so, the most obvious way, per another answer, is to do one of the following:
s == s[::-1]
list(s) == list(reversed(s))
s == ''.join(reversed(s))
Or various other similar things. All of them say: "is the string equal to the string backwards?".
If for some reason you really do need the optimisation that you know you've got a palindrome once you're halfway (you usually shouldn't, but maybe you're dealing with extremely long strings), you can still do better than index arithmetic. You can start with:
halfway = len(s) // 2
(where // forces the result to an integer, even in Py3 or if you've done from __future__ import division). This leads to:
s[:halfway] == ''.join(reversed(s[halfway:]))
This will work for all even-length s, but fail for odd length s because the RHS will be one element longer. But, you don't care about the last character in it, since it is the middle character of the string - which doesn't affect its palindromeness. If you zip the two together, it will stop after the end of the short one - and you can compare a character at a time like in your original loop:
for f,b in zip(s[:half], reversed(s[half:])):
if f != b:
return False
return True
And you don't even need ''.join or list or such. But you can still do better - this kind of loop is so idiomatic that Python has a builtin function called all just to do it for you:
all(f == b for f,b in zip(s[:half], reversed(s[half:])))
Says 'all the characters in the front half of the list are the same as the ones in the back half of the list written backwards'.
One improvement I can see would be to use xrange instead of range.
It probably isn't a faster implementation but you could use a recursive test. Since you're learning, this construction is very useful in many situation :
def is_palindrome(word):
if len(word) < 2:
return True
if word[0] != word[-1]:
return False
return is_palindrome(word[1:-1])
Since this is a rather simple (light) function this construction might not be the fastest because of the overhead of calling the function multiple times, but in other cases where the computation are more intensive it can be a very effective construction.
Just my two cents.
Related
I am confused about when using a while-loop or for-loop is better? I am particularly worried about producing optimized answers to coding questions. I find myself solving problems just to find out a while-loop would've been faster but am confused about what leads people to choose to use it instead of a for-loop, like what criteria should I be looking for?
Here's an example of a coding question I answered which checks if a string of parentheses is balanced.
def parenCheck(str):
stack = Stack()
for x in str:
if x == '(':
stack.push(x)
else:
if stack.isEmpty():
return False
else:
stack.pop()
return stack.isEmpty()
Here is the answer I saw for it , which I know is faster because it doesnt use a for-loop:
def parChecker(symbolString):
s = Stack()
balanced = True
index = 0
while index < len(symbolString) and balanced:
symbol = symbolString[index]
if symbol == "(":
s.push(symbol)
else:
if s.isEmpty():
balanced = False
else:
s.pop()
index = index + 1
if balanced and s.isEmpty():
return True
else:
return False
For the same number of iterations, and content, my guess is that a for loop and wihile have practically the same speed. But I haven't tested that - mostly I answer numpy questions where we try to avoid either loop (preferring iterations in compiled code).
Your examples show the basic case where for results in cleaner, if not faster, code:
for x in an_iterator:
<do something>
versus
i = 0
while i < len(an_iterator):
x = an_iterator[x]
<do something>
i += 1
If you must have an index as well, you can use:
for i, x in enumerate(an_iterator):
....
That ability to iterate directly on things like lists and dictionaries was the main neat feature that caught my attention when I first saw Python in the 1990.
A common subcase of a for loop accumulates values, so common, that Python provides the popular list comprehension, and extended the syntax to generator expressions and dictionary comprehensions.
while still has its uses. Other languages have do while and do until variants, which attempt to streamline stepping the variable and testing. Python has just the one version with separate steps. The new walrus operator has the potential of cleaning that up:
https://docs.python.org/3/whatsnew/3.8.html#assignment-expressions
while (block := f.read(256)) != '':
process(block)
while is most useful when the steps aren't regular, or from a well defined iterator. for can break and continue, but otherwise the sequence is fixed (at the start of iteration).
In addition to enumerate, zip lets us iterate on several things at once.
The convenience of list comprehensions encourages us to decompose a complicated task into a sequence of comprehensions. And to make that even more convenient (and faster/memory efficient) Python provides generators and all sort so itertools. In Python 3, range and dictionary keys/items/values all became generator like expressions.
This cannot be decided by your code that while loop worked faster than for loop. As of your question on when to use while and for loop, it is decided on the basis of whether we know the number of iterations. If number of iterations is known, for loop is preferred, whereas while loop is preferred when the iterations are indefinite.
for (int i : mynumber_list)
Iterator it = mynumber_list.iterator()
while (it.hasNext())
As can be seen from above, for loop is more readable, simple, and easy in traversing. Moreover, iterator.hasNext() has a very high probability of entering into an infinite loop
while can also be useful for user choice input questions, like keep executing the program until the user presses any other key except y. This is difficult to achieve with for loop.
In python, there is such a feature - True and False can be added, subtracted, etc
Are there any examples where this can be useful?
Is there any real benefit from this feature, for example, when:
it increases productivity
it makes the code more concise (without losing speed)
etc
While in most cases it would just be confusing and completely unwarranted to (ab)use this functionality, I'd argue that there are a few cases that are exceptions.
One example would be counting. True casts to 1, so you can count the number of elements that pass some criteria in this fashion, while remaining concise and readable. An example of this would be:
valid_elements = sum(is_valid(element) for element in iterable)
As mentioned in the comments, this could be accomplished via:
valid_elements = list(map(is_valid, iterable)).count(True)
but to use .count(...), the object must be a list, which imposes a linear space complexity (iterable may have been a constant space generator for all we know).
Another case where this functionality might be usable is as a play on the ternary operator for sequences, where you either want the sequence or an empty sequence depending on the value. Say you want to return the resulting list if a condition holds, otherwise an empty list:
return result_list * return_empty
or if you are doing a conditional string concatentation
result = str1 + str2 * do_concatenate
of course, both of these could be solved by using python's ternary operator:
return [] if return_empty else result_list
...
result = str1 + str2 if do_concatenate else str1
The point being, this behavior does provide other options in a few scenarios that isn't all too unreasonable. Its just a matter of using your best judgement as to whether it'll cause confusion for future readers (yourself included).
I would avoid it at all cost. It is confusing and goes against typing. Python being permissive does not mean you should do it ...
What is an efficient way to check that a string s in Python consists of just one character, say 'A'? Something like all_equal(s, 'A') which would behave like this:
all_equal("AAAAA", "A") = True
all_equal("AAAAAAAAAAA", "A") = True
all_equal("AAAAAfAAAAA", "A") = False
Two seemingly inefficient ways would be to: first convert the string to a list and check each element, or second to use a regular expression. Are there more efficient ways or are these the best one can do in Python? Thanks.
This is by far the fastest, several times faster than even count(), just time it with that excellent mgilson's timing suite:
s == len(s) * s[0]
Here all the checking is done inside the Python C code which just:
allocates len(s) characters;
fills the space with the first character;
compares two strings.
The longer the string is, the greater is time bonus. However, as mgilson writes, it creates a copy of the string, so if your string length is many millions of symbols, it may become a problem.
As we can see from timing results, generally the fastest ways to solve the task do not execute any Python code for each symbol. However, the set() solution also does all the job inside C code of the Python library, but it is still slow, probably because of operating string through Python object interface.
UPD: Concerning the empty string case. What to do with it strongly depends on the task. If the task is "check if all the symbols in a string are the same", s == len(s) * s[0] is a valid answer (no symbols mean an error, and exception is ok). If the task is "check if there is exactly one unique symbol", empty string should give us False, and the answer is s and s == len(s) * s[0], or bool(s) and s == len(s) * s[0] if you prefer receiving boolean values. Finally, if we understand the task as "check if there are no different symbols", the result for empty string is True, and the answer is not s or s == len(s) * s[0].
>>> s = 'AAAAAAAAAAAAAAAAAAA'
>>> s.count(s[0]) == len(s)
True
This doesn't short circuit. A version which does short-circuit would be:
>>> all(x == s[0] for x in s)
True
However, I have a feeling that due the the optimized C implementation, the non-short circuiting version will probably perform better on some strings (depending on size, etc)
Here's a simple timeit script to test some of the other options posted:
import timeit
import re
def test_regex(s,regex=re.compile(r'^(.)\1*$')):
return bool(regex.match(s))
def test_all(s):
return all(x == s[0] for x in s)
def test_count(s):
return s.count(s[0]) == len(s)
def test_set(s):
return len(set(s)) == 1
def test_replace(s):
return not s.replace(s[0],'')
def test_translate(s):
return not s.translate(None,s[0])
def test_strmul(s):
return s == s[0]*len(s)
tests = ('test_all','test_count','test_set','test_replace','test_translate','test_strmul','test_regex')
print "WITH ALL EQUAL"
for test in tests:
print test, timeit.timeit('%s(s)'%test,'from __main__ import %s; s="AAAAAAAAAAAAAAAAA"'%test)
if globals()[test]("AAAAAAAAAAAAAAAAA") != True:
print globals()[test]("AAAAAAAAAAAAAAAAA")
raise AssertionError
print
print "WITH FIRST NON-EQUAL"
for test in tests:
print test, timeit.timeit('%s(s)'%test,'from __main__ import %s; s="FAAAAAAAAAAAAAAAA"'%test)
if globals()[test]("FAAAAAAAAAAAAAAAA") != False:
print globals()[test]("FAAAAAAAAAAAAAAAA")
raise AssertionError
On my machine (OS-X 10.5.8, core2duo, python2.7.3) with these contrived (short) strings, str.count smokes set and all, and beats str.replace by a little, but is edged out by str.translate and strmul is currently in the lead by a good margin:
WITH ALL EQUAL
test_all 5.83863711357
test_count 0.947771072388
test_set 2.01028490067
test_replace 1.24682998657
test_translate 0.941282987595
test_strmul 0.629556179047
test_regex 2.52913498878
WITH FIRST NON-EQUAL
test_all 2.41147494316
test_count 0.942595005035
test_set 2.00480484962
test_replace 0.960338115692
test_translate 0.924381017685
test_strmul 0.622269153595
test_regex 1.36632800102
The timings could be slightly (or even significantly?) different between different systems and with different strings, so that would be worth looking into with an actual string you're planning on passing.
Eventually, if you hit the best case for all enough, and your strings are long enough, you might want to consider that one. It's a better algorithm ... I would avoid the set solution though as I don't see any case where it could possibly beat out the count solution.
If memory could be an issue, you'll need to avoid str.translate, str.replace and strmul as those create a second string, but this isn't usually a concern these days.
You could convert to a set and check there is only one member:
len(set("AAAAAAAA"))
Try using the built-in function all:
all(c == 'A' for c in s)
If you need to check if all the characters in the string are same and is equal to a given character, you need to remove all duplicates and check if the final result equals the single character.
>>> set("AAAAA") == set("A")
True
In case you desire to find if there is any duplicate, just check the length
>>> len(set("AAAAA")) == 1
True
Adding another solution to this problem
>>> not "AAAAAA".translate(None,"A")
True
Interesting answers so far. Here's another:
flag = True
for c in 'AAAAAAAfAAAA':
if not c == 'A':
flag = False
break
The only advantage I can think of to mine is that it doesn't need to traverse the entire string if it finds an inconsistent character.
not len("AAAAAAAAA".replace('A', ''))
I have a string list
[str1, str2, str3.....]
and I also have a def to check the format of the strings, something like:
def CheckIP(strN):
if(formatCorrect(strN)):
return True
return False
Now I want to check every string in list, and of course I can use for to check one by one. But could I use map() to make code more readable...?
You can map your list to your function and then use all to check if it returns True for every item:
if all(map(CheckIP, list_of_strings)):
# All strings are good
Actually, it would be cleaner to just get rid of the CheckIP function and use formatCorrect directly:
if all(map(formatCorrect, list_of_strings)):
# All strings are good
Also, as an added bonus, all uses lazy-evaluation. Meaning, it only checks as many items as are necessary before returning a result.
Note however that a more common approach would be to use a generator expression instead of map:
if all(formatCorrect(x) for x in list_of_strings):
In my opinion, generator expressions are always better than map because:
They are slightly more readable.
They are just as fast if not faster than using map. Also, in Python 2.x, map creates a list object that is often unnecessary (wastes memory). Only in Python 3.x does map use lazy-computation like a generator expression.
They are more powerful. In addition to just mapping items to a function, generator expressions allow you to perform operations on each item as they are produced. For example:
sum(x * 2 for x in (1, 2, 3))
They are preferred by most Python programmers. Keeping with convention is important when programming because it eases maintenance and makes your code more understandable.
There is talk of removing functions like map, filter, etc. from a future version of the language. Though this is not set in stone, it has come up many times in the Python community.
Of course, if you are a fan of functional programming, there isn't much chance you'll agree to points one and four. :)
An example, how you could do:
in_str = ['str1', 'str2', 'str3', 'not']
in_str2 = ['str1', 'str2', 'str3']
def CheckIP(strN):
# different than yours, just to show example.
if 'str' in strN:
return True
else:
return False
print(all(map(CheckIP, in_str))) # gives false
print(all(map(CheckIP, in_str2))) # gives true
L = [str1, str2, str3.....]
answer = list(map(CheckIP, L))
answer is a list of booleans such that answer[i] is CheckIP(L[i]). If you want to further check if all of those values are True, you could use all:
all(answer)
This returns True if and only if all the values in answer are True. However, you may do this without listifying:
all(map(CheckIP, L)), as, in python3, `map` returns an iterator, not a list. This way, you don't waste space turning everything into a list. You also save on time, as the first `False` value makes `all` return `False`, stopping `map` from computing any remaining values
This is my first effort on solving the exercise. I gotta say, I'm kind of liking Python. :D
# D. verbing
# Given a string, if its length is at least 3,
# add 'ing' to its end.
# Unless it already ends in 'ing', in which case
# add 'ly' instead.
# If the string length is less than 3, leave it unchanged.
# Return the resulting string.
def verbing(s):
if len(s) >= 3:
if s[-3:] == "ing":
s += "ly"
else:
s += "ing"
return s
else:
return s
# +++your code here+++
return
What do you think I could improve on here?
def verbing(s):
if len(s) >= 3:
if s.endswith("ing"):
s += "ly"
else:
s += "ing"
return s
How about this little rewrite:
def verbing(s):
if len(s) < 3:
return s
elif s.endswith('ing'):
return s + 'ly'
else:
return s + 'ing'
I would use s.endswith("ing") in the if, which is also a bit faster, because it doesn't create a new string for the comparision.
And second, I would use docstrings for commenting. This way, you can see your description when you do a help(yourmodule) or when you use some autodoc-tool like Sphinx to create a handbook describing your API. Example:
def verbings(s):
"""Given a string, if its length is at least 3, add 'ing' to its end.
Unless it already ends in 'ing', in which case add 'ly' instead.
If the string length is less than 3, leave it unchanged."""
# rest of the function
Third, it's often considered a bad practice to change input parameters. You can do it for dict or list parameters, which can also act as output parameters. But strings are input parameters only (that's why you have the return). The source you have written is valid of course, but is often confusing. Other languages have often a final or const keyword to avoid this confusion, but Python doesn't. So, I would recommend you, to use either a second variable result = s + "ing" and do a return result afterwards, or write return s + "ing".
The rest is perfectly fine. There are of course some constructs in Python which are shorter to write (you will learn them with the time), but they are often not so readable. Therefore I would stay with your solution.
Pretty good for a beginner! Yes, I would say this is the Pythonic way of doing things. I especially like the way you have commented exactly what the function does. Good work there.
Keep working with Python, though. You're doing fine.
def add_string(str1):
if(len(str1)>=3):
if (str1[-3::] == 'ing'):
str1+='ly'
else:
str1+='ing'
return str1
str1="com"
print(add_string(str1))