Have I found a bug in Python's str.endswith()?

Have I found a bug in Python's str.endswith()? - python

According to the Python documentation:
str.endswith(suffix[, start[, end]])
Return True if the string ends with the specified suffix, otherwise return False. suffix can also be a tuple of suffixes to look for. With optional start, test beginning at that position. Withoptional end, stop comparing at that position.
Changed in version 2.5: Accept tuples as suffix.
The following code should return True, but it returns False in Python 2.7.3:
"hello-".endswith(('.', ',', ':', ';', '-' '?', '!'))
It seems str.endswith() ignores anything beyond the forth tuple element:
>>> "hello-".endswith(('.', ',', ':', '-', ';' '?', '!'))
>>> True
>>> "hello;".endswith(('.', ',', ':', '-', ';' '?', '!'))
>>> False
Have I found a bug, or am I missing something?

or am I missing something?
You're missing a comma after the ';' in your tuple:
>>> "hello;".endswith(('.', ',', ':', '-', ';' '?', '!'))
# ^
# comma missing
False
Due to this, ; and ? are concatenated. So, the string ending with ;? will return True for this case:
>>> "hello;?".endswith(('.', ',', ':', '-', ';' '?', '!'))
True
After adding a comma, it would work as expected:
>>> "hello;".endswith(('.', ',', ':', '-', ';', '?', '!'))
True

If you write tuple as
>>> tuple_example = ('.', ',', ':', '-', ';' '?', '!')
then the tuple will become
>>> tuple_example
('.', ',', ':', '-', ';?', '!')
^
# concatenate together
So that is why return False

It has already been pointed out that adjacent string literals are concatenated, but I wanted to add a little additional information and context.
This is a feature that is shared with (and borrowed from) C.
Additionally, this is doesn't act like a concatenation operator like '+', and is treated identically as if they were literally joined together in the source without any additional overhead.
For example:
>>> 'a' 'b' * 2
'abab'
Whether this is useful feature or an annoying design is really a matter of opinion, but it does allow for breaking up string literals among multiple lines by encapsulating the literals within parentheses.
>>> print("I don't want to type this whole string"
"literal all on one line.")
I don't want to type this whole stringliteral all on one line.
That type of usage (along with being used with #defines) is why it was useful in C in the first place and was subsequently brought along in Python.

Related

How to add single backslash to a Python list? [duplicate]

This question already has an answer here:
Why does printing a tuple (list, dict, etc.) in Python double the backslashes?
(1 answer)
Closed 7 months ago.
I want to add single backslash element to my list. I used print("\\") and it printed single backslash; however, when I try to add "\\" to my list, it adds double backslash. How can I solve this problem?
You can see the code below:
signs=["+","x","÷","=","/","\\","$","€","£","#","*","!","#",":",";","&","-","(",")","_","'","\"",".",",","?"]
print("Signs:",signs)
I use the Python 3.7.3 IDLE as IDE.
From now, thanks for your attention!

Using a Python REPL
>>> ll = [1,2,3]
>>> ll.append('\\')
>>> ll
[1, 2, 3, '\\']
>>> ll[3]
'\\'
>>> print(ll[3])
\
>>>
If Python displays a string it needs to Escape the backslash, put if you print the element it shows a single backslash

Your code print the list as a whole:
signs=["+","x","÷","=","/","\\","$","€","£","#","*","!","#",":",";","&","-","(",")","_","'","\"",".",",","?"]
print("Signs:",signs)
gives:
Signs: ['+', 'x', '÷', '=', '/', '\\', '$', '€', '£', '#', '*', '!', '#', ':', ';', '&', '-', '(', ')', '_', "'", '"', '.', ',', '?']
[]
Use * to print it 'one by one'. * is the unpacking operator that turns a list into positional arguments, print(*[a,b,c]) is the same as print(a,b,c).
signs=["+","x","÷","=","/","\\","$","€","£","#","*","!","#",":",";","&","-","(",")","_","'","\"",".",",","?"]
print("Signs:",*signs)

\n in return statement of function prints \n instead of new line in python

Consider the below python code.I am very new to python.Please help me with this.
this function returns
'\n56'
but i need
--new line--
56
def fun_ret(num):
return '\n'+str(num)
if __name__=='__main__':
a=fun_ret(56)
I have gone through similar posts but it was all about print statement
Actual scenario is to pass the doctest of method str in a class.
def __str__(self):
"""(Maze)-->str
The parameter represents a maze. Return a string representation of the maze
>>> maze=Maze([['#', '#', '#', '#', '#', '#', '#'],['#', 'J', '.', '.', 'P', '.', '#'], ['#', '.', '#', '#', '#', '.', '#'],['#', '.', '.', '#', '#', '.', '#'],['#', '#', '#', '.', '#', '.', '#'], ['#', '#', '#', '#', '#', '#', '#']], Rat('J', 1, 1),Rat('P', 1, 4))
>>> str(maze)
"#######
#J..P.#
#.###.#
#..##.#
###.#.#
#######
J at (1, 1) ate 0 sprouts.
P at (1, 4) ate 0 sprouts."
"""
result=''
for outer_list in self.maze_content:
for inner_list in outer_list:
result +='{0}'.format(str(inner_list))
result+='\n'
result += '\n'+'{0} at ({1}, {2}) ate {3} sprouts.'.format(self.rat_1.symbol,self.rat_1.row,self.rat_1.col,self.rat_1.num_sprouts_eaten)
result += '\n'+('{0} at ({1}, {2}) ate {3} sprouts.'.format(self.rat_2.symbol,self.rat_2.row,self.rat_2.col,self.rat_2.num_sprouts_eaten))
return result
if __name__=='__main__':
import doctest
doctest.testmod()

Are you working in a line-oriented command shell? If so, I can see your confusion. A typical command shell has a special routine for simply displaying a variable value, one that does not convert output formatting characters. For instance, the default Python interactive shell gives us this:
>>> def fun_ret(num):
... return '\n'+str(num)
...
>>> a=fun_ret(56)
>>> a
'\n56'
>>> print(a)
56
>>>
Note the difference: the simple a command carefully displays the string value of the variable, expanding the visual short-hand for various control characters. The print command applies the standard formatter, which is where the \n character is turned into an actual "new line" operation.
Does that help clear up the problem?

I know this is a really old question at this point but I found it while trying to solve a similar problem myself (I was doing a challenge in a course I was taking in which I had to print a string containing multiple lines out of a dictionary of strings). So, since some of the comments lead me to my answer, I'd like to share it here in case someone else has the same problem later.
Instead of just returning the string, as in OP's case:
def fun_ret(num):
return '\n'+str(num)
you can print the statement, like so:
def fun_ret(num):
print('\n'+str(num))
This wouldn't necessarily work if you potentially need to return that function into just a variable somewhere, in which case you'd probably be better off using the first definition, but calling it in a print statement:
print(fun_ret(num))
instead.
I'm not sure if this will help with the actual situation specified by OP (pretty new to Python myself), but it will cause the function call to print a new line, rather than just return the value, if that's what someone needs for their use case.

Python regex split into characters except if followed by parentheses

I have a string like "F(230,24)F[f(22)_(23);(2)%[+(45)FF]]", where each character except for parentheses and what they enclose represents a kind of instruction. A character can be followed by an optional list of arguments specified in optional parentheses.
Such a string i would like to split the string into
['F(230,24)', 'F', '[', 'f(22)', '_(23)', ';(2)', '%', '[', '+(45)', 'F', 'F', ']', ']'], however at the moment i only get ['F(230,24)', 'F', '[', 'f(22)_(23);(2)', '%', '[', '+(45)', 'F', 'F', ']', ']'] (a substring was not split correctly).
Currently i am using list(filter(None, re.split(r'([A-Za-z\[\]\+\-\^\&\\\/%_;~](?!\())', string))), which is just a mess of characters and a negative lookahead for (. list(filter(None, <list>)) is used to remove empty strings from the result.
I am aware that this is likely caused by Python's re.split having been designed not to split on a zero length match, as discussed here.
However i was wondering what would be a good solution? Is there a better way than re.findall?
Thank you.
EDIT: Unfortunately i am not allowed to use custom packages like regex module

You can use re.findall to find out all single character optionally followed by a pair of parenthesis:
import re
s = "F(230,24)F[f(22)_(23);(2)%[+(45)FF]]"
re.findall("[^()](?:\([^()]*\))?", s)
['F(230,24)',
'F',
'[',
'f(22)',
'_(23)',
';(2)',
'%',
'[',
'+(45)',
'F',
'F',
']',
']']
[^()] match a single character except for parenthesis;
(?:\([^()]*\))? denotes a non-capture group(?:) enclosed by a pair of parenthesis and use ? to make the group optional;

I am aware that this is likely caused by Python's re.split having been designed not to split on a zero length match
You can use the VERSION1 flag of the regex module. Taking that example from the thread you've linked - see how split() produces zero-width matches as well:
>>> import regex as re
>>> re.split(r"\s+|\b", "Split along words, preserve punctuation!", flags=re.V1)
['', 'Split', 'along', 'words', ',', 'preserve', 'punctuation', '!']

Another solution. This time the pattern recognize strings with the structure SYMBOL[(NUMBER[,NUMBER...])]. The function parse_it returns True and the tokens if the string match with the regular expression and False and empty if don't match.
import re
def parse_it(string):
'''
Input: String to parse
Output: True|False, Tokens|empty_string
'''
pattern = re.compile('[A-Za-z\[\]\+\-\^\&\\\/%_;~](?:\(\d+(?:,\d+)*\))?')
tokens = pattern.findall(string)
if ''.join(tokens) == string:
res = (True, tokens)
else:
res = (False, '')
return res
good_string = 'F(230,24)F[f(22)_(23);(2)%[+(45)FF]]'
bad_string = 'F(2a30,24)F[f(22)_(23);(2)%[+(45)FF]]' # There is an 'a' in a bad place.
print(parse_it(good_string))
print(parse_it(bad_string))
Output:
(True, ['F(230,24)', 'F', '[', 'f(22)', '_(23)', ';(2)', '%', '[',
'+(45)', 'F', 'F', ']', ']'])(False, '')

Multiple symbols replace not working

I need to check a string for some symbols and replace them with a whitespace. My code:
string = 'so\bad'
symbols = ['•', '!', '"', '#', '$', '%', '&', '\'', '(', ')', '*', '+', ',', '-', '.', '/', ':', ';', '<', '>', '=', '?', '#', '[', ']', '\\', '^', '_', '`', '{', '}', '~', '|', '"', '⌐', '¬', '«', '»', '£', '$', '°', '§', '–', '—']
for symbol in symbols:
string = string.replace(symbol, ' ')
print string
>> sad
Why does it replace a\b with nothing?

This is because \b is ASCII backspace character:
>>> string = 'so\bad'
>>> print string
sad
You can find it and all the other escape characters from Python Reference Manual.
In order to get the behavior you expect escape the backslash character or use raw strings:
# Both result to 'so bad'
string = 'so\\bad'
string = r'so\bad'

The issue you are facing is the use of \ as a escape character.
\b is a special character (backspace)
Use a String literal with prefix r.
With the r, backslashes \ are treated as literal
string = r'so\bad'

You are not replacing anything "\b" is backspace, moving your cursor to the left one step.

Note that even if you omit the symbols list and your for symbol in symbols: code, you will always get the result "sad" when you print string. This is because \b means something as an ascii character, and is being interpreted together.
Check out this stackoverflow answer for a solution on how to work around this issue: How can I print out the string "\b" in Python

Removing punctuation in lists in Python

Creating a Python program that converts the string to a list, uses a loop to remove any punctuation and then converts the list back into a string and prints the sentence without punctuation.
punctuation=['(', ')', '?', ':', ';', ',', '.', '!', '/', '"', "'"]
str=input("Type in a line of text: ")
alist=[]
alist.extend(str)
print(alist)
#Use loop to remove any punctuation (that appears on the punctuation list) from the list
print(''.join(alist))
This is what I have so far. I tried using something like: alist.remove(punctuation) but I get an error saying something like list.remove(x): x not in list. I didn't read the question properly at first and realized that I needed to do this by using a loop so I added that in as a comment and now I'm stuck. I was, however, successful in converting it from a list back into a string.

import string
punct = set(string.punctuation)
''.join(x for x in 'a man, a plan, a canal' if x not in punct)
Out[7]: 'a man a plan a canal'
Explanation: string.punctuation is pre-defined as:
'!"#$%&\'()*+,-./:;<=>?#[\\]^_`{|}~'
The rest is a straightforward comprehension. A set is used to speed up the filtering step.

I found a easy way to do it:
punctuation = ['(', ')', '?', ':', ';', ',', '.', '!', '/', '"', "'"]
str = raw_input("Type in a line of text: ")
for i in punctuation:
str = str.replace(i,"")
print str
With this way you will not get any error.

punctuation=['(', ')', '?', ':', ';', ',', '.', '!', '/', '"', "'"]
result = ""
for character in str:
if(character not in punctuation):
result += character
print result

Here is the answer of how to tokenize the given statements by using python. the python version I used is 3.4.4
Assume that I have text which is saved as one.txt. then I have saved my python program in the directory where my file is (i.e. one.txt). The following is my python program:
with open('one.txt','r')as myFile:
str1=myFile.read()
print(str1)# This is to print the given statements with punctuations(before removal of punctuations)
# The following is the list of punctuations that we need to remove, add any more if I forget
punctuation = ['(', ')', '?', ':', ';', ',', '.', '!', '/', '"', "'"]
for i in punctuation:
str1 = str1.replace(i," ") #to make empty the place where punctuation is there.
myList=[]
myList.extend(str1.split(" "))
print (str1) #this is to print the given statements without puctions(after Removal of punctuations)
for i in myList:
# print ("____________")
print(i,end='\n')
print ("____________")
==============next I will post for you how to remove stop words============
until that let you comment if it is useful.
Thank you

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Have I found a bug in Python's str.endswith()? - python

If you write tuple as >>> tuple_example = ('.', ',', ':', '-', ';' '?', '!') then the tuple will become >>> tuple_example ('.', ',', ':', '-', ';?', '!') ^ # concatenate together So that is why return False

Related

How to add single backslash to a Python list? [duplicate]

\n in return statement of function prints \n instead of new line in python

Python regex split into characters except if followed by parentheses

Multiple symbols replace not working

Removing punctuation in lists in Python

Categories

Resources