How to check if a string has at least one of certain character?
If the string cool = "Sam!", how do i check to see if that string has at least one !
Use the in operator
>>> cool = "Sam!"
>>> '!' in cool
True
>>> '?' in cool
False
As you can see '!' in cool returns a boolean and can be used further in your code
In Python, a string is a sequence (like an array); therefore, the in operator can be used to check for the existence of a character in a Python string. The in operator is used to assert membership in a sequence such as strings, lists, and tuples.
cool = "Sam!"
'!' in cool # True
Alternately you can use any of the following for more information:
cool.find("!") # Returns index of "!" which is 3
cool.index("!") # Same as find() but throws exception if not found
cool.count("!") # Returns number of instances of "!" in cool which is 1
More info that you may find helpful:
http://www.tutorialspoint.com/python/python_strings.htm
Use the following
cool="Sam!"
if "!" in cool:
pass # your code
Or just:
It_Is="!" in cool
# some code
if It_Is:
DoSmth()
else:
DoNotDoSmth()
Related
I'm looking for ignore case string comparison in Python.
I tried with:
if line.find('mandy') >= 0:
but no success for ignore case. I need to find a set of words in a given text file. I am reading the file line by line. The word on a line can be mandy, Mandy, MANDY, etc. (I don't want to use toupper/tolower, etc.).
I'm looking for the Python equivalent of the Perl code below.
if ($line=~/^Mandy Pande:/i)
If you don't want to use str.lower(), you can use a regular expression:
import re
if re.search('mandy', 'Mandy Pande', re.IGNORECASE):
# Is True
There's another post here. Try looking at this.
BTW, you're looking for the .lower() method:
string1 = "hi"
string2 = "HI"
if string1.lower() == string2.lower():
print "Equals!"
else:
print "Different!"
One can use the in operator after applying str.casefold to both strings.
str.casefold is the recommended method for use in case-insensitive comparison.
Return a casefolded copy of the string. Casefolded strings may be used for caseless matching.
Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string. For example, the German lowercase letter 'ß' is equivalent to "ss". Since it is already lowercase, lower() would do nothing to 'ß'; casefold() converts it to "ss".
The casefolding algorithm is described in section 3.13 of the Unicode Standard.
New in version 3.3.
For case-insensitive substring search:
needle = "TEST"
haystack = "testing"
if needle.casefold() in haystack.casefold():
print('Found needle in haystack')
For case-insensitive string comparison:
a = "test"
b = "TEST"
if a.casefold() == b.casefold():
print('a and b are equal, ignoring case')
Try:
if haystackstr.lower().find(needlestr.lower()) != -1:
# True
a = "MandY"
alow = a.lower()
if "mandy" in alow:
print "true"
work around
you can also use: s.lower() in str.lower()
You can use in operator in conjunction with lower method of strings.
if "mandy" in line.lower():
import re
if re.search('(?i)Mandy Pande:', line):
...
See this.
In [14]: re.match("mandy", "MaNdY", re.IGNORECASE)
Out[14]: <_sre.SRE_Match object at 0x23a08b8>
If it is a pandas series, you can mention case=False in the str.contains
data['Column_name'].str.contains('abcd', case=False)
OR if it is just two string comparisons try the other method below
You can use casefold() method. The casefold() method ignores cases when comparing.
firstString = "Hi EVERYONE"
secondString = "Hi everyone"
if firstString.casefold() == secondString.casefold():
print('The strings are equal.')
else:
print('The strings are not equal.')
Output:
The strings are equal.
This question already has answers here:
Understanding slicing
(38 answers)
Closed 29 days ago.
I want to get a new string from the third character to the end of the string, e.g. myString[2:end]. If omitting the second part means 'to the end', and if you omit the first part, does it start from the start?
>>> x = "Hello World!"
>>> x[2:]
'llo World!'
>>> x[:2]
'He'
>>> x[:-2]
'Hello Worl'
>>> x[-2:]
'd!'
>>> x[2:-2]
'llo Worl'
Python calls this concept "slicing" and it works on more than just strings. Take a look here for a comprehensive introduction.
Just for completeness as nobody else has mentioned it. The third parameter to an array slice is a step. So reversing a string is as simple as:
some_string[::-1]
Or selecting alternate characters would be:
"H-e-l-l-o- -W-o-r-l-d"[::2] # outputs "Hello World"
The ability to step forwards and backwards through the string maintains consistency with being able to array slice from the start or end.
Substr() normally (i.e. PHP and Perl) works this way:
s = Substr(s, beginning, LENGTH)
So the parameters are beginning and LENGTH.
But Python's behaviour is different; it expects beginning and one after END (!). This is difficult to spot by beginners. So the correct replacement for Substr(s, beginning, LENGTH) is
s = s[ beginning : beginning + LENGTH]
A common way to achieve this is by string slicing.
MyString[a:b] gives you a substring from index a to (b - 1).
One example seems to be missing here: full (shallow) copy.
>>> x = "Hello World!"
>>> x
'Hello World!'
>>> x[:]
'Hello World!'
>>> x==x[:]
True
>>>
This is a common idiom for creating a copy of sequence types (not of interned strings), [:]. Shallow copies a list, see Python list slice syntax used for no obvious reason.
Is there a way to substring a string in Python, to get a new string from the 3rd character to the end of the string?
Maybe like myString[2:end]?
Yes, this actually works if you assign, or bind, the name,end, to constant singleton, None:
>>> end = None
>>> myString = '1234567890'
>>> myString[2:end]
'34567890'
Slice notation has 3 important arguments:
start
stop
step
Their defaults when not given are None - but we can pass them explicitly:
>>> stop = step = None
>>> start = 2
>>> myString[start:stop:step]
'34567890'
If leaving the second part means 'till the end', if you leave the first part, does it start from the start?
Yes, for example:
>>> start = None
>>> stop = 2
>>> myString[start:stop:step]
'12'
Note that we include start in the slice, but we only go up to, and not including, stop.
When step is None, by default the slice uses 1 for the step. If you step with a negative integer, Python is smart enough to go from the end to the beginning.
>>> myString[::-1]
'0987654321'
I explain slice notation in great detail in my answer to Explain slice notation Question.
I would like to add two points to the discussion:
You can use None instead on an empty space to specify "from the start" or "to the end":
'abcde'[2:None] == 'abcde'[2:] == 'cde'
This is particularly helpful in functions, where you can't provide an empty space as an argument:
def substring(s, start, end):
"""Remove `start` characters from the beginning and `end`
characters from the end of string `s`.
Examples
--------
>>> substring('abcde', 0, 3)
'abc'
>>> substring('abcde', 1, None)
'bcde'
"""
return s[start:end]
Python has slice objects:
idx = slice(2, None)
'abcde'[idx] == 'abcde'[2:] == 'cde'
You've got it right there except for "end". It's called slice notation. Your example should read:
new_sub_string = myString[2:]
If you leave out the second parameter it is implicitly the end of the string.
text = "StackOverflow"
#using python slicing, you can get different subsets of the above string
#reverse of the string
text[::-1] # 'wolfrevOkcatS'
#fist five characters
text[:5] # Stack'
#last five characters
text[-5:] # 'rflow'
#3rd character to the fifth character
text[2:5] # 'rflow'
#characters at even positions
text[1::2] # 'tcOefo'
If myString contains an account number that begins at offset 6 and has length 9, then you can extract the account number this way: acct = myString[6:][:9].
If the OP accepts that, they might want to try, in an experimental fashion,
myString[2:][:999999]
It works - no error is raised, and no default 'string padding' occurs.
Well, I got a situation where I needed to translate a PHP script to Python, and it had many usages of substr(string, beginning, LENGTH).
If I chose Python's string[beginning:end] I'd have to calculate a lot of end indexes, so the easier way was to use string[beginning:][:length], it saved me a lot of trouble.
str1='There you are'
>>> str1[:]
'There you are'
>>> str1[1:]
'here you are'
#To print alternate characters skipping one element in between
>>> str1[::2]
'Teeyuae'
#To print last element of last two elements
>>> str1[:-2:-1]
'e'
#Similarly
>>> str1[:-2:-1]
'e'
#Using slice datatype
>>> str1='There you are'
>>> s1=slice(2,6)
>>> str1[s1]
'ere '
Maybe I missed it, but I couldn't find a complete answer on this page to the original question(s) because variables are not further discussed here. So I had to go on searching.
Since I'm not yet allowed to comment, let me add my conclusion here. I'm sure I was not the only one interested in it when accessing this page:
>>>myString = 'Hello World'
>>>end = 5
>>>myString[2:end]
'llo'
If you leave the first part, you get
>>>myString[:end]
'Hello'
And if you left the : in the middle as well you got the simplest substring, which would be the 5th character (count starting with 0, so it's the blank in this case):
>>>myString[end]
' '
Using hardcoded indexes itself can be a mess.
In order to avoid that, Python offers a built-in object slice().
string = "my company has 1000$ on profit, but I lost 500$ gambling."
If we want to know how many money I got left.
Normal solution:
final = int(string[15:19]) - int(string[43:46])
print(final)
>>>500
Using slices:
EARNINGS = slice(15, 19)
LOSSES = slice(43, 46)
final = int(string[EARNINGS]) - int(string[LOSSES])
print(final)
>>>500
Using slice you gain readability.
a="Helloo"
print(a[:-1])
In the above code, [:-1] declares to print from the starting till the maximum limit-1.
OUTPUT :
>>> Hello
Note: Here a [:-1] is also the same as a [0:-1] and a [0:len(a)-1]
a="I Am Siva"
print(a[2:])
OUTPUT:
>>> Am Siva
In the above code a [2:] declares to print a from index 2 till the last element.
Remember that if you set the maximum limit to print a string, as (x) then it will print the string till (x-1) and also remember that the index of a list or string will always start from 0.
I have a simpler solution using for loop to find a given substring in a string.
Let's say we have two string variables,
main_string = "lullaby"
match_string = "ll"
If you want to check whether the given match string exists in the main string, you can do this,
match_string_len = len(match_string)
for index,value in enumerate(main_string):
sub_string = main_string[index:match_string_len+index]
if sub_string == match_string:
print("match string found in main string")
I am looking for a function that combines the methods isalpha() and isspace() into a single method.
I want to check if a given string only contains letters and/or spaces, for example:
"This is text".isalpha_or_space()
# True
However, with the 2 methods, I get:
"This is text".isalpha() or "This is text".isspace()
# False
as the string is not only alpha nor space.
Of course, I could iterate over every character and check it for space or alpha.
I could also compare the string with ("abcdefghijklmnopqrstuvwxyz" + " ")
However, both of these approaches don't seem very pythonic to me - convince me otherwise.
The most Pythonic will be to use a def for this:
def isalpha_or_space(self):
if self == "":
return False
for char in self:
if not (char.isalpha() or char.isspace()):
return False
return True
It is not easy to contribute this as a method on str, since Python does not encourage the monkeypatching of built-in types. My recommendation is just to leave this as a module level function.
Nonetheless, it is still possible to mimic the interface of a method, since most namespaces in Python are writable if you know where to find them. The suggestion below is not Pythonic, and relies on implementation detail.
>>> import gc
>>> def monkeypatch(type_, func):
... gc.get_referents(type_.__dict__)[0][func.__name__] = func
...
>>> monkeypatch(str, isalpha_or_space)
>>> "hello world".isalpha_or_space()
True
Use a regular expression (regex):
>>> import re
>>> result = re.match('[a-zA-Z\s]+$', "This is text")
>>> bool(result)
True
Breakdown:
re - Python's regex module
[a-zA-Z\s] - Any letter or whitespace
+ - One or more of the previous item
$ - End of string
The above works with ASCII letters. For the full Unicode range on Python 3, unfortunately the regex is a bit complicated:
>>> result = re.match('([^\W\d_]|\s)+$', 'un café')
Breakdown:
(x|y) - x or y
[^\W\d_] - Any word character except a number or an underscore
From Mark Tolonen's answer on How to match all unicode alphabetic characters and spaces in a regex?
You can use the following solution:
s != '' and all(c.isalpha() or c.isspace() for c in s)
I have several strings in Python. Let's assume each string is associated with a variable. These strings are only composed of characters and integers:
one = '74A76B217C'
two = '8B7A1742B'
three = '8123A9A8B'
I would like a conditional in my code which checks these strings if 'A' exists first, and if so, return the integer.
So, in the example above, the first integer and first character is: for one, 74 and A; for two, 8 and B; for three, 8123 and A.
For the function, one would return True, and 74; two would be False, and three would be 8123 and A.
My problem is, I am not sure how to efficiently parse the strings in order to check for the first integer and character.
In Python, there are methods to check whether the character exists in the string, e.g.
if 'A' in one:
print('contains A')
But this doesn't take order into account order. What is the most efficient way to access the first character and first integer, or at least check whether the first character to occur in a string is of a certain identity?
Try this as an alternative of regex:
def check(s):
i = s.find('A')
if i > 0 and s[:i].isdigit():
return int(s[:i]), True
return False
# check(one) (74, True)
# check(two) False
# check(three) (8123, True)
Use regex: ^(\d+)A
import re
def check_string(s):
m_test = re.search("^(\d+)A", s)
if m_test:
return m_test.group(1)
See online regex tester:
https://regex101.com/r/LmSbly/1
Try a regular expression.
>>> import re
>>> re.search('[0-9]+', '74A76B217C').group(0)
'74'
>>> re.search('[A-z]', '74A76B217C').group(0)
'A'
You can use a regex:
>>> re.match('^([0-9]+)A', string)
For example:
import re
for s in ['74A76B217C', '8B7A1742B', '8123A9A8B']:
match = re.match('^([0-9]+)A', s)
print(match is not None and match.group(1))
i am new to regular expressions and developed this to find out if idno has values from 0 to 9 in the first nine characters and V, v, X or x as the last. Is the syntax correct because it sends an error requesting two args.
Another problem is that it should be only 10 characters long. I used a separate code to validate that but can I integrate it into this too?
if len(idno) is 10:
if re.match("[0-9]{9}[VvXx],idno") == true:
print "Valid"
You have more wrong there than right, I'm afraid. Note the following:
You should really compare integers by equality (== 10) not identity (is 10) - CPython interns small integers, so your current code will work, but that's an implementation detail you shouldn't rely on;
If you add $ (end of string) to the end the regular expression will only match strings ten characters long, making the len check unnecessary anyway;
The quotes are in the wrong place, so you're passing a single string to re.match, rather than the pattern and the name you want to try to match it in - the comma and idno are all part of the pattern parameter;
'true' != 'True': Python is case-sensitive, and the booleans start with capital letters;
re.match returns either an SRE_Match object or None, neither of which == True. However, it's pretty awkward to write == True even where you're only getting True or False, and you can use the fact that Match is truth-y and None is false-y to write the much neater if some_thing: rather than if some_thing == True:; and
Regular expressions already have a case covering [0-9], you can just use \d (digit).
Your code should therefore be:
if re.match(r'\d{9}[VvXx]$', idno):
# ^ note 'raw' string, to avoid escaping the backslash
print "Valid"
You could simplify further using the re.IGNORECASE flag and making the group for the last character [vx]. A few examples:
>>> import re
>>> for test in ('123456789x', '123456789a', '123abc456x', '123456789xa'):
print test, re.match(r'\d{9}[vx]$', test, re.I)
# ^ shorter version of IGNORECASE
123456789x <_sre.SRE_Match object at 0x10041e308> # valid
123456789a None # wrong final letter
123abc456x None # non-digits in first nine characters
123456789xa None # start matches but ends with additional character