Most of the questions I've found are biased on the fact they're looking for letters in their numbers, whereas I'm looking for numbers in what I'd like to be a numberless string.
I need to enter a string and check to see if it contains any numbers and if it does reject it.
The function isdigit() only returns True if ALL of the characters are numbers. I just want to see if the user has entered a number so a sentence like "I own 1 dog" or something.
Any ideas?
You can use any function, with the str.isdigit function, like this
def has_numbers(inputString):
return any(char.isdigit() for char in inputString)
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
Alternatively you can use a Regular Expression, like this
import re
def has_numbers(inputString):
return bool(re.search(r'\d', inputString))
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
You can use a combination of any and str.isdigit:
def num_there(s):
return any(i.isdigit() for i in s)
The function will return True if a digit exists in the string, otherwise False.
Demo:
>>> king = 'I shall have 3 cakes'
>>> num_there(king)
True
>>> servant = 'I do not have any cakes'
>>> num_there(servant)
False
Use the Python method str.isalpha(). This function returns True if all characters in the string are alphabetic and there is at least one character; returns False otherwise.
Python Docs: https://docs.python.org/3/library/stdtypes.html#str.isalpha
https://docs.python.org/2/library/re.html
You should better use regular expression. It's much faster.
import re
def f1(string):
return any(i.isdigit() for i in string)
def f2(string):
return re.search('\d', string)
# if you compile the regex string first, it's even faster
RE_D = re.compile('\d')
def f3(string):
return RE_D.search(string)
# Output from iPython
# In [18]: %timeit f1('assdfgag123')
# 1000000 loops, best of 3: 1.18 µs per loop
# In [19]: %timeit f2('assdfgag123')
# 1000000 loops, best of 3: 923 ns per loop
# In [20]: %timeit f3('assdfgag123')
# 1000000 loops, best of 3: 384 ns per loop
You could apply the function isdigit() on every character in the String. Or you could use regular expressions.
Also I found How do I find one number in a string in Python? with very suitable ways to return numbers. The solution below is from the answer in that question.
number = re.search(r'\d+', yourString).group()
Alternatively:
number = filter(str.isdigit, yourString)
For further Information take a look at the regex docu: http://docs.python.org/2/library/re.html
Edit: This Returns the actual numbers, not a boolean value, so the answers above are more correct for your case
The first method will return the first digit and subsequent consecutive digits. Thus 1.56 will be returned as 1. 10,000 will be returned as 10. 0207-100-1000 will be returned as 0207.
The second method does not work.
To extract all digits, dots and commas, and not lose non-consecutive digits, use:
re.sub('[^\d.,]' , '', yourString)
I'm surprised that no-one mentionned this combination of any and map:
def contains_digit(s):
isdigit = str.isdigit
return any(map(isdigit,s))
in python 3 it's probably the fastest there (except maybe for regexes) is because it doesn't contain any loop (and aliasing the function avoids looking it up in str).
Don't use that in python 2 as map returns a list, which breaks any short-circuiting
You can accomplish this as follows:
if a_string.isdigit():
do_this()
else:
do_that()
https://docs.python.org/2/library/stdtypes.html#str.isdigit
Using .isdigit() also means not having to resort to exception handling (try/except) in cases where you need to use list comprehension (try/except is not possible inside a list comprehension).
You can use NLTK method for it.
This will find both '1' and 'One' in the text:
import nltk
def existence_of_numeric_data(text):
text=nltk.word_tokenize(text)
pos = nltk.pos_tag(text)
count = 0
for i in range(len(pos)):
word , pos_tag = pos[i]
if pos_tag == 'CD':
return True
return False
existence_of_numeric_data('We are going out. Just five you and me.')
You can use range with count to check how many times a number appears in the string by checking it against the range:
def count_digit(a):
sum = 0
for i in range(10):
sum += a.count(str(i))
return sum
ans = count_digit("apple3rh5")
print(ans)
#This print 2
import string
import random
n = 10
p = ''
while (string.ascii_uppercase not in p) and (string.ascii_lowercase not in p) and (string.digits not in p):
for _ in range(n):
state = random.randint(0, 2)
if state == 0:
p = p + chr(random.randint(97, 122))
elif state == 1:
p = p + chr(random.randint(65, 90))
else:
p = p + str(random.randint(0, 9))
break
print(p)
This code generates a sequence with size n which at least contain an uppercase, lowercase, and a digit. By using the while loop, we have guaranteed this event.
any and ord can be combined to serve the purpose as shown below.
>>> def hasDigits(s):
... return any( 48 <= ord(char) <= 57 for char in s)
...
>>> hasDigits('as1')
True
>>> hasDigits('as')
False
>>> hasDigits('as9')
True
>>> hasDigits('as_')
False
>>> hasDigits('1as')
True
>>>
A couple of points about this implementation.
any is better because it works like short circuit expression in C Language and will return result as soon as it can be determined i.e. in case of string 'a1bbbbbbc' 'b's and 'c's won't even be compared.
ord is better because it provides more flexibility like check numbers only between '0' and '5' or any other range. For example if you were to write a validator for Hexadecimal representation of numbers you would want string to have alphabets in the range 'A' to 'F' only.
What about this one?
import string
def containsNumber(line):
res = False
try:
for val in line.split():
if (float(val.strip(string.punctuation))):
res = True
break
except ValueError:
pass
return res
containsNumber('234.12 a22') # returns True
containsNumber('234.12L a22') # returns False
containsNumber('234.12, a22') # returns True
I'll make the #zyxue answer a bit more explicit:
RE_D = re.compile('\d')
def has_digits(string):
res = RE_D.search(string)
return res is not None
has_digits('asdf1')
Out: True
has_digits('asdf')
Out: False
which is the solution with the fastest benchmark from the solutions that #zyxue proposed on the answer.
Also, you could use regex findall. It's a more general solution since it adds more control over the length of the number. It could be helpful in cases where you require a number with minimal length.
s = '67389kjsdk'
contains_digit = len(re.findall('\d+', s)) > 0
Simpler way to solve is as
s = '1dfss3sw235fsf7s'
count = 0
temp = list(s)
for item in temp:
if(item.isdigit()):
count = count + 1
else:
pass
print count
alp_num = [x for x in string.split() if x.isalnum() and re.search(r'\d',x) and
re.search(r'[a-z]',x)]
print(alp_num)
This returns all the string that has both alphabets and numbers in it. isalpha() returns the string with all digits or all characters.
This too will work.
if any(i.isdigit() for i in s):
print("True")
You can also use set.intersection
It is quite fast, better than regex for small strings.
def contains_number(string):
return True if set(string).intersection('0123456789') else False
An iterator approach. It consumes all characters unless a digit is met. The second argument of next fix the default value to return when the iterator is "empty". In this case it set to False but also '' works since it is casted to a boolean value in the if.
def has_digit(string):
str_iter = iter(string)
while True:
char = next(str_iter, False)
# check if iterator is empty
if char:
if char.isdigit():
return True
else:
return False
or by looking only at the 1st term of a generator comprehension
def has_digit(string):
return next((True for char in string if char.isdigit()), False)
I'm surprised nobody has used the python operator in. Using this would work as follows:
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in range(10):
if str(i) in list(string):
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
Or we could use the function isdigit():
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in list(string):
if i.isdigit():
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
These functions basically just convert s into a list, and check whether the list contains a digit. If it does, it returns True, if not, it returns False.
Related
Given 2 strings, each containing a DNA sequence, the function returns a bool to show if a contiguous sub-fragment of length 5 or above exists in string1 that can pair w/ a fragment of str2.
Here is what I tried using the functions "complement" and "reverese_complement" that I created but it doesn't give the right result:
def complement5(sequence1,sequence2):
for i in range(len(sequence1) - 4):
substr1 = sequence1[i:i+5]
if complement(substr1) in sequence2 or reverese_complement(substr1) in sequence2:
print(f'{substr1, complement(substr1), reverese_complement(substr1)} in {sequence2}')
return True
else:
return False
Then when I try:
complement5('CAATTCC','CTTAAGG')
it gives False instead of True
I personally would try to identify the whole complement first, then try to use the python function find(). See the example below
s = "This be a string"
if s.find("is") == -1:
print("No 'is' here!")
else:
print("Found 'is' in the string.")
So... for your genetics problem here:
Given str1 = "AGAGAG..."
Given str2 = "TCTCTC..."
Iterate through str1 sub-chunks (of length 5?) with a for loop and identify possible complements.
str1-complement-chunk = "TCTCTC" #use a method here. Obviously i'm simplifiying.
if (str2.find(str1-complement-chunk ) == -1):
print("dang couldn't find it")
else:
print('There exists a complement at <index>')
Additionally, you may reverse the chunk if you need to do so. I believe it can be accessed with str1[:-1], or something like that idk tho.
I'm a beginner and I have a question. Is there any possibility to compare characters inside strings?
I made a function:
def animal_crackers(text):
text1 = text.split()
a = ''
count = 0
for a in text1:
for char in enumerate(a):
if char[0] == char[1]:
return True
else:
return False
Result:
>>> animal_crackers('Spam Spam')
>>> False
The logic is that I'm trying to split a string consisting of two words. Then I set those words with 1st "for" cycle and then I'm trying to get inside the string with the 2nd and this "char in enumerate(a)".
It should return True if both words start with the same letter.
This is basically not working so I'm wondering. Can you give me an advice and not ready code? Or maybe you can tell me where's mistake.
You can also have a look at Levensthein distance for strings. This is really basic, but both a good lesson for starters and a reasonable method of comparing typography.
While strings are not the same as lists, their elements can be accessed like lists.
salami = 'Salami'
spam = 'Spam'
cheese = 'Cheese'
salami[0] == spam[0] # True
salami[0] == cheese[0] # False
This is probably what you need:
def animal_crackers(text):
text1 = text.split()
for i in range(len(text1)-1):
if text1[i][0] == text1[i+1][0]:
print(True)
else:
print(False)
return
I can see where the mistake is and it is at the "enumerate(a)". when you use enumerate it will return a pair like for the first iteration it will give (0, 'S') i.e. char[0] = 0 and char[1]='S' so char[0] == char[1] is False and they are different data types. Instead try indexing like a list since text1.split() will return list. I hope it helps.
If it were just checking whether letters in a test_string are also in a control_string,
I would not have had this problem.
I will simply use the code below.
if set(test_string.lower()) <= set(control_string.lower()):
return True
But I also face a rather convoluted task of discerning whether the overlapping letters in the
control_string are in the same sequential order as those in test_string.
For example,
test_string = 'Dih'
control_string = 'Danish'
True
test_string = 'Tbl'
control_string = 'Bottle'
False
I thought of using the for iterator to compare the indices of the alphabets, but it is quite hard to think of the appropriate algorithm.
for i in test_string.lower():
for j in control_string.lower():
if i==j:
index_factor = control_string.index(j)
My plan is to compare the primary index factor to the next factor, and if primary index factor turns out to be larger than the other, the function returns False.
I am stuck on how to compare those index_factors in a for loop.
How should I approach this problem?
You could just join the characters in your test string to a regular expression, allowing for any other characters .* in between, and then re.search that pattern in the control string.
>>> test, control = "Dih", "Danish"
>>> re.search('.*'.join(test), control) is not None
True
>>> test, control = "Tbl", "Bottle"
>>> re.search('.*'.join(test), control) is not None
False
Without using regular expressions, you can create an iter from the control string and use two nested loops,1) breaking from the inner loop and else returning False until all the characters in test are found in control. It is important to create the iter, even though control is already iterable, so that the inner loop will continue where it last stopped.
def check(test, control):
it = iter(control)
for a in test:
for b in it:
if a == b:
break
else:
return False
return True
You could even do this in one (well, two) lines using all and any:
def check(test, control):
it = iter(control)
return all(any(a == b for b in it) for a in test)
Complexity for both approaches should be O(n), with n being the max number of characters.
1) This is conceptually similar to what #jpp does, but IMHO a bit clearer.
Here's one solution. The idea is to iterate through the control string first and yield a value if it matches the next test character. If the total number of matches equals the length of test, then your condition is satisfied.
def yield_in_order(x, y):
iterstr = iter(x)
current = next(iterstr)
for i in y:
if i == current:
yield i
current = next(iterstr)
def checker(test, control):
x = test.lower()
return sum(1 for _ in zip(x, yield_in_order(x, control.lower()))) == len(x)
test1, control1 = 'Tbl', 'Bottle'
test2, control2 = 'Dih', 'Danish'
print(checker(test1, control1)) # False
print(checker(test2, control2)) # True
#tobias_k's answer has cleaner version of this. If you want some additional information, e.g. how many letters align before there's a break found, you can trivially adjust the checker function to return sum(1 for _ in zip(x, yield_in_order(...))).
You can use find(letter, last_index) to find occurence of desired letter after processed letters.
def same_order_in(test, control):
index = 0
control = control.lower()
for i in test.lower():
index = control.find(i, index)
if index == -1:
return False
# index += 1 # uncomment to check multiple occurrences of same letter in test string
return True
If test string have duplicate letters like:
test_string = 'Diih'
control_string = 'Danish'
With commented line same_order_in(test_string, control_string) == True
and with uncommented line same_order_in(test_string, control_string) == False
Recursion is the best way to solve such problems.
Here's one that checks for sequential ordering.
def sequentialOrder(test_string, control_string, len1, len2):
if len1 == 0: # base case 1
return True
if len2 == 0: # base case 2
return False
if test_string[len1 - 1] == control_string[len2 - 1]:
return sequentialOrder(test_string, control_string, len1 - 1, len2 - 1) # Recursion
return sequentialOrder(test_string, control_string, len1, len2-1)
test_string = 'Dih'
control_string = 'Danish'
print(isSubSequence(test_string, control_string, len(test_string), len(control_string)))
Outputs:
True
and False for
test_string = 'Tbl'
control_string = 'Bottle'
Here's an Iterative approach that does the same thing,
def sequentialOrder(test_string,control_string,len1,len2):
i = 0
j = 0
while j < len1 and i < len2:
if test_string[j] == control_string[i]:
j = j + 1
i = i + 1
return j==len1
test_string = 'Dih'
control_string = 'Danish'
print(sequentialOrder(test_string,control_string,len(test_string) ,len(control_string)))
An elegant solution using a generator:
def foo(test_string, control_string):
if all(c in control_string for c in test_string):
gen = (char for char in control_string if char in test_string)
if all(x == test_string[i] for i, x in enumerate(gen)):
return True
return False
print(foo('Dzn','Dahis')) # False
print(foo('Dsi','Dahis')) # False
print(foo('Dis','Dahis')) # True
First check if all the letters in the test_string are contained in the control_string. Then check if the order is similar to the test_string order.
A simple way is making use of the key argument in sorted, which serves as a key for the sort comparison:
def seq_order(l1, l2):
intersection = ''.join(sorted(set(l1) & set(l2), key = l2.index))
return True if intersection == l1 else False
Thus this is computing the intersection of the two sets and sorting it according to the longer string. Having done so you only need to compare the result with the shorter string to see if they are the same.
The function returns True or False accordingly. Using your examples:
seq_order('Dih', 'Danish')
#True
seq_order('Tbl', 'Bottle')
#False
seq_order('alp','apple')
#False
[Edit: as someone pointed out I have used improperly the palindrom concept, now I have edited with the correct functions. I have done also some optimizations in the first and third example, in which the for statement goes until it reach half of the string]
I have coded three different versions for a method which checks if a string is a palindrome. The method are implemented as extensions for the class "str"
The methods also convert the string to lowercase, and delete all the punctual and spaces. Which one is the better (faster, pythonic)?
Here are the methods:
1) This one is the first solution that I thought of:
def palindrom(self):
lowerself = re.sub("[ ,.;:?!]", "", self.lower())
n = len(lowerself)
for i in range(n//2):
if lowerself[i] != lowerself[n-(i+1)]:
return False
return True
I think that this one is the more faster because there aren't transformations or reversing of the string, and the for statement breaks at the first different element, but I don't think it's an elegant and pythonic way to do so
2) In the second version I do a transformation with the solution founded here on stackoverflow (using advanced slicing string[::-1])
# more compact
def pythonicPalindrom(self):
lowerself = re.sub("[ ,.;:?!]", "", self.lower())
lowerReversed = lowerself[::-1]
if lowerself == lowerReversed:
return True
else:
return False
But I think that the slicing and the comparision between the strings make this solution slower.
3) The thirds solution that I thought of, use an iterator:
# with iterator
def iteratorPalindrom(self):
lowerself = re.sub("[ ,.;:?!]", "", self.lower())
iteratorReverse = reversed(lowerself)
for char in lowerself[0:len(lowerself)//2]:
if next(iteratorReverse) != char:
return False
return True
which I think is way more elegant of the first solution, and more efficient of the second solution
So, I decided to just timeit, and find which one was the fastest. Note that the final function is a cleaner version of your own pythonicPalindrome. It is defined as follows:
def palindrome(s, o):
return re.sub("[ ,.;:?!]", "", s.lower()) == re.sub("[ ,.;:?!]", "", o.lower())[::-1]
Methodology
I ran 10 distinct tests per function. In each test run, the function was called 10000 times, with arguments self="aabccccccbaa", other="aabccccccbaa". The results can be found below.
palindrom iteratorPalindrome pythonicPalindrome palindrome
1 0.131656638 0.108762937 0.071676536 0.072031984
2 0.140950052 0.109713793 0.073781851 0.071860462
3 0.126966087 0.109586756 0.072349792 0.073776719
4 0.125113136 0.108729573 0.094633969 0.071474645
5 0.130878159 0.108602964 0.075770395 0.072455015
6 0.133569472 0.110276694 0.072811747 0.071764222
7 0.128642812 0.111065438 0.072170571 0.072285204
8 0.124896702 0.110218949 0.071898959 0.071841214
9 0.123841905 0.109278358 0.077430437 0.071747112
10 0.124083576 0.108184210 0.080211147 0.077391086
AVG 0.129059854 0.109441967 0.076273540 0.072662766
STDDEV 0.005387429 0.000901370 0.007030835 0.001781309
It would appear that the cleaner version of your pythonicPalindrome is marginally faster, but both functions clearly outclass the alternatives.
It seems that you want to know the execution time of your blocks of code and compare them.
You can use the timeit module.
Here's a quick way:
import timeit
start = timeit.default_timer()
#Your code here
stop = timeit.default_timer()
print stop - start
Read more:
Option 1
Option 2
You could also time this one-liner that does not use re, but itertools instead:
def isPalindrom(self):
return all(i==j for i, j in itertools.zip_longest((i.lower() for i in self if i not in " ,.;:?!"), (j.lower() for j in self[::-1] if j not in " ,.;:?!")))
Or, explained in more details:
def isPalindrom(self):
#using generators to not use memory
stripped_self = (i.lower() for i in self if i not in " ,.;:?!")
reversed_stripped_self = (j.lower() for j in self[::-1] if j not in " ,.;:?!")
return all(self_char==reversed_char for self_char, reversed_char in itertools.zip_longest(stripped_self, reversed_stripped_self))
Recall that filter works on strings:
>>> st="One string, with punc. That also needs lowercase!"
>>> filter(lambda c: c not in " ,.;:?!", st.lower())
'onestringwithpuncthatalsoneedslowercase'
So your test can be a one liner that is obvious in function:
>>> str
'!esacrewol sdeen osla tahT .cnup htiw ,gnirts enO'
>>> filter(lambda c: c not in " ,.;:?!", st.lower())==filter(lambda c: c not in " ,.;:?!", str.lower()[::-1])
True
Or, if you are going to use a regex, just reverse the result with the idiomatic str[::-1]:
>>> "123"[::-1]
'321'
>>> re.sub(r'[ ,.;:?!]', '', st.lower())==re.sub(r'[ ,.;:?!]', '', str.lower())[::-1]
True
The fastest may be to use string.tranlate to delete the characters:
>>> import string
>>> string.translate(st, None, " ,.;:?!")
'OnestringwithpuncThatalsoneedslowercase'
>>> string.translate(st, None, " ,.;:?!")==string.translate(str, None, " ,.;:?!")[::-1]
True
When we pass a word it checks if it can be reversed,If it can be reversed it prints "This is a Palindrome". or "This is NOT a Palindrome"
def reverse(word):
x = ''
for i in range(len(word)):
x += word[len(word)-1-i]
return x
word = input('give me a word:\n')
x = reverse(word)
if x == word:
print('This is a Palindrome')
else:
print('This is NOT a Palindrome')
Why not using a more pythonic way!
def palindrome_checker(string):
string = string.lower()
return string == string[::-1] # returns a boolean
Given a string, how do we check if any anagram of it can be a palindrome?
For example let us consider the string "AAC". An anagram of it is "ACA" which is a palindrome. We have to write a method which takes a string and outputs true if we can form a palindrome from any anagram of the given string. Otherwise outputs false.
This is my current solution:
from collections import defaultdict
def check(s):
newdict = defaultdict(int)
for e in s:
newdict[e] += 1
times = 0
for e in newdict.values():
if times == 2:
return False
if e == 1:
times += 1
return True
Any shorter solutions using the python library?
Here is shorter solution that uses the standard library, with a corrected algorithm (all the character counts must be even, except for at most one):
from collections import Counter
def check(s):
return sum(1 for count in Counter(s).itervalues() if count % 2 == 1) <= 1
This is short but "slow", as the program goes through all the odd counts instead of stopping as soon as two are found. A faster solution that stops as soon as possible, is:
def check(s):
odd_counts = (count for count in Counter(s).itervalues() if count % 2 == 1)
try:
next(odd_counts) # Fails if there is no odd count
next(odd_counts) # Fails if there is one odd count
except StopIteration:
return True
else:
return False
This is probably better fit for code golfing, but eh it is quite trivial.
Observe that palindromes require a balanced set of sides, so you need generally even number of inputs per type. However a single odd item can be provided in the middle, so you can essentially raise that to a maximum of one set of characters that are odd. This can be done with a single list comprehension
>>> from collections import Counter
>>> def is_palindrome(letters):
... return len([v for v in Counter(letters).values() if v % 2]) <= 1
...
>>> is_palindrome('level')
True
>>> is_palindrome('levels')
False
>>> is_palindrome('levelss')
True
Oh wait, someone else beat with a solution, but that's what I got.
Without using Counter:
>>> def isit(s):
... ls = [ x % 2 for x in [s.count(x) for x in set(s)]]
... return [False, True][all(ls) or ls.count(1) == 1]
...
>>> isit('abc')
False
>>> isit('abb')
True
>>> isit('abbd')
False
>>> isit('abbdd')
True
>>> isit('abbdda')
True
>>>
Even though it's not algorithmically the best, (if your strings are not extremely long it's not a problem), I'd like to provide a more readable solution.
from itertools import permutations
def has_palindrome(s):
return any(c == c[::-1] for c in permutations(s,len(s)))