Python anagram strings

Python anagram strings - python

I'm supposed to write a function to determine if two strings are anagrams or not.
The codes I've written now, doesn't really work well.
For example, one of sample input is
Tom Cruise
So I'm cuter
output should be True, but my code keep says False.
For another example, when the input is
the eyes
they see
My code actually says True which is the right answer.
So I have no idea why my code only works for certain input.
Can anyone help?
def anagram(a, b):
if(sorted(a)==sorted(b)):
return True
else:
return False

You need to remove all the non alphabet characters and then convert all letters to lower case:
import re
regex = re.compile('[^a-zA-Z]')
return sorted(regex.sub('', a).lower()) == sorted(regex.sub('', b).lower())

As I mentioned in the comment section above, you need to remove any symbol such as ' and convert each letter to either uppercase or lowercase to avoid case mismatches. So, your code should look like this:
def anagram(a, b):
newA = ''.join(elem.lower() for elem in a if elem.isalpha())
newB = ''.join(elem.lower() for elem in b if elem.isalpha())
if(sorted(newA)==sorted(newB)):
return True
else:
return False
a = "Tom Cruise"
b = "So I'm cuter"
print(anagram(a,b))
This will give you:
True

Check this one...
s1=input("Enter first string:")
s2=input("Enter second string:")
a=''.join(t for t in s1 if t.isalnum())
b=''.join(t for t in s2 if t.isalnum())
a=a.lower()
b=b.lower()
if(sorted(a)==sorted(b)):
print("The strings are anagrams.")
else:
print("The strings aren't anagrams.")

def anagram(a, b):
a = ''.join(e for e in a if e.isalpha()).lower()
b = ''.join(e for e in b if e.isalpha()).lower()
return sorted(a)==sorted(b)
a = "Tom Cruise"
b = "So I'm cuter"
anagram(a,b)
You got to remove the none alpha characters in the string and convert the two string into the consistent case.

Related

How to compare characters inside string

I'm a beginner and I have a question. Is there any possibility to compare characters inside strings?
I made a function:
def animal_crackers(text):
text1 = text.split()
a = ''
count = 0
for a in text1:
for char in enumerate(a):
if char[0] == char[1]:
return True
else:
return False
Result:
>>> animal_crackers('Spam Spam')
>>> False
The logic is that I'm trying to split a string consisting of two words. Then I set those words with 1st "for" cycle and then I'm trying to get inside the string with the 2nd and this "char in enumerate(a)".
It should return True if both words start with the same letter.
This is basically not working so I'm wondering. Can you give me an advice and not ready code? Or maybe you can tell me where's mistake.

You can also have a look at Levensthein distance for strings. This is really basic, but both a good lesson for starters and a reasonable method of comparing typography.

While strings are not the same as lists, their elements can be accessed like lists.
salami = 'Salami'
spam = 'Spam'
cheese = 'Cheese'
salami[0] == spam[0] # True
salami[0] == cheese[0] # False

This is probably what you need:
def animal_crackers(text):
text1 = text.split()
for i in range(len(text1)-1):
if text1[i][0] == text1[i+1][0]:
print(True)
else:
print(False)
return

I can see where the mistake is and it is at the "enumerate(a)". when you use enumerate it will return a pair like for the first iteration it will give (0, 'S') i.e. char[0] = 0 and char[1]='S' so char[0] == char[1] is False and they are different data types. Instead try indexing like a list since text1.split() will return list. I hope it helps.

How does comparing two chars (within a string) work in Python

I am starting to learn Python and looked at following website: https://www.w3resource.com/python-exercises/string/
I work on #4 which is "Write a Python program to get a string from a given string where all occurrences of its first char have been changed to '$', except the first char itself."
str="restart"
char=str[0]
print(char)
strcpy=str
i=1
for i in range(len(strcpy)):
print(strcpy[i], "\n")
if strcpy[i] is char:
strcpy=strcpy.replace(strcpy[i], '$')
print(strcpy)
I would expect "resta$t" but the actual result is: $esta$t
Thank you for your help!

There are two issues, first, you are not starting iteration where you think you are:
i = 1 # great, i is 1
for i in range(5):
print(i)
0
1
2
3
4
i has been overwritten by the value tracking the loop.
Second, the is does not mean value equivalence. That is reserved for the == operator. Simpler types such as int and str can make it seem like is works in this fashion, but other types do not behave this way:
a, b = 5, 5
a is b
True
a, b = "5", "5"
a is b
True
a==b
True
### This doesn't work
a, b = [], []
a is b
False
a == b
True
As #Kevin pointed out in the comments, 99% of the time, is is not the operator you want.
As far as your code goes, str.replace will replace all instances of the argument supplied with the second arg, unless you give it an optional number of instances to replace. To avoid replacing the first character, grab the first char separately, like val = somestring[0], then replace the rest using a slice, no need for iteration:
somestr = 'restart' # don't use str as a variable name
val = somestr[0] # val is 'r'
# somestr[1:] gives 'estart'
x = somestr[1:].replace(val, '$')
print(val+x)
# resta$t
If you still want to iterate, you can do that over the slice as well:
# collect your letters into a list
letters = []
char = somestr[0]
for letter in somestr[1:]: # No need to track an index here
if letter == char: # don't use is, use == for value comparison
letter = '$' # change letter to a different value if it is equal to char
letters.append(letter)
# Then use join to concatenate back to a string
print(char + ''.join(letters))
# resta$t

There are some need of modification on your code.
Modify your code with as given in below.
strcpy="restart"
i=1
for i in range(len(strcpy)):
strcpy=strcpy.replace(strcpy[0], '$')[:]
print(strcpy)
# $esta$t
Also, the best practice to write code in Python is to use Function. You can modify your code as given below or You can use this function.
def charreplace(s):
return s.replace(s[0],'$')[:]
charreplace("restart")
#'$esta$t'
Hope this helpful.

Validate both numbers and letters

There have been cases where I have needed to validate a string filled with numbers and letters and I want to know the easiest way to do it
For example, in Tic Tac Toe / Noughts and Crosses, I need to make sure that the position that the user has entered is between "1-3" and "a-c"
For better understanding of what I am asking:
pos = "2c"
>>> Input is valid
pos = "1z"
>>> Input is invalid: Letters outside range a-c
pos = "5b"
>>> Input is invalid: Numbers outside range 1-3

There are only 9 possible valid inputs, so you could just check them all, or you could use a regular expression to see if the input matches all the valid inputs.
import re
pattern = re.compile(r'^[123][abc]$')
m = pattern.match("2b")
if m:
print("It's a match!")
The regular expression r'^[123][abc]$' looks for the start of a string, followed by 1, 2, or 3, followed by a, b, or c, followed by the end of the string. No inputs outside that range (or that are longer than two characters) should match.

Without regular expressions, you could have something like:
def validate(i):
if type(i) != str or len(i) != 2:
return False
d, char = int(i[0]), i[1]
return d >= 1 and d <= 3 and char in 'abc'
print(validate('1c')) #True
print(validate('3a')) #True
print(validate('2b')) #True
print(validate('36')) # False
print(validate('106')) # False
print(validate('10c')) # False
print(validate(10)) # False
This does, however, assume that the first character in your input can be converted to an int.

Use regex as follows:
import re
example_str = '1c'
p = re.compile('^[1-3][a-c]$')
if p.match(example_str):
# Valid
else:
# Invalid
You can use - for range selection in pattern.

Checking the first few characters in a string

I would like to check the first few characters (the numbers are variable) in a string.
E.g.
a = '+6221-123-4567'
and I would like to check if the first few characters are in
b = ['021', '+6221', '(021)', '(+62)']
I would like to do it programatically, without separating manually based on the number of characters:
if a[:3] in ['021']: print('yes')
if a[:5] in ['+6221', '(021)', '(+62)']: print('yes')
Thank you!

str.startswith(prefix[, start[, end]])
Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of prefixes to look for.
docs
Try this,
a.startswith(tuple(b)).
Full code,
if a.startswith(tuple(b)):
print("yes")

You can try this:
a = '+6221-123-4567'
b = ['021', '+6221', '(021)', '(+62)']
b = [i[1:-1] if "(" in i else i for i in b]
#you can generate a list of characters:
new_list = [a[:5][:i+1] for i in range(5) if a[:5][:i+1] in b]
print(new_list)
if len(new_list) > 0:
print "yes"
else:
print "no"
Output:
['+62', '+6221']
yes

Check if a string contains a number

Most of the questions I've found are biased on the fact they're looking for letters in their numbers, whereas I'm looking for numbers in what I'd like to be a numberless string.
I need to enter a string and check to see if it contains any numbers and if it does reject it.
The function isdigit() only returns True if ALL of the characters are numbers. I just want to see if the user has entered a number so a sentence like "I own 1 dog" or something.
Any ideas?

You can use any function, with the str.isdigit function, like this
def has_numbers(inputString):
return any(char.isdigit() for char in inputString)
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
Alternatively you can use a Regular Expression, like this
import re
def has_numbers(inputString):
return bool(re.search(r'\d', inputString))
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False

You can use a combination of any and str.isdigit:
def num_there(s):
return any(i.isdigit() for i in s)
The function will return True if a digit exists in the string, otherwise False.
Demo:
>>> king = 'I shall have 3 cakes'
>>> num_there(king)
True
>>> servant = 'I do not have any cakes'
>>> num_there(servant)
False

Use the Python method str.isalpha(). This function returns True if all characters in the string are alphabetic and there is at least one character; returns False otherwise.
Python Docs: https://docs.python.org/3/library/stdtypes.html#str.isalpha

https://docs.python.org/2/library/re.html
You should better use regular expression. It's much faster.
import re
def f1(string):
return any(i.isdigit() for i in string)
def f2(string):
return re.search('\d', string)
# if you compile the regex string first, it's even faster
RE_D = re.compile('\d')
def f3(string):
return RE_D.search(string)
# Output from iPython
# In [18]: %timeit f1('assdfgag123')
# 1000000 loops, best of 3: 1.18 µs per loop
# In [19]: %timeit f2('assdfgag123')
# 1000000 loops, best of 3: 923 ns per loop
# In [20]: %timeit f3('assdfgag123')
# 1000000 loops, best of 3: 384 ns per loop

You could apply the function isdigit() on every character in the String. Or you could use regular expressions.
Also I found How do I find one number in a string in Python? with very suitable ways to return numbers. The solution below is from the answer in that question.
number = re.search(r'\d+', yourString).group()
Alternatively:
number = filter(str.isdigit, yourString)
For further Information take a look at the regex docu: http://docs.python.org/2/library/re.html
Edit: This Returns the actual numbers, not a boolean value, so the answers above are more correct for your case
The first method will return the first digit and subsequent consecutive digits. Thus 1.56 will be returned as 1. 10,000 will be returned as 10. 0207-100-1000 will be returned as 0207.
The second method does not work.
To extract all digits, dots and commas, and not lose non-consecutive digits, use:
re.sub('[^\d.,]' , '', yourString)

I'm surprised that no-one mentionned this combination of any and map:
def contains_digit(s):
isdigit = str.isdigit
return any(map(isdigit,s))
in python 3 it's probably the fastest there (except maybe for regexes) is because it doesn't contain any loop (and aliasing the function avoids looking it up in str).
Don't use that in python 2 as map returns a list, which breaks any short-circuiting

You can accomplish this as follows:
if a_string.isdigit():
do_this()
else:
do_that()
https://docs.python.org/2/library/stdtypes.html#str.isdigit
Using .isdigit() also means not having to resort to exception handling (try/except) in cases where you need to use list comprehension (try/except is not possible inside a list comprehension).

You can use NLTK method for it.
This will find both '1' and 'One' in the text:
import nltk
def existence_of_numeric_data(text):
text=nltk.word_tokenize(text)
pos = nltk.pos_tag(text)
count = 0
for i in range(len(pos)):
word , pos_tag = pos[i]
if pos_tag == 'CD':
return True
return False
existence_of_numeric_data('We are going out. Just five you and me.')

You can use range with count to check how many times a number appears in the string by checking it against the range:
def count_digit(a):
sum = 0
for i in range(10):
sum += a.count(str(i))
return sum
ans = count_digit("apple3rh5")
print(ans)
#This print 2

import string
import random
n = 10
p = ''
while (string.ascii_uppercase not in p) and (string.ascii_lowercase not in p) and (string.digits not in p):
for _ in range(n):
state = random.randint(0, 2)
if state == 0:
p = p + chr(random.randint(97, 122))
elif state == 1:
p = p + chr(random.randint(65, 90))
else:
p = p + str(random.randint(0, 9))
break
print(p)
This code generates a sequence with size n which at least contain an uppercase, lowercase, and a digit. By using the while loop, we have guaranteed this event.

any and ord can be combined to serve the purpose as shown below.
>>> def hasDigits(s):
... return any( 48 <= ord(char) <= 57 for char in s)
...
>>> hasDigits('as1')
True
>>> hasDigits('as')
False
>>> hasDigits('as9')
True
>>> hasDigits('as_')
False
>>> hasDigits('1as')
True
>>>
A couple of points about this implementation.
any is better because it works like short circuit expression in C Language and will return result as soon as it can be determined i.e. in case of string 'a1bbbbbbc' 'b's and 'c's won't even be compared.
ord is better because it provides more flexibility like check numbers only between '0' and '5' or any other range. For example if you were to write a validator for Hexadecimal representation of numbers you would want string to have alphabets in the range 'A' to 'F' only.

What about this one?
import string
def containsNumber(line):
res = False
try:
for val in line.split():
if (float(val.strip(string.punctuation))):
res = True
break
except ValueError:
pass
return res
containsNumber('234.12 a22') # returns True
containsNumber('234.12L a22') # returns False
containsNumber('234.12, a22') # returns True

I'll make the #zyxue answer a bit more explicit:
RE_D = re.compile('\d')
def has_digits(string):
res = RE_D.search(string)
return res is not None
has_digits('asdf1')
Out: True
has_digits('asdf')
Out: False
which is the solution with the fastest benchmark from the solutions that #zyxue proposed on the answer.

Also, you could use regex findall. It's a more general solution since it adds more control over the length of the number. It could be helpful in cases where you require a number with minimal length.
s = '67389kjsdk'
contains_digit = len(re.findall('\d+', s)) > 0

Simpler way to solve is as
s = '1dfss3sw235fsf7s'
count = 0
temp = list(s)
for item in temp:
if(item.isdigit()):
count = count + 1
else:
pass
print count

alp_num = [x for x in string.split() if x.isalnum() and re.search(r'\d',x) and
re.search(r'[a-z]',x)]
print(alp_num)
This returns all the string that has both alphabets and numbers in it. isalpha() returns the string with all digits or all characters.

This too will work.
if any(i.isdigit() for i in s):
print("True")

You can also use set.intersection
It is quite fast, better than regex for small strings.
def contains_number(string):
return True if set(string).intersection('0123456789') else False

An iterator approach. It consumes all characters unless a digit is met. The second argument of next fix the default value to return when the iterator is "empty". In this case it set to False but also '' works since it is casted to a boolean value in the if.
def has_digit(string):
str_iter = iter(string)
while True:
char = next(str_iter, False)
# check if iterator is empty
if char:
if char.isdigit():
return True
else:
return False
or by looking only at the 1st term of a generator comprehension
def has_digit(string):
return next((True for char in string if char.isdigit()), False)

I'm surprised nobody has used the python operator in. Using this would work as follows:
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in range(10):
if str(i) in list(string):
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
Or we could use the function isdigit():
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in list(string):
if i.isdigit():
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
These functions basically just convert s into a list, and check whether the list contains a digit. If it does, it returns True, if not, it returns False.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python anagram strings - python

You need to remove all the non alphabet characters and then convert all letters to lower case: import re regex = re.compile('[^a-zA-Z]') return sorted(regex.sub('', a).lower()) == sorted(regex.sub('', b).lower())

Check this one... s1=input("Enter first string:") s2=input("Enter second string:") a=''.join(t for t in s1 if t.isalnum()) b=''.join(t for t in s2 if t.isalnum()) a=a.lower() b=b.lower() if(sorted(a)==sorted(b)): print("The strings are anagrams.") else: print("The strings aren't anagrams.")

Related

How to compare characters inside string

How does comparing two chars (within a string) work in Python

Validate both numbers and letters

Checking the first few characters in a string

Check if a string contains a number

Categories

Resources