How to get Count of discrepancy values from two given string? - python

I have two Strings Paula and Pole. If we check Paula with Pole then we will get three discrepancy a,u,a is present in Paula but not present in Pole so it should return a value 3.
Input:
enter string1: Paula
enter string2: Pole
Expected Output:
3
String 1 is always correct here for a row of names.
I have tried something like below so far
import itertools
def compare(string1, string2, no_match_c=' ', match_c='|'):
if len(string2) < len(string1):
string1, string2 = string2, string1
result = ''
n_diff = 0
for c1, c2 in itertools.izip(string1, string2):
if c1 == c2:
result += match_c
else:
result += no_match_c
n_diff += 1
delta = len(string2) - len(string1)
result += delta * no_match_c
n_diff += delta
return (result, n_diff)
def main():
string1 = 'paula'
string2 = 'pole'
result, n_diff = compare(string1, string2, no_match_c='_')
print(n_diff)
main()
Answer should be in a function
Example of other string
string1 = Michelle
string2 = Michele
Output : 1

This is a simple approach to do what you want, assuming you always want to find the number of chars in string 1 not in string 2.
def compare(str1, str2):
#Return count of chars in str1 not in str2
s1 = set([x for x in str1])
s2 = set([x for x in str2])
ms = s1 ^ s2 & s1 #Finds the chars in string 1 not in str2
rslt = 0
for v in ms:
rslt += str1.count(v)
return rslt

You may try Counter, it can use to count the number of each character in both strings and support subtraction between counts.
from collections import Counter
def diff(s1, s2):
c1 = Counter(s1)
c2 = Counter(s2)
return sum((c1 - c2).values())
print(diff("Paula", "Pole")) # Output: 3
print(diff("Pole", "Paula")) # Output: 2
print(diff("Michelle", "Michele")) # Output: 1
print(diff("Michele", "Michelle")) # Output: 0

You can try that, using a list of zeors in the size of 256 (number of ASCII characters) which represents a counter for all characters.
def compare(string1, string2):
chars_counter = [0]*256
for c1 in string1:
chars_counter[ord(c1)] += 1
for c2 in string2:
if chars_counter[ord(c2)] != 0:
chars_counter[ord(c2)] -= 1
return sum(chars_counter)

Related

How to know if a string contains more numeric pattern than alphabetic pattern?

I want to know if my string contains more numbers or more alphabets.
I have tried using regex in python and putting a condition in between.
search_3 = '(\d) > (\D)'
words["aplha_or_numeric_mark"] = words["Words"].str.findall(search_3)
print(words)
The actual result is just an empty list on each row
Expected results :
123ABCD should output 1 since alphabets > numbers.
1234ABC should output 0 since alphabets < numbers.
This should work.
string = "ABCD12345"
num_count = 0
word_count = 0
for i in string:
if i.isalpha():
word_count += 1
elif i.isdigit():
num_count += 1
if word_count > num_count:
print(1)
else:
print(0)
You can do using zip on a generator:
def is_alpha_more(s):
total_alphas, total_nums = zip(*((x.isalpha(), x.isdigit()) for x in s))
return 1 if sum(total_alphas) >= sum(total_nums) else 0
Sample run:
>>> s = '12,"BCD'
>>> is_alpha_more(s)
1
>>> s = '1234A,":B'
>>> is_alpha_more(s)
0
Why not just use re.findall to find the count of both and get the results?
import re
s = '123ABCD'
numAlphabets = len(re.findall('[a-zA-Z]', s))
numDigits = len(re.findall('\d', s))
if numAlphabets > numDigits:
print('More alphabets then digits')
elif numDigits > numAlphabets:
print('More digits then alphabets')
else:
print('Same numbers for both')
For this case it prints,
More alphabets then digits
Also, if all you want to return 1 if more alphabets and 0 if less alphabets then digits, you can use this function,
import re
def has_more_alphabets(s):
if len(re.findall('[a-zA-Z]', s)) > len(re.findall('\d', s)):
return 1
else:
return 0
print(has_more_alphabets('123ABCD'))
print(has_more_alphabets('123##334ABCD'))
print(has_more_alphabets('123###ad553353455ABCD'))
print(has_more_alphabets('123BCD'))
Prints following,
1
0
0
0
There are many ways to accomplish what you ask. Regular expressions are meant for "search" or "search and replace" in strings. You need to count. One example would be something like:
def test_string(text):
count_letters = 0
count_digits = 0
for character in text:
if character.isalpha():
count_letters += 1
elif character.isdigit():
count_digits += 1
if count_letters > count_digits:
return 1
return 0
You still haven't defined what should happen if the two numbers are equal, but that should be easy case to add.

Removing all instances of the second string from the first

The question states: Write code that takes two strings from the user, and returns what is left over if all instances of the second string is removed from the first. The second string is guaranteed to be no longer than two characters.
I started off with the following:
def remove(l1,l2):
string1 = l1
string2 = l2
result = ""
ctr = 0
while ctr < len(l1):
Since it cannot be longer than 2 characters I think I have to put in an if function as such:
if len(sub) == 2:
if (ctr + 1) < len(string) and string[ctr] == sub[0]
You could just use the replace method to remove all occurrences of the the second string from the first:
def remove(s1, s2):
return s1.replace(s2, "")
print remove("hello this is a test", "l")
For a manual method, you can use:
def remove(s1, s2):
newString = []
if len(s2) > 2:
return "The second argument cannot exceed two characters"
for c in s1:
if c not in s2:
newString.append(c)
return "".join(newString)
print remove("hello this is a test", "l")
Yields: heo this is a test
The code looks like this:
def remove(l1,l2):
string1 = l1
string2 = l2
ctr = 0
result = ""
while ctr < len(string1):
if string1[ctr : ctr + len(string2)] == string2:
ctr += len(string2)
else:
result += string1[ctr]
ctr += 1
return result
I got it resolved; just took me a little bit of time.
You could use list comprehension:
st1 = "Hello how are you"
st2 = "This is a test"
st3 = [i for i in st1 if i not in st2]
print ''.join(st3)
Using solely the slice method:
def remove_all(substr,theStr):
num=theStr.count(substr)
for i in range(len(theStr)):
finalStr=""
if theStr.find(substr)<0:
return theStr
elif theStr[i:i+len(substr)]==substr:
return theStr[0:i]+ theStr[i+len(substr*num):len(theStr)]
s1= input()
s2= input()
#get length of each string
l_s1,l_s2= len(s1),len(s2)
#list to store the answer
ans= list()
i=0
#check if more characters are left
#in s1 to b compared
#and length of substring of s1 remaining to
#be compared must be greater than or equal
#to the length of s2
while i<l_s1 and l_s1-i>=l_s2:
j=0
#compare the substring from s1 with s2
while j<l_s2 and s1[i+j]==s2[j]:
j+=1
#if string matches
#discard that substring of s1
#from solution
#and update the pointer i
#accordingly
if j==l_s2:
i+=j
#otherwise append the ith character to
#ans list
else:
ans.append(s1[i])
i+=1
#append if any characters remaining
while i<l_s1:
ans.append(s1[i])
i+=1
print(''.join(ans))
'''
Sample Testcase
1.
kapil
kd
kapil
2.
devansh
dev
ansh
3.
adarsh
ad
arsh
'''

How to convert numbers in a string without using lists?

My prof wants me to create a function that return the sum of numbers in a string but without using any lists or list methods.
The function should look like this when operating:
>>> sum_numbers('34 3 542 11')
590
Usually a function like this would be easy to create when using lists and list methods. But trying to do so without using them is a nightmare.
I tried the following code but they don't work:
>>> def sum_numbers(s):
for i in range(len(s)):
int(i)
total = s[i] + s[i]
return total
>>> sum_numbers('1 2 3')
'11'
Instead of getting 1, 2, and 3 all converted into integers and added together, I instead get the string '11'. In other words, the numbers in the string still have not been converted to integers.
I also tried using a map() function but I just got the same results:
>>> def sum_numbers(s):
for i in range(len(s)):
map(int, s[i])
total = s[i] + s[i]
return total
>>> sum_numbers('1 2 3')
'11'
Totally silly of course, but for fun:
s = '34 3 542 11'
n = ""; total = 0
for c in s:
if c == " ":
total = total + int(n)
n = ""
else:
n = n + c
# add the last number
total = total + int(n)
print(total)
> 590
This assumes all characters (apart from whitespaces) are figures.
You've definitely put some effort in here, but one part of your approach definitely won't work as-is: you're iterating over the characters in the string, but you keep trying to treat each character as its own number. I've written a (very commented) method that accomplishes what you want without using any lists or list methods:
def sum_numbers(s):
"""
Convert a string of numbers into a sum of those numbers.
:param s: A string of numbers, e.g. '1 -2 3.3 4e10'.
:return: The floating-point sum of the numbers in the string.
"""
def convert_s_to_val(s):
"""
Convert a string into a number. Will handle anything that
Python could convert to a float.
:param s: A number as a string, e.g. '123' or '8.3e-18'.
:return: The float value of the string.
"""
if s:
return float(s)
else:
return 0
# These will serve as placeholders.
sum = 0
current = ''
# Iterate over the string character by character.
for c in s:
# If the character is a space, we convert the current `current`
# into its numeric representation.
if c.isspace():
sum += convert_s_to_val(current)
current = ''
# For anything else, we accumulate into `current`.
else:
current = current + c
# Add `current`'s last value to the sum and return.
sum += convert_s_to_val(current)
return sum
Personally, I would use this one-liner, but it uses str.split():
def sum_numbers(s):
return sum(map(float, s.split()))
No lists were used (nor harmed) in the production of this answer:
def sum_string(string):
total = 0
if len(string):
j = string.find(" ") % len(string) + 1
total += int(string[:j]) + sum_string(string[j:])
return total
If the string is noisier than the OP indicates, then this should be more robust:
import re
def sum_string(string):
pattern = re.compile(r"[-+]?\d+")
total = 0
match = pattern.search(string)
while match:
total += int(match.group())
match = pattern.search(string, match.end())
return total
EXAMPLES
>>> sum_string('34 3 542 11')
590
>>> sum_string(' 34 4 ')
38
>>> sum_string('lksdjfa34adslkfja4adklfja')
38
>>> # and I threw in signs for fun
...
>>> sum_string('34 -2 45 -8 13')
82
>>>
If you want to be able to handle floats and negative numbers:
def sum_numbers(s):
sm = i = 0
while i < len(s):
t = ""
while i < len(s) and not s[i].isspace():
t += s[i]
i += 1
if t:
sm += float(t)
else:
i += 1
return sm
Which will work for all cases:
In [9]: sum_numbers('34 3 542 11')
Out[9]: 590.0
In [10]: sum_numbers('1.93 -1 23.12 11')
Out[10]: 35.05
In [11]: sum_numbers('')
Out[11]: 0
In [12]: sum_numbers('123456')
Out[12]: 123456.0
Or a variation taking slices:
def sum_numbers(s):
prev = sm = i = 0
while i < len(s):
while i < len(s) and not s[i].isspace():
i += 1
if i > prev:
sm += float(s[prev:i])
prev = i
i += 1
return sm
You could also use itertools.groupby which uses no lists, using a set of allowed chars to group by:
from itertools import groupby
def sum_numbers(s):
allowed = set("0123456789-.")
return sum(float("".join(v)) for k,v in groupby(s, key=allowed.__contains__) if k)
which gives you the same output:
In [14]: sum_numbers('34 3 542 11')
Out[14]: 590.0
In [15]: sum_numbers('1.93 -1 23.12 11')
Out[15]: 35.05
In [16]: sum_numbers('')
Out[16]: 0
In [17]: sum_numbers('123456')
Out[17]: 123456.0
Which if you only have to consider positive ints could just use str.isdigit as the key:
def sum_numbers(s):
return sum(int("".join(v)) for k,v in groupby(s, key=str.isdigit) if k)
Try this:
def sum_numbers(s):
sum = 0
#This string will represent each number
number_str = ''
for i in s:
if i == ' ':
#if it is a whitespace it means
#that we have a number so we incease the sum
sum += int(number_str)
number_str = ''
continue
number_str += i
else:
#add the last number
sum += int(number_str)
return sum
You could write a generator:
def nums(s):
idx=0
while idx<len(s):
ns=''
while idx<len(s) and s[idx].isdigit():
ns+=s[idx]
idx+=1
yield int(ns)
while idx<len(s) and not s[idx].isdigit():
idx+=1
>>> list(nums('34 3 542 11'))
[34, 3, 542, 11]
Then just sum that:
>>> sum(nums('34 3 542 11'))
590
or, you could use re.finditer with a regular expression and a generator construction:
>>> sum(int(m.group(1)) for m in re.finditer(r'(\d+)', '34 3 542 11'))
590
No lists used...
def sum_numbers(s):
total=0
gt=0 #grand total
l=len(s)
for i in range(l):
if(s[i]!=' '):#find each number
total = int(s[i])+total*10
if(s[i]==' ' or i==l-1):#adding to the grand total and also add the last number
gt+=total
total=0
return gt
print(sum_numbers('1 2 3'))
Here each substring is converted to number and added to grant total
If we omit the fact eval is evil, we can solve that problem with it.
def sum_numbers(s):
s = s.replace(' ', '+')
return eval(s)
Yes, that simple. But i won't put that thing in production.
And sure we need to test that:
from hypothesis import given
import hypothesis.strategies as st
#given(list_num=st.lists(st.integers(), min_size=1))
def test_that_thing(list_num):
assert sum_numbers(' '.join(str(i) for i in list_num)) == sum(list_num)
test_that_thing()
And it would raise nothing.

Write a program that prints the number of times the string contains a substring

s = "bobobobobobsdfsdfbob"
count = 0
for x in s :
if x == "bob" :
count += 1
print count
i want to count how many bobs in string s, the result if this gives me 17
what's wrong with my code i'm newbie python.
When you are looping overt the string, the throwaway variable will hold the characters, so in your loop x is never equal with bob.
If you want to count the non-overlaping strings you can simply use str.count:
In [52]: s.count('bob')
Out[52]: 4
For overlapping sub-strings you can use lookaround in regex:
In [57]: import re
In [59]: len(re.findall(r'(?=bob)', s))
Out[59]: 6
you can use string.count
for example:
s = "bobobobobobsdfsdfbob"
count = s.count("bob")
print(count)
I'm not giving the best solution, just trying to correct your code.
Understanding what for each (a.k.a range for) does in your case
for c in "Hello":
print c
Outputs:
H
e
l
l
o
In each iteration you are comparing a character to a string which results in a wrong answer.
Try something like
(For no overlapping, i.e no span)
s = "bobobobobobsdfsdfbob"
w = "bob"
count = 0
i = 0
while i <= len(s) - len(w):
if s[i:i+len(w)] == w:
count += 1
i += len(w)
else:
i += 1
print (count)
Output:
Count = 4
Overlapping
s = "bobobobobobsdfsdfbob"
w = "bob"
count = 0
for i in range(len(s) - len(w) + 1):
if s[i:i+len(w)] == w:
count += 1
print (count)
Output:
Count = 6

count the matching characters between two inputs given by user

How do i get this python output? counting matches and mismatches
String1: aaabbbccc #aaabbbccc is user input
String2: aabbbcccc #aabbbcccc is user input
Matches: ?
MisMatches: ?
String1: aaAbbBccc #mismatches are capitalize
String2: aaBbbCccc
import itertools
s1 = 'aaabbbccc'
s2 = 'aabbbcccc'
print "Matches:", sum( c1==c2 for c1, c2 in itertools.izip(s1, s2) )
print "Mismatches:", sum( c1!=c2 for c1, c2 in itertools.izip(s1, s2) )
print "String 1:", ''.join( c1 if c1==c2 else c1.upper() for c1, c2 in itertools.izip(s1, s2) )
print "String 2:", ''.join( c2 if c1==c2 else c2.upper() for c1, c2 in itertools.izip(s1, s2) )
This produces:
Matches: 7
Mismatches: 2
String 1: aaAbbBccc
String 2: aaBbbCccc
Assuming you have gotten the string from a file or user input, what about:
import itertools
s1 = 'aaabbbccc'
s2 = 'aabbbcccc'
# This will only consider n characters, where n = min(len(s1), len(s2))
match_indices = [i for (i,(c1, c2)) in enumerate(itertools.izip(s1, s2)) if c1 == c2]
num_matches = len(match_indices)
num_misses = min(len(s1), len(s2)) - num_matches
print("Matches: %d" % num_matches)
print("Mismatches: %d" % num_misses)
print("String 1: %s" % ''.join(c if i in match_indices else c.upper() for (i,c) in enumerate(s1)))
print("String 2: %s" % ''.join(c if i in match_indices else c.upper() for (i,c) in enumerate(s2)))
Output:
Matches: 7
Mismatches: 2
String 1: aaAbbBccc
String 1: aaBbbCccc
If you wanted to count strings of uneven length (where extra characters counted as misses), you could change:
num_misses = min(len(s1), len(s2)) - num_matches
# to
num_misses = max(len(s1), len(s2)) - num_matches
You can try:
index = 0
for letter in String1:
if String1[index] != String2[index]:
mismatches +=1
index += 1
print "Matches:" + (len(String1)-mismatches)
print "Mismatches:" + mismatches
You could try the below.
>>> s1 = 'aaabbbccc'
>>> s2 = 'aabbbcccc'
>>> match = 0
>>> mismatch = 0
>>> for i,j in itertools.izip_longest(s1,s2):
if i == j:
match += 1
else:
mismatch +=1
In python3 use itertools.zip_longest instead of itertools.izip_longest.
If you want to consider a and A as a match, then change the if condition to,
if i.lower() == j.lower():
Finally get the match and mismatch count from the variables match and mismatch .
>>>s= list('aaabbbccc')
>>>s1=list('aabbbcccc')
>>>match=0
>>>mismatch=0
>>>for i in range(0,len(s)):
... if(s[i]==s1[i]):
... match+=1
... else:
... mismatch+=1
... s[i]=s[i].upper()
... s1[i]=s1[i].upper()
>>>print 'Matches:'+ str(match)
>>>print 'MisMatches:'+str(mismatch)
>>>print 'String 1:' +''.join(s)
>>>print 'String 2:' +''.join(s1)

Categories

Resources