Occurence of characters in common in two strings - python

I want to use a for loop to calculate the number of times a character in one string occurs in another string.
e.g. if string1 = 'python' and string2 = 'boa constrictor' then it should calculate to 6 (2 t's, 3 o's, 1 n)
Does anyone know how to do this?

Pretty straightforward:
count = 0
for letter in set(string1):
count += string2.count(letter)
print(count)

Use dict comprehension {ch:string2.count(ch) for ch in string1 if ch in string2}
I forgot you need a for loop and sum over all letters.
count = 0
for ch in string1:
if ch in string2:
count += string2.count(ch)

Related

How to count all substrings in a string, and then sort based on if they begin with a vowel

I tried doing it like this:
# the system waits for the user's string input
word = input()
# defining vowels
vowels = "aeiou"
#// BEGIN_TODO [count_substrings] counting all substrings in a string
count_c: int = 0
count_v: int = 0
total_substrings = (len(word)*(len(word)+1)) // 2
for letter in word:
if letter in vowels:
for i in range(total_substrings):
for j in range(i, total_substrings):
count_v += 1
if letter not in vowels:
for i in range(total_substrings):
for j in range(i, total_substrings):
count_c += 1
print('number of substrings starting with Vowels: {} - number of substrings starting with Consonants: {}'.format(count_v, count_c))
#// END_TODO [count_substrings]
but the code outputs strange numbers. Help would be appreciated on how to approach this problem.
Looks like you're trying to do the same thing twice: You use a formula to calculate the number of substrings, but then you also try to count them with loops. One of those is unnecessary. Or you could combine a little calculation and a little counting like this:
VOWELS = "aeiou"
word = 'test'
n = len(word)
count_v: int = 0
count_c: int = 0
for i, letter in enumerate(word):
# number of substrings starting at i, not counting the empty string:
n_substrings = n - i
if letter in VOWELS:
count_v += n_substrings
else:
count_c += n_substrings
print(f"number of substrings starting with vowels: {count_v}")
print(f"number of substrings starting with consonants: {count_c}")
number of substrings starting with vowels: 3
number of substrings starting with consonants: 7
Note, however, that this approach will overcount if there are identical substrings that can be started at different points in the string. For example, if word = 'eel', the substring 'e' would be counted twice. If you don't want that, it gets more complicated. You could extract all the substrings and collect them in two sets (one for those starting with a vowel, and one for the rest), to remove the duplicates.

Count the number of times a string appears in the reverse position of the string

I am trying to figure out a way that you can count how many times a letter appears in the exact opposite position. For example:
word = 'ABXCEEVHBA' --> The correct output give me 3 because A is first and last. B is second and second from last and so forth.
I have found an answer that gives me the correct result but I was wondering if there is a more elegant way to do this ideally with no modules.
word = 'ABXCEEVHBA'
reverse = ''.join(reversed(word))
sum =0
for i in range(len(word)):
if word[i]==reverse[i]:
sum+=1
print(int(sum/2))
Believe this shall do it:
>>> count = 0
>>> for i in range(len(word)//2): # meet half-way.
if word[i] == word[~i]:
count += 1
>>> count
3
You can do it with zip by combining the string with the inverse of itself:
sum(a==b for a,b in zip(word,reversed(word)))//2

How to find vowels in the odd positions of a string?

For my code, I have to make a function that counts the number of vowels in the odd positions of a string.
For example, the following will produce an output of 2.
st = "xxaeixxAU"
res = countVowelsOdd(st)
print (res)
For my code, the only problem I have is figuring out how to tell python to count the vowels in the ODD positions.
This is found in the second part of the "if statement" in my code where I tried to make the index odd by putting st[i] %2 == 1. I get all types of errors trying to fix this.
Any idea how to resolve this?
def countVowelsOdd(st):
vowels = "aeiouAEIOU"
count = 0
for i, ch in enumerate(st):
if i in vowels and st[i] % 2 == 1:
count += 1
return count
if i in vowels ...
i is the index, you want the letter
if ch in vowels ...
and then since you have the index, that is what you find the modulo on
if ch in vowels and i % 2 == 1:
enumerate provides you first argument i as position.
def countVowelsOdd(st):
vowels = "aeiouAEIOU"
count = 0
for i, ch in enumerate(st):
if ch in vowels and i % 2 == 1:
count += 1
return count
I don't know if your assignment/project precludes the use of regex, but if you are open to it, here is one option. We can first do a regex replacement to remove all even-positioned characters from the input. Then, do a second replacement to remove all non-vowel characters. Finally, what remains gives us correct vowel count.
st = "xxaeixxAU"
st = re.sub(r'(.).', '\\1', st)
print(st)
st = re.sub(r'[^aeiou]', '', st, flags=re.IGNORECASE)
print(len(st))
This prints:
xaixU
3
Please, have a look at this
In [1]: a = '01234567'
In [2]: print(*(c for c in a[0::2]))
0 2 4 6
In [3]: print(*(c for c in a[1::2]))
1 3 5 7
In [4]: print(*(c in '12345' for c in a[1::2]))
True True True False
In [5]: print(sum(c in '12345' for c in a[1::2]))
3
does it help with your problem?

count the total numbers of unique letters occurred once in a string in python?

a = 'abhishek'
count = 0
for x in a:
if x in a:
count += 1
print(count)
I have tried this but it gives me the total number of letters. I want only a unique latter that occurs only once.
len(set(a)) will give you the unique count of letters
Edit: add explanation
set(a) returns a container of all the unique characters (Python calls this the set) in the string a. Then len() gets the count of that set, which corresponds to the count of unique chars in string a.
You are iterating the string and checking the letter in the string itself, so your if condition is always True in this case.
What you need is to maintain a separate list of all the letters you have already seen while iterating the string. Like this,
uniq_list = []
a = 'abhishek'
count = 0
for x in a:
if x not in uniq_list: # check if the letter is already seen.
count += 1 # increase the counter only when the letter is not seen.
uniq_list.append(x) # add the letter in the list to mark it as seen.
print(count)
a = 'abhishek'
count = 0
uls = set()
nls = set()
for x in a:
if x not in uls:
uls.add(x)
else:
nls.add(x)
print(len(uls - nls))
it will print char, which occur only once.
Output: 6
Why not just:
a = 'abhishek'
a.count('a') # or any other letter you want to count.
1
Is this what you want?

Count number of occurrences of a substring in a string

How can I count the number of times a given substring is present within a string in Python?
For example:
>>> 'foo bar foo'.numberOfOccurrences('foo')
2
To get indices of the substrings, see How to find all occurrences of a substring?.
string.count(substring), like in:
>>> "abcdabcva".count("ab")
2
This is for non overlapping occurrences.
If you need to count overlapping occurrences, you'd better check the answers here, or just check my other answer below.
s = 'arunununghhjj'
sb = 'nun'
results = 0
sub_len = len(sb)
for i in range(len(s)):
if s[i:i+sub_len] == sb:
results += 1
print results
Depending what you really mean, I propose the following solutions:
You mean a list of space separated sub-strings and want to know what is the sub-string position number among all sub-strings:
s = 'sub1 sub2 sub3'
s.split().index('sub2')
>>> 1
You mean the char-position of the sub-string in the string:
s.find('sub2')
>>> 5
You mean the (non-overlapping) counts of appearance of a su-bstring:
s.count('sub2')
>>> 1
s.count('sub')
>>> 3
The best way to find overlapping sub-strings in a given string is to use a regular expression. With lookahead, it will find all the overlapping matches using the regular expression library's findall(). Here, left is the substring and right is the string to match.
>>> len(re.findall(r'(?=aa)', 'caaaab'))
3
To find overlapping occurences of a substring in a string in Python 3, this algorithm will do:
def count_substring(string,sub_string):
l=len(sub_string)
count=0
for i in range(len(string)-len(sub_string)+1):
if(string[i:i+len(sub_string)] == sub_string ):
count+=1
return count
I myself checked this algorithm and it worked.
You can count the frequency using two ways:
Using the count() in str:
a.count(b)
Or, you can use:
len(a.split(b))-1
Where a is the string and b is the substring whose frequency is to be calculated.
Scenario 1: Occurrence of a word in a sentence.
eg: str1 = "This is an example and is easy". The occurrence of the word "is". lets str2 = "is"
count = str1.count(str2)
Scenario 2 : Occurrence of pattern in a sentence.
string = "ABCDCDC"
substring = "CDC"
def count_substring(string,sub_string):
len1 = len(string)
len2 = len(sub_string)
j =0
counter = 0
while(j < len1):
if(string[j] == sub_string[0]):
if(string[j:j+len2] == sub_string):
counter += 1
j += 1
return counter
Thanks!
The current best answer involving method count doesn't really count for overlapping occurrences and doesn't care about empty sub-strings as well.
For example:
>>> a = 'caatatab'
>>> b = 'ata'
>>> print(a.count(b)) #overlapping
1
>>>print(a.count('')) #empty string
9
The first answer should be 2 not 1, if we consider the overlapping substrings.
As for the second answer it's better if an empty sub-string returns 0 as the asnwer.
The following code takes care of these things.
def num_of_patterns(astr,pattern):
astr, pattern = astr.strip(), pattern.strip()
if pattern == '': return 0
ind, count, start_flag = 0,0,0
while True:
try:
if start_flag == 0:
ind = astr.index(pattern)
start_flag = 1
else:
ind += 1 + astr[ind+1:].index(pattern)
count += 1
except:
break
return count
Now when we run it:
>>>num_of_patterns('caatatab', 'ata') #overlapping
2
>>>num_of_patterns('caatatab', '') #empty string
0
>>>num_of_patterns('abcdabcva','ab') #normal
2
The question isn't very clear, but I'll answer what you are, on the surface, asking.
A string S, which is L characters long, and where S[1] is the first character of the string and S[L] is the last character, has the following substrings:
The null string ''. There is one of these.
For every value A from 1 to L, for every value B from A to L, the string S[A]..S[B]
(inclusive). There are L + L-1 + L-2 + ... 1 of these strings, for a
total of 0.5*L*(L+1).
Note that the second item includes S[1]..S[L],
i.e. the entire original string S.
So, there are 0.5*L*(L+1) + 1 substrings within a string of length L. Render that expression in Python, and you have the number of substrings present within the string.
One way is to use re.subn. For example, to count the number of
occurrences of 'hello' in any mix of cases you can do:
import re
_, count = re.subn(r'hello', '', astring, flags=re.I)
print('Found', count, 'occurrences of "hello"')
How about a one-liner with a list comprehension? Technically its 93 characters long, spare me PEP-8 purism. The regex.findall answer is the most readable if its a high level piece of code. If you're building something low level and don't want dependencies, this one is pretty lean and mean. I'm giving the overlapping answer. Obviously just use count like the highest score answer if there isn't overlap.
def count_substring(string, sub_string):
return len([i for i in range(len(string)) if string[i:i+len(sub_string)] == sub_string])
If you want to count all the sub-string (including overlapped) then use this method.
import re
def count_substring(string, sub_string):
regex = '(?='+sub_string+')'
# print(regex)
return len(re.findall(regex,string))
I will keep my accepted answer as the "simple and obvious way to do it", however, it does not cover overlapping occurrences.
Finding out those can be done naively, with multiple checking of the slices - as in:
sum("GCAAAAAGH"[i:].startswith("AAA") for i in range(len("GCAAAAAGH")))
which yields 3.
Or it can be done by trick use of regular expressions, as can be seen at How to use regex to find all overlapping matches - and it can also make for fine code golfing.
This is my "hand made" count for overlapping occurrences of patterns in a string which tries not to be extremely naive (at least it does not create new string objects at each interaction):
def find_matches_overlapping(text, pattern):
lpat = len(pattern) - 1
matches = []
text = array("u", text)
pattern = array("u", pattern)
indexes = {}
for i in range(len(text) - lpat):
if text[i] == pattern[0]:
indexes[i] = -1
for index, counter in list(indexes.items()):
counter += 1
if text[i] == pattern[counter]:
if counter == lpat:
matches.append(index)
del indexes[index]
else:
indexes[index] = counter
else:
del indexes[index]
return matches
def count_matches(text, pattern):
return len(find_matches_overlapping(text, pattern))
For overlapping count we can use use:
def count_substring(string, sub_string):
count=0
beg=0
while(string.find(sub_string,beg)!=-1) :
count=count+1
beg=string.find(sub_string,beg)
beg=beg+1
return count
For non-overlapping case we can use count() function:
string.count(sub_string)
Overlapping occurences:
def olpcount(string,pattern,case_sensitive=True):
if case_sensitive != True:
string = string.lower()
pattern = pattern.lower()
l = len(pattern)
ct = 0
for c in range(0,len(string)):
if string[c:c+l] == pattern:
ct += 1
return ct
test = 'my maaather lies over the oceaaan'
print test
print olpcount(test,'a')
print olpcount(test,'aa')
print olpcount(test,'aaa')
Results:
my maaather lies over the oceaaan
6
4
2
Here's a solution that works for both non-overlapping and overlapping occurrences. To clarify: an overlapping substring is one whose last character is identical to its first character.
def substr_count(st, sub):
# If a non-overlapping substring then just
# use the standard string `count` method
# to count the substring occurences
if sub[0] != sub[-1]:
return st.count(sub)
# Otherwise, create a copy of the source string,
# and starting from the index of the first occurence
# of the substring, adjust the source string to start
# from subsequent occurences of the substring and keep
# keep count of these occurences
_st = st[::]
start = _st.index(sub)
cnt = 0
while start is not None:
cnt += 1
try:
_st = _st[start + len(sub) - 1:]
start = _st.index(sub)
except (ValueError, IndexError):
return cnt
return cnt
If you're looking for a power solution that works every case this function should work:
def count_substring(string, sub_string):
ans = 0
for i in range(len(string)-(len(sub_string)-1)):
if sub_string == string[i:len(sub_string)+i]:
ans += 1
return ans
If you want to find out the count of substring inside any string; please use below code.
The code is easy to understand that's why i skipped the comments. :)
string=raw_input()
sub_string=raw_input()
start=0
answer=0
length=len(string)
index=string.find(sub_string,start,length)
while index<>-1:
start=index+1
answer=answer+1
index=string.find(sub_string,start,length)
print answer
You could use the startswith method:
def count_substring(string, sub_string):
x = 0
for i in range(len(string)):
if string[i:].startswith(sub_string):
x += 1
return x
def count_substring(string, sub_string):
inc = 0
for i in range(0, len(string)):
slice_object = slice(i,len(sub_string)+i)
count = len(string[slice_object])
if(count == len(sub_string)):
if(sub_string == string[slice_object]):
inc = inc + 1
return inc
if __name__ == '__main__':
string = input().strip()
sub_string = input().strip()
count = count_substring(string, sub_string)
print(count)
def count_substring(string, sub_string):
k=len(string)
m=len(sub_string)
i=0
l=0
count=0
while l<k:
if string[l:l+m]==sub_string:
count=count+1
l=l+1
return count
if __name__ == '__main__':
string = input().strip()
sub_string = input().strip()
count = count_substring(string, sub_string)
print(count)
2+ others have already provided this solution, and I even upvoted one of them, but mine is probably the easiest for newbies to understand.
def count_substring(string, sub_string):
slen = len(string)
sslen = len(sub_string)
range_s = slen - sslen + 1
count = 0
for i in range(range_s):
if string[i:i+sslen] == sub_string:
count += 1
return count
I'm not sure if this is something looked at already, but I thought of this as a solution for a word that is 'disposable':
for i in xrange(len(word)):
if word[:len(term)] == term:
count += 1
word = word[1:]
print count
Where word is the word you are searching in and term is the term you are looking for
string="abc"
mainstr="ncnabckjdjkabcxcxccccxcxcabc"
count=0
for i in range(0,len(mainstr)):
k=0
while(k<len(string)):
if(string[k]==mainstr[i+k]):
k+=1
else:
break
if(k==len(string)):
count+=1;
print(count)
my_string = """Strings are amongst the most popular data types in Python.
We can create the strings by enclosing characters in quotes.
Python treats single quotes the same as double quotes."""
Count = my_string.lower().strip("\n").split(" ").count("string")
Count = my_string.lower().strip("\n").split(" ").count("strings")
print("The number of occurance of word String is : " , Count)
print("The number of occurance of word Strings is : " , Count)
For a simple string with space delimitation, using Dict would be quite fast, please see the code as below
def getStringCount(mnstr:str, sbstr:str='')->int:
""" Assumes two inputs string giving the string and
substring to look for number of occurances
Returns the number of occurances of a given string
"""
x = dict()
x[sbstr] = 0
sbstr = sbstr.strip()
for st in mnstr.split(' '):
if st not in [sbstr]:
continue
try:
x[st]+=1
except KeyError:
x[st] = 1
return x[sbstr]
s = 'foo bar foo test one two three foo bar'
getStringCount(s,'foo')
Below logic will work for all string & special characters
def cnt_substr(inp_str, sub_str):
inp_join_str = ''.join(inp_str.split())
sub_join_str = ''.join(sub_str.split())
return inp_join_str.count(sub_join_str)
print(cnt_substr("the sky is $blue and not greenthe sky is $blue and not green", "the sky"))
Here's the solution in Python 3 and case insensitive:
s = 'foo bar foo'.upper()
sb = 'foo'.upper()
results = 0
sub_len = len(sb)
for i in range(len(s)):
if s[i:i+sub_len] == sb:
results += 1
print(results)
j = 0
while i < len(string):
sub_string_out = string[i:len(sub_string)+j]
if sub_string == sub_string_out:
count += 1
i += 1
j += 1
return count
#counting occurence of a substring in another string (overlapping/non overlapping)
s = input('enter the main string: ')# e.g. 'bobazcbobobegbobobgbobobhaklpbobawanbobobobob'
p=input('enter the substring: ')# e.g. 'bob'
counter=0
c=0
for i in range(len(s)-len(p)+1):
for j in range(len(p)):
if s[i+j]==p[j]:
if c<len(p):
c=c+1
if c==len(p):
counter+=1
c=0
break
continue
else:
break
print('number of occurences of the substring in the main string is: ',counter)

Categories

Resources