Removing substring from string in Python

Removing substring from string in Python - python

I want to write a function that takes 2 inputs: a string and a substring, then the function will remove that part of the substring from the string.
def remove_substring(s, substr):
"""(str, str) -> NoneType
Returns string without string substr
remove_substring_from_string("Im in cs", "cs")
Im in
"""
other_s = ''
for substr in s:
if substr in s:
continue
How do I continue on from here? Assuming my logic is sound.

Avoiding the use of Python functions.
Method 1
def remove_substring_from_string(s, substr):
'''
find start index in s of substring
remove it by skipping over it
'''
i = 0
while i < len(s) - len(substr) + 1:
# Check if substring starts at i
if s[i:i+len(substr)] == substr:
break
i += 1
else:
# break not hit, so substr not found
return s
# break hit
return s[:i] + s[i+len(substr):]
Method 2
If the range function can be used, the above can be written more compactly as follows.
def remove_substring_from_string(s, substr):
'''
find start index in s of substring
remove it by skipping over it
'''
for i in range(len(s) - len(substr) + 1):
if s[i:i+len(substr)] == substr:
break
else:
# break not hit, so substr not found
return s
return s[:i] + s[i+len(substr):]
Test
print(remove_substring_from_string("I have nothing to declare except my genuis", " except my genuis"))
# Output: I have nothing to declare'

This approach is based on the KMP algorithm:
def KMP(s):
n = len(s)
pi = [0 for _ in range(n)]
for i in range(1, n):
j = pi[i - 1]
while j > 0 and s[i] != s[j]:
j = pi[j - 1]
if s[i] == s[j]:
j += 1
pi[i] = j
return pi
# Removes all occurences of t in s
def remove_substring_from_string(s, t):
n = len(s)
m = len(t)
# Calculate the prefix function using KMP
pi = KMP(t + '\x00' + s)[m + 1:]
r = ""
i = 0
while i + m - 1 < n: # Before the remaining string is smaller than the substring
if pi[i + m - 1] == m: # If the substring is here, skip it
i += m
else: # Otherwise, add the current character and move to the next
r += s[i]
i += 1
# Add the remaining string
r += s[i:]
return r
It runs in O(|s| + |t|), but it has a few downsides:
The code is long and unintuitive.
It requires that there is no null (\x00) in the input strings.
Its constant factors are pretty bad for short s and t.
It doesn't handle overlapping strings how you might want it: remove_substring_from_string("aaa", "aa") will return "a". The only guarantee made is that t in remove_substring_from_string(s, t) is False for any two strings s and t.
A C++ example and further explanation for the KMP algorithm can be found here. The remove_substring_from_string function then only checks if the entire substring is matched at each position and if so, skips over the substring.

I would do this using re.
import re
def remove_substring(s, substr):
# type: (str, str) -> str
return re.subn(substr, '', s)[0]
remove_substring('I am in cs', 'cs')
# 'I am in '
remove_substring('This also removes multiple substr that are found. Even if that substr is repeated like substrsubstrsubstr', 'substr')
# 'This also removes multiple that are found. Even if that is repeated like '

def remove_substring(s, substr):
while s != "":
if substr in s:
s = s.replace(substr, "")
else:
return s
if s == "":
return "Empty String"
The idea here is that we replace all occurrences of substr within s, by replacing the first instance of substr and then looping until we are done.

Related

Need to print the first occurrence of k-size subsequence, but my code prints the last

You are given a string my_string and a positive integer k. Write code that prints the first substring of my_string of length k all of whose characters are identical (lowercase and uppercase are different). If none such exists, print an appropriate error message (see Example 4 below). In particular, the latter holds when my_string is empty.
Example: For the input my_string = “abaadddefggg”, k = 3 the output is For length 3, found the substring ddd!
Example: For the input my_string = “abaadddefggg”, k = 9 the output is Didn't find a substring of length 9
this is my attempt:
my_string = 'abaadddefggg'
k = 3
s=''
for i in range(len(my_string) - k + 1):
if my_string[i:i+k] == my_string[i] * k:
s = my_string[i:i+k]
if len(s) > 0:
print(f'For length {k}, found the substring {s}!')
else:
print(f"Didn't find a substring of length {k}")

You need break. When you find the first occurrence you need to exit from the for-loop. You can do this with break. If you don't break you continue and maybe find the last occurrence if exists.
my_string = 'abaadddefggg'
k = 3
s=''
for i in range(len(my_string) - k + 1):
if my_string[i:i+k] == my_string[i] * k:
s = my_string[i:i+k]
break
if len(s) > 0:
print(f'For length {k}, found the substring {s}!')
else:
print(f"Didn't find a substring of length {k}")
You can use itertools.groupby and also You can write a function and use return and then find the first occurrence and return result from the function.
import itertools
def first_occurrence(string, k):
for key, group in itertools.groupby(string):
lst_group = list(group)
if len(lst_group) == k:
return ''.join(lst_group)
return ''
my_string = 'abaadddefggg'
k = 3
s = first_occurrence(my_string, k)
if len(s) > 0:
print(f'For length {k}, found the substring {s}!')
else:
print(f"Didn't find a substring of length {k}")

One alternative approach using an iterator:
my_string = 'abaadddefggg'
N= 3
ou = next((my_string[i:i+N] for i in range(len(my_string)-N)
if len(set(my_string[i:i+N])) == 1), 'no match')
Output: 'ddd'

Scalable solution to check if string a a substring of large number of strings

How can I efficiently check if a string is a substring of any a given strings?
The naive way would be:
def is_substring_of_any(phrase, known_phrases):
for known_phrase in known_phrases:
if phrase in known_phrase:
return True
return False
However, it scales badly for large number of strings:
known_phrases = [phrase_generator(15) for _ in tqdm(range(10 ** 6), desc="Generating known phrases")]
phrases_to_check = [phrase_generator(7) for _ in tqdm(range(10 ** 5), desc="Generating phrases to check")]
for phrase in tqdm(phrases_to_check):
is_substring_of_any(phrase, known_phrases)
giving >1h to process them all:
Generating known phrases: 100%|██████████| 1000000/1000000 [00:13<00:00, 73370.36it/s]
Generating phrases to check: 100%|██████████| 100000/100000 [00:00<00:00, 137534.76it/s]
6%|▌ | 5991/100000 [04:23<1:11:20, 21.96it/s]
Is there a way to run it faster without setting up additional infrastructure?

Boyer–Moore string-search algorithm
Boyer–Moore string-search algorithm is an efficient string-searching algorithm that is the standard benchmark for practical string-search literature.
The Boyer–Moore algorithm searches for occurrences of $P$ in $T$ by performing explicit character comparisons at different alignments. Instead of a brute-force search of all alignments $m-n+1$ Boyer–Moore uses information gained by preprocessing P to skip as many alignments as possible.
The key insight in this algorithm is that if the end of the pattern is compared to the text, then jumps along the text can be made rather than checking every character of the text. The reason that this works is that in lining up the pattern against the text, the last character of the pattern is compared to the character in the text. If the characters do not match, there is no need to continue searching backwards along the text. If the character in the text does not match any of the characters in the pattern, then the next character in the text to check is located n characters farther along the text, where n is the length of the pattern. If the character in the text is in the pattern, then a partial shift of the pattern along the text is done to line up along the matching character and the process is repeated. Jumping along the text to make comparisons rather than checking every character in the text decreases the number of comparisons that have to be made, which is the key to the efficiency of the algorithm.
More formally, the algorithm begins at alignment $k=n$, so the start of P is aligned with the start of T. Characters in P and T are then compared starting at index n in P and k in T, moving backward. The strings are matched from the end of P to the start of P. The comparisons continue until either the beginning of P is reached (which means there is a match) or a mismatch occurs upon which the alignment is shifted forward (to the right) according to the maximum value permitted by a number of rules. The comparisons are performed again at the new alignment, and the process repeats until the alignment is shifted past the end of T, which means no further matches will be found.
The shift rules are implemented as constant-time table lookups, using tables generated during the preprocessing of P.
Python Implementation:
from typing import *
# This version is sensitive to the English alphabet in ASCII for case-insensitive matching.
# To remove this feature, define alphabet_index as ord(c), and replace instances of "26"
# with "256" or any maximum code-point you want. For Unicode you may want to match in UTF-8
# bytes instead of creating a 0x10FFFF-sized table.
ALPHABET_SIZE = 26
def alphabet_index(c: str) -> int:
"""Return the index of the given character in the English alphabet, counting from 0."""
val = ord(c.lower()) - ord("a")
assert val >= 0 and val < ALPHABET_SIZE
return val
def match_length(S: str, idx1: int, idx2: int) -> int:
"""Return the length of the match of the substrings of S beginning at idx1 and idx2."""
if idx1 == idx2:
return len(S) - idx1
match_count = 0
while idx1 < len(S) and idx2 < len(S) and S[idx1] == S[idx2]:
match_count += 1
idx1 += 1
idx2 += 1
return match_count
def fundamental_preprocess(S: str) -> List[int]:
"""Return Z, the Fundamental Preprocessing of S.
Z[i] is the length of the substring beginning at i which is also a prefix of S.
This pre-processing is done in O(n) time, where n is the length of S.
"""
if len(S) == 0: # Handles case of empty string
return []
if len(S) == 1: # Handles case of single-character string
return [1]
z = [0 for x in S]
z[0] = len(S)
z[1] = match_length(S, 0, 1)
for i in range(2, 1 + z[1]): # Optimization from exercise 1-5
z[i] = z[1] - i + 1
# Defines lower and upper limits of z-box
l = 0
r = 0
for i in range(2 + z[1], len(S)):
if i <= r: # i falls within existing z-box
k = i - l
b = z[k]
a = r - i + 1
if b < a: # b ends within existing z-box
z[i] = b
else: # b ends at or after the end of the z-box, we need to do an explicit match to the right of the z-box
z[i] = a + match_length(S, a, r + 1)
l = i
r = i + z[i] - 1
else: # i does not reside within existing z-box
z[i] = match_length(S, 0, i)
if z[i] > 0:
l = i
r = i + z[i] - 1
return z
def bad_character_table(S: str) -> List[List[int]]:
"""
Generates R for S, which is an array indexed by the position of some character c in the
English alphabet. At that index in R is an array of length |S|+1, specifying for each
index i in S (plus the index after S) the next location of character c encountered when
traversing S from right to left starting at i. This is used for a constant-time lookup
for the bad character rule in the Boyer-Moore string search algorithm, although it has
a much larger size than non-constant-time solutions.
"""
if len(S) == 0:
return [[] for a in range(ALPHABET_SIZE)]
R = [[-1] for a in range(ALPHABET_SIZE)]
alpha = [-1 for a in range(ALPHABET_SIZE)]
for i, c in enumerate(S):
alpha[alphabet_index(c)] = i
for j, a in enumerate(alpha):
R[j].append(a)
return R
def good_suffix_table(S: str) -> List[int]:
"""
Generates L for S, an array used in the implementation of the strong good suffix rule.
L[i] = k, the largest position in S such that S[i:] (the suffix of S starting at i) matches
a suffix of S[:k] (a substring in S ending at k). Used in Boyer-Moore, L gives an amount to
shift P relative to T such that no instances of P in T are skipped and a suffix of P[:L[i]]
matches the substring of T matched by a suffix of P in the previous match attempt.
Specifically, if the mismatch took place at position i-1 in P, the shift magnitude is given
by the equation len(P) - L[i]. In the case that L[i] = -1, the full shift table is used.
Since only proper suffixes matter, L[0] = -1.
"""
L = [-1 for c in S]
N = fundamental_preprocess(S[::-1]) # S[::-1] reverses S
N.reverse()
for j in range(0, len(S) - 1):
i = len(S) - N[j]
if i != len(S):
L[i] = j
return L
def full_shift_table(S: str) -> List[int]:
"""
Generates F for S, an array used in a special case of the good suffix rule in the Boyer-Moore
string search algorithm. F[i] is the length of the longest suffix of S[i:] that is also a
prefix of S. In the cases it is used, the shift magnitude of the pattern P relative to the
text T is len(P) - F[i] for a mismatch occurring at i-1.
"""
F = [0 for c in S]
Z = fundamental_preprocess(S)
longest = 0
for i, zv in enumerate(reversed(Z)):
longest = max(zv, longest) if zv == i + 1 else longest
F[-i - 1] = longest
return F
def string_search(P, T) -> List[int]:
"""
Implementation of the Boyer-Moore string search algorithm. This finds all occurrences of P
in T, and incorporates numerous ways of pre-processing the pattern to determine the optimal
amount to shift the string and skip comparisons. In practice it runs in O(m) (and even
sublinear) time, where m is the length of T. This implementation performs a case-insensitive
search on ASCII alphabetic characters, spaces not included.
"""
if len(P) == 0 or len(T) == 0 or len(T) < len(P):
return []
matches = []
# Preprocessing
R = bad_character_table(P)
L = good_suffix_table(P)
F = full_shift_table(P)
k = len(P) - 1 # Represents alignment of end of P relative to T
previous_k = -1 # Represents alignment in previous phase (Galil's rule)
while k < len(T):
i = len(P) - 1 # Character to compare in P
h = k # Character to compare in T
while i >= 0 and h > previous_k and P[i] == T[h]: # Matches starting from end of P
i -= 1
h -= 1
if i == -1 or h == previous_k: # Match has been found (Galil's rule)
matches.append(k - len(P) + 1)
k += len(P) - F[1] if len(P) > 1 else 1
else: # No match, shift by max of bad character and good suffix rules
char_shift = i - R[alphabet_index(T[h])][i]
if i + 1 == len(P): # Mismatch happened on first attempt
suffix_shift = 1
elif L[i + 1] == -1: # Matched suffix does not appear anywhere in P
suffix_shift = len(P) - F[i + 1]
else: # Matched suffix appears in P
suffix_shift = len(P) - 1 - L[i + 1]
shift = max(char_shift, suffix_shift)
previous_k = k if shift >= i + 1 else previous_k # Galil's rule
k += shift
return matches
For more Information:
https://en.m.wikipedia.org/wiki/Boyer%E2%80%93Moore_string-search_algorithm

Python is not recognizing null in a list

I am new in Python and I have a question. What should I put in line 5? I want to see if the character c doesn't exist in groups, then I want to create one and assign it to 0. And then, increase it as much as this character has appeared
def firstUniqChar(self, s):
groups = {}
for i in range(0, len(s) - 1):
c = s[i]
if groups[c] == null:
groups[c] = 0
else:
groups[c] = groups[c] + 1
for j in range(0, len(s) - 1):
if groups[s[i]] == 1:
return j
return -1

Rather than null you need to write None in python.
You can write it like
if groups[c] is None:

How to handle long strings on hackerrank in python

How to handle long strings in hackerrank in python as it shows an error:
Terminated due to timeout.
Written code in python:
s = 'NANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANAN'
def calculate(s):
result = []
for num in range(1,len(s)+1):
for i in range(0,len(s)):
temp = s[i:i+num]
if(len(temp) == num):
result.append(temp)
return(result)
resultlist = calculate(s)
print(resultlist)
resultset = set(resultlist)
resultdict = {}
for elem in resultset:
resultdict[elem] = resultlist.count(elem)
countVowel = 0
countcons = 0
for elem in resultdict.keys():
if elem[0] in 'AEIOU':
countVowel += resultdict[elem]
else:
countcons += resultdict[elem]
if(countVowel > countcons):
print("Kevin "+str(countVowel))
elif(countVowel < countcons):
print("Stuart "+str(countcons))
else:
print("draw")
The expected result should print string, but it's showing an error of terminated due to timeout.

There is no problem with long strings in Python, only with algorithms with O(n³) complexity (two nested loops plus string slicing) that are applied to inputs with size n=5000.
It seems like you want to compare how many substrings of the input s start with a vowel vs. how many start with a non-vowel. For this you do not really need to calculate all those substrings -- you only need to know how many substrings there are starting at the current position!
countVowel = 0
countcons = 0
for i, a in enumerate(s):
if a in 'AEIOU':
countVowel += len(s) - i
else:
countcons += len(s) - i
And that's all you need. No calculate, not resultlist, -set and -dict. For a shorter test input, this yielded the same result as your algorithm, just much, much faster, in O(n).
For example, if you have the string ABCDE and you are currently at position B, you have the substrings [B, BC, BCD, BCDE]. You do not have to actually calculate all those substrings to know that (a) there are four of them (the length of the string minus the current position) and (b) all of those start with a B. Thus, just iterate the characters in the string, calculate the number of substrings from that position, and add that number to the counter corresponding to the current character.

Try this: (global s is renamed to inp)
inp = 'NANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANANNANAN'
def calculate(s):
result = {}
s_len = len(s)
for num in range(1, s_len + 1):
for i in range(0, s_len - num + 1):
temp = s[i:i + num]
count = result.setdefault(temp, 0)
result[temp] = count + 1
return result
resultdict = calculate(inp)
countVowel = 0
countcons = 0
for elem in resultdict.keys():
if elem[0] in 'AEIOU':
countVowel += resultdict[elem]
else:
countcons += resultdict[elem]
if countVowel > countcons:
print("Kevin " + str(countVowel))
elif countVowel < countcons:
print("Stuart " + str(countcons))
else:
print("draw")
I got result Stuart 7501500
If you provide more input examples and expected results - it could be possible to check :)

first time programming-exercise

First time programming ever... I'm trying to do this exercise to.. :
Write a program that prints the longest substring of s in which the letters occur in alphabetical order. For example, if s = 'azcbobobegghakl', then your program should print
Longest substring in alphabetical order is: beggh
I was here..before starting to freak out:
s = 'abcdezcbobobegghakl'
n = len(s)
x = 0
x += 1
lengh = s[x-1]
if s[x] >= s[x-1]:
lengh = lengh + s[x]
if s[x+1] < s[x]:
n = len(lengh)
if x > n:
break
print('Longest substring in alphabetical order is: ' + str(lengh))
I know this code is bad..I m trying to find substring in alphabetical order and is some way keep the longest one! I know could be normal, because I never programmed before but i feel really frustrated...any good idea/help??

First try to decompose your problem into little problems (do not optimize ! until your problem is solved), if you have learned about functions they are a good way to decompose execution flow into readable and understandable snippets.
An example to start would be :
def get_sequence_size(my_string, start):
# Your code here
return size_of_sequence
current_position = 0
while current_position < len(my_string):
# Your code here using get_sequence_size() function

The following code solves the problem using the reduce method:
solution = ''
def check(substr, char):
global solution
last_char = substr[-1]
substr = (substr + char) if char >= last_char else char
if len(substr) > len(solution):
solution = substr
return substr
def get_largest(s):
global solution
solution = ''
reduce(check, list(s))
return solution

def find_longest_substr(my_str):
# string var to hold the result
res = ""
# candidate for the longest sub-string
candidate = ""
# for each char in string
for char in my_str:
# if candidate is empty, just add the first char to it
if not candidate:
candidate += char
# if last char in candidate is "lower" than equal to current char, add char to candidate
elif candidate[-1] <= char:
candidate += char
# if candidate is longer than result, we found new longest sub-string
elif len(candidate) > len(res):
res= candidate
candidate = char
# reset candidate and add current char to it
else:
candidate = char
# last candidate is the longest, update result
if len(candidate) > len(res):
res= candidate
return res
def main():
str1 = "azcbobobegghaklbeggh"
longest = find_longest_substr(str1)
print longest
if __name__ == "__main__":
main()

These are all assuming you have a string (s) and are needing to find the longest substring in alphabetical order.
Option A
test = s[0] # seed with first letter in string s
best = '' # empty var for keeping track of longest sequence
for n in range(1, len(s)): # have s[0] so compare to s[1]
if len(test) > len(best):
best = test
if s[n] >= s[n-1]:
test = test + s[n] # add s[1] to s[0] if greater or equal
else: # if not, do one of these options
test = s[n]
print "Longest substring in alphabetical order is:", best
Option B
maxSub, currentSub, previousChar = '', '', ''
for char in s:
if char >= previousChar:
currentSub = currentSub + char
if len(currentSub) > len(maxSub):
maxSub = currentSub
else: currentSub = char
previousChar = char
print maxSub
Option C
matches = []
current = [s[0]]
for index, character in enumerate(s[1:]):
if character >= s[index]: current.append(character)
else:
matches.append(current)
current = [character]
print "".join(max(matches, key=len))
Option D
def longest_ascending(s):
matches = []
current = [s[0]]
for index, character in enumerate(s[1:]):
if character >= s[index]:
current.append(character)
else:
matches.append(current)
current = [character]
matches.append(current)
return "".join(max(matches, key=len))
print(longest_ascending(s))

def longest(s):
buff = ''
longest = ''
s += chr(255)
for i in range(len(s)-1):
buff += s[i]
if not s[i] < s[i+1]:
if len(buff) > len(longest):
longest = buff
buff = ''
if len(buff) > len(longest):
longest = buff
return longest

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Removing substring from string in Python - python

def remove_substring(s, substr): while s != "": if substr in s: s = s.replace(substr, "") else: return s if s == "": return "Empty String" The idea here is that we replace all occurrences of substr within s, by replacing the first instance of substr and then looping until we are done.

Related

Need to print the first occurrence of k-size subsequence, but my code prints the last

Scalable solution to check if string a a substring of large number of strings

Python is not recognizing null in a list

How to handle long strings on hackerrank in python

first time programming-exercise

Categories

Resources