finding the minimum window substring - python

the problem says to create a string, take 3 non-consecutive characters from the string and put it into a sub-string and print the which character the first one is and which character the last one is.
str="subliminal"
sub="bmn"
n = len(str)-3
for i in range(0, n):
print(str1[i:i+4])
if sub1 in str1:
print(sub1[i])
this should print 3 to 8 because b is the third letter and n is the 8th letter.
i also don't know how to make the code work for substrings that aren't 3 characters long without changing the code in total.

Not sure if this is what you meant. I assume that the substring is already valid, which means that it contains non consecutive letters. Then I get the first and last letter of the substring and create a list of all the letters in the string using a list comprehension. Then i just loop through the letters and save where the first and last letter occur. If anything is missing, hmu.
sub = "bmn"
str = "subliminal"
first_letter = sub[0]
last_letter = sub[-1]
start = None
end = None
letters = [let for let in str]
for i, letter in enumerate(letters):
if letter == first_letter:
start = i
if letter == last_letter:
end = i
if start and end:
print(f"From %s to %s." % (start + 1, end + 1)) # Output: From 3 to 8.

Some recursion for good health:
def minimum_window_substring(strn, sub, beg=0, fin=0, firstFound=False):
if len(sub) == 0 or len(strn) == 0:
return f'From {beg + 1} to {fin}'
elif strn[0] == sub[0]:
return minimum_window_substring(strn[1:], sub[1:], beg, fin + 1, True)
if not firstFound:
beg += 1
return minimum_window_substring(strn[1:], sub, beg, fin + 1, firstFound)
Explanation:
The base case is if we get our original string or our sub-string to be length 0, we then stop and print the beginning and the end of the substring in the original string.
If the first letter of the current string is equal then we start the counter (we fix the beginning "beg" with the flag "firstFound") Then increment until we finish (sub is an empty string / original string is empty)
Something to think about / More explanation:
If for example, you ask for the first occurrence of the substring, for example if the original string would be "sububusubulum" and the sub would equal to "sbl" then when we hit our first "s" - it means it would 100% start from there, because if another "sbl" is inside the original string - then it must contain the remaining letters, and so we would say they belong to the first s. (A horrible explanation, I am sorry) what I am trying to say is that if we have 2 occurrences of the substring - then we would pick the first one, no matter what.
Note: This function does not really care if the sub-string contains consecutive letters, also, it does not check whether the characters are in the string itself, because you said that we must be given characters from the original string. The positive thing about it, is that the function can be given more than (or less than) 3 characters long substring
When I say "original string" I mean subliminal (or other inputs)

There are many different ways you could do it,
here is a soultion,
import re
def Func(String, SubString):
patt = "".join([char + "[A-Za-z]" + "+" for char in sub[:-1]] + [sub[-1]])
MatchedString = re.findall(patt, String)[0]
FirstIndex = String.find(MatchedString) + 1
LastIndex = FirstIndex + len(MatchedString) -1
return FirstIndex, LastIndex
string="subliminal"
sub="bmn"
FirstIndex, LastIndex = Func(string, sub)
This will return 3, 8 and you could change the length of the substring, and assuming you want just the first match only

Related

String incrementation

I've just started to learn Python and I'm doing some exercises in codewars. The instructions are simple: If the string already ends with a number, the number should be incremented by 1.
If the string does not end with a number. the number 1 should be appended to the new string.
I wrote this:
if strng[-1].isdigit():
return strng.replace(strng[-1],str(int(strng[-1])+1))
else:
return strng + "1"
return(strng)
It works sometimes (for example 'foobar001 - foobar002', 'foobar' - 'foobar1'). But in other cases it adds 1 to each number at the end (for example 'foobar11' - 'foobar22'), I would like to achieve a code where the effect is to add only +1 to the ending number, for example when 'foobar99' then 'foobar100', so the number has to be considered as a whole. I would be grateful for advices for beginner :)!
First, you have to make some assumptions
Assuming that the numerical values are always at the end of string and the first character from the right that is not numeric would mark the end of the non-number string, i.e.
>>> input = "foobar123456"
>>> output = 123456 + 1
Second, we need to assume that number exists at the end of the string.
So if we encounter a string without a number, we need to decide if the python code should throw an error and not try to add 1.
>>> input = "foobar"
Or we decide that we automatically generate a 0 digit, which would require us to do something like
input = input if input[-1].isdigit() else input + "0"
Lets assume the latter decision for simplicity of the explanation.
Next we will try to read the numbers from the right until you get to a non-digit
Lets use reversed() to flip the string and then a for-loop to read the characters until we reach a non-number, i.e.
>>> s = "foobar123456"
>>> output = 123456
>>> for character in reversed(s):
... if not character.isdigit():
... break
... else:
... print(character)
...
6
5
4
3
2
1
Now, lets use a list to keep the digits characters
>>> digits_in_reverse = []
>>> for character in reversed(s):
... if not character.isdigit():
... break
... else:
... digits_in_reverse.append(character)
...
>>> digits_in_reverse
['6', '5', '4', '3', '2', '1']
Then we reverse it:
>>> ''.join(reversed(digits_in_reverse))
'123456'
And convert it into an integer:
>>> int(''.join(reversed(digits_in_reverse)))
123456
Now the +1 increment would be easy!
How do we find the string preceding the number?
# The input string.
s = "foobar123456"
s = s if s[-1].isdigit() else s + "0"
# Keep a list of the digits in reverse.
digits_in_reverse = []
# Iterate through each character from the right.
for character in reversed(s):
# If we meet a character that is not a digit, stop.
if not character.isdigit():
break
# Otherwise, keep collecting the digits.
else:
digits_in_reverse.append(character)
# Reverse, the reversed digits, then convert it into an integer.
number_str = "".join(reversed(digits_in_reverse))
number = int(number_str)
print(number)
# end of string preceeding number.
end = s.rindex(number_str)
print(s[:end])
# Increment +1
print(s[:end] + str(number + 1))
[output]:
123456
foobar
foobar123457
Bonus: Can you do it with a one-liner?
Not exactly one line, but close:
import itertools
s = "foobar123456"
s = s if s[-1].isdigit() else s + "0"
number_str = "".join(itertools.takewhile(lambda ch: ch.isdigit(), reversed(s)))[::-1]
end = s.rindex(number_str)
print(s[:end] + str(int(number_str) + 1))
Bonus: But how about regex?
Yeah, with regex it's pretty magical, you would still make the same assumption as how we started, and to make your regex as simple as possible you have to add another assumption that the alphabetic characters preceding the number can only be made up of a-z or A-Z.
Then you can do this:
import re
s = "foobar123456"
s = s if s[-1].isdigit() else s + "0"
alpha, numeric = re.match("([a-zA-z]+)(\d.+)", s).groups()
print(alpha + str(int(numeric) + 1))
But you have to understand the regex which might be a steep learning, see https://regex101.com/r/9iiaCW/1
One simple solution would be:
Have two empty variables head (=non-numeric prefix) and tail (numeric suffix). Iterate the string normally, from left to right. If the current character is a digit, add it to tail. Otherwise, join head and tail, add the current char to head and empty tail. Once complete, increment tail and return head + tail:
def foo(s):
head = tail = ''
for char in s:
if char.isdigit():
tail += char
else:
head += tail + char
tail = ''
tail = int(tail or '0')
return head + str(tail + 1)
Leading zeroes (x001 -> x002), if needed, left as an exercise ;)
In your string, you need to check if it is alpha numeric or not. if it is alpha numeric, then you need to check the last character, whether it is digit or not.
now if above condition satisfy then you need to get the index of first digit in the string which make a integer number in last of string.
once you got the index then, seperate the character and numeric part.
once done, convert numerical string part to interger and add 1. after this join both character and numeric part. that is your answer.
# your code goes here
string = 'randomstring2345'
index = len(string) - 1
if string.isalnum() and string[-1].isdigit():
while True:
if string[index].isdigit():
index-=1
else:
index+=1
break
if index<0:
break
char_part = string[:index]
int_part = string[index:]
integer = 0
if int_part:
integer = int(''.join(int_part))
modified_int = integer + 1
new_string = ''.join([char_part, str(modified_int)])
print(new_string)
output
randomstring2346
Regex can be a useful tool in python~ Here I make two groups, the first (.*?) is as few of anything as possible, while the second (\d*$) is as many digits at the end of the string as possible. For more in depth explanation see regexr.
import re
def increment(s):
word, digits = re.match('(.*?)(\d*$)', s).groups()
digits = str(int(digits) + 1).zfill(len(digits)) if digits else '1'
return word + digits
print(increment('foobar001'))
print(increment('foobar009'))
print(increment('foobar19'))
print(increment('foobar20'))
print(increment('foobar99'))
print(increment('foobar'))
print(increment('1a2c1'))
print(increment(''))
print(increment('01'))
Output:
foobar002
foobar010
foobar20
foobar21
foobar100
foobar1
1a2c2
1
02
Source
def solve(data):
result = None
if len(data) == 0 or not data[-1].isdigit():
result = data + str(1) #appending 1
else:
lin = 0
for index, ch in enumerate(data[::-1]):
if ch.isdigit():
lin = len(data) - index -1
else:
break
result = data[0 : lin] + str(int(data[lin:]) + 1) # incrementing result
return result
pass
print(solve("Hey123"))
print(solve("aaabbbzzz"))
output :
Hey124
aaabbbzzz1

How to find the amount of equal characters that are next to eachother in a string?

i just started using python and im a noob.
this is an example of the string i have to work with "--+-+++----------------+-+"
The program needs to find whats the longest ++ "chain", so how many times does + appear, when they are next to eachother. I dont really know how to explain this, but i need it to find that chain of 3 + smybols, so i can print that the longest + chain contains 3 + symbols.
a = "--+-+++----------------+-+"
count = 0
most = 0
for x in range(len(a)):
if a[x] == "+":
count+=1
else:
count = 0
if count > most:
most = count
print(f"longest + chain includes {most} symbols")
there might be a better way but it's more self explanatory
Try this. It uses regular expressions and a list comprehension, so you may need to read about them.
But the idea is to find all the + chains, calculate their lengths and get the maximum length
import re
s = '+++----------------+-+'
occurs = re.findall('\++',s)
print(max([len(i) for i in occurs]))
Output:
3
You can use a regular expression to specify "one or more + characters". The character for specifying this kind of repetition in a regex is itself +, so to specify the actual + character you have to escape it.
haystack = "--+-+++----------------+-+"
needle = re.compile(r"\++")
Now we can use findall to find all the occurrences of this pattern in the original string, and max to find the longest of these.
longest = max(len(x) for x in needle.findall(haystack))
If you instead need the position of the longest sequence in the target string, you can use:
pos = haystack.index(max(needle.findall(haystack), key=len))
A simple solution is to iterate over the string one character at a time. When the character is the same as the last add one to a counter and each time the character is different to the previous the count can be restarted.
s = "--+-+++----------------+-+"
p = s[0]
max, count = 0
for c in s:
if c == p:
count = count + 1
else:
count = 0
if count > max:
max = count
p = c
s is the string, c is the character being checked, p is previous character, count is the counter, and max is the highest found value,
If the only other character in your string is a minus sign, you can split the string on the minus sign and get maximum length of the resulting substrings:
a = "--+-+++----------------+-+"
r = max(map(len,a.split('-')))
print(r) # 3

Remove string character after run of n characters in string

Suppose you have a given string and an integer, n. Every time a character appears in the string more than n times in a row, you want to remove some of the characters so that it only appears n times in a row. For example, for the case n = 2, we would want the string 'aaabccdddd' to become 'aabccdd'. I have written this crude function that compiles without errors but doesn't quite get me what I want:
def strcut(string, n):
for i in range(len(string)):
for j in range(n):
if i + j < len(string)-(n-1):
if string[i] == string[i+j]:
beg = string[:i]
ends = string[i+1:]
string = beg + ends
print(string)
These are the outputs for strcut('aaabccdddd', n):
n
output
expected
1
'abcdd'
'abcd'
2
'acdd'
'aabccdd'
3
'acddd'
'aaabccddd'
I am new to python but I am pretty sure that my error is in line 3, 4 or 5 of my function. Does anyone have any suggestions or know of any methods that would make this easier?
This may not answer why your code does not work, but here's an alternate solution using regex:
import re
def strcut(string, n):
return re.sub(fr"(.)\1{{{n-1},}}", r"\1"*n, string)
How it works: First, the pattern formatted is "(.)\1{n-1,}". If n=3 then the pattern becomes "(.)\1{2,}"
(.) is a capture group that matches any single character
\1 matches the first capture group
{2,} matches the previous token 2 or more times
The replacement string is the first capture group repeated n times
For example: str = "aaaab" and n = 3. The first "a" is the capture group (.). The next 3 "aaa" matches \1{2,} - in this example a{2,}. So the whole thing matches "a" + "aaa" = "aaaa". That is replaced with "aaa".
regex101 can explain it better than me.
you can implement a stack data structure.
Idea is you add new character in stack, check if it is same as previous one or not in stack and yes then increase counter and check if counter is in limit or not if yes then add it into stack else not. if new character is not same as previous one then add that character in stack and set counter to 1
# your code goes here
def func(string, n):
stack = []
counter = None
for i in string:
if not stack:
counter = 1
stack.append(i)
elif stack[-1]==i:
if counter+1<=n:
stack.append(i)
counter+=1
elif stack[-1]!=i:
stack.append(i)
counter = 1
return ''.join(stack)
print(func('aaabbcdaaacccdsdsccddssse', 2)=='aabbcdaaccdsdsccddsse')
print(func('aaabccdddd',1 )=='abcd')
print(func('aaabccdddd',2 )=='aabccdd')
print(func('aaabccdddd',3 )=='aaabccddd')
output
True
True
True
True
The method I would use is creating a new empty string at the start of the function and then everytime you exceed the number of characters in the input string you just not insert them in the output string, this is computationally efficient because it is O(n) :
def strcut(string,n) :
new_string = ""
first_c, s = string[0], 0
for c in string :
if c != first_c :
first_c, s= c, 0
s += 1
if s > n : continue
else : new_string += c
return new_string
print(strcut("aabcaaabbba",2)) # output : #aabcaabba
Simply, to anwer the question
appears in the string more than n times in a row
the following code is small and simple, and will work fine :-)
def strcut(string: str, n: int) -> str:
tmp = "*" * (n+1)
for char in string:
if tmp[len(tmp) - n:] != char * n:
tmp += char
print(tmp[n+1:])
strcut("aaabccdddd", 1)
strcut("aaabccdddd", 2)
strcut("aaabccdddd", 3)
Output:
abcd
aabccdd
aaabccddd
Notes:
The character "*" in the line tmp = "*"*n+string[0:1] can be any character that is not in the string, it's just a placeholder to handle the start case when there are no characters.
The print(tmp[n:]) line simply removes the "*" characters added in the beginning.
You don't need nested loops. Keep track of the current character and its count. include characters when the count is less or equal to n, reset the current character and count when it changes.
def strcut(s,n):
result = '' # resulting string
char,count = '',0 # initial character and count
for c in s: # only loop once on the characters
if c == char: count += 1 # increase count
else: char,count = c,1 # reset character/count
if count<=n: result += c # include character if count is ok
return result
Just to give some ideas, this is a different approach. I didn't like how n was iterating each time even if I was on i=3 and n=2, I still jump to i=4 even though I already checked that character while going through n. And since you are checking the next n characters in the string, you method doesn't fit with keeping the strings in order. Here is a rough method that I find easier to read.
def strcut(string, n):
for i in range(len(string)-1,0,-1): # I go backwards assuming you want to keep the front characters
if string.count(string[i]) > n:
string = remove(string,i)
print(string)
def remove(string, i):
if i > len(string):
return string[:i]
return string[:i] + string[i+1:]
strcut('aaabccdddd',2)

How to define a function that counts the lower cases letters until reaching a value?

Define a function countLowerFromUntil(...) which receives one string (st) and an integer value (start) as input. The string may potentially also be the empty string. This function should return how many lower case letters there are in the input string st, starting to count (and including) the start position and advancing one position at a time until reaching the end of the string or until reaching a digit (if there is such). The string may contain letters or digits. If the start value is out of the string range the function should return 0. Note: Keep in mind the string method islower() which returns true is applied to a character or a string containing only lower case letters.
For example countLowerFromUntil("ABCxAxx1aa") should return 3, because there are three lower case letters (3 "x"'s) before reaching the digit 1
As an example, the following code fragment:
val = countLowerFromUntil("ABCxAxx1aa",0)
print (val)
should produce the output:
3
so far I have this but I get an error:
def countLowerFromUntil(st,ch):
s = st().strip()
count = 1
for i in s:
if i.islower():
count = count + 1
return count
You need to take into account start. Start iterating from the start index. You can slice what you need. You don't need to strip (stripping spaces might invalidate start)
The algorithm goes like this:
counts begin from 0
if you encounter a digit, stop counting!
if you encounter a lower-case character, increment your counter.
return count at the end of the function
def f(string, start):
count = 0
for c in string[start:]:
if c.isdigit():
break
elif c.islower():
count += 1
return count
>>> f("ABCxAxx1aa", 0)
3

alphabetical order in Python

Okay I have questions regarding the following code:
s = "wxyabcd"
myString = s[0]
longest = s[0]
for i in range(1, len(s)):
if s[i] >= myString[-1]:
myString += s[i]
if len(myString) > len(longest):
longest = myString
else:
myString = s[i]
print longest
Answer: "abcd"
w
wx
wxy
a
ab
abc
abcd
I am new to Python and I am trying to learn how some of these loops work but I am very confused. This found what the longest string in alphabetical order was... The actual answer was "abcd" but I know that the process it went through was one by one.
Question: Can someone please guide me through the code so I can understand it better? Since there are 7 characters I am assuming it starts by saying: "For each item in range 1-7 if the item is 'more' than myString [-1] which is 'w' then I add the letter plus the item in i which in this case it would be 'x'.
I get lost right after this... So from a - z : a > z? Is that how it is? And how then when s[i] != myString[-1] did it skip to start from 'a' in s[i].
Sorry I am all over the place. Anyways i've tried to search places online to help me learn this but some things are just hard. I know that in a few months ill get the hang of it and hopefully be more fluent.
Thank you!
Here's a bit of an explanation of the control flow and what's going on with Python's indexing, hope it helps:
s = "wxyabcd"
myString = s[0] # 'w'
longest = s[0] # 'w' again, for collecting other chars
for i in range(1, len(s)): # from 1 to 7, exclusive of 7, so 2nd index to last
if s[i] >= myString[-1]: # compare the chars, e.g. z > a, so x and y => True
myString += s[i] # concatenate on to 'w'
if len(myString) > len(longest): # evident?
longest = myString # reassign longest to myString
else:
myString = s[i] # reassign myString to where you are in s.
print longest
# s is a 7 character string
s = "wxyabcd"
# set `mystring` to be the first character of s, 'w'
myString = s[0]
# set `longest` to be the first character of s, 'w'
longest = s[0]
# loop from 1 up to and not including length of s (7)
# Which equals for i in (1,2,3,4,5,6):
for i in range(1, len(s)):
# Compare the character at i with the last character of `mystring`
if s[i] >= myString[-1]:
# If it is greater (in alphabetical sense)
# append the character at i to `mystring`
myString += s[i]
# If this makes `mystring` longer than the previous `longest`,
# set `mystring` to be the new `longest`
if len(myString) > len(longest):
longest = myString
# Otherwise set `mystring` to be a single character string
# and start checking from index i
else:
myString = s[i]
# `longest` will be the longest `mystring` that was formed,
# using only characters in descending alphabetic order
print longest
Two approaches I can think of (quickly)
def approach_one(text): # I approve of this method!
all_substrings = list()
this_substring = ""
for letter in text:
if len(this_substring) == 0 or letter > this_substring[-1]:
this_substring+=letter
else:
all_substrings.append(this_substring)
this_substring = letter
all_substrings.append(this_substring)
return max(all_substrings,key=len)
def approach_two(text):
#forthcoming

Categories

Resources