I'm trying to write a Python program which would take a string and print the longest substring in it which is also in alphabetical order. For example:
the_string = "abcdefgghhisdghlqjwnmonty"
The longest substring in alphabetical order here would be "abcdefgghhis"
I'm not allowed to define my own functions and can't use lists. So here's what I came up with:
def in_alphabetical_order(string):
for letter in range(len(string) - 1):
if string[letter] > string[letter + 1]:
return False
return True
s = "somestring"
count = 0
for char in range(len(s)):
i = 0
while i <= len(s):
sub_string = s[char : i]
if (len(sub_string) > count) and (in_alphabetical_order(sub_string)):
count = len(sub_string)
longest_string = sub_string
i += 1
print("Longest substring in alphabetical order is: " + longest_string)
This obviously contains a function that is not built-in. How can I check whether the elements of the substring candidate is in alphabetical order without defining this function? In other words: How can I implement what this function does for me into the code (e.g. by using another for loop in the code somewhere or something)?
just going by your code you can move the operation of the function into the loop and use a variable to store what would have been the return value.
I would recommend listening to bill the lizard to help with the way you solve the problem
s = "somestring"
count = 0
longest_string = ''
for char in range(len(s)):
i = 0
while i <= len(s):
sub_string = s[char : i]
in_order = True
for letter in range(len(sub_string) - 1):
if sub_string[letter] > sub_string[letter + 1]:
in_order = False
break
if (len(sub_string) > count) and (in_order):
count = len(sub_string)
longest_string = sub_string
i += 1
print("Longest substring in alphabetical order is: " + longest_string)
You don't need to check the whole substring with a function call to see if it is alphabetical. You can just check one character at a time.
Start at the first character. If the next character is later in the alphabet, keep moving along in the string. When you reach a character that's earlier in the alphabet than the previous character, you've found the longest increasing substring starting at the first character. Save it and start over from the second character.
Each time you find the longest substring starting at character N, check to see if it is longer than the previous longest substring. If it is, replace the old one.
Here's a solution based off of what you had:
s = 'abcdefgghhisdghlqjwnmonty'
m, n = '', ''
for i in range(len(s) - 1):
if s[i + 1] < s[i]:
if len(n) > len(m):
m = n
n = s[i]
else:
n += s[i]
Output:
m
'abcdefgghhi'
Related
I need help on a problem that I am doing in a course. The exact details are below:
Assume s is a string of lower case characters.
Write a program that prints the longest substring of s in which the letters occur in alphabetical order.
For example, if s = 'azcbobobegghakl', then your program should print
"Longest substring in alphabetical order is: beggh"
In the case of ties, print the first substring. For example, if s = 'abcbcd', then your program should print
"Longest substring in alphabetical order is: abc"
I have written some code that achieves some correct answers but not all and I am unsure why.
This is my code
s = 'vettmlxvn'
alphabet = "abcdefghijklmnopqrstuvwxyz"
substring = ""
highest_len = 0
highest_string = ""
counter = 0
for letter in s:
counter += 1
if s.index(letter) == 0:
substring = substring + letter
highest_len = len(substring)
highest_string = substring
else:
x = alphabet.index(substring[-1])
y = alphabet.index(letter)
if y >= x:
substring = substring + letter
if counter == len(s) and len(substring) > highest_len:
highest_len = len(substring)
highest_string = substring
else:
if len(substring) > highest_len:
highest_len = len(substring)
highest_string = substring
substring = "" + letter
else:
substring = "" + letter
print("Longest substring in alphabetical order is: " + highest_string)
When I test for this specific string it gives me "lxv" instead of the correct answer: "ett". I do not know why this is and have even tried drawing a trace table so I can trace variables and I should be getting "ett".
Maybe I have missed something simple but can someone explain why it is not working.
I know there are probably easier ways to do this problem but I am a beginner in python and have been working on this problem for a long time.
Just want to know what is wrong with my code.
Thanks.
I solved your Problem with an alternate simpler approach.
You can compare 2 characters like numbers. a comes before b in the alphabet, that means the expression 'a' < 'b' is True
def longest_substring(s):
# Initialize the longest substring
longest = ""
# Initialize the current substring
current = ""
# Loop through the string
for i in range(len(s)):
if i == 0:
# If it's the first letter, add it to the current substring
current += s[i]
else:
if s[i] >= s[i-1]:
# If the current letter is greater than or equal to the previous letter,
# add it to the current substring
current += s[i]
else:
# If the current letter is less than the previous letter,
# check if the current substring is longer than the longest substring
# and update the longest substring if it is
if len(current) > len(longest):
longest = current
current = s[i]
# Once the loop is done,
# check again if the current substring is longer than the longest substring
if len(current) > len(longest):
longest = current
return longest
print(longest_substring("azcbobobegghakl"))
print(longest_substring("abcbcd"))
print(longest_substring("abcdefghijklmnopqrstuvwxyz"))
print(longest_substring("vettmlxvn"))
Output:
beggh
abc
abcdefghijklmnopqrstuvwxyz
ett
I haven't figured out what's wrong with your code yet. I will update this if I figured it out.
Edit
So commented some stuff out and changed one thing.
Here's the working code:
s = 'vettmlxvn'
alphabet = "abcdefghijklmnopqrstuvwxyz"
substring = ""
highest_len = 0
highest_string = ""
counter = 0
for letter in s:
counter += 1
if s.index(letter) == 0:
substring = substring + letter
# highest_len = len(substring)
# highest_string = substring
else:
x = alphabet.index(substring[-1])
y = alphabet.index(letter)
if y >= x:
substring = substring + letter
if counter == len(s) and len(substring) > highest_len:
highest_len = len(substring)
highest_string = substring
else:
if len(substring) > len(highest_string): # changed highest_len to len(highest_string)
#highest_len = len(substring)
highest_string = substring
substring = "" + letter
else:
substring = "" + letter
print("Longest substring in alphabetical order is: " + highest_string)
You are overwriting highest_string at the wrong time. Only overwrite it if the substrings ends and after checking if it's greater than the longest found before.
Outputs:
Longest substring in alphabetical order is: ett
Alternate Approach
Break input string into substrings
Where each substring is increasing
Find the longest substring
Code
def longest_increasing_substring(s):
# Break s into substrings which are increasing
incr_subs = []
for c in s:
if not incr_subs or incr_subs[-1][-1] > c:# check alphabetical order of last letter of current
# string to current letter
incr_subs.append('') # Start new last substring since not in order
incr_subs[-1] += c # append to last substsring
return max(incr_subs, key = len) # substring with max length
Test
for s in ['vettmlxvn', 'azcbobobegghakl', 'abcbcd']:
print(f'{s} -> {longest_increasing_substring(s)}')
Output
vettmlxvn -> ett
azcbobobegghakl -> beggh
abcbcd -> abc
Suppose you have a given string and an integer, n. Every time a character appears in the string more than n times in a row, you want to remove some of the characters so that it only appears n times in a row. For example, for the case n = 2, we would want the string 'aaabccdddd' to become 'aabccdd'. I have written this crude function that compiles without errors but doesn't quite get me what I want:
def strcut(string, n):
for i in range(len(string)):
for j in range(n):
if i + j < len(string)-(n-1):
if string[i] == string[i+j]:
beg = string[:i]
ends = string[i+1:]
string = beg + ends
print(string)
These are the outputs for strcut('aaabccdddd', n):
n
output
expected
1
'abcdd'
'abcd'
2
'acdd'
'aabccdd'
3
'acddd'
'aaabccddd'
I am new to python but I am pretty sure that my error is in line 3, 4 or 5 of my function. Does anyone have any suggestions or know of any methods that would make this easier?
This may not answer why your code does not work, but here's an alternate solution using regex:
import re
def strcut(string, n):
return re.sub(fr"(.)\1{{{n-1},}}", r"\1"*n, string)
How it works: First, the pattern formatted is "(.)\1{n-1,}". If n=3 then the pattern becomes "(.)\1{2,}"
(.) is a capture group that matches any single character
\1 matches the first capture group
{2,} matches the previous token 2 or more times
The replacement string is the first capture group repeated n times
For example: str = "aaaab" and n = 3. The first "a" is the capture group (.). The next 3 "aaa" matches \1{2,} - in this example a{2,}. So the whole thing matches "a" + "aaa" = "aaaa". That is replaced with "aaa".
regex101 can explain it better than me.
you can implement a stack data structure.
Idea is you add new character in stack, check if it is same as previous one or not in stack and yes then increase counter and check if counter is in limit or not if yes then add it into stack else not. if new character is not same as previous one then add that character in stack and set counter to 1
# your code goes here
def func(string, n):
stack = []
counter = None
for i in string:
if not stack:
counter = 1
stack.append(i)
elif stack[-1]==i:
if counter+1<=n:
stack.append(i)
counter+=1
elif stack[-1]!=i:
stack.append(i)
counter = 1
return ''.join(stack)
print(func('aaabbcdaaacccdsdsccddssse', 2)=='aabbcdaaccdsdsccddsse')
print(func('aaabccdddd',1 )=='abcd')
print(func('aaabccdddd',2 )=='aabccdd')
print(func('aaabccdddd',3 )=='aaabccddd')
output
True
True
True
True
The method I would use is creating a new empty string at the start of the function and then everytime you exceed the number of characters in the input string you just not insert them in the output string, this is computationally efficient because it is O(n) :
def strcut(string,n) :
new_string = ""
first_c, s = string[0], 0
for c in string :
if c != first_c :
first_c, s= c, 0
s += 1
if s > n : continue
else : new_string += c
return new_string
print(strcut("aabcaaabbba",2)) # output : #aabcaabba
Simply, to anwer the question
appears in the string more than n times in a row
the following code is small and simple, and will work fine :-)
def strcut(string: str, n: int) -> str:
tmp = "*" * (n+1)
for char in string:
if tmp[len(tmp) - n:] != char * n:
tmp += char
print(tmp[n+1:])
strcut("aaabccdddd", 1)
strcut("aaabccdddd", 2)
strcut("aaabccdddd", 3)
Output:
abcd
aabccdd
aaabccddd
Notes:
The character "*" in the line tmp = "*"*n+string[0:1] can be any character that is not in the string, it's just a placeholder to handle the start case when there are no characters.
The print(tmp[n:]) line simply removes the "*" characters added in the beginning.
You don't need nested loops. Keep track of the current character and its count. include characters when the count is less or equal to n, reset the current character and count when it changes.
def strcut(s,n):
result = '' # resulting string
char,count = '',0 # initial character and count
for c in s: # only loop once on the characters
if c == char: count += 1 # increase count
else: char,count = c,1 # reset character/count
if count<=n: result += c # include character if count is ok
return result
Just to give some ideas, this is a different approach. I didn't like how n was iterating each time even if I was on i=3 and n=2, I still jump to i=4 even though I already checked that character while going through n. And since you are checking the next n characters in the string, you method doesn't fit with keeping the strings in order. Here is a rough method that I find easier to read.
def strcut(string, n):
for i in range(len(string)-1,0,-1): # I go backwards assuming you want to keep the front characters
if string.count(string[i]) > n:
string = remove(string,i)
print(string)
def remove(string, i):
if i > len(string):
return string[:i]
return string[:i] + string[i+1:]
strcut('aaabccdddd',2)
I have written a program to find the duplicate character of a string, as below:
string = "India";
print("Duplicate characters in a given string: ");
#Counts each character present in the string
for i in range(0, len(string)):
count = 1;
for j in range(i+1, len(string)):
if(string[i] == string[j] and string[i] != ' '):
count = count + 1;
#Set string[j] to 0 to avoid printing visited character
string = string[:j] + '0' + string[j+1:];
#A character is considered as duplicate if count is greater than 1
if(count > 1 and string[i] != '0'):
print(string[i]);
But thats not it I just want to increase the duplicate character by putting one extra occurence of that character.
For Example:
input : "India"(Here Duplicate Character is 'i')
output : "IIndiia"(Increase one occurrence of the duplicate character)
Can any one help me in solving this?
Collect all counts in one go, and do so case-insensitively:
from collections import Counter
string = "India"
c = Counter(string.lower())
"".join(x * (1 + (c[x.lower()] > 1)) for x in string)
'IIndiia'
One easy way is to use standard python string methods:
line = "some string"
result = ""
for letter in line:
result = result + letter
if line.upper().count(letter.upper()) > 1:
result = result + letter
my code for finding longest substring in alphabetical order using python
what I mean by longest substring in alphabetical order?
if the input was"asdefvbrrfqrstuvwxffvd" the output wil be "qrstuvwx"
#we well use the strings as arrays so don't be confused
s='abcbcd'
#give spaces which will be our deadlines
h=s+' (many spaces) '
#creat outputs
g=''
g2=''
#list of alphapets
abc='abcdefghijklmnopqrstuvwxyz'
#create the location of x"the character the we examine" and its limit
limit=len(s)
#start from 1 becouse we substract one in the rest of the code
x=1
while (x<limit):
#y is the curser that we will move the abc array on it
y=0
#putting our break condition first
if ((h[x]==' ') or (h[x-1]==' ')):
break
for y in range(0,26):
#for the second character x=1
if ((h[x]==abc[y]) and (h[x-1]==abc[y-1]) and (x==1)):
g=g+abc[y-1]+abc[y]
x+=1
#for the third to the last character x>1
if ((h[x]==abc[y]) and (h[x-1]==abc[y-1]) and (x!=1)):
g=g+abc[y]
x+=1
if (h[x]==' '):
break
print ("Longest substring in alphabetical order is:" +g )
it doesn't end,as if it's in infinite loop
what should I do?
I am a beginner so I want some with for loops not functions from libraries
Thanks in advance
To avoid infinite loop add x += 1 in the very end of your while-loop. As a result your code works but works wrong in general case.
The reason why it works wrong is that you use only one variable g to store the result. Use at least two variables to compare previous found substring and new found substring or use list to remember all substrings and then choose the longest one.
s = 'abcbcdiawuhdawpdijsamksndaadhlmwmdnaowdihasoooandalw'
longest = ''
current = ''
for idx, item in enumerate(s):
if idx == 0 or item > s[idx-1]:
current = current + item
if idx > 0 and item <= s[idx-1]:
current = ''
if len(current)>len(longest):
longest = current
print(longest)
Output:
dhlmw
For your understanding 'b'>'a' is True, 'a'>'b' is False etc
Edit:
For longest consecutive substring:
s = 'asdefvbrrfqrstuvwxffvd'
abc = 'abcdefghijklmnopqrstuvwxyz'
longest = ''
current = ''
for idx, item in enumerate(s):
if idx == 0 or abc.index(item) - abc.index(s[idx-1]) == 1:
current = current + item
else:
current = item
if len(current)>len(longest):
longest = current
print(longest)
Output:
qrstuvwx
def sub_strings(string):
substring = ''
string +='\n'
i = 0
string_dict ={}
while i < len(string)-1:
substring += string[i]
if ord(substring[-1])+1 != ord(string[i+1]):
string_dict[substring] = len(substring)
substring = ''
i+=1
return string_dict
s='abcbcd'
sub_strings(s)
{'abc': 3, 'bcd': 3}
To find the longest you can domax(sub_strings(s))
So here which one do you want to be taken as the longest??. Now that is a problem you would need to solve
You can iterate through the string and keep comparing to the last character and append to the potentially longest string if the current character is greater than the last character by one ordinal number:
def longest_substring(s):
last = None
current = longest = ''
for c in s:
if not last or ord(c) - ord(last) == 1:
current += c
else:
if len(current) > len(longest):
longest = current
current = c
last = c
if len(current) > len(longest):
longest = current
return longest
so that:
print(longest_substring('asdefvbrrfqrstuvwxffvd'))
would output:
qrstuvwx
Okay I have questions regarding the following code:
s = "wxyabcd"
myString = s[0]
longest = s[0]
for i in range(1, len(s)):
if s[i] >= myString[-1]:
myString += s[i]
if len(myString) > len(longest):
longest = myString
else:
myString = s[i]
print longest
Answer: "abcd"
w
wx
wxy
a
ab
abc
abcd
I am new to Python and I am trying to learn how some of these loops work but I am very confused. This found what the longest string in alphabetical order was... The actual answer was "abcd" but I know that the process it went through was one by one.
Question: Can someone please guide me through the code so I can understand it better? Since there are 7 characters I am assuming it starts by saying: "For each item in range 1-7 if the item is 'more' than myString [-1] which is 'w' then I add the letter plus the item in i which in this case it would be 'x'.
I get lost right after this... So from a - z : a > z? Is that how it is? And how then when s[i] != myString[-1] did it skip to start from 'a' in s[i].
Sorry I am all over the place. Anyways i've tried to search places online to help me learn this but some things are just hard. I know that in a few months ill get the hang of it and hopefully be more fluent.
Thank you!
Here's a bit of an explanation of the control flow and what's going on with Python's indexing, hope it helps:
s = "wxyabcd"
myString = s[0] # 'w'
longest = s[0] # 'w' again, for collecting other chars
for i in range(1, len(s)): # from 1 to 7, exclusive of 7, so 2nd index to last
if s[i] >= myString[-1]: # compare the chars, e.g. z > a, so x and y => True
myString += s[i] # concatenate on to 'w'
if len(myString) > len(longest): # evident?
longest = myString # reassign longest to myString
else:
myString = s[i] # reassign myString to where you are in s.
print longest
# s is a 7 character string
s = "wxyabcd"
# set `mystring` to be the first character of s, 'w'
myString = s[0]
# set `longest` to be the first character of s, 'w'
longest = s[0]
# loop from 1 up to and not including length of s (7)
# Which equals for i in (1,2,3,4,5,6):
for i in range(1, len(s)):
# Compare the character at i with the last character of `mystring`
if s[i] >= myString[-1]:
# If it is greater (in alphabetical sense)
# append the character at i to `mystring`
myString += s[i]
# If this makes `mystring` longer than the previous `longest`,
# set `mystring` to be the new `longest`
if len(myString) > len(longest):
longest = myString
# Otherwise set `mystring` to be a single character string
# and start checking from index i
else:
myString = s[i]
# `longest` will be the longest `mystring` that was formed,
# using only characters in descending alphabetic order
print longest
Two approaches I can think of (quickly)
def approach_one(text): # I approve of this method!
all_substrings = list()
this_substring = ""
for letter in text:
if len(this_substring) == 0 or letter > this_substring[-1]:
this_substring+=letter
else:
all_substrings.append(this_substring)
this_substring = letter
all_substrings.append(this_substring)
return max(all_substrings,key=len)
def approach_two(text):
#forthcoming