Python, replacing characters in a string while preserving original string

Python, replacing characters in a string while preserving original string - python

More specifically:
Given a string and a non-empty word string, return a version of the original String where all chars have been replaced by pluses ("+"), except for appearances of the word string which are preserved unchanged.
def(base,word):
plusOut("12xy34", "xy") → "++xy++"
plusOut("12xy34", "1") → "1+++++"
plusOut("12xy34xyabcxy", "xy") → "++xy++xy+++xy"
My original thought was this:
def main():
x = base.split(word)
y = ''.join(x)
print(y.replace(y,'+')*len(y))
From here I have trouble reinserting the word back into the str in the correct places. Any help is appreciated.

You can use any string to join (instead of the empty string '' like you have).
def plusOut(s, word):
x = s.split(word)
y = ['+' * len(z) for z in x]
final = word.join(y)
return final
Edit: I've removed the regex, but I'm keeping the function across multiple lines to more closely match your original code.

A regex is not required. We can solve this without any libraries, iterating through exactly once.
We want to iterate through the indices i of the string, yielding the word and jumping ahead by len(word) if the slice of len(word) starting at i matches the word, and by yielding '+' and incrementing by one otherwise.
def replace_chars_except_word(string, word):
def generate_chars():
i = 0
while i < len(string):
if string[i:(i+len(word))] == word:
i += len(word)
yield word
else:
yield '+'
i+= 1
return ''.join(generate_chars())
if __name__ == '__main__':
test_string = 'stringabcdefg string11010string1'
result = replace_chars_except_word(test_string, word = 'string')
assert result == 'string++++++++string+++++string+'
I use an internal generator function to yield the strings, but you could use a buffer to replace the internal function. (This is slightly less memory efficient).
buffer = []
if (condition)
buffer.append(word)
else:
buffer.append'+'
return ''.join(buffer)

Related

String exercise in Python which detects certain letters

I am trying to create a function in Python which allows me to know if a string contains a letter "y" which appears in the beginning of a word and before a consonant. For example, the sentence "The word yes is correct but the word yntelligent is incorrect" contains the "y" of the word "yncorrect", so the function has to return True. In addition, it has to return true if the "y" is in capital letters and verifies those same conditions.
I have done it in the following way and it appears as if the program works but I was asked to use the method for strings in Python find and I havent't been able to include it. Any hint about how to do it using the method find? Thank you very much.
def function(string):
resultado=False
consonants1="bcdfghjklmnñpqrstvwxyz"
consonants2="BCDFGHJKLMNÑPQRSTVWXYZ"
for i in range(0,len(string)):
if string[i]=="y" and string[i-1]==" " and string[i+1] in consonants1:
resultado=True
break
if string[i]=="Y" and string[i-1]==" " and string[i+1] in consonants2:
resultado=True
break
return resultado
print(function("The word yes is correct but the word yntelligent is incorrect"))

Basically it is better to use re
consonants1="BCDFGHJKLMNÑPQRSTVWXYZ"
for i in consonants1:
if (a:= string.upper().find(f' Y{i}')) != -1:
print(...)
break

I think the function you want isn't find, but finditer from the package 're' (find will only give you the first instance of y, while finditer will return all instances of y)
import re
import string
consonants = string.ascii_lowercase
vowels = ['a', 'e', 'i', 'o', 'u']
for vowel in vowels:
consonants.remove(vowel)
def func(string):
for x in re.finditer('y', string.lower()):
if string[x.start() + 1] in consonants:
return True
return False

The function find returns the index at which the string first begins or is found. So, it returns the first index, else -1. This won't work for your use cases, unless you make it a bit more complicated.
Method One: Check every combination with find.
You have to two results, one to check if its the first word, or if its in any other word. Then return True if they hit. Otherwise return false
def function(string):
consonants1="bcdfghjklmnñpqrstvwxyz"
string = string.lower()
for c in consonants1:
result1 = string.find(" y" + c)
result2 = string.find("y" + c)
if result1 != 1 or result2 == 0:
return True
return False
Method Two: loop through find results.
You can use .find but it will be counter-intuitive. You can use .find and loop through each new substring excluding the past "y/Y", and do a check each time you find one. I would also convert the string to .lower() (convert to lowercase) so that you don't have to worry about case sensitivity.
def function(string):
consonants1="bcdfghjklmnñpqrstvwxyz"
string = string.lower()
start_index = 0
while start_index < len(string):
temp_string = string[start_index+1:end] ## add the 1 so that you don't include the past y
found_index = temp_string.find("y")
if found_index == -1: return False
og_index = start_index + found_index
## check to see if theres a " yConsonants1" combo or its the first word without space
if (string[og_index - 1] == " " and string[og_index+1] in consonants1) or (string[og_index+1] in consonants1 and og_index == 0):
return True
else:
start_index = og_index
return False

Here's how I would go about solving it:
Look up what the find function does. I found this resource online which says that find will return the index of the first occurrence of value (what's passed into the function. If one doesn't exist, it returns -1.
Since we're looking for combinations of y and any consonant, I'd just change the arrays of your consonants to be a list of all the combinations that I'm looking for:
# Note that each one of the strings has a space in the beginning so that
# it always appears in the start of the word
incorrect_strings = [" yb", " yc", ...]
But this won't quite work because it doesn't take into account all the permutations of lowercase and uppercase letters. However, there is a handy trick for handling lowercase vs. uppercase (making the entire string lowercase).
string = string.lower()
Now we just have to see if any of the incorrect strings appear in the string:
string = string.lower()
incorrect_strings = [" yb", " yc", ...]
for incorrect_string in incorrect_strings:
if string.find(incorrect_string) >= 0:
# We can early return here since it contains at least one incorrect string
return True
return False
To be honest, since you're only returning a True/False value, I'm not too sure why you need to use the find function. Doing if incorrect_string in string: would work better in this case.
EDIT
#Barmar mentioned that this wouldn't correctly check for the first word in the string. One way to get around this is to remove the " " from all the incorrect_strings. And then have the if case check for both incorrrect_string and f" {incorrect_string}"
string = string.lower()
incorrect_strings = ["yb", "yc", ...]
for incorrect_string in incorrect_strings:
if string.find(incorrect_string) >= 0 or string.find(f" {incorrect_string}"):
# We can early return here since it contains at least one incorrect string
return True
return False

identifying the substring when the number of characters in between don't matter

def checkPattern(x, string):
e = len(string)
if len(x) < e:
return False
for i in range(e - 1):
x = string[i]
y = string[i + 1]
last = x.rindex(x)
first = x.index(y)
if last == -1 or first == -1 or last > first:
return False
return True
if __name__ == "__main__":
x = str(input())
string = "hello"
if checkPattern(x, string) is True:
print('YES')
if checkPattern(x, string) is False:
print('NO')
So basically the code is supposed to identify a substring when the number of characters between the substring's letters don't matter. string = "hello" is supposed to be the substring. While the characters in between don't matter the order still matters so If I type "h.e.l.l.o" for example it's a YES, but if it's something like "hlelo" it's a NO. I sorta copied the base of the code and I'm still a little new to python so sorry if the question and code aren't clear.

Assuming I understand, and the input hlelo is No and the input h.e..l.l.!o is Yes, then the following code should work:
def checkPattern(x, string):
assert x and string, "Error. Both inputs should be non-empty. "
count_idx = 0 # index which counts where you are.
for letter in x:
if letter == string[count_idx]:
count_idx += 1 # increment to check the next string
if count_idx == len(string):
return True
# pattern was found if counter found matches equal to string length
return False
if __name__ == "__main__":
inp = input()
string = "hello"
if checkPattern(inp, string) is True:
print('YES')
if checkPattern(inp, string) is False:
print('NO')
Explaination: Regardless of the input string, x, you want to loop through each character of the search-string hello, to check if you find each character in the correct order. What my solution does is that it counts how many of the characters h, e, l, l, o it has found, starting from 0. If it finds a match for h, it moves on to check for a match for e, and so on. Ultimately, if you search through the entire string x, and the counter does not equal to the length of the search string (i.e. you could not find all the hello characters), it returns false.
EDIT: Small debug in the way the return worked. Instead returns if ever the counter goes over the length. Also added more examples given in comments

Here is my solution to this problem:
pattern = "hello"
def patternCheck(word, pattern) -> bool:
plist = list(pattern)
wlist = list(word)
for p in plist:
if p in wlist:
for _ in range(wlist.index(p) , -1, -1):
wlist.pop(_)
else:
return False
return True
print(patternCheck("h.e.l.l.o", pattern))
print(patternCheck("aalohel", pattern))
print(patternCheck("hhhhheeelllooo", pattern))
Explanation
First we convert our strings to a list
plist = list(pattern)
wlist = list(word)
Now we check using a for loop if every element in our pattern list is in the word list.
for p in plist:
if p in wlist:
If yes then we remove all the elements from index 0 to the index of that element.
for _ in range(wlist.index(p) , -1, -1):
wlist.pop(_)
We are removing elements in decreasing order of there indices to protect ourself from the IndexError: pop index out of range.
If the for loop ends normally then there was a match and we return True. Else if the element was not found in the word list in the first place then we return false as there is no match.

How to use a list of numbers as index inputs

So I have a list of numbers (answer_index) which correlate to the index locations (indicies) of a characters (char) in a word (word). I would like to use the numbers in the list as index inputs later (indexes) on in code to replace every character except my chosen character(char) with "*" so that the final print (new_word) in this instance would be (****ee) instead of (coffee). it is important that (word) maintains it's original value while (new_word) becomes the modified version. Does anyone have a solution for turning a list into valid index inputs? I will also except easier ways to meet my goal. (Note: I am extremely new to python so I'm sure my code looks horrendous) Code below:
word = 'coffee'
print(word)
def find(string, char):
for i, c in enumerate(string):
if c == char:
yield i
string = word
char = "e"
indices = (list(find(string, char)))
answer_index = (list(indices))
print(answer_index)
for t in range(0, len(answer_index)):
answer_index[t] = int(answer_index[t])
indexes = [(answer_index)]
new_character = '*'
result = ''
for i in indexes:
new_word = word[:i] + new_character + word[i+1:]
print(new_word)

You hardly ever need to work with indices directly:
string = "coffee"
char_to_reveal = "e"
censored_string = "".join(char if char == char_to_reveal else "*" for char in string)
print(censored_string)
Output:
****ee
If you're trying to implement a game of hangman, you might be better off using a dictionary which maps characters to other characters:
string = "coffee"
map_to = "*" * len(string)
mapping = str.maketrans(string, map_to)
translated_string = string.translate(mapping)
print(f"All letters are currently hidden: {translated_string}")
char_to_reveal = "e"
del mapping[ord(char_to_reveal)]
translated_string = string.translate(mapping)
print(f"'{char_to_reveal}' has been revealed: {translated_string}")
Output:
All letters are currently hidden: ******
'e' has been revealed: ****ee

The easiest and fastest way to replace all characters except some is to use regular expression substitution. In this case, it would look something like:
import re
re.sub('[^e]', '*', 'coffee') # returns '****ee'
Here, [^...] is a pattern for negative character match. '[^e]' will match (and then replace) anything except "e".
Other options include decomposing the string into an iterable of characters (#PaulM's answer) or working with bytearray instead

In Python, it's often not idiomatic to use indexes, unless you really want to do something with them. I'd avoid them for this problem and instead just iterate over the word, read each character and and create a new word:
word = "coffee"
char_to_keep = "e"
new_word = ""
for char in word:
if char == char_to_keep:
new_word += char_to_keep
else:
new_word += "*"
print(new_word)
# prints: ****ee

Remove string character after run of n characters in string

Suppose you have a given string and an integer, n. Every time a character appears in the string more than n times in a row, you want to remove some of the characters so that it only appears n times in a row. For example, for the case n = 2, we would want the string 'aaabccdddd' to become 'aabccdd'. I have written this crude function that compiles without errors but doesn't quite get me what I want:
def strcut(string, n):
for i in range(len(string)):
for j in range(n):
if i + j < len(string)-(n-1):
if string[i] == string[i+j]:
beg = string[:i]
ends = string[i+1:]
string = beg + ends
print(string)
These are the outputs for strcut('aaabccdddd', n):
n
output
expected
1
'abcdd'
'abcd'
2
'acdd'
'aabccdd'
3
'acddd'
'aaabccddd'
I am new to python but I am pretty sure that my error is in line 3, 4 or 5 of my function. Does anyone have any suggestions or know of any methods that would make this easier?

This may not answer why your code does not work, but here's an alternate solution using regex:
import re
def strcut(string, n):
return re.sub(fr"(.)\1{{{n-1},}}", r"\1"*n, string)
How it works: First, the pattern formatted is "(.)\1{n-1,}". If n=3 then the pattern becomes "(.)\1{2,}"
(.) is a capture group that matches any single character
\1 matches the first capture group
{2,} matches the previous token 2 or more times
The replacement string is the first capture group repeated n times
For example: str = "aaaab" and n = 3. The first "a" is the capture group (.). The next 3 "aaa" matches \1{2,} - in this example a{2,}. So the whole thing matches "a" + "aaa" = "aaaa". That is replaced with "aaa".
regex101 can explain it better than me.

you can implement a stack data structure.
Idea is you add new character in stack, check if it is same as previous one or not in stack and yes then increase counter and check if counter is in limit or not if yes then add it into stack else not. if new character is not same as previous one then add that character in stack and set counter to 1
# your code goes here
def func(string, n):
stack = []
counter = None
for i in string:
if not stack:
counter = 1
stack.append(i)
elif stack[-1]==i:
if counter+1<=n:
stack.append(i)
counter+=1
elif stack[-1]!=i:
stack.append(i)
counter = 1
return ''.join(stack)
print(func('aaabbcdaaacccdsdsccddssse', 2)=='aabbcdaaccdsdsccddsse')
print(func('aaabccdddd',1 )=='abcd')
print(func('aaabccdddd',2 )=='aabccdd')
print(func('aaabccdddd',3 )=='aaabccddd')
output
True
True
True
True

The method I would use is creating a new empty string at the start of the function and then everytime you exceed the number of characters in the input string you just not insert them in the output string, this is computationally efficient because it is O(n) :
def strcut(string,n) :
new_string = ""
first_c, s = string[0], 0
for c in string :
if c != first_c :
first_c, s= c, 0
s += 1
if s > n : continue
else : new_string += c
return new_string
print(strcut("aabcaaabbba",2)) # output : #aabcaabba

Simply, to anwer the question
appears in the string more than n times in a row
the following code is small and simple, and will work fine :-)
def strcut(string: str, n: int) -> str:
tmp = "*" * (n+1)
for char in string:
if tmp[len(tmp) - n:] != char * n:
tmp += char
print(tmp[n+1:])
strcut("aaabccdddd", 1)
strcut("aaabccdddd", 2)
strcut("aaabccdddd", 3)
Output:
abcd
aabccdd
aaabccddd
Notes:
The character "*" in the line tmp = "*"*n+string[0:1] can be any character that is not in the string, it's just a placeholder to handle the start case when there are no characters.
The print(tmp[n:]) line simply removes the "*" characters added in the beginning.

You don't need nested loops. Keep track of the current character and its count. include characters when the count is less or equal to n, reset the current character and count when it changes.
def strcut(s,n):
result = '' # resulting string
char,count = '',0 # initial character and count
for c in s: # only loop once on the characters
if c == char: count += 1 # increase count
else: char,count = c,1 # reset character/count
if count<=n: result += c # include character if count is ok
return result

Just to give some ideas, this is a different approach. I didn't like how n was iterating each time even if I was on i=3 and n=2, I still jump to i=4 even though I already checked that character while going through n. And since you are checking the next n characters in the string, you method doesn't fit with keeping the strings in order. Here is a rough method that I find easier to read.
def strcut(string, n):
for i in range(len(string)-1,0,-1): # I go backwards assuming you want to keep the front characters
if string.count(string[i]) > n:
string = remove(string,i)
print(string)
def remove(string, i):
if i > len(string):
return string[:i]
return string[:i] + string[i+1:]
strcut('aaabccdddd',2)

Longest Common Prefix with Python

I am trying to figure out an easy leetcode question and I do not know why my answer does not work.
Problem:
Write a function to find the longest common prefix string amongst an array of strings.
If there is no common prefix, return an empty string "".
My Code:
shortest=min(strs,key=len)
strs.remove(shortest)
common=shortest
for i in range(1,len(shortest)):
comparisons=[common in str for str in strs]
if all(comparisons):
print(common)
break
else:
common=common[:-i]
The above trial does not work when the length of the strings in the list are same but works for other cases.
Thank you very much.

Friend, try to make it as 'pythonic' as possible. just like you would in real life.
in real life what do you see? you see words and maybe look for the shortest word and compare it to all the others. Okay, let's do that, let's find the longest word and then the shortest.
First we create an empty string, there the characters that are the same in both strings will be stored
prefix = ''
#'key=len' is a necesary parameter, because otherwise, it would look for the chain with the highest value in numerical terms, and it is not always the shortest in terms of length (it is not exactly like that but so that it is understood haha)
max_sentense = max(strings, key=len)
min_sentense = min(strings, key=len)
Okay, now what would we do in real life?
loop both one by one from the beginning, is it possible in python? yes. with zip()
for i, o in zip(max_sentense, min_sentense):
the 'i' will go through the longest string and the 'o' will go through the shortest string.
ok, now it's easy, we just have to stop going through them when 'i' and 'o' are not the same, that is, they are not the same character.
for i, o in zip(max_sentense, min_sentense):
if i == o:
prefix += i
else:
break
full code:
prefix = ''
max_sentense = max(strings, key=len)
min_sentense = min(strings, key=len)
for i, o in zip(max_sentense, min_sentense):
if i == o:
prefix += i
else:
break
print(prefix)

It's quickest to compare the first characters of all the words, and then the second characters, etc. Otherwise you're doing unnecessary comparisons.
def longestCommonPrefix(self, strs):
prefix = ''
for char in zip(*strs):
if len(set(char)) == 1:
prefix += char[0]
else:
break
return prefix

You can do this fairly efficiently in a single iteration over the list. I've made this a little verbose so that it's easier to understand.
import itertools
def get_longest_common_prefix(strs):
longest_common_prefix = strs.pop()
for string in strs:
pairs = zip(longest_common_prefix, string)
longest_common_prefix_pairs = itertools.takewhile(lambda pair: pair[0] == pair[1], pairs)
longest_common_prefix = (x[0] for x in longest_common_prefix_pairs)
return ''.join(longest_common_prefix)

In your code you cross check with the shortest string which can be one of the shortest strings if multiple same length strings are present. Furthermore the shortest might not have the longest common prefix.
This is not a very clean code but it does the job
common, max_cnt = "", 0
for i, s1 in enumerate(strs[:-2]):
for s2 in strs[i+1:]:
for j in range(1, min(len(s1), len(s2))+1):
if s1[:j] == s2[:j]:
if j > max_cnt:
max_cnt = j
common = s1[:j]

This function takes any number of positional arguments.
If no argument is given, it returns "".
If just one argument is given, it is returned.
from itertools import zip_longest
def common_prefix(*strings) -> str:
length = len(strings)
if not length:
return ""
if length == 1:
return strings[0]
# as pointed in another answer, 'key=len' is necessary because otherwise
# the strings will be compared according to lexicographical order,
# instead of their length
shortest = min(strings, key=len)
longest = max(strings, key=len)
# we use zip_longest instead of zip because `shortest` might be a substring
# of the longest; that is, the longest common prefix might be `shortest`
# itself
for i, chars in enumerate(zip_longest(shortest, longest)):
if chars[0] != chars[1]:
return shortest[:i]
# if it didn't return by now, the first character is already different,
# so the longest common prefix is empty
return ""
if __name__ == "__main__":
for args in [
("amigo", "amiga", "amizade"),
tuple(),
("teste",),
("amigo", "amiga", "amizade", "atm"),
]:
print(*args, sep=", ", end=": ")
print(common_prefix(*args))

Simple python code
def longestCommonPrefix(self, arr):
arr.sort(reverse = False)
print arr
n= len(arr)
str1 = arr[0]
str2 = arr[n-1]
n1 = len(str1)
n2 = len(str2)
result = ""
j = 0
i = 0
while(i <= n1 - 1 and j <= n2 - 1):
if (str1[i] != str2[j]):
break
result += (str1[i])
i += 1
j += 1
return (result)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python, replacing characters in a string while preserving original string - python

Related

String exercise in Python which detects certain letters

identifying the substring when the number of characters in between don't matter

How to use a list of numbers as index inputs

Remove string character after run of n characters in string

Longest Common Prefix with Python

Categories

Resources