The problem at hand is that given a string S, we can transform every letter individually to be lowercase or uppercase to create another string.
Desired result is a list of all possible strings we could create.
Eg:
Input:
S = "a1b2"
Output:
["a1b2", "a1B2", "A1b2", "A1B2"]
I see the below code generates the correct result, but I'm a beginner in Python and can you help me understand how does loop line 5 & 7 work, which assign value to res.
def letterCasePermutation(self, S):
res = ['']
for ch in S:
if ch.isalpha():
res = [i+j for i in res for j in [ch.upper(), ch.lower()]]
else:
res = [i+ch for i in res]
return res
The result is a list of all possible strings up to this point. One call to the function handles the next character.
If the character is a non-letter (line 7), the comprehension simply adds that character to each string in the list.
If the character is a letter, then the new list contains two strings for each one in the input: one with the upper-case version added, one for the lower-case version.
If you're still confused, then I strongly recommend that you make an attempt to understand this with standard debugging techniques. Insert a couple of useful print statements to display the values that confuse you.
def letterCasePermutation(self, S):
res = ['']
for ch in S:
print("char = ", ch)
if ch.isalpha():
res = [i+j for i in res for j in [ch.upper(), ch.lower()]]
else:
res = [i+ch for i in res]
print(res)
return res
letterCasePermutation(None, "a1b2")
Output:
char = a
['A', 'a']
char = 1
['A1', 'a1']
char = b
['A1B', 'A1b', 'a1B', 'a1b']
char = 2
['A1B2', 'A1b2', 'a1B2', 'a1b2']
Best way to analyze this code is include the line:
print(res)
at the end of the outer for loop, as first answer suggests.
Then run it with the string '123' and the string 'abc' which will isolate the two conditionals. This gives the following output:
['1']
['12']
['123']
and
['A','a']
['AB','Ab','aB','ab']
['ABC','ABc','AbC','aBC','Abc','aBc','abC','abc']
Here we can see the loop is just taking the previously generated list as its input, and if the next string char is not a letter, is simply tagging the number/symbol onto the end of each string in the list, via string concatenation. If the next char in the initial input string is a letter, however, then the list is doubled in length by creating two copies for each item in the list, while simultaneously appending an upper version of the new char to the first copy, and a lower version of the new char to the second copy.
For an interesting result, see how the code fails if this change is made at line 2:
res = []
Related
In a python script, I need to assess whether a string contains duplicates of a specific character (e.g., "f") and, if so, remove all but the first instance of that character. Other characters in the string may also have duplicates, but the script should not remove any duplicates other than those of the specified character.
This is what I've got so far. The script runs, but it is not accomplishing the desired task. I modified the reduce() line from the top answer to this question, but it's a little more complex than what I've learned at this point, so it's difficult for me to tell what part of this is wrong.
import re
from functools import reduce
string = "100 ffeet"
dups = ["f", "t"]
for char in dups:
if string.count(char) > 1:
lst = list(string)
reduce(lambda acc, el: acc if re.match(char, el) and el in acc else acc + [el], lst, [])
string = "".join(lst)
Let's create a function that receives a string s and a character c as parameters, and returns a new string where all but the first occurrence of c in s are removed.
We'll be making use of the following functions from Python std lib:
str.find(sub): Return the lowest index in the string where substring sub is found.
str.replace(old, new): Return a copy of the string with all occurrences of substring old replaced by new.
The idea is straightforward:
Find the first index of c in s
If none is found, return s
Make a substring of s starting from the next character after c
Remove all occurrences of c in the substring
Concatenate the first part of s with the updated substring
Return the final string
In Python:
def remove_all_but_first(s, c):
i = s.find(c)
if i == -1:
return s
i += 1
return s[:i] + s[i:].replace(c, '')
Now you can use this function to remove all the characters you want.
def main():
s = '100 ffffffffeet'
dups = ['f', 't', 'x']
print('Before:', s)
for c in dups:
s = remove_all_but_first(s, c)
print('After:', s)
if __name__ == '__main__':
main()
Here is one way that you could do it
string = "100 ffeet"
dups = ["f", "t"]
seen = []
for s in range(len(string)-1,0,-1):
if string[s] in dups and string[s] in seen:
string = string[:s] + '' + string[s+1:]
elif string[s] in dups:
seen.append(string[s])
print(string)
so i need to code a program which, for example if given the input 3[a]2[b], prints "aaabb" or when given 3[ab]2[c],prints "abababcc"(basicly prints that amount of that letter in the given order). i tried to use a for loop to iterate the first given input and then detect "[" letters in it so it'll know that to repeatedly print but i don't know how i can make it also understand where that string ends
also this is where i could get it to,which probably isnt too useful:
string=input()
string=string[::-1]
bulundu=6
for i in string:
if i!="]":
if i!="[":
lst.append(i)
if i=="[":
break
The approach I took is to remove the brackets, split the items into a list, then walk the list, and if the item is a number, add that many repeats of the next item to the result for output:
import re
data = "3[a]2[b]"
# Remove brackets and convert to a list
data = re.sub(r'[\[\]]', ' ', data).split()
result = []
for i, item in enumerate(data):
# If item is a number, print that many of the next item
if item.isdigit():
result.append(data[i+1] * int(item))
print(''.join(result))
# aaabb
A different approach, inspired by Subbu's use of re.findall. This approach finds all 'pairs' of numbers and letters using match groups, then multiplies them to produce the required text:
import re
data = "3[a]2[b]"
matches = re.findall('(\d+)\[([a-zA-Z]+)\]',data)
# [(3, 'a'), (2, 'b')]
for x in matches:
print(x[1] * int(x[0]), end='')
#aaabb
Lenghty and documented version using NO regex but simple string and list manipulation:
first split the input into parts that are numbers and texts
then recombinate them again
I opted to document with inline comments
This could be done like so:
# testcases are tuples of input and correct result
testcases = [ ("3[a]2[b]","aaabb"),
("3[ab]2[c]","abababcc"),
("5[12]6[c]","1212121212cccccc"),
("22[a]","a"*22)]
# now we use our algo for all those testcases
for inp,res in testcases:
split_inp = [] # list that takes the splitted values of the input
num = 0 # accumulator variable for more-then-1-digit numbers
in_text = False # bool that tells us if we are currently collecting letters
# go over all letters : O(n)
for c in inp:
# when a [ is reached our num is complete and we need to store it
# we collect all further letters until next ] in a list that we
# add at the end of your split_inp
if c == "[":
split_inp.append(num) # add the completed number
num = 0 # and reset it to 0
in_text = True # now in text
split_inp.append([]) # add a list to collect letters
# done collecting letters
elif c == "]":
in_text = False # no longer collecting, convert letters
split_inp[-1] = ''.join(split_inp[-1]) # to text
# between [ and ] ... simply add letter to list at end
elif in_text:
split_inp[-1].append(c) # add letter
# currently collecting numbers
else:
num *= 10 # increase current number by factor 10
num += int(c) # add newest number
print(repr(inp), split_inp, sep="\n") # debugging output for parsing part
# now we need to build the string from our parsed data
amount = 0
result = [] # intermediate list to join ['aaa','bb']
# iterate the list, if int remember it, it text, build composite
for part in split_inp:
if isinstance(part, int):
amount = part
else:
result.append(part*amount)
# join the parts
result = ''.join(result)
# check if all worked out
if result == res:
print("CORRECT: ", result + "\n")
else:
print (f"INCORRECT: should be '{res}' but is '{result}'\n")
Result:
'3[a]2[b]'
[3, 'a', 2, 'b']
CORRECT: aaabb
'3[ab]2[c]'
[3, 'ab', 2, 'c']
CORRECT: abababcc
'5[12]6[c]'
[5, '12', 6, 'c']
CORRECT: 1212121212cccccc
'22[a]'
[22, 'a']
CORRECT: aaaaaaaaaaaaaaaaaaaaaa
This will also handle cases of '5[12]' wich some of the other solutions wont.
You can capture both the number of repetitions n and the pattern to repeat v in one go using the described pattern. This essentially matches any sequence of digits - which is the first group we need to capture, reason why \d+ is between brackets (..) - followed by a [, followed by anything - this anything is the second pattern of interest, hence it is between backets (...) - which is then followed by a ].
findall will find all these matches in the passed line, then the first match - the number - will be cast to an int and used as a multiplier for the string pattern. The list of int(n) * v is then joined with an empty space. Malformed patterns may throw exceptions or return nothing.
Anyway, in code:
import re
pattern = re.compile("(\d+)\[(.*?)\]")
def func(x): return "".join([v*int(n) for n,v in pattern.findall(x)])
print(func("3[a]2[b]"))
print(func("3[ab]2[c]"))
OUTPUT
aaabb
abababcc
FOLLOW UP
Another solution which achieves the same result, without using regular expression (ok, not nice at all, I get it...):
def func(s): return "".join([int(x[0])*x[1] for x in map(lambda x:x.split("["), s.split("]")) if len(x) == 2])
I am not much more than a beginner and looking at the other answers, I thought understanding regex might be a challenge for a new contributor such as yourself since I myself haven't really dealt with regex.
The beginner friendly way to do this might be to loop through the input string and use string functions like isnumeric() and isalpha()
data = "3[a]2[b]"
chars = []
nums = []
substrings = []
for i, char in enumerate(data):
if char.isnumeric():
nums.append(char)
if char.isalpha():
chars.append(char)
for i, char in enumerate(chars):
substrings.append(char * int(nums[i]))
string = "".join(substrings)
print(string)
OUTPUT:
aaabb
And on trying different values for data:
data = "0[a]2[b]3[p]"
OUTPUT bbppp
data = "1[a]1[a]2[a]"
OUTPUT aaaa
NOTE: In case you're not familiar with the above functions, they are string functions, which are fairly self-explanatory. They are used as <your_string_here>.isalpha() which returns true if and only if the string is an alphabet (whitespace, numerics, and symbols return false
And, similarly for isnumeric()
For example,
"]".isnumeric() and "]".isalpha() return False
"a".isalpha() returns True
IF YOU NEED ANY CLARIFICATION ON A FUNCTION USED, PLEASE DO NOT HESITATE TO LEAVE A COMMENT
I'm trying to compress a string in a way that any sequence of letters in strict alphabetical order is swapped with the first letter plus the length of the sequence.
For example, the string "abcdefxylmno", would become: "a6xyl4"
Single letters that aren't in order with the one before or after just stay the way they are.
How do I check that two letters are successors (a,b) and not simply in alphabetical order (a,c)? And how do I keep iterating on the string until I find a letter that doesn't meet this requirement?
I'm also trying to do this in a way that makes it easier to write an inverse function (that given the result string gives me back the original one).
EDIT :
I've managed to get the function working, thanks to your suggestion of using the alphabet string as comparison; now I'm very much stuck on the inverse function: given "a6xyl4" expand it back into "abcdefxylmno".
After quite some time I managed to split the string every time there's a number and I made a function that expands a 2 char string, but it fails to work when I use it on a longer string:
from string import ascii_lowercase as abc
def subString(start,n):
L=[]
ind = abc.index(start)
newAbc = abc[ind:]
for i in range(len(newAbc)):
while i < n:
L.append(newAbc[i])
i+=1
res = ''.join(L)
return res
def unpack(S):
for i in range(len(S)-1):
if S[i] in abc and S[i+1] not in abc:
lett = str(S[i])
num = int(S[i+1])
return subString(lett,num)
def separate(S):
lst = []
for i in S:
lst.append(i)
for el in lst:
if el.isnumeric():
ind = lst.index(el)
lst.insert(ind+1,"-")
a = ''.join(lst)
L = a.split("-")
if S[-1].isnumeric():
L.remove(L[-1])
return L
else:
return L
def inverse(S):
L = separate(S)
for i in L:
return unpack(i)
Each of these functions work singularly, but inverse(S) doesn't output anything. What's the mistake?
You can use the ord() function which returns an integer representing the Unicode character. Sequential letters in alphabetical order differ by 1. Thus said you can implement a simple funtion:
def is_successor(a,b):
# check for marginal cases if we dont ensure
# input restriction somewhere else
if ord(a) not in range(ord('a'), ord('z')) and ord(a) not in range(ord('A'),ord('Z')):
return False
if ord(b) not in range(ord('a'), ord('z')) and ord(b) not in range(ord('A'),ord('Z')):
return False
# returns true if they are sequential
return ((ord(b) - ord(a)) == 1)
You can use chr(int) method for your reversing stage as it returns a string representing a character whose Unicode code point is an integer given as argument.
This builds on the idea that acceptable subsequences will be substrings of the ABC:
from string import ascii_lowercase as abc # 'abcdefg...'
text = 'abcdefxylmno'
stack = []
cache = ''
# collect subsequences
for char in text:
if cache + char in abc:
cache += char
else:
stack.append(cache)
cache = char
# if present, append the last sequence
if cache:
stack.append(cache)
# stack is now ['abcdef', 'xy', 'lmno']
# Build the final string 'a6x2l4'
result = ''.join(f'{s[0]}{len(s)}' if len(s) > 1 else s for s in stack)
I am trying to remove word with single repeated characters using regex in python, for example :
good => good
gggggggg => g
What I have tried so far is following
re.sub(r'([a-z])\1+', r'\1', 'ffffffbbbbbbbqqq')
Problem with above solution is that it changes good to god and I just want to remove words with single repeated characters.
A better approach here is to use a set
def modify(s):
#Create a set from the string
c = set(s)
#If you have only one character in the set, convert set to string
if len(c) == 1:
return ''.join(c)
#Else return original string
else:
return s
print(modify('good'))
print(modify('gggggggg'))
If you want to use regex, mark the start and end of the string in our regex by ^ and $ (inspired from #bobblebubble comment)
import re
def modify(s):
#Create the sub string with a regex which only matches if a single character is repeated
#Marking the start and end of string as well
out = re.sub(r'^([a-z])\1+$', r'\1', s)
return out
print(modify('good'))
print(modify('gggggggg'))
The output will be
good
g
If you do not want to use a set in your method, this should do the trick:
def simplify(s):
l = len(s)
if l>1 and s.count(s[0]) == l:
return s[0]
return s
print(simplify('good'))
print(simplify('abba'))
print(simplify('ggggg'))
print(simplify('g'))
print(simplify(''))
output:
good
abba
g
g
Explanations:
You compute the length of the string
you count the number of characters that are equal to the first one and you compare the count with the initial string length
depending on the result you return the first character or the whole string
You can use trim command:
take a look at this examples:
"ggggggg".Trim('g');
Update:
and for characters which are in the middle of the string use this function, thanks to this answer
in java:
public static string RemoveDuplicates(string input)
{
return new string(input.ToCharArray().Distinct().ToArray());
}
in python:
used = set()
unique = [x for x in mylist if x not in used and (used.add(x) or True)]
but I think all of these answers does not match situation like aaaaabbbbbcda, this string has an a at the end of string which does not appear in the result (abcd). for this kind of situation use this functions which I wrote:
In:
def unique(s):
used = set()
ret = list()
s = list(s)
for x in s:
if x not in used:
ret.append(x)
used = set()
used.add(x)
return ret
print(unique('aaaaabbbbbcda'))
out:
['a', 'b', 'c', 'd', 'a']
def password(passlist):
listt = []
for i in range(0, len(passlist)):
temp = passlist[i]
for j in range(0, len(temp)/2):
if((j+2)%2 == 0) :
t = temp[j]
temp.replace(temp[j], temp[j+2])
temp.replace(temp[j+2], t)
listt.append(temp)
I am passing a list of string
example ["abcd", "bcad"]. for each string i will swap ith character with j character if (i+j)%2 == 0.
My code is going out of the boundary of string.
Please suggest me a better approach to this problem
Here's how I'd do it:
def password(passlist):
def password_single(s):
temp = list(s)
for j in range(0, len(temp) // 2, 2):
temp[j], temp[j+2] = temp[j+2], temp[j]
return ''.join(temp)
return [password_single(s) for s in passlist]
print(password(["abcd", "bcad"]))
Define a function that operates on a single list element (password_single). It's easier to develop and debug that way. In this case, I made it an inner function but it doesn't have to be.
Use three-argument range calls, since it's the same as doing the two-argument + if(index%2 == 0)
Convert strings to lists, perform the swapping and convert back.
Use a "swap" type operation instead of two replaces.
Strings are immutable in python, therefore you cannot swap the characters in place. You have to build a new string.
Moreover, your code does not work for each string in passlist. You iterate through the string in passlist in the first for block, but then you use the temp variable outside that block. This means that the second for loop only iterates on the last string.
Now, a way to do what you want, might be:
for i in range(len(passlist)):
pass_ = passlist[i]
new_pass = [c for c in pass_] # convert the string in a list of chars
for j in range(len(pass_) / 2):
new_pass[j], new_pass[j+2] = new_pass[j+2], new_pass[j] # swap
listt.append(''.join(new_pass)) # convert the list of chars back to string