Leetcode problem 14. Longest Common Prefix (Python) - python

I tried to solve the problem (you can read description here: https://leetcode.com/problems/longest-common-prefix/) And the following is code I came up with.
It gives prefix value of the first string in strs list and compares prefix with every string from the list, popping all characters that are not equal.
class Solution:
def longestCommonPrefix(self, strs: List[str]) -> str:
prefix = strs[0][0]
for i in range(len(strs)):
for j in range(len(prefix)):
if strs[i][j] != prefix[j]:
prefix.pop(prefix[j])
return prefix
But this code fails in the very first testcase where strs = ["flower","flow","flight"]
Expected output is "fl", while my code returns just "f"
I am struggling to find what is going wrong in my solution. Maybe you can help?

Iterate over the characters in parallel with zip:
strs = ["flower", "flow", "flight"]
n = 0
for chars in zip(*strs):
if len(set(chars)) > 1:
break
n += 1
# length
print(n) # 2
# prefix
print(strs[0][:n]) # fl
Similar approach as a one-liner using itertools.takewhile:
from itertools import takewhile
prefix = ''.join([x[0] for x in takewhile(lambda x: len(set(x)) == 1, zip(*strs))])

Alternatively you could try to use the lib in os - commonprefix:
(it's available since Python 3.5+)
def longestCommonPrefix(self, strs: List[str]) -> str:
return os.path.commonprefix(strs)
strs = ["flower","flow","flight"]
print(longestCommonPrefix(strs))

Related

How can I remove specific duplicates from a list, rather than remove all duplicates indiscriminately?

In a python script, I need to assess whether a string contains duplicates of a specific character (e.g., "f") and, if so, remove all but the first instance of that character. Other characters in the string may also have duplicates, but the script should not remove any duplicates other than those of the specified character.
This is what I've got so far. The script runs, but it is not accomplishing the desired task. I modified the reduce() line from the top answer to this question, but it's a little more complex than what I've learned at this point, so it's difficult for me to tell what part of this is wrong.
import re
from functools import reduce
string = "100 ffeet"
dups = ["f", "t"]
for char in dups:
if string.count(char) > 1:
lst = list(string)
reduce(lambda acc, el: acc if re.match(char, el) and el in acc else acc + [el], lst, [])
string = "".join(lst)
Let's create a function that receives a string s and a character c as parameters, and returns a new string where all but the first occurrence of c in s are removed.
We'll be making use of the following functions from Python std lib:
str.find(sub): Return the lowest index in the string where substring sub is found.
str.replace(old, new): Return a copy of the string with all occurrences of substring old replaced by new.
The idea is straightforward:
Find the first index of c in s
If none is found, return s
Make a substring of s starting from the next character after c
Remove all occurrences of c in the substring
Concatenate the first part of s with the updated substring
Return the final string
In Python:
def remove_all_but_first(s, c):
i = s.find(c)
if i == -1:
return s
i += 1
return s[:i] + s[i:].replace(c, '')
Now you can use this function to remove all the characters you want.
def main():
s = '100 ffffffffeet'
dups = ['f', 't', 'x']
print('Before:', s)
for c in dups:
s = remove_all_but_first(s, c)
print('After:', s)
if __name__ == '__main__':
main()
Here is one way that you could do it
string = "100 ffeet"
dups = ["f", "t"]
seen = []
for s in range(len(string)-1,0,-1):
if string[s] in dups and string[s] in seen:
string = string[:s] + '' + string[s+1:]
elif string[s] in dups:
seen.append(string[s])
print(string)

Transform "4CA2CACA" to "CCCCACCACA"

I have already tried this (as somebody told me on another question):
import re
def decode(txt):
list = []
for cnt, char in re.findall(r"([\d*])([^d])", txt):
list.extend((char * (int(cnt) if cnt else 1)))
list = "".join(list)
return list
Example:
print(decode("2CACA2CACA3CACACA3CAC"))
This is what I get
CCCCCCCCCC
And this is what I need
CCACACCACACCCACACACCCAC
re.sub can take a named function or lambda as its second argument, and you can use this to accomplish your goal. Using this approach you simply don't do any substitution when a letter does not have a number in front of it.
def decode(s):
return re.sub(r'(\d+)([a-zA-Z])',
lambda m: m.group(2)*int(m.group(1)),
s)
decode("2CACA2CACA3CACACA3CAC")
# 'CCACACCACACCCACACACCCAC'
What you are missing is characters without a digit in front. This will include those:
import re
def decode(txt):
_list = []
for cnt, char, single_char in re.findall(r"(\d)([^\d])|([^\d])", txt):
if single_char:
_list.extend(single_char)
else:
_list.extend((char * (int(cnt) if cnt else 1)))
_list = "".join(_list)
return _list
print(decode("2CACA2CACA3CACACA3CAC"))
You can do this easily with functools and re. 1 line of code does all the work.
import re, functools
#create a partial of the `sub` method and it's `repl` arg
#the `lambda` takes the match, and multiplies the letter by the number
decode = functools.partial(re.compile(r'(\d+)([a-z])', re.I).sub,
lambda m: m.group(2)*int(m.group(1)))
rle = '4CA2CACA2CACACACA'
print(decode(rle)) #CCCCACCACACCACACACA
If you want to do it without any imports, then you could do something like this:
x = '2CACA2CACA3CACACA3CAC0M'
int_string = ""
char_list = []
for char in x:
if char.isnumeric():
int_string += char
continue
else:
if not int_string:
char_list.append(char)
else:
char_list.append(char * int(int_string))
int_string = ""
print("".join(char_list))
This will work for any positive integers, even zero, as you can see in the above example.

Function to remove more than 2 consecutive repetitions of a string not working

Here's my function:
def remove_more_than_two_reps(text):
result = list(text)
for idx,char in enumerate(text):
if(result[:idx].count(char) > 2):
result.remove(char)
return ''.join(result)
expected result:
text = 'teeeexxxxt'
result = remove_more_than_two_reps(text)
>'teexxt'
My function just returns the original string, what is the problem?
Try using append which is O(1) instead of remove which is O(n):
def remove_more_than_two_reps(text: str) -> str:
result = []
for ch in text:
if len(result) < 2 or result[-1] != ch or result[-2] != ch:
result.append(ch)
return ''.join(result)
text = 'teeeexxxxt'
result = remove_more_than_two_reps(text)
print(result)
Output:
teexxt
Another option could be using a pattern, matching 3 or more times the same character (.)\1{2,} and in the replacement use 2 times the captured group value:
import re
def remove_more_than_two_reps(text):
return re.sub(r'(.)\1{2,}', r'\1\1', text)
text = 'teeeexxxxt'
print(remove_more_than_two_reps(text))
Output
teexxt
See a regex demo and a Python demo.
Wanted to share an itertools solution, useful when you have particularly big strings (since it avoids allocating an enormous list):
import itertools as it
def remove_more_than_two_reps(text: str) -> str:
reps_of_at_most_two = (it.islice(reps, 2) for _, reps in it.groupby(text))
return ''.join(it.chain.from_iterable(reps_of_at_most_two))

How can we remove word with repeated single character?

I am trying to remove word with single repeated characters using regex in python, for example :
good => good
gggggggg => g
What I have tried so far is following
re.sub(r'([a-z])\1+', r'\1', 'ffffffbbbbbbbqqq')
Problem with above solution is that it changes good to god and I just want to remove words with single repeated characters.
A better approach here is to use a set
def modify(s):
#Create a set from the string
c = set(s)
#If you have only one character in the set, convert set to string
if len(c) == 1:
return ''.join(c)
#Else return original string
else:
return s
print(modify('good'))
print(modify('gggggggg'))
If you want to use regex, mark the start and end of the string in our regex by ^ and $ (inspired from #bobblebubble comment)
import re
def modify(s):
#Create the sub string with a regex which only matches if a single character is repeated
#Marking the start and end of string as well
out = re.sub(r'^([a-z])\1+$', r'\1', s)
return out
print(modify('good'))
print(modify('gggggggg'))
The output will be
good
g
If you do not want to use a set in your method, this should do the trick:
def simplify(s):
l = len(s)
if l>1 and s.count(s[0]) == l:
return s[0]
return s
print(simplify('good'))
print(simplify('abba'))
print(simplify('ggggg'))
print(simplify('g'))
print(simplify(''))
output:
good
abba
g
g
Explanations:
You compute the length of the string
you count the number of characters that are equal to the first one and you compare the count with the initial string length
depending on the result you return the first character or the whole string
You can use trim command:
take a look at this examples:
"ggggggg".Trim('g');
Update:
and for characters which are in the middle of the string use this function, thanks to this answer
in java:
public static string RemoveDuplicates(string input)
{
return new string(input.ToCharArray().Distinct().ToArray());
}
in python:
used = set()
unique = [x for x in mylist if x not in used and (used.add(x) or True)]
but I think all of these answers does not match situation like aaaaabbbbbcda, this string has an a at the end of string which does not appear in the result (abcd). for this kind of situation use this functions which I wrote:
In:
def unique(s):
used = set()
ret = list()
s = list(s)
for x in s:
if x not in used:
ret.append(x)
used = set()
used.add(x)
return ret
print(unique('aaaaabbbbbcda'))
out:
['a', 'b', 'c', 'd', 'a']

Python: loop over consecutive characters?

In Python (specifically Python 3.0 but I don't think it matters), how do I easily write a loop over a sequence of characters having consecutive character codes? I want to do something like this pseudocode:
for Ch from 'a' to 'z' inclusive: #
f(Ch)
Example: how about a nice "pythonic" version of the following?
def Pangram(Str):
''' Returns True if Str contains the whole alphabet, else False '''
for Ch from 'a' to 'z' inclusive: #
M[Ch] = False
for J in range(len(Str)):
Ch = lower(Str[J])
if 'a' <= Ch <= 'z':
M[Ch] = True
return reduce(and, M['a'] to M['z'] inclusive) #
The lines marked # are pseudocode. Of course reduce() is real Python!
Dear wizards (specially old, gray-bearded wizards), perhaps you can tell that my favorite language used to be Pascal.
You have a constant in the string module called ascii_lowercase, try that out:
>>> from string import ascii_lowercase
Then you can iterate over the characters in that string.
>>> for i in ascii_lowercase :
... f(i)
For your pangram question, there is a very simple way to find out if a string contains all the letters of the alphabet. Using ascii_lowercase as before,
>>> def pangram(str) :
... return set(ascii_lowercase).issubset(set(str))
Iterating a constant with all the characters you need is very Pythonic. However if you don't want to import anything and are only working in Unicode, use the built-ins ord() and its inverse chr().
for code in range(ord('a'), ord('z') + 1):
print chr(code)
You've got to leave the Pascal-isms behind and learn Python with a fresh perspective.
>>> ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'
>>> def pangram( source ):
return all(c in source for c in ascii_lowercase)
>>> pangram('hi mom')
False
>>> pangram(ascii_lowercase)
True
By limiting yourself to what Pascal offered, you're missing the things Python offers.
And... try to avoid reduce. It often leads to terrible performance problems.
Edit. Here's another formulation; this one implements set intersection.
>>> def pangram( source ):
>>> notused= [ c for c in ascii_lowercase if c not in source ]
>>> return len(notused) == 0
This one gives you a piece of diagnostic information for determining what letters are missing from a candidate pangram.
A more abstract answer would be something like:
>>> x="asdf"
>>> for i in range(len(x)):
... print x[i]
Hacky
method_1 = [chr(x) for x in range(ord('a'), ord('z')+1)]
print(method_1)
Neat
# this is the recommended method generally
from string import ascii_lowercase
method_2 = [x for x in ascii_lowercase]
print(method_2)
I would write a function similar to Python's range
def alpha_range(*args):
if len(args) == 1:
start, end, step = ord('a'), ord(args[0]), 1
elif len(args) == 2:
start, end, step = ord(args[0]), ord(args[1]), 1
else:
start, end, step = ord(args[0]), ord(args[1]), args[2]
return (chr(i) for i in xrange(start, end, step))

Categories

Resources