Find common characters between two strings - python

I am trying to print the common letters from two different user inputs using a for loop. (I need to do it using a for loop.) I am running into two problems: 1. My statement "If char not in output..." is not pulling unique values. 2. The output is giving me a list of individual letters rather than a single string. I tried the split the output but split ran into a type error.
wrd = 'one'
sec_wrd = 'toe'
def unique_letters(x):
output =[]
for char in x:
if char not in output and char != " ":
output.append(char)
return output
final_output = (unique_letters(wrd) + unique_letters(sec_wrd))
print(sorted(final_output))

You are trying to perform the Set Intersection. Python has set.intersection method for the same. You can use it for your use-case as:
>>> word_1 = 'one'
>>> word_2 = 'toe'
# v join the intersection of `set`s to get back the string
# v v No need to type-cast it to `set`.
# v v Python takes care of it
>>> ''.join(set(word_1).intersection(word_2))
'oe'
set will return the unique characters in your string. set.intersection method will return the characters which are common in both the sets.
If for loop is must for you, then you may use a list comprehension as:
>>> unique_1 = [w for w in set(word_1) if w in word_2]
# OR
# >>> unique_2 = [w for w in set(word_2) if w in word_1]
>>> ''.join(unique_1) # Or, ''.join(unique_2)
'oe'
Above result could also be achieved with explicit for loop as:
my_str = ''
for w in set(word_1):
if w in word_2:
my_str += w
# where `my_str` will hold `'oe'`

For this kind of problem, you're probably better off using sets:
wrd = 'one'
sec_wrd = 'toe'
wrd = set(wrd)
sec_wrd = set(sec_wrd)
print(''.join(sorted(wrd.intersection(sec_wrd))))

I have just solved this today on code signal. It worked for all tests.
def solution(s1, s2):
common_char = ""
for i in s1:
if i not in common_char:
i_in_s1 = s1.count(i)
i_in_s2 = s2.count(i)
comm_num = []
comm_num.append(i_in_s1)
comm_num.append(i_in_s2)
comm_i = min(comm_num)
new_char = i * comm_i
common_char += new_char
return len(common_char)

Function to solve the problem
def find_common_characters(msg1,msg2):
#to remove duplication set() is used.
set1=set(msg1)
set2=set(msg2)
remove={" "}
#if you wish to exclude space
set3=(set1&set2)-remove
msg=''.join(set3)
return msg
Providing input and Calling the function
Provide different values for msg1,msg2 and test your program
msg1="python"
msg2="Python"
common_characters=find_common_characters(msg1,msg2)
print(common_characters)

Here is your one line code if you want the number of common character between them!
def solution(s1,s2):
return sum(min(s1.count(x),s2.count(x)) for x in set(s1))

Related

How do I check if the next item in a string is the alphabetical successor of the one before? + Inverse

I'm trying to compress a string in a way that any sequence of letters in strict alphabetical order is swapped with the first letter plus the length of the sequence.
For example, the string "abcdefxylmno", would become: "a6xyl4"
Single letters that aren't in order with the one before or after just stay the way they are.
How do I check that two letters are successors (a,b) and not simply in alphabetical order (a,c)? And how do I keep iterating on the string until I find a letter that doesn't meet this requirement?
I'm also trying to do this in a way that makes it easier to write an inverse function (that given the result string gives me back the original one).
EDIT :
I've managed to get the function working, thanks to your suggestion of using the alphabet string as comparison; now I'm very much stuck on the inverse function: given "a6xyl4" expand it back into "abcdefxylmno".
After quite some time I managed to split the string every time there's a number and I made a function that expands a 2 char string, but it fails to work when I use it on a longer string:
from string import ascii_lowercase as abc
def subString(start,n):
L=[]
ind = abc.index(start)
newAbc = abc[ind:]
for i in range(len(newAbc)):
while i < n:
L.append(newAbc[i])
i+=1
res = ''.join(L)
return res
def unpack(S):
for i in range(len(S)-1):
if S[i] in abc and S[i+1] not in abc:
lett = str(S[i])
num = int(S[i+1])
return subString(lett,num)
def separate(S):
lst = []
for i in S:
lst.append(i)
for el in lst:
if el.isnumeric():
ind = lst.index(el)
lst.insert(ind+1,"-")
a = ''.join(lst)
L = a.split("-")
if S[-1].isnumeric():
L.remove(L[-1])
return L
else:
return L
def inverse(S):
L = separate(S)
for i in L:
return unpack(i)
Each of these functions work singularly, but inverse(S) doesn't output anything. What's the mistake?
You can use the ord() function which returns an integer representing the Unicode character. Sequential letters in alphabetical order differ by 1. Thus said you can implement a simple funtion:
def is_successor(a,b):
# check for marginal cases if we dont ensure
# input restriction somewhere else
if ord(a) not in range(ord('a'), ord('z')) and ord(a) not in range(ord('A'),ord('Z')):
return False
if ord(b) not in range(ord('a'), ord('z')) and ord(b) not in range(ord('A'),ord('Z')):
return False
# returns true if they are sequential
return ((ord(b) - ord(a)) == 1)
You can use chr(int) method for your reversing stage as it returns a string representing a character whose Unicode code point is an integer given as argument.
This builds on the idea that acceptable subsequences will be substrings of the ABC:
from string import ascii_lowercase as abc # 'abcdefg...'
text = 'abcdefxylmno'
stack = []
cache = ''
# collect subsequences
for char in text:
if cache + char in abc:
cache += char
else:
stack.append(cache)
cache = char
# if present, append the last sequence
if cache:
stack.append(cache)
# stack is now ['abcdef', 'xy', 'lmno']
# Build the final string 'a6x2l4'
result = ''.join(f'{s[0]}{len(s)}' if len(s) > 1 else s for s in stack)

How can we remove word with repeated single character?

I am trying to remove word with single repeated characters using regex in python, for example :
good => good
gggggggg => g
What I have tried so far is following
re.sub(r'([a-z])\1+', r'\1', 'ffffffbbbbbbbqqq')
Problem with above solution is that it changes good to god and I just want to remove words with single repeated characters.
A better approach here is to use a set
def modify(s):
#Create a set from the string
c = set(s)
#If you have only one character in the set, convert set to string
if len(c) == 1:
return ''.join(c)
#Else return original string
else:
return s
print(modify('good'))
print(modify('gggggggg'))
If you want to use regex, mark the start and end of the string in our regex by ^ and $ (inspired from #bobblebubble comment)
import re
def modify(s):
#Create the sub string with a regex which only matches if a single character is repeated
#Marking the start and end of string as well
out = re.sub(r'^([a-z])\1+$', r'\1', s)
return out
print(modify('good'))
print(modify('gggggggg'))
The output will be
good
g
If you do not want to use a set in your method, this should do the trick:
def simplify(s):
l = len(s)
if l>1 and s.count(s[0]) == l:
return s[0]
return s
print(simplify('good'))
print(simplify('abba'))
print(simplify('ggggg'))
print(simplify('g'))
print(simplify(''))
output:
good
abba
g
g
Explanations:
You compute the length of the string
you count the number of characters that are equal to the first one and you compare the count with the initial string length
depending on the result you return the first character or the whole string
You can use trim command:
take a look at this examples:
"ggggggg".Trim('g');
Update:
and for characters which are in the middle of the string use this function, thanks to this answer
in java:
public static string RemoveDuplicates(string input)
{
return new string(input.ToCharArray().Distinct().ToArray());
}
in python:
used = set()
unique = [x for x in mylist if x not in used and (used.add(x) or True)]
but I think all of these answers does not match situation like aaaaabbbbbcda, this string has an a at the end of string which does not appear in the result (abcd). for this kind of situation use this functions which I wrote:
In:
def unique(s):
used = set()
ret = list()
s = list(s)
for x in s:
if x not in used:
ret.append(x)
used = set()
used.add(x)
return ret
print(unique('aaaaabbbbbcda'))
out:
['a', 'b', 'c', 'd', 'a']

Print the first, second occurred character in a list

I working on a simple algorithm which prints the first character who occurred twice or more.
for eg:
string ='abcabc'
output = a
string = 'abccba'
output = c
string = 'abba'
output = b
what I have done is:
string = 'abcabc'
s = []
for x in string:
if x in s:
print(x)
break
else:
s.append(x)
output: a
But its time complexity is O(n^2), how can I do this in O(n)?
Change s = [] to s = set() (and obviously the corresponding append to add). in over set is O(1), unlike in over list which is sequential.
Alternately, with regular expressions (O(n^2), but rather fast and easy):
import re
match = re.search(r'(.).*\1', string)
if match:
print(match.group(1))
The regular expression (.).*\1 means "any character which we'll remember for later, any number of intervening characters, then the remembered character again". Since regexp is scanned left-to-right, it will find a in "abba" rather than b, as required.
Use dictionaries
string = 'abcabc'
s = {}
for x in string:
if x in s:
print(x)
break
else:
s[x] = 0
or use sets
string = 'abcabc'
s = set()
for x in string:
if x in s:
print(x)
break
else:
s.add(x)
both dictionaries and sets use indexing and search in O(1)

Python lists and for loops. How do I communicate to the for loop that I intend to work on subsequent items and not the first one only?

I am a newbie in python and I am working on a function that I expect to pass a string like abcd and it outputs something like A-Bb-Ccc-Dddd.
I have created the following.
`
def mumbler(s):
chars = list(s)
mumbled = []
result = []
for char in chars:
caps = char.upper()
num = chars.index(char)
low = char.lower()
mumbled.append( caps+ low*num)
for i in mumbled:
result.append(i+'-')
result = ''.join(result)
return result[:-1]
`
It works for most cases. However, when I pass a string like Abcda. It fails to return the expected output, in this case, A-Bb-Ccc-Dddd-Aaaaa.
How should I go about solving this?
Thank you for taking the time to answer this.
You can do it in a much simpler way using list comprehension and enumerate
>>> s = 'abcd'
>>> '-'.join([c.upper() + c.lower()*i for i,c in enumerate(s)])
'A-Bb-Ccc-Dddd'
If you want to make your own code work, you'll just need to convert the result list to string outside your second for-loop:
def mumbler(s):
chars = list(s)
mumbled = []
result = []
for char in chars:
caps = char.upper()
num = chars.index(char)
low = char.lower()
mumbled.append( caps+ low*num)
for i in mumbled:
result.append(i+'-')
result = ''.join(result)
return result[:-1]
mumbler('Abcda')
'A-Bb-Ccc-Dddd-Aaaaa'
Go for a simple 1-liner - next() on count for maintaining the times to repeat and title() for title-casing:
from itertools import count
s = 'Abcda'
i = count(1)
print('-'.join([(x * next(i)).title() for x in s]))
# A-Bb-Ccc-Dddd-Aaaaa

Python - making a function that would add "-" between letters

I'm trying to make a function, f(x), that would add a "-" between each letter:
For example:
f("James")
should output as:
J-a-m-e-s-
I would love it if you could use simple python functions as I am new to programming. Thanks in advance. Also, please use the "for" function because it is what I'm trying to learn.
Edit:
yes, I do want the "-" after the "s".
Can I try like this:
>>> def f(n):
... return '-'.join(n)
...
>>> f('james')
'j-a-m-e-s'
>>>
Not really sure if you require the last 'hyphen'.
Edit:
Even if you want suffixed '-', then can do like
def f(n):
return '-'.join(n) + '-'
As being learner, it is important to understand for your that "better to concat more than two strings in python" would be using str.join(iterable), whereas + operator is fine to append one string with another.
Please read following posts to explore further:
Any reason not to use + to concatenate two strings?
which is better to concat string in python?
How slow is Python's string concatenation vs. str.join?
Also, please use the "for" function because it is what I'm trying to learn
>>> def f(s):
m = s[0]
for i in s[1:]:
m += '-' + i
return m
>>> f("James")
'J-a-m-e-s'
m = s[0] character at the index 0 is assigned to the variable m
for i in s[1:]: iterate from the second character and
m += '-' + i append - + char to the variable m
Finally return the value of variable m
If you want - at the last then you could do like this.
>>> def f(s):
m = ""
for i in s:
m += i + '-'
return m
>>> f("James")
'J-a-m-e-s-'
text_list = [c+"-" for c in text]
text_strung = "".join(text_list)
As a function, takes a string as input.
def dashify(input):
output = ""
for ch in input:
output = output + ch + "-"
return output
Given you asked for a solution that uses for and a final -, simply iterate over the message and add the character and '-' to an intermediate list, then join it up. This avoids the use of string concatenations:
>>> def f(message)
l = []
for c in message:
l.append(c)
l.append('-')
return "".join(l)
>>> print(f('James'))
J-a-m-e-s-
I'm sorry, but I just have to take Alexander Ravikovich's answer a step further:
f = lambda text: "".join([c+"-" for c in text])
print(f('James')) # J-a-m-e-s-
It is never too early to learn about list comprehension.
"".join(a_list) is self-explanatory: glueing elements of a list together with a string (empty string in this example).
lambda... well that's just a way to define a function in a line. Think
square = lambda x: x**2
square(2) # returns 4
square(3) # returns 9
Python is fun, it's not {enter-a-boring-programming-language-here}.

Categories

Resources