Comparing two strings in python - python

I need to compare two strings that are almost the same. Then find the point at which they differ using python. Any help?
for example two strings a and b
A = 'oooooSooooooooooooooooooRoMooooooAooooooooooooooo'
B = 'oooooSooooooooooooooooooooMooooooAooooooooooooooo'
Thanks

I'd suggest using the difflib which is shipped with every standard python installation. There you'll find the handy function ndiff.
>>> import difflib
>>> print "\n".join(difflib.ndiff([A], [B])),
- oooooSooooooooooooooooooRoMooooooAooooooooooooooo
? ^
+ oooooSooooooooooooooooooooMooooooAooooooooooooooo
? ^
>>>

For same sized strings or if only shortest length matters:
def diffindex(string1, string2):
for i, (char1, char2) in enumerate(zip(string1, string2)):
if char1 != char2:
return i
return -1
For different sized strings:
from itertools import zip_longest
Now replace the corresponding line with this one:
for i, (char1, char2) in enumerate(zip_longest(string1, string2)):

Some hints.
String have length:
print(len(A))
You can access individual letters by index:
print(A[0])
There is range function which allows you to generate sequence of integers:
for i in range(10):
print(i)
You can check if two characters are equal:
'a' == 'a'
'a' == 'b'

1.split the string to letters to arrays A[] and B[]
2.compare each letter with the same array index inside a loop.
3.repeat the loop 0- (length of the string) with a count inside the loop.
4.Take the count for the compare condition became false (when A[] == B[] became False)

Related

How does comparing two chars (within a string) work in Python

I am starting to learn Python and looked at following website: https://www.w3resource.com/python-exercises/string/
I work on #4 which is "Write a Python program to get a string from a given string where all occurrences of its first char have been changed to '$', except the first char itself."
str="restart"
char=str[0]
print(char)
strcpy=str
i=1
for i in range(len(strcpy)):
print(strcpy[i], "\n")
if strcpy[i] is char:
strcpy=strcpy.replace(strcpy[i], '$')
print(strcpy)
I would expect "resta$t" but the actual result is: $esta$t
Thank you for your help!
There are two issues, first, you are not starting iteration where you think you are:
i = 1 # great, i is 1
for i in range(5):
print(i)
0
1
2
3
4
i has been overwritten by the value tracking the loop.
Second, the is does not mean value equivalence. That is reserved for the == operator. Simpler types such as int and str can make it seem like is works in this fashion, but other types do not behave this way:
a, b = 5, 5
a is b
True
a, b = "5", "5"
a is b
True
a==b
True
### This doesn't work
a, b = [], []
a is b
False
a == b
True
As #Kevin pointed out in the comments, 99% of the time, is is not the operator you want.
As far as your code goes, str.replace will replace all instances of the argument supplied with the second arg, unless you give it an optional number of instances to replace. To avoid replacing the first character, grab the first char separately, like val = somestring[0], then replace the rest using a slice, no need for iteration:
somestr = 'restart' # don't use str as a variable name
val = somestr[0] # val is 'r'
# somestr[1:] gives 'estart'
x = somestr[1:].replace(val, '$')
print(val+x)
# resta$t
If you still want to iterate, you can do that over the slice as well:
# collect your letters into a list
letters = []
char = somestr[0]
for letter in somestr[1:]: # No need to track an index here
if letter == char: # don't use is, use == for value comparison
letter = '$' # change letter to a different value if it is equal to char
letters.append(letter)
# Then use join to concatenate back to a string
print(char + ''.join(letters))
# resta$t
There are some need of modification on your code.
Modify your code with as given in below.
strcpy="restart"
i=1
for i in range(len(strcpy)):
strcpy=strcpy.replace(strcpy[0], '$')[:]
print(strcpy)
# $esta$t
Also, the best practice to write code in Python is to use Function. You can modify your code as given below or You can use this function.
def charreplace(s):
return s.replace(s[0],'$')[:]
charreplace("restart")
#'$esta$t'
Hope this helpful.

Occurrence of a letter case sensitive

I am trying to find occurrence of letter 'b' and 'B'. the code that I have written works perfectly. Is there a better way that i can do this.
My code:
def count_letter_b(string):
#TODO: Your code goes here
a = int(string.count('B'))
b = int(string.count('b'))
return a + b
print count_letter_b("Bubble Bungle")
You can turn the string to uppercase (or lowercase), then count the occurrences:
string.upper().count('B')
So, overall, your code will look like this:
def count_letter_b(string):
return string.upper().count('B')
Note: no need to cast to int(..) as the result of str.count is already an int
Well if you only want to apply the same computation to a varying amount of letters you may want them to be arguments (count_letter(s, letters)), but anyway, here is a more functional example:
def count_letter_b(string):
return sum(map(string.count, 'Bb'))
This uses the str.count version that is bound to your input string instance.
Note that you're shadowing the name string if you use it as a parameter name.
You could do
# count in upper string, upper character
def countInvariantChars(c,s):
return s.upper().count(c.upper())
# list comprehensions + length
def countInvariantChars2(c,s):
return len([x for x in s if c.upper() == x.upper()])
# sum ones of list comprehension
def countInvariantChars3(c,s):
return sum([1 for x in s if c.upper() == x.upper()])
print(countInvariantChars("b","Bubble Bungle"))
print(countInvariantChars2("b","Bubble Bungle"))
print(countInvariantChars3("b","Bubble Bungle"))
Output (pyfiddle.io):
read-only#bash: 4
4
4
Use this:
def count_letter_b(string):
return string.lower().count('b')
print(count_letter_b(string))

How would one alternately add 2 characters into a string in python?

Like, for example, I have the string '12345' and the string '+*' and I want to make it so that the new string would be '1+2*3+4*5', alternating between the two characters in the second string. I know how to do it with one character using join(), but I just can't figure out how to do it with both alternating. Any help would be greatly appreciated. Thanks!
You could use itertools.cycle() to forever alternate between the characters:
from itertools import cycle
result = ''.join([c for pair in zip(inputstring, cycle('+*')) for c in pair])[:-1]
You do need to remove that last + added on, but this does work just fine otherwise:
>>> from itertools import cycle
>>> inputstring = '12345'
>>> ''.join([c for pair in zip(inputstring, cycle('+*')) for c in pair])[:-1]
'1+2*3+4*5'
import itertools
s = '12345'
op = '+*'
answer = ''.join(itertools.chain.from_iterable(zip(s, itertools.cycle(op))))[:-1]
print(answer)
Output:
1+2*3+4*5
You could use this code:
string = "12345"
separator = "+*"
result = ""
for i, c in enumerate(string): //enumerate returns a list of tuples [index, character]
t = i, c
result += t[1] //append character
if(t[0]==len(string)-1): //if reached max length
break
if(t[0]%2==0): //if even
result += separator[0] //append +
else:
result += separator[1] //append *
print(result) //otuput "1+2*3+4*5"
Following works without having to trim the end.
''.join(map(lambda x: x[0] + x[1],izip_longest('12345',''.join(repeat('*+',len('12345')/2)),fillvalue='')))
From python documentation;
itertools.izip_longest(*iterables[, fillvalue]): Make an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted.

How to remove duplicates only if consecutive in a string? [duplicate]

This question already has answers here:
Removing elements that have consecutive duplicates
(9 answers)
Closed 3 years ago.
For a string such as '12233322155552', by removing the duplicates, I can get '1235'.
But what I want to keep is '1232152', only removing the consecutive duplicates.
import re
# Only repeated numbers
answer = re.sub(r'(\d)\1+', r'\1', '12233322155552')
# Any repeated character
answer = re.sub(r'(.)\1+', r'\1', '12233322155552')
You can use itertools, here is the one liner
>>> s = '12233322155552'
>>> ''.join(i for i, _ in itertools.groupby(s))
'1232152'
Microsoft / Amazon job interview type of question:
This is the pseudocode, the actual code is left as exercise.
for each char in the string do:
if the current char is equal to the next char:
delete next char
else
continue
return string
As a more high level, try (not actually the implementation):
for s in string:
if s == s+1: ## check until the end of the string
delete s+1
Hint: the itertools module is super-useful. One function in particular, itertools.groupby, might come in really handy here:
itertools.groupby(iterable[, key])
Make an iterator that returns consecutive keys and groups from
the iterable. The key is a function computing a key value for each
element. If not specified or is None, key defaults to an identity
function and returns the element unchanged. Generally, the iterable
needs to already be sorted on the same key function.
So since strings are iterable, what you could do is:
use groupby to collect neighbouring elements
extract the keys from the iterator returned by groupby
join the keys together
which can all be done in one clean line..
First of all, you can't remove anything from a string in Python (google "Python immutable string" if this is not clear).
M first approach would be:
foo = '12233322155552'
bar = ''
for chr in foo:
if bar == '' or chr != bar[len(bar)-1]:
bar += chr
or, using the itertools hint from above:
''.join([ k[0] for k in groupby(a) ])
+1 for groupby. Off the cuff, something like:
from itertools import groupby
def remove_dupes(arg):
# create generator of distinct characters, ignore grouper objects
unique = (i[0] for i in groupby(arg))
return ''.join(unique)
Cooks for me in Python 2.7.2
number = '12233322155552'
temp_list = []
for item in number:
if len(temp_list) == 0:
temp_list.append(item)
elif len(temp_list) > 0:
if temp_list[-1] != item:
temp_list.append(item)
print(''.join(temp_list))
This would be a way:
def fix(a):
list = []
for element in a:
# fill the list if the list is empty
if len(list) == 0:list.append(element)
# check with the last element of the list
if list[-1] != element: list.append(element)
print(''.join(list))
a= 'GGGGiiiiniiiGinnaaaaaProtijayi'
fix(a)
# output => GiniGinaProtijayi
t = '12233322155552'
for i in t:
dup = i+i
t = re.sub(dup, i, t)
You can get final output as 1232152

Swapping every second character in a string in Python

I have the following problem: I would like to write a function in Python which, given a string, returns a string where every group of two characters is swapped.
For example given "ABCDEF" it returns "BADCFE".
The length of the string would be guaranteed to be an even number.
Can you help me how to do it in Python?
To add another option:
>>> s = 'abcdefghijkl'
>>> ''.join([c[1] + c[0] for c in zip(s[::2], s[1::2])])
'badcfehgjilk'
import re
print re.sub(r'(.)(.)', r'\2\1', "ABCDEF")
from itertools import chain, izip_longest
''.join(chain.from_iterable(izip_longest(s[1::2], s[::2], fillvalue = '')))
You can also use islices instead of regular slices if you have very large strings or just want to avoid the copying.
Works for odd length strings even though that's not a requirement of the question.
While the above solutions do work, there is a very simple solution shall we say in "layman's" terms. Someone still learning python and string's can use the other answers but they don't really understand how they work or what each part of the code is doing without a full explanation by the poster as opposed to "this works". The following executes the swapping of every second character in a string and is easy for beginners to understand how it works.
It is simply iterating through the string (any length) by two's (starting from 0 and finding every second character) and then creating a new string (swapped_pair) by adding the current index + 1 (second character) and then the actual index (first character), e.g., index 1 is put at index 0 and then index 0 is put at index 1 and this repeats through iteration of string.
Also added code to ensure string is of even length as it only works for even length.
string = "abcdefghijklmnopqrstuvwxyz123"
# use this prior to below iteration if string needs to be even but is possibly odd
if len(string) % 2 != 0:
string = string[:-1]
# iteration to swap every second character in string
swapped_pair = ""
for i in range(0, len(string), 2):
swapped_pair += (string[i + 1] + string[i])
# use this after above iteration for any even or odd length of strings
if len(swapped_pair) % 2 != 0:
swapped_adj += swapped_pair[-1]
print(swapped_pair)
badcfehgjilknmporqtsvuxwzy21 # output if the "needs to be even" code used
badcfehgjilknmporqtsvuxwzy213 # output if the "even or odd" code used
Here's a nifty solution:
def swapem (s):
if len(s) < 2: return s
return "%s%s%s"%(s[1], s[0], swapem (s[2:]))
for str in ("", "a", "ab", "abcdefgh", "abcdefghi"):
print "[%s] -> [%s]"%(str, swapem (str))
though possibly not suitable for large strings :-)
Output is:
[] -> []
[a] -> [a]
[ab] -> [ba]
[abcdefgh] -> [badcfehg]
[abcdefghi] -> [badcfehgi]
If you prefer one-liners:
''.join(reduce(lambda x,y: x+y,[[s[1+(x<<1)],s[x<<1]] for x in range(0,len(s)>>1)]))
Here's a another simple solution:
"".join([(s[i:i+2])[::-1]for i in range(0,len(s),2)])

Categories

Resources