Python Regex to find all characters between two index values

Python Regex to find all characters between two index values - python

Looking for a way to use a Python regex to extract all the characters in a string between to indexes. My code is below:
import re
txt = "Hula hoops are fun."
x = re.search(r"hoops", txt)
c = x.span()
a = c[0]
b = c[1]
print(a) # prints 5
print(b) # prints 10
txt2 = "Hula loops are fun."
z = re.???(a, b, txt2) #<------ Incorrect
print(z)
What I am trying to figure out is to somehow use a and b to get z = "loops" in txt2 (the rewrite of txt). Is there a python regex command to do this?

you can use z = txt[a:b] to extract all characters between a and b indices.

Why not using slices(the obvious way)?
z = txt2[a:b]
print(z) # loops
If you really want to use regex, you need to consume a . character a times to reach a because Regex doesn't have indexing directly. Then get the next b - a characters. In your case you end up with (?<=.{5}).{5} pattern. (?<=.{5}) part is a positive lookbehind assertion.
pat = rf"(?<=.{{{str(a)}}}).{{{str(b - a)}}}"
print(re.search(pat, txt2))
output:
<re.Match object; span=(5, 10), match='loops'>

import re
txt = "Hula hoops are fun."
x = re.search(r"hoops", txt)
c = x.span()
a = c[0]
b = c[1]
print(a) # prints 5
print(b) # prints 10
txt2 = "Hula loops are fun."
txt3 = list(txt2)
xy = txt3[a:b]
z = ""
for item in xy:
z = z + item
print(z)

Related

For cycle does not replace first symbol in a given range

I tried to run a code in python to remove symbols with index 0, 3, 6, 9... etc. I decided to choose "for" cycle for it. Question: Why does the code not replacing the first symbol?
>>> s = 'Python'
>>> a = len(s)
>>> a
6
>>> for i in range (0, a, 3):
b = s.replace(s[i], '')
>>> b
'Pyton'
>>>

You are overriding b on each iteration. A one liner solution that helps avoiding these mistakes could be:
b = "".join([l for i, l in enumerate(s) if i % 3 != 0])
Example:
In [6]: s = "Python"
In [7]: b = "".join([l for i, l in enumerate(s) if i % 3 != 0])
In [8]: b
Out[8]: 'yton'

If you edit your code to print the variables after every loop, you'll figure out what's happening:
s = 'Python'
a = len(s)
for i in range (0, a, 3):
b = s.replace(s[i], '')
print(i, s, b)
print(">", b)
prints out
0 Python ython
3 Python Pyton
> Pyton
This is because you're assigning to b, but using s as the source replacement string.
You'll get closer by reassigning to s instead:
s = 'Python'
a = len(s)
for i in range (0, a, 3):
s = s.replace(s[i], '')
print(i, s)
print(">", s)
0 ython
3 ythn
> ythn
However note that since you're shortening the string in-place, the indices have changed, and you're not replacing the characters you think you might be. More so if there are multiple instances of the same character, as replace will remove them all.

In each cycle you are overriding the last b.
put a print function inside your loop you will understand.
first you will get ython and the second cycle you will get Pyton.

Generalization of Caesar Algorithm decryption for 0 to 25 shifts

I've written a small Caesar Algorithm decrypter (to test myself ) which only works if the encrypter shifts all the characters once to the right of English alphabet and That's the Problem !! . Would appreciate any help on how to make it work with more than 1 shift . (tried few approaches but none worked! Quite frustrating indeed :( )
text = input('ENTER CIPHER HERE:')
def decrypt(text):
b = ''
for i in range(len(text)):
s1 = chr(((ord(text[i]) - 1 -65 )+ 26)%26 + 65)
i = (i + 1)
b = b + s1
if len(b) == len(text):
print(b)
decrypt(text)

You can use the built-in translate method of strings for such a simple substitution cypher.
First, we generate the alphabet (lowercase only):
In [1]: lc = ''.join(chr(j) for j in range(97, 123))
Out[1]: 'abcdefghijklmnopqrstuvwxyz'
Then we define a function to roll it:
In [2]: def roll_left(s, n):
...: return s[n:] + s[:n]
...:
In [3]: roll_left(lc, 13)
Out[3]: 'nopqrstuvwxyzabcdefghijklm'
In [4]: roll_left(lc, 5)
Out[4]: 'fghijklmnopqrstuvwxyzabcde'
We use the original alphabet and the rolled version to make a translation table.
In [5]: t5 = str.maketrans(lc, roll_left(lc, 5))
To "encrypt" we apply that table to a string.
In [6]: 'this is a test'.translate(t5)
Out[6]: 'ymnx nx f yjxy'
To decrypt, make an inverse translation table:
In [7]: tinv5 = str.maketrans(roll_left(lc, 5), lc)
In [8]: 'ymnx nx f yjxy'.translate(tinv5)
Out[8]: 'this is a test'

How to implement an encoder and decoder of ROT-13? [duplicate]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am searching for a short and cool rot13 function in Python ;-)
I've written this function:
def rot13(s):
chars = "abcdefghijklmnopqrstuvwxyz"
trans = chars[13:]+chars[:13]
rot_char = lambda c: trans[chars.find(c)] if chars.find(c)>-1 else c
return ''.join( rot_char(c) for c in s )
Can anyone make it better? E.g supporting uppercase characters.

It's very simple:
>>> import codecs
>>> codecs.encode('foobar', 'rot_13')
'sbbone'

maketrans()/translate() solutions…
Python 2.x
import string
rot13 = string.maketrans(
"ABCDEFGHIJKLMabcdefghijklmNOPQRSTUVWXYZnopqrstuvwxyz",
"NOPQRSTUVWXYZnopqrstuvwxyzABCDEFGHIJKLMabcdefghijklm")
string.translate("Hello World!", rot13)
# 'Uryyb Jbeyq!'
Python 3.x
rot13 = str.maketrans(
'ABCDEFGHIJKLMabcdefghijklmNOPQRSTUVWXYZnopqrstuvwxyz',
'NOPQRSTUVWXYZnopqrstuvwxyzABCDEFGHIJKLMabcdefghijklm')
'Hello World!'.translate(rot13)
# 'Uryyb Jbeyq!'

This works on Python 2 (but not Python 3):
>>> 'foobar'.encode('rot13')
'sbbone'

The maketrans and translate methods of str are handy for this type of thing.
Here's a general solution:
import string
def make_rot_n(n):
lc = string.ascii_lowercase
uc = string.ascii_uppercase
trans = str.maketrans(lc + uc,
lc[n:] + lc[:n] + uc[n:] + uc[:n])
return lambda s: str.translate(s, trans)
rot13 = make_rot_n(13)
rot13('foobar')
# 'sbbone'

From the builtin module this.py (import this):
s = "foobar"
d = {}
for c in (65, 97):
for i in range(26):
d[chr(i+c)] = chr((i+13) % 26 + c)
print("".join([d.get(c, c) for c in s])) # sbbone

As of Python 3.1, string.translate and string.maketrans no longer exist. However, these methods can be used with bytes instead.
Thus, an up-to-date solution directly inspired from Paul Rubel's one, is:
rot13 = bytes.maketrans(
b"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ",
b"nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM")
b'Hello world!'.translate(rot13)
Conversion from string to bytes and vice-versa can be done with the encode and decode built-in functions.

Try this:
import codecs
codecs.encode("text to be rot13()'ed", "rot_13")

In python-3 the str-codec that #amber mentioned has moved to codecs standard-library:
> import codecs
> codecs.encode('foo', 'rot13')
sbb

The following function rot(s, n) encodes a string s with ROT-n encoding for any integer n, with n defaulting to 13. Both upper- and lowercase letters are supported. Values of n over 26 or negative values are handled appropriately, e.g., shifting by 27 positions is equal to shifting by one position. Decoding is done with invrot(s, n).
import string
def rot(s, n=13):
'''Encode string s with ROT-n, i.e., by shifting all letters n positions.
When n is not supplied, ROT-13 encoding is assumed.
'''
upper = string.ascii_uppercase
lower = string.ascii_lowercase
upper_start = ord(upper[0])
lower_start = ord(lower[0])
out = ''
for letter in s:
if letter in upper:
out += chr(upper_start + (ord(letter) - upper_start + n) % 26)
elif letter in lower:
out += chr(lower_start + (ord(letter) - lower_start + n) % 26)
else:
out += letter
return(out)
def invrot(s, n=13):
'''Decode a string s encoded with ROT-n-encoding
When n is not supplied, ROT-13 is assumed.
'''
return(rot(s, -n))

A one-liner to rot13 a string S:
S.translate({a : a + (lambda x: 1 if x>=0 else -1)(77 - a) * 13 for a in range(65, 91)})

For arbitrary values, something like this works for 2.x
from string import ascii_uppercase as uc, ascii_lowercase as lc, maketrans
rotate = 13 # ROT13
rot = "".join([(x[:rotate][::-1] + x[rotate:][::-1])[::-1] for x in (uc,lc)])
def rot_func(text, encode=True):
ascii = uc + lc
src, trg = (ascii, rot) if encode else (rot, ascii)
trans = maketrans(src, trg)
return text.translate(trans)
text = "Text to ROT{}".format(rotate)
encode = rot_func(text)
decode = rot_func(encode, False)

This works for uppercase and lowercase. I don't know how elegant you deem it to be.
def rot13(s):
rot=lambda x:chr(ord(x)+13) if chr(ord(x.lower())+13).isalpha()==True else chr(ord(x)-13)
s=[rot(i) for i in filter(lambda x:x!=',',map(str,s))]
return ''.join(s)

You can support uppercase letters on the original code posted by Mr. Walter by alternating the upper case and lower case letters.
chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz"
If you notice the index of the uppercase letters are all even numbers while the index of the lower case letters are odd.
A = 0 a = 1,
B = 2, b = 3,
C = 4, c = 4,
...
This odd-even pattern allows us to safely add the amount needed without having to worry about the case.
trans = chars[26:] + chars[:26]
The reason you add 26 is because the string has doubled in letters due to the upper case letters. However, the shift is still 13 spaces on the alphabet.
The full code:
def rot13(s):
chars = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz"
trans = chars[26:]+chars[:26]
rot_char = lambda c: trans[chars.find(c)] if chars.find(c) > -1 else c
return ''.join(rot_char(c) for c in s)
OUTPUT (Tested with python 2.7):
print rot13("Hello World!") --> Uryyb Jbeyq!

Interesting exercise ;-) i think i have the best solution because:
no modules needed, uses only built-in functions --> no deprecation
it can be used as a one liner
based on ascii, no mapping dicts/strings etc.
Python 2 & 3 (probably Python 1):
def rot13(s):
return ''.join([chr(ord(n) + (13 if 'Z' < n < 'n' or n < 'N' else -13)) if n.isalpha() else n for n in s])
def rot13_verbose(s):
x = []
for n in s:
if n.isalpha():
# 'n' is the 14th character in the alphabet so if a character is bigger we can subtract 13 to get rot13
ort = 13 if 'Z' < n < 'n' or n < 'N' else -13
x.append(chr(ord(n) + ort))
else:
x.append(n)
return ''.join(x)
# crazy .min version (99 characters) disclaimer: not pep8 compatible^
def r(s):return''.join([chr(ord(n)+(13if'Z'<n<'n'or'N'>n else-13))if n.isalpha()else n for n in s])

def rot13(s):
lower_chars = ''.join(chr(c) for c in range (97,123)) #ASCII a-z
upper_chars = ''.join(chr(c) for c in range (65,91)) #ASCII A-Z
lower_encode = lower_chars[13:] + lower_chars[:13] #shift 13 bytes
upper_encode = upper_chars[13:] + upper_chars[:13] #shift 13 bytes
output = "" #outputstring
for c in s:
if c in lower_chars:
output = output + lower_encode[lower_chars.find(c)]
elif c in upper_chars:
output = output + upper_encode[upper_chars.find(c)]
else:
output = output + c
return output
Another solution with shifting. Maybe this code helps other people to understand rot13 better.
Haven't tested it completely.

from string import maketrans, lowercase, uppercase
def rot13(message):
lower = maketrans(lowercase, lowercase[13:] + lowercase[:13])
upper = maketrans(uppercase, uppercase[13:] + uppercase[:13])
return message.translate(lower).translate(upper)

I found this post when I started wondering about the easiest way to implement
rot13 into Python myself. My goals were:
Works in both Python 2.7.6 and 3.3.
Handle both upper and lower case.
Not use any external libraries.
This meets all three of those requirements. That being said, I'm sure it's not winning any code golf competitions.
def rot13(string):
CLEAR = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
ROT13 = 'NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm'
TABLE = {x: y for x, y in zip(CLEAR, ROT13)}
return ''.join(map(lambda x: TABLE.get(x, x), string))
if __name__ == '__main__':
CLEAR = 'Hello, World!'
R13 = 'Uryyb, Jbeyq!'
r13 = rot13(CLEAR)
assert r13 == R13
clear = rot13(r13)
assert clear == CLEAR
This works by creating a lookup table and simply returning the original character for any character not found in the lookup table.
Update
I got to worrying about someone wanting to use this to encrypt an arbitrarily-large file (say, a few gigabytes of text). I don't know why they'd want to do this, but what if they did? So I rewrote it as a generator. Again, this has been tested in both Python 2.7.6 and 3.3.
def rot13(clear):
CLEAR = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
ROT13 = 'NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm'
TABLE = {x: y for x, y in zip(CLEAR, ROT13)}
for c in clear:
yield TABLE.get(c, c)
if __name__ == '__main__':
CLEAR = 'Hello, World!'
R13 = 'Uryyb, Jbeyq!'
r13 = ''.join(rot13(CLEAR))
assert r13 == R13
clear = ''.join(rot13(r13))
assert clear == CLEAR

I couldn't leave this question here with out a single statement using the modulo operator.
def rot13(s):
return ''.join([chr(x.islower() and ((ord(x) - 84) % 26) + 97
or x.isupper() and ((ord(x) - 52) % 26) + 65
or ord(x))
for x in s])
This is not pythonic nor good practice, but it works!
>> rot13("Hello World!")
Uryyb Jbeyq!

You can also use this also
def n3bu1A(n):
o=""
key = {
'a':'n', 'b':'o', 'c':'p', 'd':'q', 'e':'r', 'f':'s', 'g':'t', 'h':'u',
'i':'v', 'j':'w', 'k':'x', 'l':'y', 'm':'z', 'n':'a', 'o':'b', 'p':'c',
'q':'d', 'r':'e', 's':'f', 't':'g', 'u':'h', 'v':'i', 'w':'j', 'x':'k',
'y':'l', 'z':'m', 'A':'N', 'B':'O', 'C':'P', 'D':'Q', 'E':'R', 'F':'S',
'G':'T', 'H':'U', 'I':'V', 'J':'W', 'K':'X', 'L':'Y', 'M':'Z', 'N':'A',
'O':'B', 'P':'C', 'Q':'D', 'R':'E', 'S':'F', 'T':'G', 'U':'H', 'V':'I',
'W':'J', 'X':'K', 'Y':'L', 'Z':'M'}
for x in n:
v = x in key.keys()
if v == True:
o += (key[x])
else:
o += x
return o
Yes = n3bu1A("N zhpu fvzcyre jnl gb fnl Guvf vf zl Zragbe!!")
print(Yes)

Short solution:
def rot13(text):
return "".join([x if ord(x) not in range(65, 91)+range(97, 123) else
chr(((ord(x)-97+13)%26)+97) if x.islower() else
chr(((ord(x)-65+13)%26)+65) for x in text])

What is the simplest way to swap each pair of adjoining chars in a string with Python?

I want to swap each pair of characters in a string. '2143' becomes '1234', 'badcfe' becomes 'abcdef'.
How can I do this in Python?

oneliner:
>>> s = 'badcfe'
>>> ''.join([ s[x:x+2][::-1] for x in range(0, len(s), 2) ])
'abcdef'
s[x:x+2] returns string slice from x to x+2; it is safe for odd len(s).
[::-1] reverses the string in Python
range(0, len(s), 2) returns 0, 2, 4, 6 ... while x < len(s)

The usual way to swap two items in Python is:
a, b = b, a
So it would seem to me that you would just do the same with an extended slice. However, it is slightly complicated because strings aren't mutable; so you have to convert to a list and then back to a string.
Therefore, I would do the following:
>>> s = 'badcfe'
>>> t = list(s)
>>> t[::2], t[1::2] = t[1::2], t[::2]
>>> ''.join(t)
'abcdef'

Here's one way...
>>> s = '2134'
>>> def swap(c, i, j):
... c = list(c)
... c[i], c[j] = c[j], c[i]
... return ''.join(c)
...
>>> swap(s, 0, 1)
'1234'
>>>

''.join(s[i+1]+s[i] for i in range(0, len(s), 2)) # 10.6 usec per loop
or
''.join(x+y for x, y in zip(s[1::2], s[::2])) # 10.3 usec per loop
or if the string can have an odd length:
''.join(x+y for x, y in itertools.izip_longest(s[1::2], s[::2], fillvalue=''))
Note that this won't work with old versions of Python (if I'm not mistaking older than 2.5).
The benchmark was run on python-2.7-8.fc14.1.x86_64 and a Core 2 Duo 6400 CPU with s='0123456789'*4.

If performance or elegance is not an issue, and you just want clarity and have the job done then simply use this:
def swap(text, ch1, ch2):
text = text.replace(ch2, '!',)
text = text.replace(ch1, ch2)
text = text.replace('!', ch1)
return text
This allows you to swap or simply replace chars or substring.
For example, to swap 'ab' <-> 'de' in a text:
_str = "abcdefabcdefabcdef"
print swap(_str, 'ab','de') #decabfdecabfdecabf

Loop over length of string by twos and swap:
def oddswap(st):
s = list(st)
for c in range(0,len(s),2):
t=s[c]
s[c]=s[c+1]
s[c+1]=t
return "".join(s)
giving:
>>> s
'foobar'
>>> oddswap(s)
'ofbora'
and fails on odd-length strings with an IndexError exception.

There is no need to make a list. The following works for even-length strings:
r = ''
for in in range(0, len(s), 2) :
r += s[i + 1] + s[i]
s = r

A more general answer... you can do any single pairwise swap with tuples or strings using this approach:
# item can be a string or tuple and swap can be a list or tuple of two
# indices to swap
def swap_items_by_copy(item, swap):
s0 = min(swap)
s1 = max(swap)
if isinstance(item,str):
return item[:s0]+item[s1]+item[s0+1:s1]+item[s0]+item[s1+1:]
elif isinstance(item,tuple):
return item[:s0]+(item[s1],)+item[s0+1:s1]+(item[s0],)+item[s1+1:]
else:
raise ValueError("Type not supported")
Then you can invoke it like this:
>>> swap_items_by_copy((1,2,3,4,5,6),(1,2))
(1, 3, 2, 4, 5, 6)
>>> swap_items_by_copy("hello",(1,2))
'hlelo'
>>>
Thankfully python gives empty strings or tuples for the cases where the indices refer to non existent slices.

To swap characters in a string a of position l and r
def swap(a, l, r):
a = a[0:l] + a[r] + a[l+1:r] + a[l] + a[r+1:]
return a
Example:
swap("aaabcccdeee", 3, 7) returns "aaadcccbeee"

Do you want the digits sorted? Or are you swapping odd/even indexed digits? Your example is totally unclear.
Sort:
s = '2143'
p=list(s)
p.sort()
s = "".join(p)
s is now '1234'. The trick is here that list(string) breaks it into characters.

Like so:
>>> s = "2143658709"
>>> ''.join([s[i+1] + s[i] for i in range(0, len(s), 2)])
'1234567890'
>>> s = "badcfe"
>>> ''.join([s[i+1] + s[i] for i in range(0, len(s), 2)])
'abcdef'

re.sub(r'(.)(.)',r"\2\1",'abcdef1234')
However re is a bit slow.
def swap(s):
i=iter(s)
while True:
a,b=next(i),next(i)
yield b
yield a
''.join(swap("abcdef1234"))

One more way:
>>> s='123456'
>>> ''.join([''.join(el) for el in zip(s[1::2], s[0::2])])
'214365'

>>> import ctypes
>>> s = 'abcdef'
>>> mutable = ctypes.create_string_buffer(s)
>>> for i in range(0,len(s),2):
>>> mutable[i], mutable[i+1] = mutable[i+1], mutable[i]
>>> s = mutable.value
>>> print s
badcfe

def revstr(a):
b=''
if len(a)%2==0:
for i in range(0,len(a),2):
b += a[i + 1] + a[i]
a=b
else:
c=a[-1]
for i in range(0,len(a)-1,2):
b += a[i + 1] + a[i]
b=b+a[-1]
a=b
return b
a=raw_input('enter a string')
n=revstr(a)
print n

A bit late to the party, but there is actually a pretty simple way to do this:
The index sequence you are looking for can be expressed as the sum of two sequences:
0 1 2 3 ...
+1 -1 +1 -1 ...
Both are easy to express. The first one is just range(N). A sequence that toggles for each i in that range is i % 2. You can adjust the toggle by scaling and offsetting it:
i % 2 -> 0 1 0 1 ...
1 - i % 2 -> 1 0 1 0 ...
2 * (1 - i % 2) -> 2 0 2 0 ...
2 * (1 - i % 2) - 1 -> +1 -1 +1 -1 ...
The entire expression simplifies to i + 1 - 2 * (i % 2), which you can use to join the string almost directly:
result = ''.join(string[i + 1 - 2 * (i % 2)] for i in range(len(string)))
This will work only for an even-length string, so you can check for overruns using min:
N = len(string)
result = ''.join(string[min(i + 1 - 2 * (i % 2), N - 1)] for i in range(N))
Basically a one-liner, doesn't require any iterators beyond a range over the indices, and some very simple integer math.

While the above solutions do work, there is a very simple solution shall we say in "layman's" terms. Someone still learning python and string's can use the other answers but they don't really understand how they work or what each part of the code is doing without a full explanation by the poster as opposed to "this works". The following executes the swapping of every second character in a string and is easy for beginners to understand how it works.
It is simply iterating through the string (any length) by two's (starting from 0 and finding every second character) and then creating a new string (swapped_pair) by adding the current index + 1 (second character) and then the actual index (first character), e.g., index 1 is put at index 0 and then index 0 is put at index 1 and this repeats through iteration of string.
Also added code to ensure string is of even length as it only works for even length.
DrSanjay Bhakkad post above is also a good one that works for even or odd strings and is basically doing the same function as below.
string = "abcdefghijklmnopqrstuvwxyz123"
# use this prior to below iteration if string needs to be even but is possibly odd
if len(string) % 2 != 0:
string = string[:-1]
# iteration to swap every second character in string
swapped_pair = ""
for i in range(0, len(string), 2):
swapped_pair += (string[i + 1] + string[i])
# use this after above iteration for any even or odd length of strings
if len(swapped_pair) % 2 != 0:
swapped_adj += swapped_pair[-1]
print(swapped_pair)
badcfehgjilknmporqtsvuxwzy21 # output if the "needs to be even" code used
badcfehgjilknmporqtsvuxwzy213 # output if the "even or odd" code used

One of the easiest way to swap first two characters from a String is
inputString = '2134'
extractChar = inputString[0:2]
swapExtractedChar = extractChar[::-1] """Reverse the order of string"""
swapFirstTwoChar = swapExtractedChar + inputString[2:]
# swapFirstTwoChar = inputString[0:2][::-1] + inputString[2:] """For one line code"""
print(swapFirstTwoChar)

#Works on even/odd size strings
str = '2143657'
newStr = ''
for i in range(len(str)//2):
newStr += str[i*2+1] + str[i*2]
if len(str)%2 != 0:
newStr += str[-1]
print(newStr)

#Think about how index works with string in Python,
>>> a = "123456"
>>> a[::-1]
'654321'

Using a caesarian cipher on a string of text in python?

I'm trying to slowly knock out all of the intricacies of python. Basically, I'm looking for some way, in python, to take a string of characters and push them all over by 'x' characters.
For example, inputing abcdefg will give me cdefghi (if x is 2).

My first version:
>>> key = 2
>>> msg = "abcdefg"
>>> ''.join( map(lambda c: chr(ord('a') + (ord(c) - ord('a') + key)%26), msg) )
'cdefghi'
>>> msg = "uvwxyz"
>>> ''.join( map(lambda c: chr(ord('a') + (ord(c) - ord('a') + key)%26), msg) )
'wxyzab'
(Of course it works as expected only if msg is lowercase...)
edit: I definitely second David Raznick's answer:
>>> import string
>>> alphabet = "abcdefghijklmnopqrstuvwxyz"
>>> key = 2
>>> tr = string.maketrans(alphabet, alphabet[key:] + alphabet[:key])
>>> "abcdefg".translate(tr)
'cdefghi'

I think your best bet is to look at string.translate. You may have to use make_trans to make the mapping you like.

I would do it this way (for conceptual simplicity):
def encode(s):
l = [ord(i) for i in s]
return ''.join([chr(i + 2) for i in l])
Point being that you convert the letter to ASCII, add 2 to that code, convert it back, and "cast" it into a string (create a new string object). This also makes no conversions based on "case" (upper vs. lower).
Potential optimizations/research areas:
Use of StringIO module for large strings
Apply this to Unicode (not sure how)

This solution works for both lowercase and uppercase:
from string import lowercase, uppercase
def caesar(text, key):
result = []
for c in text:
if c in lowercase:
idx = lowercase.index(c)
idx = (idx + key) % 26
result.append(lowercase[idx])
elif c in uppercase:
idx = uppercase.index(c)
idx = (idx + key) % 26
result.append(uppercase[idx])
else:
result.append(c)
return "".join(result)
Here is a test:
>>> caesar("abcdefg", 2)
'cdefghi'
>>> caesar("z", 1)
'a'

Another version. Allows for definition of your own alphabet, and doesn't translate any other characters (such as punctuation). The ugly part here is the loop, which might cause performance problems. I'm not sure about python but appending strings like this is a big no in other languages like Java and C#.
def rotate(data, n):
alphabet = list("abcdefghijklmopqrstuvwxyz")
n = n % len(alphabet)
target = alphabet[n:] + alphabet[:n]
translation = dict(zip(alphabet, target))
result = ""
for c in data:
if translation.has_key(c):
result += translation[c]
else:
result += c
return result
print rotate("foobar", 1)
print rotate("foobar", 2)
print rotate("foobar", -1)
print rotate("foobar", -2)
Result:
gppcbs
hqqdct
emmazq
dllzyp
The make_trans() solution suggested by others is the way to go here.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Regex to find all characters between two index values - python

you can use z = txt[a:b] to extract all characters between a and b indices.

import re txt = "Hula hoops are fun." x = re.search(r"hoops", txt) c = x.span() a = c[0] b = c[1] print(a) # prints 5 print(b) # prints 10 txt2 = "Hula loops are fun." txt3 = list(txt2) xy = txt3[a:b] z = "" for item in xy: z = z + item print(z)

Related

For cycle does not replace first symbol in a given range

Generalization of Caesar Algorithm decryption for 0 to 25 shifts

How to implement an encoder and decoder of ROT-13? [duplicate]

What is the simplest way to swap each pair of adjoining chars in a string with Python?

Using a caesarian cipher on a string of text in python?

Categories

Resources