Split string into tuple (Upper,lower) 'ABCDefgh' . Python 2.7.6 - python

my_string = 'ABCDefgh'
desired = ('ABCD','efgh')
the only way I can think of doing this is creating a for loop and then scanning through and checking each element in the string individually and adding to string and then creating the tuple . . . is there a more efficient way to do this?
it will always be in the format UPPERlower

print re.split("([A-Z]+)",my_string)[1:]

Simple way (two passes):
>>> import itertools
>>> my_string = 'ABCDefgh'
>>> desired = (''.join(itertools.takewhile(lambda c:c.isupper(), my_string)), ''.join(itertools.dropwhile(lambda c:c.isupper(), my_string)))
>>> desired
('ABCD', 'efgh')
Efficient way (one pass):
>>> my_string = 'ABCDefgh'
>>> uppers = []
>>> done = False
>>> i = 0
>>> while not done:
... c = my_string[i]
... if c.isupper():
... uppers.append(c)
... i += 1
... else:
... done = True
...
>>> lowers = my_string[i:]
>>> desired = (''.join(uppers), lowers)
>>> desired
('ABCD', 'efgh')

Because I throw itertools.groupby at everything:
>>> my_string = 'ABCDefgh'
>>> from itertools import groupby
>>> [''.join(g) for k,g in groupby(my_string, str.isupper)]
['ABCD', 'efgh']
(A little overpowered here, but scales up to more complicated problems nicely.)

my_string='ABCDefg'
import re
desired = (re.search('[A-Z]+',my_string).group(0),re.search('[a-z]+',my_string).group(0))
print desired

A more robust approach without using re
import string
>>> txt = "ABCeUiioualfjNLkdD"
>>> tup = (''.join([char for char in txt if char in string.ascii_uppercase]),
''.join([char for char in txt if char not in string.ascii_uppercase]))
>>> tup
('ABCUNLD', 'eiioualfjkd')
the char not in string.ascii_uppercase instead of char in string.ascii_lowercase means that you'll never lose any data in case your string has non-letters in it, which could be useful if you suddenly start having errors when this input starts being rejected 20 function calls later.

Related

How to replace characters in a spliced list?

My code right now is very simple
sentence = input("Input a sentence: ")
print(sentence[::2])
My goal is to instead of having the spliced list replace the characters with nothing it will replace with another character like 'A'
Some things I've tried are
print(sentence[::2].replace("A", "B")
print(sentence[::2], sep = "A")
print(sentence[::2], "A")
One solution with print:
s = 'test'
print(*s[::2], sep='A', end='\n' if len(s) % 2 else 'A\n')
Prints:
tAsA
If s='tes':
tAs
You have the right idea with sep, but the wrong function.
extract alternate characters with [::2]
convert to a list of individual chars
join them with the desired separator, A
Step by step:
>>> s = "Hello, world"
>>> s[::2]
'Hlo ol'
>>> list(s[::2])
['H', 'l', 'o', ' ', 'o', 'l']
>>> 'A'.join(list(s[::2]))
'HAlAoA AoAl'
Q.E.D.
Thanks to kaya3 for the bug catching. Kludge solution:
>>> new = 'A'.join(list(s[::2]))
>>> if len(new) < len(s):
... new += 'A'
...
>>> new
'HAlAoA AoAlA'
>>>
The join method almost does this, but fails on even-lengthed strings:
>>> 'A'.join('hello'[::2])
'hAlAo'
>>> 'A'.join('test'[::2])
'tAs'
To solve this we can add on an extra A if the length is even:
def replace_alternating(s, sep):
result = sep.join(s[::2])
if len(s) % 2 == 0:
result += sep
return result
Here's an alternative solution using a regex to replace pairs of characters:
>>> re.sub('(.).', r'\1A', 'hello')
'hAlAo'
>>> re.sub('(.).', r'\1A', 'test')
'tAsA'

Moving parts of string around python

I have a string, well, several actually. The strings are simply:
string.a.is.this
or
string.a.im
in that fashion.
and what I want to do is make those stings become:
this.is.a.string
and
im.a.string
What I've tried:
new_string = string.split('.')
new_string = (new_string[3] + '.' + new_string[2] + '.' + new_string[1] + '.' + new_string[0])
Which works fine for making:
string.a.is.this
into
this.is.a.string
but gives me a error of 'out of range' if I try it on:
string.a.im
yet if I do:
new_string = (new_string[2] + '.' + new_string[1] + '.' + new_string[0])
that works fine to make:
string.a.im
into
im.a.string
but obviously does not work for:
string.a.is.this
since it is not setup for 4 indices. I was trying to figure out how to make the extra index optional, or any other work around, or, better method. Thanks.
You can use str.join, str.split, and [::-1]:
>>> mystr = 'string.a.is.this'
>>> '.'.join(mystr.split('.')[::-1])
'this.is.a.string'
>>> mystr = 'string.a.im'
>>> '.'.join(mystr.split('.')[::-1])
'im.a.string'
>>>
To explain better, here is a step-by-step demonstration with the first string:
>>> mystr = 'string.a.is.this'
>>>
>>> # Split the string on .
>>> mystr.split('.')
['string', 'a', 'is', 'this']
>>>
>>> # Reverse the list returned above
>>> mystr.split('.')[::-1]
['this', 'is', 'a', 'string']
>>>
>>> # Join the strings in the reversed list, separating them by .
>>> '.'.join(mystr.split('.')[::-1])
'this.is.a.string'
>>>
You could do it through python's re module,
import re
mystr = 'string.a.is.this'
regex = re.findall(r'([^.]+)', mystr)
'.'.join(regex[::-1])
'this.is.a.string'

Get values in string - Python

I am new to Python so I have lots of doubts. For instance I have a string:
string = "xtpo, example1=x, example2, example3=thisValue"
For example, is it possible to get the values next to the equals in example1 and example3? knowing only the keywords, not what comes after the = ?
You can use regex:
>>> import re
>>> strs = "xtpo, example1=x, example2, example3=thisValue"
>>> key = 'example1'
>>> re.search(r'{}=(\w+)'.format(key), strs).group(1)
'x'
>>> key = 'example3'
>>> re.search(r'{}=(\w+)'.format(key), strs).group(1)
'thisValue'
Spacing things out for clarity
>>> Sstring = "xtpo, example1=x, example2, example3=thisValue"
>>> items = Sstring.split(',') # Get the comma separated items
>>> for i in items:
... Pair = i.split('=') # Try splitting on =
... if len(Pair) > 1: # Did split
... print Pair # or whatever you would like to do
...
[' example1', 'x']
[' example3', 'thisValue']
>>>

Python Regular expression repeat

I have a string like this
--x123-09827--x456-9908872--x789-267504
I am trying to get all value like
123:09827
456:9908872
789:267504
I've tried (--x([0-9]+)-([0-9])+)+
but it only gives me last pair result, I am testing it through python
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> p = "(--x([0-9]+)-([0-9]+))+"
>>> re.match(p,x)
>>> re.match(p,x).groups()
('--x789-267504', '789', '267504')
How should I write with nested repeat pattern?
Thanks a lot!
David
Code it like this:
x = "--x123-09827--x456-9908872--x789-267504"
p = "--x(?:[0-9]+)-(?:[0-9]+)"
print re.findall(p,x)
Just use the .findall method instead, it makes the expression simpler.
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> r = re.compile(r"--x(\d+)-(\d+)")
>>> r.findall(x)
[('123', '09827'), ('456', '9908872'), ('789', '267504')]
You can also use .finditer which might be helpful for longer strings.
>>> [m.groups() for m in r.finditer(x)]
[('123', '09827'), ('456', '9908872'), ('789', '267504')]
Use re.finditer or re.findall. Then you don't need the extra pair of parentheses that wrap the entire expression. For example,
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> p = "--x([0-9]+)-([0-9]+)"
>>> for m in re.finditer(p,x):
>>> print '{0} {1}'.format(m.group(1),m.group(2))
try this
p='--x([0-9]+)-([0-9]+)'
re.findall(p,x)
No need to use regex :
>>> "--x123-09827--x456-9908872--x789-267504".replace('--x',' ').replace('-',':').strip()
'123:09827 456:9908872 789:267504'
You don't need regular expressions for this. Here is a simple one-liner, non-regex solution:
>>> input = "--x123-09827--x456-9908872--x789-267504"
>>> [ x.replace("-", ":") for x in input.split("--x")[1:] ]
['123:09827', '456:9908872', '789:267504']
If this is an exercise on regex, here is a solution that uses the repetition (technically), though the findall(...) solution may be preferred:
>>> import re
>>> input = "--x123-09827--x456-9908872--x789-267504"
>>> regex = '--x(.+)'
>>> [ x.replace("-", ":") for x in re.match(regex*3, input).groups() ]
['123:09827', '456:9908872', '789:267504']

How do I coalesce a sequence of identical characters into just one?

Suppose I have this:
My---sun--is------very-big---.
I want to replace all multiple hyphens with just one hyphen.
import re
astr='My---sun--is------very-big---.'
print(re.sub('-+','-',astr))
# My-sun-is-very-big-.
If you want to replace any run of consecutive characters, you can use
>>> import re
>>> a = "AA---BC++++DDDD-EE$$$$FF"
>>> print(re.sub(r"(.)\1+",r"\1",a))
A-BC+D-E$F
If you only want to coalesce non-word-characters, use
>>> print(re.sub(r"(\W)\1+",r"\1",a))
AA-BC+DDDD-EE$FF
If it's really just hyphens, I recommend unutbu's solution.
If you really only want to coalesce hyphens, use the other suggestions. Otherwise you can write your own function, something like this:
>>> def coalesce(x):
... n = []
... for c in x:
... if not n or c != n[-1]:
... n.append(c)
... return ''.join(n)
...
>>> coalesce('My---sun--is------very-big---.')
'My-sun-is-very-big-.'
>>> coalesce('aaabbbccc')
'abc'
As usual, there's a nice itertools solution, using groupby:
>>> from itertools import groupby
>>> s = 'aaaaa----bbb-----cccc----d-d-d'
>>> ''.join(key for key, group in groupby(s))
'a-b-c-d-d-d'
How about:
>>> import re
>>> re.sub("-+", "-", "My---sun--is------very-big---.")
'My-sun-is-very-big-.'
the regular expression "-+" will look for 1 or more "-".
re.sub('-+', '-', "My---sun--is------very-big---")
How about an alternate without the re module:
'-'.join(filter(lambda w: len(w) > 0, 'My---sun--is------very-big---.'.split("-")))
Or going with Tim and FogleBird's previous suggestion, here's a more general method:
def coalesce_factory(x):
return lambda sent: x.join(filter(lambda w: len(w) > 0, sent.split(x)))
hyphen_coalesce = coalesce_factory("-")
hyphen_coalesce('My---sun--is------very-big---.')
Though personally, I would use the re module first :)
mcpeterson
Another simple solution is the String object's replace function.
while '--' in astr:
astr = astr.replace('--','-')
if you don't want to use regular expressions:
my_string = my_string.split('-')
my_string = filter(None, my_string)
my_string = '-'.join(my_string)
I have
my_str = 'a, b,,,,, c, , , d'
I want
'a,b,c,d'
compress all the blanks (the "replace" bit), then split on the comma, then if not None join with a comma in between:
my_str_2 = ','.join([i for i in my_str.replace(" ", "").split(',') if i])

Categories

Resources