my_string = 'ABCDefgh'
desired = ('ABCD','efgh')
the only way I can think of doing this is creating a for loop and then scanning through and checking each element in the string individually and adding to string and then creating the tuple . . . is there a more efficient way to do this?
it will always be in the format UPPERlower
print re.split("([A-Z]+)",my_string)[1:]
Simple way (two passes):
>>> import itertools
>>> my_string = 'ABCDefgh'
>>> desired = (''.join(itertools.takewhile(lambda c:c.isupper(), my_string)), ''.join(itertools.dropwhile(lambda c:c.isupper(), my_string)))
>>> desired
('ABCD', 'efgh')
Efficient way (one pass):
>>> my_string = 'ABCDefgh'
>>> uppers = []
>>> done = False
>>> i = 0
>>> while not done:
... c = my_string[i]
... if c.isupper():
... uppers.append(c)
... i += 1
... else:
... done = True
...
>>> lowers = my_string[i:]
>>> desired = (''.join(uppers), lowers)
>>> desired
('ABCD', 'efgh')
Because I throw itertools.groupby at everything:
>>> my_string = 'ABCDefgh'
>>> from itertools import groupby
>>> [''.join(g) for k,g in groupby(my_string, str.isupper)]
['ABCD', 'efgh']
(A little overpowered here, but scales up to more complicated problems nicely.)
my_string='ABCDefg'
import re
desired = (re.search('[A-Z]+',my_string).group(0),re.search('[a-z]+',my_string).group(0))
print desired
A more robust approach without using re
import string
>>> txt = "ABCeUiioualfjNLkdD"
>>> tup = (''.join([char for char in txt if char in string.ascii_uppercase]),
''.join([char for char in txt if char not in string.ascii_uppercase]))
>>> tup
('ABCUNLD', 'eiioualfjkd')
the char not in string.ascii_uppercase instead of char in string.ascii_lowercase means that you'll never lose any data in case your string has non-letters in it, which could be useful if you suddenly start having errors when this input starts being rejected 20 function calls later.
Related
My code right now is very simple
sentence = input("Input a sentence: ")
print(sentence[::2])
My goal is to instead of having the spliced list replace the characters with nothing it will replace with another character like 'A'
Some things I've tried are
print(sentence[::2].replace("A", "B")
print(sentence[::2], sep = "A")
print(sentence[::2], "A")
One solution with print:
s = 'test'
print(*s[::2], sep='A', end='\n' if len(s) % 2 else 'A\n')
Prints:
tAsA
If s='tes':
tAs
You have the right idea with sep, but the wrong function.
extract alternate characters with [::2]
convert to a list of individual chars
join them with the desired separator, A
Step by step:
>>> s = "Hello, world"
>>> s[::2]
'Hlo ol'
>>> list(s[::2])
['H', 'l', 'o', ' ', 'o', 'l']
>>> 'A'.join(list(s[::2]))
'HAlAoA AoAl'
Q.E.D.
Thanks to kaya3 for the bug catching. Kludge solution:
>>> new = 'A'.join(list(s[::2]))
>>> if len(new) < len(s):
... new += 'A'
...
>>> new
'HAlAoA AoAlA'
>>>
The join method almost does this, but fails on even-lengthed strings:
>>> 'A'.join('hello'[::2])
'hAlAo'
>>> 'A'.join('test'[::2])
'tAs'
To solve this we can add on an extra A if the length is even:
def replace_alternating(s, sep):
result = sep.join(s[::2])
if len(s) % 2 == 0:
result += sep
return result
Here's an alternative solution using a regex to replace pairs of characters:
>>> re.sub('(.).', r'\1A', 'hello')
'hAlAo'
>>> re.sub('(.).', r'\1A', 'test')
'tAsA'
I have a string, well, several actually. The strings are simply:
string.a.is.this
or
string.a.im
in that fashion.
and what I want to do is make those stings become:
this.is.a.string
and
im.a.string
What I've tried:
new_string = string.split('.')
new_string = (new_string[3] + '.' + new_string[2] + '.' + new_string[1] + '.' + new_string[0])
Which works fine for making:
string.a.is.this
into
this.is.a.string
but gives me a error of 'out of range' if I try it on:
string.a.im
yet if I do:
new_string = (new_string[2] + '.' + new_string[1] + '.' + new_string[0])
that works fine to make:
string.a.im
into
im.a.string
but obviously does not work for:
string.a.is.this
since it is not setup for 4 indices. I was trying to figure out how to make the extra index optional, or any other work around, or, better method. Thanks.
You can use str.join, str.split, and [::-1]:
>>> mystr = 'string.a.is.this'
>>> '.'.join(mystr.split('.')[::-1])
'this.is.a.string'
>>> mystr = 'string.a.im'
>>> '.'.join(mystr.split('.')[::-1])
'im.a.string'
>>>
To explain better, here is a step-by-step demonstration with the first string:
>>> mystr = 'string.a.is.this'
>>>
>>> # Split the string on .
>>> mystr.split('.')
['string', 'a', 'is', 'this']
>>>
>>> # Reverse the list returned above
>>> mystr.split('.')[::-1]
['this', 'is', 'a', 'string']
>>>
>>> # Join the strings in the reversed list, separating them by .
>>> '.'.join(mystr.split('.')[::-1])
'this.is.a.string'
>>>
You could do it through python's re module,
import re
mystr = 'string.a.is.this'
regex = re.findall(r'([^.]+)', mystr)
'.'.join(regex[::-1])
'this.is.a.string'
I am new to Python so I have lots of doubts. For instance I have a string:
string = "xtpo, example1=x, example2, example3=thisValue"
For example, is it possible to get the values next to the equals in example1 and example3? knowing only the keywords, not what comes after the = ?
You can use regex:
>>> import re
>>> strs = "xtpo, example1=x, example2, example3=thisValue"
>>> key = 'example1'
>>> re.search(r'{}=(\w+)'.format(key), strs).group(1)
'x'
>>> key = 'example3'
>>> re.search(r'{}=(\w+)'.format(key), strs).group(1)
'thisValue'
Spacing things out for clarity
>>> Sstring = "xtpo, example1=x, example2, example3=thisValue"
>>> items = Sstring.split(',') # Get the comma separated items
>>> for i in items:
... Pair = i.split('=') # Try splitting on =
... if len(Pair) > 1: # Did split
... print Pair # or whatever you would like to do
...
[' example1', 'x']
[' example3', 'thisValue']
>>>
I have a string like this
--x123-09827--x456-9908872--x789-267504
I am trying to get all value like
123:09827
456:9908872
789:267504
I've tried (--x([0-9]+)-([0-9])+)+
but it only gives me last pair result, I am testing it through python
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> p = "(--x([0-9]+)-([0-9]+))+"
>>> re.match(p,x)
>>> re.match(p,x).groups()
('--x789-267504', '789', '267504')
How should I write with nested repeat pattern?
Thanks a lot!
David
Code it like this:
x = "--x123-09827--x456-9908872--x789-267504"
p = "--x(?:[0-9]+)-(?:[0-9]+)"
print re.findall(p,x)
Just use the .findall method instead, it makes the expression simpler.
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> r = re.compile(r"--x(\d+)-(\d+)")
>>> r.findall(x)
[('123', '09827'), ('456', '9908872'), ('789', '267504')]
You can also use .finditer which might be helpful for longer strings.
>>> [m.groups() for m in r.finditer(x)]
[('123', '09827'), ('456', '9908872'), ('789', '267504')]
Use re.finditer or re.findall. Then you don't need the extra pair of parentheses that wrap the entire expression. For example,
>>> import re
>>> x = "--x123-09827--x456-9908872--x789-267504"
>>> p = "--x([0-9]+)-([0-9]+)"
>>> for m in re.finditer(p,x):
>>> print '{0} {1}'.format(m.group(1),m.group(2))
try this
p='--x([0-9]+)-([0-9]+)'
re.findall(p,x)
No need to use regex :
>>> "--x123-09827--x456-9908872--x789-267504".replace('--x',' ').replace('-',':').strip()
'123:09827 456:9908872 789:267504'
You don't need regular expressions for this. Here is a simple one-liner, non-regex solution:
>>> input = "--x123-09827--x456-9908872--x789-267504"
>>> [ x.replace("-", ":") for x in input.split("--x")[1:] ]
['123:09827', '456:9908872', '789:267504']
If this is an exercise on regex, here is a solution that uses the repetition (technically), though the findall(...) solution may be preferred:
>>> import re
>>> input = "--x123-09827--x456-9908872--x789-267504"
>>> regex = '--x(.+)'
>>> [ x.replace("-", ":") for x in re.match(regex*3, input).groups() ]
['123:09827', '456:9908872', '789:267504']
Suppose I have this:
My---sun--is------very-big---.
I want to replace all multiple hyphens with just one hyphen.
import re
astr='My---sun--is------very-big---.'
print(re.sub('-+','-',astr))
# My-sun-is-very-big-.
If you want to replace any run of consecutive characters, you can use
>>> import re
>>> a = "AA---BC++++DDDD-EE$$$$FF"
>>> print(re.sub(r"(.)\1+",r"\1",a))
A-BC+D-E$F
If you only want to coalesce non-word-characters, use
>>> print(re.sub(r"(\W)\1+",r"\1",a))
AA-BC+DDDD-EE$FF
If it's really just hyphens, I recommend unutbu's solution.
If you really only want to coalesce hyphens, use the other suggestions. Otherwise you can write your own function, something like this:
>>> def coalesce(x):
... n = []
... for c in x:
... if not n or c != n[-1]:
... n.append(c)
... return ''.join(n)
...
>>> coalesce('My---sun--is------very-big---.')
'My-sun-is-very-big-.'
>>> coalesce('aaabbbccc')
'abc'
As usual, there's a nice itertools solution, using groupby:
>>> from itertools import groupby
>>> s = 'aaaaa----bbb-----cccc----d-d-d'
>>> ''.join(key for key, group in groupby(s))
'a-b-c-d-d-d'
How about:
>>> import re
>>> re.sub("-+", "-", "My---sun--is------very-big---.")
'My-sun-is-very-big-.'
the regular expression "-+" will look for 1 or more "-".
re.sub('-+', '-', "My---sun--is------very-big---")
How about an alternate without the re module:
'-'.join(filter(lambda w: len(w) > 0, 'My---sun--is------very-big---.'.split("-")))
Or going with Tim and FogleBird's previous suggestion, here's a more general method:
def coalesce_factory(x):
return lambda sent: x.join(filter(lambda w: len(w) > 0, sent.split(x)))
hyphen_coalesce = coalesce_factory("-")
hyphen_coalesce('My---sun--is------very-big---.')
Though personally, I would use the re module first :)
mcpeterson
Another simple solution is the String object's replace function.
while '--' in astr:
astr = astr.replace('--','-')
if you don't want to use regular expressions:
my_string = my_string.split('-')
my_string = filter(None, my_string)
my_string = '-'.join(my_string)
I have
my_str = 'a, b,,,,, c, , , d'
I want
'a,b,c,d'
compress all the blanks (the "replace" bit), then split on the comma, then if not None join with a comma in between:
my_str_2 = ','.join([i for i in my_str.replace(" ", "").split(',') if i])