Suppose I have this list:
lis = ['a','b','c','d']
If I do 'x'.join(lis) the result is:
'axbxcxd'
What would be a clean, simple way to get this output?
'xaxbxcxdx'
I could write a helper function:
def joiner(s, it):
return s+s.join(it)+s
and call it like joiner('x',lis) which returns xaxbxcxdx, but it doesn't look as clean as it could be. Is there a better way to get this result?
>>> '{1}{0}{1}'.format(s.join(lis), s)
'xaxbxcxdx'
You can join a list that begins and ends with an empty string:
>>> 'x'.join(['', *lis, ''])
'xaxbxcxdx'
You can use f-string:
s = 'x'
f'{s}{s.join(lis)}{s}'
In Python 3.8 you can also use the walrus operator:
f"{(s:='x')}{s.join(lis)}{s}"
or
(s:='x') + s.join(lis) + s
You can use str.replace() to interleave the characters:
>>> lis = ['a','b','c','d']
>>> ''.join(lis).replace('', 'x')
'xaxbxcxdx'
On the other hand, your original solution (or a trivial modification with string formatting) is IMO actually pretty clean and readable.
You may also do it as
'x'.join([''] + lis + [''])
But I'm not sure if it's cleaner.
It will produce only 1 separator on empty list instead of 2 as one in the question.
A generator can interleave the characters, and the result can be joined without having to create intermediate strings.
L = list('abcd')
def mixer(chars, insert):
yield insert
for char in chars:
yield char
yield insert
result = ''.join(mixer(L, 'x')) # -> 'xaxbxcxdx'
While it isn't a one-liner, I think it is clean and simple, unlike these itertools creations that I came up with initially:
from itertools import repeat, starmap, zip_longest
from operator import add
# L must have a len, so doesn't work with generators
''.join(a for b in itertools.zip_longest(repeat('x', len(L) + 1),
L, fillvalue='')
for a in b)
# As above, and worse still creates lots of intermediate strings
''.join(starmap(add, zip_longest(repeat('x', len(L) + 1), L, fillvalue='')))
Arguably there is a much simpler approach to be found here - just add the character before and after your join function. Others have suggested f-strings, which is a fancy way of achieving the same thing. String concatenation is also fine:
lis = ['a','b','c','d']
lis_str = 'x' + 'x'.join(lis) + 'x'
If your string is long and you don't want to repeat it multiple times, you can just put this into a variable and do the same thing
lis = ['a','b','c','d']
join_str = 'x-marks-the-spot'
lis_str = join_str + join_str.join(lis) + join_str
Related
So SO, i am trying to "merge" a string (a) and a list of strings (b):
a = '1234'
b = ['+', '-', '']
to get the desired output (c):
c = '1+2-34'
The characters in the desired output string alternate in terms of origin between string and list. Also, the list will always contain one element less than characters in the string. I was wondering what the fastest way to do this is.
what i have so far is the following:
c = a[0]
for i in range(len(b)):
c += b[i] + a[1:][i]
print(c) # prints -> 1+2-34
But i kind of feel like there is a better way to do this..
You can use itertools.zip_longest to zip the two sequences, then keep iterating even after the shorter sequence ran out of characters. If you run out of characters, you'll start getting None back, so just consume the rest of the numerical characters.
>>> from itertools import chain
>>> from itertools import zip_longest
>>> ''.join(i+j if j else i for i,j in zip_longest(a, b))
'1+2-34'
As #deceze suggested in the comments, you can also pass a fillvalue argument to zip_longest which will insert empty strings. I'd suggest his method since it's a bit more readable.
>>> ''.join(i+j for i,j in zip_longest(a, b, fillvalue=''))
'1+2-34'
A further optimization suggested by #ShadowRanger is to remove the temporary string concatenations (i+j) and replace those with an itertools.chain.from_iterable call instead
>>> ''.join(chain.from_iterable(zip_longest(a, b, fillvalue='')))
'1+2-34'
Python programs are often short and concise and what usually requires bunch of lines in other programming languages (that I know of) can be accomplished in a line or two in python.
One such program I am trying to write was to extract every other letters from a string.
I have this working code, but wondering if any other concise way is possible?
>>> s
'abcdefg'
>>> b = ""
>>> for i in range(len(s)):
... if (i%2)==0:
... b+=s[i]
...
>>> b
'aceg'
>>>
>>> 'abcdefg'[::2]
'aceg'
Use Explain Python's slice notation:
>>> 'abcdefg'[::2]
'aceg'
>>>
The format for slice notation is [start:stop:step]. So, [::2] is telling Python to step through the string by 2's (which will return every other character).
The right way to do this is to just slice the string, as in the other answers.
But if you want a more concise way to write your code, which will work for similar problems that aren't as simple as slicing, there are two tricks: comprehensions, and the enumerate function.
First, this loop:
for i in range(len(foo)):
value = foo[i]
something with value and i
… can be written as:
for i, value in enumerate(foo):
something with value and i
So, in your case:
for i, c in enumerate(s):
if (i%2)==0:
b+=c
Next, any loop that starts with an empty object, goes through an iterable (string, list, iterator, etc.), and puts values into a new iterable, possibly running the values through an if filter or an expression that transforms them, can be turned into a comprehension very easily.
While Python has comprehensions for lists, sets, dicts, and iterators, it doesn't have comprehensions for strings—but str.join solves that.
So, putting it together:
b = "".join(c for i, c in enumerate(s) if i%2 == 0)
Not nearly as concise or readable as b = s[::2]… but a lot better than what you started with—and the same idea works when you want to do more complicated things, like if i%2 and i%3 (which doesn't map to any obvious slice), or doubling each letter with c*2 (which could be done by zipping together two slices, but that's not immediately obvious), etc.
Here is another example both for list and string:
sentence = "The quick brown fox jumped over the lazy dog."
sentence[::2]
Here we are saying: Take the entire string from the beginning to the end and return every 2nd character.
Would return the following:
'Teqikbonfxjme vrtelz o.'
You can do the same for a list:
colors = ["red", "organge", "yellow","green", "blue"]
colors[1:4]
would retrun:
['organge', 'yellow', 'green']
The way I read the slice is: If we have sentence[1:4]
Start at index 1 (remember the starting position is index 0) and Stop BEFORE the index 4
you could try using slice and join:
>>> k = list(s)
>>> "".join(k[::2])
'aceg'
Practically, slicing is the best way to go. However, there are also ways you could improve your existing code, not by making it shorter, but by making it more Pythonic:
>>> s
'abcdefg'
>>> b = []
>>> for index, value in enumerate(s):
if index % 2 == 0:
b.append(value)
>>> b = "".join(b)
or even better:
>>> b = "".join(value for index, value in enumerate(s) if index % 2 == 0)
This can be easily extended to more complicated conditions:
>>> b = "".join(value for index, value in enumerate(s) if index % 2 == index % 3 == 0)
Suppose I want to change 'abc' to 'bac' in Python. What would be the best way to do it?
I am thinking of the following
tmp = list('abc')
tmp[0],tmp[1] = tmp[1],tmp[0]
result = ''.join(tmp)
You are never editing a string "in place", strings are immutable.
You could do it with a list but that is wasting code and memory.
Why not just do:
x = 'abc'
result = x[1] + x[0] + x[2:]
or (personal fav)
import re
re.sub('(.)(.)', r'\2\1','abc')
This might be cheating, but if you really want to edit in place, and are using 2.6 or older, then use MutableString(this was deprecated in 3.0).
from UserString import MutableString
x = MutableString('abc')
x[1], x[0] = x[0], x[1]
>>>>'bac'
With that being said, solutions are generally not as simple as 'abc' = 'bac' You might want to give us more details on how you need to split up your string. Is it always just swapping first digits?
You cannot modify strings in place, they are immutable. If you want to modify a list in place, you can do it like in your example, or you could use slice assignment if the elements you want to replace can be accessed with a slice:
tmp = list('abc')
tmp[0:2] = tmp[1::-1] # replace tmp[0:2] with tmp[1::-1], which is ['b', 'a']
result = ''.join(tmp)
Is there a pythonic way to insert an element into every 2nd element in a string?
I have a string: 'aabbccdd' and I want the end result to be 'aa-bb-cc-dd'.
I am not sure how I would go about doing that.
>>> s = 'aabbccdd'
>>> '-'.join(s[i:i+2] for i in range(0, len(s), 2))
'aa-bb-cc-dd'
Assume the string's length is always an even number,
>>> s = '12345678'
>>> t = iter(s)
>>> '-'.join(a+b for a,b in zip(t, t))
'12-34-56-78'
The t can also be eliminated with
>>> '-'.join(a+b for a,b in zip(s[::2], s[1::2]))
'12-34-56-78'
The algorithm is to group the string into pairs, then join them with the - character.
The code is written like this. Firstly, it is split into odd digits and even digits.
>>> s[::2], s[1::2]
('1357', '2468')
Then the zip function is used to combine them into an iterable of tuples.
>>> list( zip(s[::2], s[1::2]) )
[('1', '2'), ('3', '4'), ('5', '6'), ('7', '8')]
But tuples aren't what we want. This should be a list of strings. This is the purpose of the list comprehension
>>> [a+b for a,b in zip(s[::2], s[1::2])]
['12', '34', '56', '78']
Finally we use str.join() to combine the list.
>>> '-'.join(a+b for a,b in zip(s[::2], s[1::2]))
'12-34-56-78'
The first piece of code is the same idea, but consumes less memory if the string is long.
If you want to preserve the last character if the string has an odd length, then you can modify KennyTM's answer to use itertools.izip_longest:
>>> s = "aabbccd"
>>> from itertools import izip_longest
>>> '-'.join(a+b for a,b in izip_longest(s[::2], s[1::2], fillvalue=""))
'aa-bb-cc-d'
or
>>> t = iter(s)
>>> '-'.join(a+b for a,b in izip_longest(t, t, fillvalue=""))
'aa-bb-cc-d'
I tend to rely on a regular expression for this, as it seems less verbose and is usually faster than all the alternatives. Aside from having to face down the conventional wisdom regarding regular expressions, I'm not sure there's a drawback.
>>> s = 'aabbccdd'
>>> '-'.join(re.findall('..', s))
'aa-bb-cc-dd'
This version is strict about actual pairs though:
>>> t = s + 'e'
>>> '-'.join(re.findall('..', t))
'aa-bb-cc-dd'
... so with a tweak you can be tolerant of odd-length strings:
>>> '-'.join(re.findall('..?', t))
'aa-bb-cc-dd-e'
Usually you're doing this more than once, so maybe get a head start by creating a shortcut ahead of time:
PAIRS = re.compile('..').findall
out = '-'.join(PAIRS(in))
Or what I would use in real code:
def rejoined(src, sep='-', _split=re.compile('..').findall):
return sep.join(_split(src))
>>> rejoined('aabbccdd', sep=':')
'aa:bb:cc:dd'
I use something like this from time to time to create MAC address representations from 6-byte binary input:
>>> addr = b'\xdc\xf7\x09\x11\xa0\x49'
>>> rejoined(addr[::-1].hex(), sep=':')
'49:a0:11:09:f7:dc'
Here is one list comprehension way with conditional value depending of modulus of enumeration, odd last character will be in group alone:
for s in ['aabbccdd','aabbccdde']:
print(''.join([ char if not ind or ind % 2 else '-' + char
for ind,char in enumerate(s)
]
)
)
""" Output:
aa-bb-cc-dd
aa-bb-cc-dd-e
"""
This one-liner does the trick. It will drop the last character if your string has an odd number of characters.
"-".join([''.join(item) for item in zip(mystring1[::2],mystring1[1::2])])
As PEP8 states:
Do not rely on CPython's efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b. This optimization is fragile even in CPython (it only works for some types) and isn't present at all in implementations.
A pythonic way of doing this that avoids this kind of concatenation, and allows you to join iterables other than strings could be:
':'.join(f'{s[i:i+2]}' for i in range(0, len(s), 2))
And another more functional-like way could be:
':'.join(map('{}{}'.format, *(s[::2], s[1::2])))
This second approach has a particular feature (or bug) of only joining pairs of letters. So:
>>> s = 'abcdefghij'
'ab:cd:ef:gh:ij'
and:
>>> s = 'abcdefghi'
'ab:cd:ef:gh'
Howdy, codeboys and codegirls!
I have came across a simple problem with seemingly easy solution. But being a Python neophyte I feel that there is a better approach somewhere.
Say you have a list of mixed strings. There are two basic types of strings in the sack - ones with "=" in them (a=potato) and ones without (Lady Jane). What you need is to sort them into two lists.
The obvious approach is to:
for arg in arguments:
if '=' in arg:
equal.append(arg)
else:
plain.append(arg)
Is there any other, more elegant way into it? Something like:
equal = [arg for arg in arguments if '=' in arg]
but to sort into multiple lists?
And what if you have more than one type of data?
Try
for arg in arguments:
lst = equal if '=' in arg else plain
lst.append(arg)
or (holy ugly)
for arg in arguments:
(equal if '=' in arg else plain).append(arg)
A third option: Create a class which offers append() and which sorts into several lists.
You can use itertools.groupby() for this:
import itertools
f = lambda x: '=' in x
groups = itertools.groupby(sorted(data, key=f), key=f)
for k, g in groups:
print k, list(g)
I would just go for two list comprehensions. While that does incur some overhead (two loops on the list), it is more Pythonic to use a list comprehension than to use a for. It's also (in my mind) much more readable than using all sorts of really cool tricks, but that less people know about.
def which_list(s):
if "=" in s:
return 1
return 0
lists = [[], []]
for arg in arguments:
lists[which_list(arg)].append(arg)
plain, equal = lists
If you have more types of data, add an if clause to which_list, and initialize lists to more empty lists.
I would go for Edan's approach, e.g.
equal = [arg for arg in arguments if '=' in arg]
plain = [arg for arg in arguments if '=' not in arg]
I read somewhere here that you might be interested in a solution that
will work for more than two identifiers (equals sign and space).
The following solution just requires you update the uniques set with
anything you would like to match, the results are placed in a dictionary of lists
with the identifier as the key.
uniques = set('= ')
matches = dict((key, []) for key in uniques)
for arg in args:
key = set(arg) & uniques
try:
matches[key.pop()].append(arg)
except KeyError:
# code to handle where arg does not contain = or ' '.
Now the above code assumes that you will only have a single match for your identifier
in your arg. I.e that you don't have an arg that looks like this 'John= equalspace'.
You will have to also think about how you would like to treat cases that don't match anything in the set (KeyError occurs.)
Another approach is to use the filter function, although it's not the most efficient solution.
Example:
>>> l = ['a=s','aa','bb','=', 'a+b']
>>> l2 = filter(lambda s: '=' in s, l)
>>> l3 = filter(lambda s: '+' in s, l)
>>> l2
['a=s', '=']
>>> l3
['a+b']
I put this together, and then see that Ned Batchelder was already on this same tack. I chose to package the splitting method instead of the list chooser, though, and to just use the implicit 0/1 values for False and True.
def split_on_condition(source, condition):
ret = [],[]
for s in source:
ret[condition(s)].append(s)
return ret
src = "z=1;q=2;lady jane;y=a;lucy in the sky".split(';')
plain,equal = split_on_condition(src, lambda s:'=' in s)
Your approach is the best one. For sorting just into two lists it can't get clearer than that. If you want it to be a one-liner, encapsulate it in a function:
def classify(arguments):
equal, plain = [], []
for arg in arguments:
if '=' in arg:
equal.append(arg)
else:
plain.append(arg)
return equal, plain
equal, plain = classify(lst)