Howdy, codeboys and codegirls!
I have came across a simple problem with seemingly easy solution. But being a Python neophyte I feel that there is a better approach somewhere.
Say you have a list of mixed strings. There are two basic types of strings in the sack - ones with "=" in them (a=potato) and ones without (Lady Jane). What you need is to sort them into two lists.
The obvious approach is to:
for arg in arguments:
if '=' in arg:
equal.append(arg)
else:
plain.append(arg)
Is there any other, more elegant way into it? Something like:
equal = [arg for arg in arguments if '=' in arg]
but to sort into multiple lists?
And what if you have more than one type of data?
Try
for arg in arguments:
lst = equal if '=' in arg else plain
lst.append(arg)
or (holy ugly)
for arg in arguments:
(equal if '=' in arg else plain).append(arg)
A third option: Create a class which offers append() and which sorts into several lists.
You can use itertools.groupby() for this:
import itertools
f = lambda x: '=' in x
groups = itertools.groupby(sorted(data, key=f), key=f)
for k, g in groups:
print k, list(g)
I would just go for two list comprehensions. While that does incur some overhead (two loops on the list), it is more Pythonic to use a list comprehension than to use a for. It's also (in my mind) much more readable than using all sorts of really cool tricks, but that less people know about.
def which_list(s):
if "=" in s:
return 1
return 0
lists = [[], []]
for arg in arguments:
lists[which_list(arg)].append(arg)
plain, equal = lists
If you have more types of data, add an if clause to which_list, and initialize lists to more empty lists.
I would go for Edan's approach, e.g.
equal = [arg for arg in arguments if '=' in arg]
plain = [arg for arg in arguments if '=' not in arg]
I read somewhere here that you might be interested in a solution that
will work for more than two identifiers (equals sign and space).
The following solution just requires you update the uniques set with
anything you would like to match, the results are placed in a dictionary of lists
with the identifier as the key.
uniques = set('= ')
matches = dict((key, []) for key in uniques)
for arg in args:
key = set(arg) & uniques
try:
matches[key.pop()].append(arg)
except KeyError:
# code to handle where arg does not contain = or ' '.
Now the above code assumes that you will only have a single match for your identifier
in your arg. I.e that you don't have an arg that looks like this 'John= equalspace'.
You will have to also think about how you would like to treat cases that don't match anything in the set (KeyError occurs.)
Another approach is to use the filter function, although it's not the most efficient solution.
Example:
>>> l = ['a=s','aa','bb','=', 'a+b']
>>> l2 = filter(lambda s: '=' in s, l)
>>> l3 = filter(lambda s: '+' in s, l)
>>> l2
['a=s', '=']
>>> l3
['a+b']
I put this together, and then see that Ned Batchelder was already on this same tack. I chose to package the splitting method instead of the list chooser, though, and to just use the implicit 0/1 values for False and True.
def split_on_condition(source, condition):
ret = [],[]
for s in source:
ret[condition(s)].append(s)
return ret
src = "z=1;q=2;lady jane;y=a;lucy in the sky".split(';')
plain,equal = split_on_condition(src, lambda s:'=' in s)
Your approach is the best one. For sorting just into two lists it can't get clearer than that. If you want it to be a one-liner, encapsulate it in a function:
def classify(arguments):
equal, plain = [], []
for arg in arguments:
if '=' in arg:
equal.append(arg)
else:
plain.append(arg)
return equal, plain
equal, plain = classify(lst)
Related
i have list of strings
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
and i have prefix string :
/root
is there any pythonic way to add the prefix string to each element in the list
so the end result will look like this:
lst = ["/root/foo/dir/c-.*.txt","/root/foo/dir2/d-.*.svc","/root/foo/dir3/es-.*.info"]
if it can be done without iterating and creating new list ...
used:
List Comprehensions
List comprehensions provide a concise way to create lists. Common
applications are to make new lists where each element is the result of
some operations applied to each member of another sequence or
iterable, or to create a subsequence of those elements that satisfy a
certain condition.
F=Strings
F-strings provide a way to embed expressions inside string literals,
using a minimal syntax. It should be noted that an f-string is really
an expression evaluated at run time, not a constant value. In Python
source code, an f-string is a literal string, prefixed with 'f', which
contains expressions inside braces. The expressions are replaced with
their values.
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix = '/root'
lst =[ f'{prefix}{path}' for path in lst]
print(lst)
I am not sure of pythonic, but this will be also on possible way
list(map(lambda x: '/root' + x, lst))
Here there is comparison between list comp and map List comprehension vs map
Also thanks to #chris-rands learnt one more way without lambda
list(map('/root'.__add__, lst))
Use list comprehensions and string concatenation:
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
print(['/root' + p for p in lst])
# ['/root/foo/dir/c-.*.txt', '/root/foo/dir2/d-.*.svc', '/root/foo/dir3/es-.*.info']
Just simply write:
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix="/root"
res = [prefix + x for x in lst]
print(res)
A simple list comprehension -
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix = '/root'
print([prefix + string for string in lst]) # You can give it a variable if you want
When working with a function that returns multiple values with a tuple, I will often find myself using the following idiom to unpack the results from inside a list comprehension.
fiz, buz = zip(*[f(x) for x in input])
Most of the time this works fine, but it throws a ValueError: need more than 0 values to unpack if input is empty. The two ways I can think of to get around this are
fiz = []
buz = []
for x in input:
a, b = f(x)
fiz.append(a)
buz.append(b)
and
if input:
fiz, buz = zip(*[f(x) for x in input])
else:
fiz, buz = [], []
but neither of these feels especially Pythonic—the former is overly verbose and the latter doesn't work if input is a generator rather than a list (in addition to requiring an if/else where I feel like one really shouldn't be needed).
Is there a good simple way to do this? I've mostly been working in Python 2.7 recently, but would also be interested in knowing any Python 3 solutions if they are different.
If f = lambda x: (x,x**2) then this works
x,y = zip(*map(f,input)) if len(input) else ((),())
If input=[], x=() and y=().
If input=[2], x=(2,) and y=(4,)
If input=[2,3], x=(2,3) and y=(4,9)
They're tuples (not lists), but thats thats pretty easy to change.
I would consider using collections.namedtuple() for this sort of thing. I believe the named tuples are deemed more pythonic, and should avoid the need for complicated list comprehensions and zipping / unpacking.
From the documentation:
>>> p = Point(11, y=22) # instantiate with positional or keyword arguments
>>> p[0] + p[1] # indexable like the plain tuple (11, 22)
33
>>> x, y = p # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y # fields also accessible by name
33
>>> p # readable __repr__ with a name=value style
Point(x=11, y=22)
You could use:
fiz = []
buz = []
results = [fiz, buz]
for x in input:
list(map(lambda res, val: res.append(val), results, f(x)))
print(results)
Note about list(map(...)): in Python3, map returns a generator, so we must use it if we want the lambda to be executed.list does it.
(adapted from my answer to Pythonic way to append output of function to several lists, where you could find other ideas.)
I was trying to use a list comprehension to replace multiple possible string values in a list of values.
I have a list of column names which are taken from a cursor.description;
['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_MCB', 'col2_MCB', 'col3_MCB']
I then have header_replace;
{'MCB': 'SourceA', 'MCA': 'SourceB'}
I would like to replace the string values for header_replace.keys() found within the column names with the values.
I have had to use the following loop;
headers = []
for header in cursor.description:
replaced = False
for key in header_replace.keys():
if key in header[0]:
headers.append(str.replace(header[0], key, header_replace[key]))
replaced = True
break
if not replaced:
headers.append(header[0])
Which gives me the correct output;
['UNIX_Time', 'col1_SourceA', 'col2_SourceA', 'col3_SourceA', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB']
I tried using this list comprehension;
[str.replace(i[0],k,header_replace[k]) if k in i[0] else i[0] for k in header_replace.keys() for i in cursor.description]
But it meant that items were duplicated for the unmatched keys and I would get;
['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_SourceA', 'col2_SourceA', 'col3_SourceA',
'UNIX_Time', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB', 'col1_MCB', 'col2_MCB', 'col3_MCB']
But if instead I use;
[str.replace(i[0],k,header_replace[k]) for k in header_replace.keys() for i in cursor.description if k in i[0]]
#Bakuriu fixed syntax
I would get the correct replacement but then loose any items that didn't need to have an string replacement.
['col1_SourceA', 'col2_SourceA', 'col3_SourceA', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB']
Is there a pythonesque way of doing this or am I over stretching list comprehensions? I certainly find them hard to read.
[str.replace(i[0],k,header_replace[k]) if k in i[0] for k in header_replace.keys() for i in cursor.description]
this is a SyntaxError, because if expressions must contain the else part. You probably meant:
[i[0].replace(k, header_replace[k]) for k in header_replace for i in cursor.description if k in i[0]]
With the if at the end. However I must say that list-comprehension with nested loops aren't usually the way to go.
I would use the expanded for loop. In fact I'd improve it removing the replaced flag:
headers = []
for header in cursor.description:
for key, repl in header_replace.items():
if key in header[0]:
headers.append(header[0].replace(key, repl))
break
else:
headers.append(header[0])
The else of the for loop is executed when no break is triggered during the iterations.
I don't understand why in your code you use str.replace(string, substring, replacement) instead of string.replace(substring, replacement). Strings have instance methods, so you them as such and not as if they were static methods of the class.
If your data is exactly as you described it, you don't need nested replacements and can boil it down to this line:
l = ['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_MCB', 'col2_MCB', 'col3_MCB']
[i.replace('_MC', '_Source') for i in l]
>>> ['UNIX_Time',
>>> 'col1_SourceA',
>>> 'col2_SourceA',
>>> 'col3_SourceA',
>>> 'col1_SourceB',
>>> 'col2_SourceB',
>>> 'col3_SourceB']
I guess a function will be more readable:
def repl(key):
for k, v in header_replace.items():
if k in key:
return key.replace(k, v)
return key
print map(repl, names)
Another (less readable) option:
import re
rx = '|'.join(header_replace)
print [re.sub(rx, lambda m: header_replace[m.group(0)], name) for name in names]
Suppose I have this list:
lis = ['a','b','c','d']
If I do 'x'.join(lis) the result is:
'axbxcxd'
What would be a clean, simple way to get this output?
'xaxbxcxdx'
I could write a helper function:
def joiner(s, it):
return s+s.join(it)+s
and call it like joiner('x',lis) which returns xaxbxcxdx, but it doesn't look as clean as it could be. Is there a better way to get this result?
>>> '{1}{0}{1}'.format(s.join(lis), s)
'xaxbxcxdx'
You can join a list that begins and ends with an empty string:
>>> 'x'.join(['', *lis, ''])
'xaxbxcxdx'
You can use f-string:
s = 'x'
f'{s}{s.join(lis)}{s}'
In Python 3.8 you can also use the walrus operator:
f"{(s:='x')}{s.join(lis)}{s}"
or
(s:='x') + s.join(lis) + s
You can use str.replace() to interleave the characters:
>>> lis = ['a','b','c','d']
>>> ''.join(lis).replace('', 'x')
'xaxbxcxdx'
On the other hand, your original solution (or a trivial modification with string formatting) is IMO actually pretty clean and readable.
You may also do it as
'x'.join([''] + lis + [''])
But I'm not sure if it's cleaner.
It will produce only 1 separator on empty list instead of 2 as one in the question.
A generator can interleave the characters, and the result can be joined without having to create intermediate strings.
L = list('abcd')
def mixer(chars, insert):
yield insert
for char in chars:
yield char
yield insert
result = ''.join(mixer(L, 'x')) # -> 'xaxbxcxdx'
While it isn't a one-liner, I think it is clean and simple, unlike these itertools creations that I came up with initially:
from itertools import repeat, starmap, zip_longest
from operator import add
# L must have a len, so doesn't work with generators
''.join(a for b in itertools.zip_longest(repeat('x', len(L) + 1),
L, fillvalue='')
for a in b)
# As above, and worse still creates lots of intermediate strings
''.join(starmap(add, zip_longest(repeat('x', len(L) + 1), L, fillvalue='')))
Arguably there is a much simpler approach to be found here - just add the character before and after your join function. Others have suggested f-strings, which is a fancy way of achieving the same thing. String concatenation is also fine:
lis = ['a','b','c','d']
lis_str = 'x' + 'x'.join(lis) + 'x'
If your string is long and you don't want to repeat it multiple times, you can just put this into a variable and do the same thing
lis = ['a','b','c','d']
join_str = 'x-marks-the-spot'
lis_str = join_str + join_str.join(lis) + join_str
I want to check if a value is in a list, no matter what the case of the letters are, and I need to do it efficiently.
This is what I have:
if val in list:
But I want it to ignore case
check = "asdf"
checkLower = check.lower()
print any(checkLower == val.lower() for val in ["qwert", "AsDf"])
# prints true
Using the any() function. This method is nice because you aren't recreating the list to have lowercase, it is iterating over the list, so once it finds a true value, it stops iterating and returns.
Demo : http://codepad.org/dH5DSGLP
If you know that your values are all of type str or unicode, you can try this:
if val in map(str.lower, list):
...Or:
if val in map(unicode.lower, list):
If you really have just a list of the values, the best you can do is something like
if val.lower() in [x.lower() for x in list]: ...
but it would probably be better to maintain, say, a set or dict whose keys are lowercase versions of the values in the list; that way you won't need to keep iterating over (potentially) the whole list.
Incidentally, using list as a variable name is poor style, because list is also the name of one of Python's built-in types. You're liable to find yourself trying to call the list builtin function (which turns things into lists) and getting confused because your list variable isn't callable. Or, conversely, trying to use your list variable somewhere where it happens to be out of scope and getting confused because you can't index into the list builtin.
You can lower the values and check them:
>>> val
'CaSe'
>>> l
['caSe', 'bar']
>>> val in l
False
>>> val.lower() in (i.lower() for i in l)
True
items = ['asdf', 'Asdf', 'asdF', 'asjdflk', 'asjdklflf']
itemset = set(i.lower() for i in items)
val = 'ASDF'
if val.lower() in itemset: # O(1)
print('wherever you go, there you are')