Suppose there exists a string 'Hello World', and I wish to use a dictionary to get a mapping of elements and their frequencies, this following code does suffice, however if I need to use list comprehension, how can I use 'if', 'else'? Please provide your solutions
for i in s:
if i in d:
d[i]=d[i]+1
else:
d[i]=1
You can use a dictionary comprehension (it doesn't make sense to use a list comprehension to build a dictionary):
s = 'Hello world'
d = {char: s.count(char) for char in set(s)}
The set(s) is a set of the unique characters in your string, and the comprehension creates a dictionary with the character as key and the number of occurences (using str.count) as value.
But you don't need to use a comprehension at all, python comes with "batteries included". In this case the battery is collections.Counter:
import collections
collections.Counter(s)
In case you really want to use a list comprehension (my personal opinion: you don't and shouldn't!) you need to work with side-effects, for example:
s = 'Hello world'
d = {}
[d.__setitem__(i, d[i]+1) if i in d else d.__setitem__(i, 1) for i in s]
print(d)
The __setitem__ calls are necessary because d[i] = 1 or d[i] = d[i] + 1 are assignments and therefore forbidden in comprehensions. But __setitem__ is the functional alternative.
Related
I have a code snippit that groups together equal keys from a list of dicts and adds the dict with equal ObjectID to a list under that key.
Code bellow works, but I am trying to convert it to a Dictionary comprehension
group togheter subblocks if they have equal ObjectID
output = {}
subblkDBF : list[dict]
for row in subblkDBF:
if row["OBJECTID"] not in output:
output[row["OBJECTID"]] = []
output[row["OBJECTID"]].append(row)
Using a comprehension is possible, but likely inefficient in this case, since you need to (a) check if a key is in the dictionary at every iteration, and (b) append to, rather than set the value. You can, however, eliminate some of the boilerplate using collections.defaultdict:
output = defaultdict(list)
for row in subblkDBF:
output[row['OBJECTID']].append(row)
The problem with using a comprehension is that if really want a one-liner, you have to nest a list comprehension that traverses the entire list multiple times (once for each key):
{k: [d for d in subblkDBF if d['OBJECTID'] == k] for k in set(d['OBJECTID'] for d in subblkDBF)}
Iterating over subblkDBF in both the inner and outer loop leads to O(n^2) complexity, which is pointless, especially given how illegible the result is.
As the other answer shows, these problems go away if you're willing to sort the list first, or better yet, if it is already sorted.
If rows are sorted by Object ID (or all rows with equal Object ID are at least next to each other, no matter the overall order of those IDs) you could write a neat dict comprehension using itertools.groupby:
from itertools import groupby
from operator import itemgetter
output = {k: list(g) for k, g in groupby(subblkDBF, key=itemgetter("OBJECTID"))}
However, if this is not the case, you'd have to sort by the same key first, making this a lot less neat, and less efficient than above or the loop (O(nlogn) instead of O(n)).
key = itemgetter("OBJECTID")
output = {k: list(g) for k, g in groupby(sorted(subblkDBF, key=key), key=key)}
You can adding an else block to safe on time n slightly improve perfomrance a little:
output = {}
subblkDBF : list[dict]
for row in subblkDBF:
if row["OBJECTID"] not in output:
output[row["OBJECTID"]] = [row]
else:
output[row["OBJECTID"]].append(row)
i have list of strings
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
and i have prefix string :
/root
is there any pythonic way to add the prefix string to each element in the list
so the end result will look like this:
lst = ["/root/foo/dir/c-.*.txt","/root/foo/dir2/d-.*.svc","/root/foo/dir3/es-.*.info"]
if it can be done without iterating and creating new list ...
used:
List Comprehensions
List comprehensions provide a concise way to create lists. Common
applications are to make new lists where each element is the result of
some operations applied to each member of another sequence or
iterable, or to create a subsequence of those elements that satisfy a
certain condition.
F=Strings
F-strings provide a way to embed expressions inside string literals,
using a minimal syntax. It should be noted that an f-string is really
an expression evaluated at run time, not a constant value. In Python
source code, an f-string is a literal string, prefixed with 'f', which
contains expressions inside braces. The expressions are replaced with
their values.
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix = '/root'
lst =[ f'{prefix}{path}' for path in lst]
print(lst)
I am not sure of pythonic, but this will be also on possible way
list(map(lambda x: '/root' + x, lst))
Here there is comparison between list comp and map List comprehension vs map
Also thanks to #chris-rands learnt one more way without lambda
list(map('/root'.__add__, lst))
Use list comprehensions and string concatenation:
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
print(['/root' + p for p in lst])
# ['/root/foo/dir/c-.*.txt', '/root/foo/dir2/d-.*.svc', '/root/foo/dir3/es-.*.info']
Just simply write:
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix="/root"
res = [prefix + x for x in lst]
print(res)
A simple list comprehension -
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix = '/root'
print([prefix + string for string in lst]) # You can give it a variable if you want
Suppose I have a function with input 'raw_data'
raw_data consists of lines such as:
key1: str1
key2: str2
...
Where strx is of the form aa:bb:cc:dd... - that is a ':' separated string
There is a helper function which does something to the strings, converting them to values, let's call it get_value()
What would be the most pythonic way to return a dict?
to_dict(raw_data):
list_data = raw_data.splitlines()
return { key.replace(':',''): get_value(str) for key,str in (line.split() for line in list_data)}
or
to_dict(raw_data):
list_data = raw_data.splitlines()
mydict = {}
for line in list_data:
key,str = line.split()
key = key.replace(':', '')
mydict.update({ key: get_value(str)})
return mydict
Or is there some much more pythonic way of doing this?
I'm aware this might seem opinion based question, but there seems to be a consensus about what is 'more pythonic' or 'less pythonic' way of doing things, I just don't know what the consensus is in this case.
How about:
{k:v for l in raw_data.splitlines() for (k,v) in [l.split(":")]}
Your first expression is more "Pythonic". If you're worried about readablity, you can take the generator expression out onto its own line.
def to_dict(raw_data):
data = (line.split(': ', 1) for line in raw_data.splitlines())
return {key: get_value(st) for key, st in data}
line.split(': ', 1) will split line on only the first ": ", so "a: b: c" becomes ['a', 'b: c']
List Comprehension is really a handy and faster way to write for loops in Python in just a single line of code. The idea of comprehension is not just unique to lists in Python. Dictionaries, one of the commonly used data structures in data science, can also do comprehension. This is called dict comprehension or dictionary comprehension.
Remember that, in python a list is defined with square brackets [] and a dictionary is defined with curly braces {}. The same idea is carried over in defining dict comprehension as well. dict comprehension is defined with a similar syntax, but with a key:value pair in expression.
I was trying to use a list comprehension to replace multiple possible string values in a list of values.
I have a list of column names which are taken from a cursor.description;
['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_MCB', 'col2_MCB', 'col3_MCB']
I then have header_replace;
{'MCB': 'SourceA', 'MCA': 'SourceB'}
I would like to replace the string values for header_replace.keys() found within the column names with the values.
I have had to use the following loop;
headers = []
for header in cursor.description:
replaced = False
for key in header_replace.keys():
if key in header[0]:
headers.append(str.replace(header[0], key, header_replace[key]))
replaced = True
break
if not replaced:
headers.append(header[0])
Which gives me the correct output;
['UNIX_Time', 'col1_SourceA', 'col2_SourceA', 'col3_SourceA', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB']
I tried using this list comprehension;
[str.replace(i[0],k,header_replace[k]) if k in i[0] else i[0] for k in header_replace.keys() for i in cursor.description]
But it meant that items were duplicated for the unmatched keys and I would get;
['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_SourceA', 'col2_SourceA', 'col3_SourceA',
'UNIX_Time', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB', 'col1_MCB', 'col2_MCB', 'col3_MCB']
But if instead I use;
[str.replace(i[0],k,header_replace[k]) for k in header_replace.keys() for i in cursor.description if k in i[0]]
#Bakuriu fixed syntax
I would get the correct replacement but then loose any items that didn't need to have an string replacement.
['col1_SourceA', 'col2_SourceA', 'col3_SourceA', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB']
Is there a pythonesque way of doing this or am I over stretching list comprehensions? I certainly find them hard to read.
[str.replace(i[0],k,header_replace[k]) if k in i[0] for k in header_replace.keys() for i in cursor.description]
this is a SyntaxError, because if expressions must contain the else part. You probably meant:
[i[0].replace(k, header_replace[k]) for k in header_replace for i in cursor.description if k in i[0]]
With the if at the end. However I must say that list-comprehension with nested loops aren't usually the way to go.
I would use the expanded for loop. In fact I'd improve it removing the replaced flag:
headers = []
for header in cursor.description:
for key, repl in header_replace.items():
if key in header[0]:
headers.append(header[0].replace(key, repl))
break
else:
headers.append(header[0])
The else of the for loop is executed when no break is triggered during the iterations.
I don't understand why in your code you use str.replace(string, substring, replacement) instead of string.replace(substring, replacement). Strings have instance methods, so you them as such and not as if they were static methods of the class.
If your data is exactly as you described it, you don't need nested replacements and can boil it down to this line:
l = ['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_MCB', 'col2_MCB', 'col3_MCB']
[i.replace('_MC', '_Source') for i in l]
>>> ['UNIX_Time',
>>> 'col1_SourceA',
>>> 'col2_SourceA',
>>> 'col3_SourceA',
>>> 'col1_SourceB',
>>> 'col2_SourceB',
>>> 'col3_SourceB']
I guess a function will be more readable:
def repl(key):
for k, v in header_replace.items():
if k in key:
return key.replace(k, v)
return key
print map(repl, names)
Another (less readable) option:
import re
rx = '|'.join(header_replace)
print [re.sub(rx, lambda m: header_replace[m.group(0)], name) for name in names]
I have a dictionary where each key has a list (vector) of items:
from collections import defaultdict
dict = defaultdict(list)
dict[133] = [2,4,64,312]
dict[4] = [2,3,5,12,45,32]
dict[54] = [12,2,443,223]
def getTotalVectorItems(items):
total = 0
for v in items.values():
total += len(v)
return total
print getTotalVectorItems(dict)
This will print:
14 # number of items in the 3 dict keys.
Is there an easier more pythonic way other than creating this "getTotalVectorItems" function? I feel like there is a quick way to do this already.
You are looking for the sum() built-in with a generator expression:
sum(len(v) for v in items.values())
The sum() function totals the values of the given iterator, and the generator expression yields the length of each value in the lists.
Note that calling lists vectors is probably confusing to most Python programmers, unless you are using the term vector in the context of the domain of the problem.
print sum(map(len,dic.itervalues()))