Typically I have used list comprehensions to iterate and filter by data (i.e dicts etc) within the need to write multiple line for loops.
[x['a']["b"] for x in s["items"] if x["car"] == "ford"]
However this returns a list such as :
[False]
Not a massive problem as I can write
[x['a']["b"] for x in s["items"] if x["car"] == "ford"][0]
However is there a way either with list comprehensions or another way to write a for loop which an if condition so that I only get a string returned ?
Edit : In other words how can I place the following onto a single line and return a string,
for x in s["items"]:
if x["car"] == "ford":
print x['a']['b']
Thanks,
If I understand correctly, you want to short-circuit at the first match. Use next along with a generator expression:
>>> s = {'items': [{'a': {'b': 'potato'}, 'car': 'ford'}, {'a': {'b': 'spam'}, 'car': 'honda'}]}
>>> next(x['a']['b'] for x in s['items'] if x['car'] == "ford")
'potato'
As you have not shown that dict s, i have tested it with the possible data and it works fine:
>>> s = {'items': [{'a': {'b': 1}, 'car': 'ford'}, {'a': {'b': 1}, 'car': 'honda'}]}
>>> print [x['a']['b'] for x in s['items'] if x['car'] == "ford"]
[1]
There is nothing in the syntax of your problem that guarantees that there is only value in s that satisfies the criterion. (I.E., for an arbitrary dict s, there could be more than one.)
You may be able to guarantee that to be the case, but that is external to (this part of) the code.
Hence python isn't going to be able to automatically enforce that.
Related
Say I have
d = {"a":0,"b":0,"c":0}
is there a way to update the keys a and b at the same time, instead of looping over them, such like
update_keys = ["a","b"]
d.some_function(update_keys) +=[10,5]
print(d)
{"a":10,"b":5,"c":0}
Yes, you can use update like this:
d.update({'a':10, 'b':5})
Thus, your code would look this way:
d = {"a":0,"b":0,"c":0}
d.update({'a':10, 'b':5})
print(d)
and shows:
{"a":10,"b":5,"c":0}
If you mean a function that can add a new value to the existing value without an explict loop, you can definitely do it like this.
add_value = lambda d,k,v: d.update(zip(k,list(map(lambda _k,_v:d[_k]+_v,k,v)))) or d
and you can use it like this
>>> d = {"a":2,"b":3}
>>> add_value(d,["a","b"],[2,-3])
{'a': 4, 'b': 0}
There is nothing tricky here, I just replace the loop with a map and a lambda to do the update job and use list to wrap them up so Python will immediately evaluate the result of map. Then I use zip to create an updated key-value pair and use dict's update method the update the dictionary. However I really doubt if this has any practical usage since this is definitely more complex than a for loop and introduces extra complexity to the code.
Update values of multiple keys in dictionary
d = {"a":0,"b":0,"c":0}
d.update({'a': 40, 'b': 41, 'c': 89})
print(d)
{'a': 40, 'b': 41, 'c': 89}
If you are just storing integer values, then you can use the Counter class provided by the python module "collections":
from collections import Counter
d = Counter({"a":0,"b":0,"c":0})
result = d + Counter({"a":10, "b":5})
'result' will have the value of
Counter({'a': 10, 'b': 5})
And since Counter is subclassed from Dict, you have probably do not have to change anything else in your code.
>>> isinstance(result, dict)
True
You do not see the 'c' key in the result because 0-values are not stored in a Counter instance, which saves space.
You can check out more about the Counter instance here.
Storing other numeric types is supported, with some conditions:
"For in-place operations such as c[key] += 1, the value type need only support addition and subtraction. So fractions, floats, and decimals would work and negative values are supported. The same is also true for update() and subtract() which allow negative and zero values for both inputs and outputs."
Performing the inverse operation of "+" requires using the method "subtract", which is a note-worthy "gotcha".
>>> d = Counter({"a":10, "b":15})
>>> result.subtract(d)
>>> c
Counter({'a': 0, 'b': 0})
I don't really understand the concept of python dictionary, can anyone help me? I want the program to have similar functionality as append in list python
d = {'key': ['value']}
print(d)
# {'key': ['value']}
d['key'] = ['mynewvalue']
print(d)
# {'key': ['mynewvalue']}
what I want the output of the program, either :
print(d)
#{'key': ['value'],'key': ['mynewvalue']}
or :
print(d)
#{'key': ['value','mynewvalue']}
Sure: first thing first, you can't have two identical keys in a dictionary. So:
{'key': 'myfirstvalue', 'key': 'mysecondvalue'}
wouldn't work. If a key has multiple values, then the key's value should be a list of values, like in your last option. Like in a real dictionary, you won't find, word: definition, word: another definition but word: a list of definitions.
In this regard, you could kind of think of a dictionary as a collection of variables - you can't assign two values to a variable except by assigning a list of values to variable.
x = 4
x = 5
is working code, but the first line is rendered meaningless. x is only equal to 5, not both 4 and 5. You could, however, say:
x = [4, 5]
I often use dictionaries for trees of data. For example, I'm working on a project involving counties for every state in the US. I have a dictionary with a key for each state, and the value of each key is another dictionary, with a key for each county, and the value for each of those dictionaries is another dictionary with the various data points for that county.
That said, you can interact with your dictionary just like you would with variables.
mylist = [1, 2, 3, 4]
mylist.append(5)
print(mylist)
will print:
[1,2,3,4,5]
But also:
mydict = {'mylist': [1,2,3,4]}
mydict['mylist'].append(5)
does the same thing.
mydict['mylist']
is the same as
mylist
in the first example. Both are equal to the list [1,2,3,4]
You cannot have same keys multiple times in a dict in python. The first output scenario you gave is invalid. The value of a dict can contain any data and in your case, it can be accessed and modified just as a list. You can modify the code as given below to get the output as desired in scenario number 2.
d = {'key': ['value']}
print(d)
# {'key': ['value']}
d['key'].append('mynewvalue')
print(d)
#{'key': ['value','mynewvalue']}
you can try it:
d = {'key': ['value']}
d['key'].append("mynewvalue")
print(d)
Output will be: {'key': ['value', 'mynewvalue']}
For the first implementation you want, I think you are violating the entire idea of dictionary, we can not have multiple keys with the same name.
For the second implementation you could write a function like this:
def updateDict(mydict,value):
mydict['key'].append(value)
I use a special class of objects and some method which returns me structures such as:
{'items': [{'_from': 12,
'bla': 3713,
'ff': 0.0,
'd': 45755,
'fdef': 1536},
{'_from': None,
'bla': 3712,
'ff': 0.0,
'd': 45838,
'fdef': 1536}]}
Sometimes this structure is empty and then I get the following;
{'items': []}
How can I check in my program if the returning structure is empty? It has no such attributes as length. It seems that I can access single elements of the structure only via the Loop (so nothing like structure['items']['bla'] is possible):
for k in myStructure.items:
idd=k.bla
How can I perform such a check in an elegant way?
Empty lists evaluate to False when used in an if-statement.
if myStructure.items:
for k in myStructure.items:
idd=k.bla
Example:
>>> if []:
print('here')
>>>
>>>
You can iterate directly over values. As I show below, you can get the length of the empty list, which is 0, or you can simply use if i which will be True if the list is not empty.
myStructure = {'items': []}
for i in myStructure.values():
if not i:
print ("list is empty")
print (len(i))
I'd like to compare all entries in a dict with all other entries – if the values are within a close enough range, I want to merge them under a single key and delete the other key. But I cannot figure out how to iterate through the dict without errors.
An example version of my code (not the real set of values, but you get the idea):
things = { 'a': 1, 'b': 3, 'c': 22 }
for me in things.iteritems():
for other in things.iteritems():
if me == other:
continue
if abs(me-other) < 5:
print 'merge!', me, other
# merge the two into 'a'
# delete 'b'
I'd hope to then get:
>> { 'a': [ 1, 2 ], 'c': 22 }
But if I run this code, I get the first two that I want to merge:
>> merge! ('a', 1) ('b', 2)
Then the same one in reverse (which I want to have merged already):
>> duplicate! ('b', 2) ('a', 1)
If I use del things['b'] I get an error that I'm trying to modify the dict while iterating. I see lots of "how to remove items from a dict" questions, and lots about comparing two separate dicts, but not this particular problem (as far as I can tell).
EDIT
Per feedback in the comments, I realized my example is a little misleading. I want to merge two items if their values are similar enough.
So, to do this in linear time (but requiring extra space) use an intermediate dict to group the keys by value:
>>> things = { 'fruit': 'tomato', 'vegetable': 'tomato', 'grain': 'wheat' }
>>> from collections import defaultdict
>>> grouper = defaultdict(list)
>>> for k, v in things.iteritems():
... grouper[v].append(k)
...
>>> grouper
defaultdict(<type 'list'>, {'tomato': ['vegetable', 'fruit'], 'wheat': ['grain']})
Then, you simply take the first item from your list of values (that used to be keys), as the new key:
>>> {v[0]:k for k, v in grouper.iteritems()}
{'vegetable': 'tomato', 'grain': 'wheat'}
Note, dictionaries are inherently unordered, so if order is important, you should have been using an OrderedDict from the beginning.
Note that your result will depend on the direction of the traversal. Since you are bucketing data depending on distance (in the metric sense), either the right neighbor or the left neighbor can claim the data point.
I'm trying to write a script to calculate all of the possible fuzzy string match matches to for a short string, or 'kmer', and the same code that works in Python 2.7.X gives me a non-deterministic answer with Python 3.3.X, and I can't figure out why.
I iterate over a dictionary, itertools.product, and itertools.combinations in my code, but I iterate over all of them to completion with no breaks or continues. In addition, I store all of my results in a separate dictionary instead of the one I'm iterating over. In short - I'm not making any mistakes that are obvious to me, so why is the behavior different between Python2 and Python3?
Sample, slightly simplified code below:
import itertools
def find_best_fuzzy_kmer( kmers ):
for kmer, value in kmers.items():
for similar_kmer in permute_string( kmer, m ):
# Tabulate Kmer
def permute_string( query, m ):
query_list = list(query)
output = set() # hold output
for i in range(m+1):
# pre-calculate the possible combinations of new bases
base_combinations = list(itertools.product('AGCT', repeat=i))
# for each combination `idx` in idxs, replace str[idx]
for positions in itertools.combinations(range(len(query_list)), i):
for bases in base_combinations:
# Generate Permutations and add to output
return output
If by "non-deterministic" you mean the order in which dictionary keys appear (when you iterate over a dictionary) changes from run to run, and the dictionary keys are strings, please say so. Then I can help. But so far you haven't said any of that ;-)
Assuming that's the problem, here's a little program:
d = dict((L, i) for i, L in enumerate('abcd'))
print(d)
and the output from 4 runs under Python 3.3.2:
{'d': 3, 'a': 0, 'c': 2, 'b': 1}
{'d': 3, 'b': 1, 'c': 2, 'a': 0}
{'d': 3, 'a': 0, 'b': 1, 'c': 2}
{'a': 0, 'b': 1, 'c': 2, 'd': 3}
The cause is hinted at from this part of python -h output:
Other environment variables:
...
PYTHONHASHSEED: if this variable is set to 'random', a random value is used
to seed the hashes of str, bytes and datetime objects. It can also be
set to an integer in the range [0,4294967295] to get hash values with a
predictable seed.
This is a half-baked "security fix", intended to help prevent DOS attacks based on constructing dict inputs designed to provoke quadratic-time behavior. "random" is the default in Python3.
You can turn that off by setting the envar PYTHONHASHSEED to an integer (your choice - pick 0 if you don't care). Then iterating a dict with string keys will produce them in the same order across runs.
As #AlcariTheMad said in a comment, you can enable the Python3 default behavior under Python 2 via python -R ....