Access list of tuples - python

I have a list that contains several tuples, like:
[('a_key', 'a value'), ('another_key', 'another value')]
where the first tuple-values act as dictionary-keys.
I'm now searching for a python-like way to access the key/value-pairs, like:
"mylist.a_key" or "mylist['a_key']"
without iterating over the list. any ideas?

You can't do it without any iteration. You will either need iteration to convert it into a dict, at which point key access will become possible sans iteration, or you will need to iterate over it for each key access. Converting to a dict seems the better idea-- in the long run it is more efficient, but more importantly, it represents how you actually see this data structure-- as pairs of keys and values.
>>> x = [('a_key', 'a value'), ('another_key', 'another value')]
>>> y = dict(x)
>>> y['a_key']
'a value'
>>> y['another_key']
'another value'

If you're generating the list yourself, you might be able to create it as a dictionary at source (which allows for key, value pairs).
Otherwise, Van Gale's defaultdict is the way to go I would think.
Edit:
As mentioned in the comments, defaultdict is not required here unless you need to deal with corner cases like several values with the same key in your list. Still, if you can originally generate the "list" as a dictionary, you save yourself having to iterate back over it afterwards.

Related

Get reference to Python dict key

In Python (3.7 and above) I would like to obtain a reference to a dict key. More precisely, let d be a dict where the keys are strings. In the following code, the value of k is potentially stored at two distinct locations in memory (one pointed to by the dict and one pointed to by k), whereas the value of v is stored at only one location (the one pointed to by the dict).
# d is a dict
# k is a string dynamically constructed, in particular not from iterating over d's keys
if k in d:
v = d[k]
# Now store k and v in other data structures
In my case, the dict is very large and the string keys are very long. To keep memory usage down I would like to replace k with a pointer to the corresponding string used by d before storing k in other data structures. Is there a straightforward way of doing this, that is using the keys of the dict as a string pool?
(Footnote: this may seem as premature optimisation, and perhaps it is, but being an old-school C programmer I sleep better at night doing "memory tricks". Joke aside, I do genuinely would like to know the answer out of curiosity, and I am indeed going to run my code on a Raspberry Pi and will probably face memory issues.)
Where does the key k come from? Is it dynamically constructed by something like str.join, + , slicing another string, bytes.decode etc? Is it read from a file or input()? Did you get it from iterating over d at some point? Or does it originate from a literal somewhere in your source code?
In the last two cases, you don't need to worry about it since it is going to be a single instance anyway.
If not, you could use sys.intern to intern your keys. If a == b then sys.intern(a) is sys.intern(b).
Another possible solution, in case you might want to garbage collect the strings at some point or you want to intern some non-string values, like tuples of strings, you could do the following:
# create this dictionary once after `d` has all the right keys
canonical_keys = {key: key for key in d}
k = canonical_keys.get(k, k) # use the same instance if possible
I recommend reading up on Python's data model.

Python convert named string fields to tuple

Similar to this question: Tuple declaration in Python
I have this function:
def get_mouse():
# Get: x:4631 y:506 screen:0 window:63557060
mouse = os.popen( "xdotool getmouselocation" ).read().splitlines()
print mouse
return mouse
When I run it it prints:
['x:2403 y:368 screen:0 window:60817757']
I can split the line and create 4 separate fields in a list but from Python code examples I've seen I feel there is a better way of doing it. I'm thinking something like x:= or window:=, etc.
I'm not sure how to properly define these "named tuple fields" nor how to reference them in subsequent commands?
I'd like to read more on the whole subject if there is a reference link handy.
It seems it would be a better option to use a dictionary here. Dictionaries allow you to set a key, and a value associated to that key. This way you can call a key such as dictionary['x'] and get the corresponding value from the dictionary (if it exists!)
data = ['x:2403 y:368 screen:0 window:60817757'] #Your return data seems to be stored as a list
result = dict(d.split(':') for d in data[0].split())
result['x']
#'2403'
result['window']
#'60817757'
You can read more on a few things here such as;
Comprehensions
Dictionaries
Happy learning!
try
dict(mouse.split(':') for el in mouse
This should give you a dict (rather than tuples, though dicts are mutable and also required hashability of keys)
{x: 2403, y:368, ...}
Also the splitlines is probably not needed, as you are only reading one line. You could do something like:
mouse = [os.popen( "xdotool getmouselocation" ).read()]
Though I don't know what xdotool getmouselocation does or if it could ever return multiple lines.

How can I access to a dictionary element indexed with a string?

I want to access to an element of a dictionary with a string.
For example, I have a dictionary like this:
data = {"masks": {"id": "valore"}}
I have one string campo="masks,id" I want to split this string with this campo.split(','). I obtain ['masks', 'id'] and with this I want to access to the element data["masks"]["id"].
This dictionary is an example, my dictionaries have more complexity. The point is that I want to access to the element data["masks"]["id"] with an input string "masks,id", or to the element data["masks"] with the string "masks" and to the element data["masks"]["id"]["X"] with the input string "masks,id,X" and so on.
How can I do this?
However, I won't recommend you to use the following method, as python dict is not meant to be accessed the way you want it to be, but since in Python you can change the object type at your own risk, I would like to attach the snippet which would get the work done for you.
So what I do is iterate over the keys and at each iteration fetch the child dictionary is present else put empty dictionary, the .get() method used, returns empty dict if the key was not found.
data = {"masks": {"id": "valore"}}
text = "masks, id"
nested_keys = text.split(", ")
nested_dict = data
for key in nested_keys:
nested_dict = nested_dict.get(key, {})
if (isinstance(nested_dict, str)):
print nested_dict
The point is that you are coming up with requirements that do not match the capability of the python-built-in dictionaries.
If you want to have nested maps that do this kind of automated "splitting" of a single key string like "masks, id, X" then ... you will have to implement that yourself.
In other words: the answer is - the built-in dictionary can't do that for you.
So, the "real" thing to do here: step back and carefully look into your requirements to understand exactly what you want to do; and why you want to do that. And going from there look for the best design to support that.
From an implementation side, I think what you "need" would roughly look like:
check if the provided "key" matches "key1,key2,key3"
if so, split that key into its sub-keys
then check if the "out dictionary" has a value for key1
then check, if the value for key1 is a dictionary
then check if that "inner" dictionary has a value for key2
...
and so on.

Python, checksum of a dict

I'm thinking to create a checksum of a dict to know if it was modified or not
For the moment i have that:
>>> import hashlib
>>> import pickle
>>> d = {'k': 'v', 'k2': 'v2'}
>>> z = pickle.dumps(d)
>>> hashlib.md5(z).hexdigest()
'8521955ed8c63c554744058c9888dc30'
Perhaps a better solution exists?
Note: I want to create an unique id of a dict to create a good Etag.
EDIT: I can have abstract data in the dict.
Something like this:
reduce(lambda x,y : x^y, [hash(item) for item in d.items()])
Take the hash of each (key, value) tuple in the dict and XOR them alltogether.
#katrielalex
If the dict contains unhashable items you could do this:
hash(str(d))
or maybe even better
hash(repr(d))
In Python 3, the hash function is initialized with a random number, which is different for each python session. If that is not acceptable for the intended application, use e.g. zlib.adler32 to build the checksum for a dict:
import zlib
d={'key1':'value1','key2':'value2'}
checksum=0
for item in d.items():
c1 = 1
for t in item:
c1 = zlib.adler32(bytes(repr(t),'utf-8'), c1)
checksum=checksum ^ c1
print(checksum)
I would recommend an approach very similar to the one your propose, but with some extra guarantees:
import hashlib, json
hashlib.md5(json.dumps(d, sort_keys=True, ensure_ascii=True).encode('utf-8')).hexdigest()
sort_keys=True: keep the same hash if the order of your keys changes
ensure_ascii=True: in case you have some non-ascii characters, to make sure the representation does not change
We use this for our ETag.
I don't know whether pickle guarantees you that the hash is serialized the same way every time.
If you only have dictionaries, I would go for o combination of calls to keys(), sorted(), build a string based on the sorted key/value pairs and compute the checksum on that
I think you may not realise some of the subtleties that go into this. The first problem is that the order that items appear in a dict is not defined by the implementation. This means that simply asking for str of a dict doesn't work, because you could have
str(d1) == "{'a':1, 'b':2}"
str(d2) == "{'b':2, 'a':1}"
and these will hash to different values. If you have only hashable items in the dict, you can hash them and then join up their hashes, as #Bart does or simply
hash(tuple(sorted(hash(x) for x in d.items())))
Note the sorted, because you have to ensure that the hashed tuple comes out in the same order irrespective of which order the items appear in the dict. If you have dicts in the dict, you could recurse this, but it will be complicated.
BUT it would be easy to break any implementation like this if you allow arbitrary data in the dictionary, since you can simply write an object with a broken __hash__ implementation and use that. And you can't use id, because then you might have equal items which compare different.
The moral of the story is that hashing dicts isn't supported in Python for a reason.
As you said, you wanted to generate an Etag based on the dictionary content, OrderedDict which preserves the order of the dictionary may be better candidate here. Just iterator through the key,value pairs and construct your Etag string.

Using 'in' to match an attribute of Python objects in an array

I don't remember whether I was dreaming or not but I seem to recall there being a function which allowed something like,
foo in iter_attr(array of python objects, attribute name)
I've looked over the docs but this kind of thing doesn't fall under any obvious listed headers
Using a list comprehension would build a temporary list, which could eat all your memory if the sequence being searched is large. Even if the sequence is not large, building the list means iterating over the whole of the sequence before in could start its search.
The temporary list can be avoiding by using a generator expression:
foo = 12
foo in (obj.id for obj in bar)
Now, as long as obj.id == 12 near the start of bar, the search will be fast, even if bar is infinitely long.
As #Matt suggested, it's a good idea to use hasattr if any of the objects in bar can be missing an id attribute:
foo = 12
foo in (obj.id for obj in bar if hasattr(obj, 'id'))
Are you looking to get a list of objects that have a certain attribute? If so, a list comprehension is the right way to do this.
result = [obj for obj in listOfObjs if hasattr(obj, 'attributeName')]
you could always write one yourself:
def iterattr(iterator, attributename):
for obj in iterator:
yield getattr(obj, attributename)
will work with anything that iterates, be it a tuple, list, or whatever.
I love python, it makes stuff like this very simple and no more of a hassle than neccessary, and in use stuff like this is hugely elegant.
No, you were not dreaming. Python has a pretty excellent list comprehension system that lets you manipulate lists pretty elegantly, and depending on exactly what you want to accomplish, this can be done a couple of ways. In essence, what you're doing is saying "For item in list if criteria.matches", and from that you can just iterate through the results or dump the results into a new list.
I'm going to crib an example from Dive Into Python here, because it's pretty elegant and they're smarter than I am. Here they're getting a list of files in a directory, then filtering the list for all files that match a regular expression criteria.
files = os.listdir(path)
test = re.compile("test\.py$", re.IGNORECASE)
files = [f for f in files if test.search(f)]
You could do this without regular expressions, for your example, for anything where your expression at the end returns true for a match. There are other options like using the filter() function, but if I were going to choose, I'd go with this.
Eric Sipple
The function you are thinking of is probably operator.attrgettter. For example, to get a list that contains the value of each object's "id" attribute:
import operator
ids = map(operator.attrgetter("id"), bar)
If you want to check whether the list contains an object with an id == 12, then a neat and efficient (i.e. doesn't iterate the whole list unnecessarily) way to do it is:
any(obj.id == 12 for obj in bar)
If you want to use 'in' with attrgetter, while still retaining lazy iteration of the list:
import operator,itertools
foo = 12
foo in itertools.imap(operator.attrgetter("id"), bar)
What I was thinking of can be achieved using list comprehensions, but I thought that there was a function that did this in a slightly neater way.
i.e. 'bar' is a list of objects, all of which have the attribute 'id'
The mythical functional way:
foo = 12
foo in iter_attr(bar, 'id')
The list comprehension way:
foo = 12
foo in [obj.id for obj in bar]
In retrospect the list comprehension way is pretty neat anyway.
If you plan on searching anything of remotely decent size, your best bet is going to be to use a dictionary or a set. Otherwise, you basically have to iterate through every element of the iterator until you get to the one you want.
If this isn't necessarily performance sensitive code, then the list comprehension way should work. But note that it is fairly inefficient because it goes over every element of the iterator and then goes BACK over it again until it finds what it wants.
Remember, python has one of the most efficient hashing algorithms around. Use it to your advantage.
I think:
#!/bin/python
bar in dict(Foo)
Is what you are thinking of. When trying to see if a certain key exists within a dictionary in python (python's version of a hash table) there are two ways to check. First is the has_key() method attached to the dictionary and second is the example given above. It will return a boolean value.
That should answer your question.
And now a little off topic to tie this in to the list comprehension answer previously given (for a bit more clarity). List Comprehensions construct a list from a basic for loop with modifiers. As an example (to clarify slightly), a way to use the in dict language construct in a list comprehension:
Say you have a two dimensional dictionary foo and you only want the second dimension dictionaries which contain the key bar. A relatively straightforward way to do so would be to use a list comprehension with a conditional as follows:
#!/bin/python
baz = dict([(key, value) for key, value in foo if bar in value])
Note the if bar in value at the end of the statement**, this is a modifying clause which tells the list comprehension to only keep those key-value pairs which meet the conditional.** In this case baz is a new dictionary which contains only the dictionaries from foo which contain bar (Hopefully I didn't miss anything in that code example... you may have to take a look at the list comprehension documentation found in docs.python.org tutorials and at secnetix.de, both sites are good references if you have questions in the future.).

Categories

Resources