python dictionaries and items()

python dictionaries and items() - python

My understanding is that .items() is only avaliable for python dictionaries.
However in the following bit of code, which runs perfectly, it appears that the .items() function is avaliable for a string. (This code is for the preprocessing stage of doc2vec )
I have looked at this for a while and I can't figure out why the .items() seems to work in this piece of code.
In the code, 'sources' is just an attribute of an instance. Yet it is able to call .items().
What am I missing here?
class LabeledLineSentence(object):
def __init__(self, sources):
self.sources = sources
flipped = {}
# make sure that keys are unique
for key, value in sources.items():
if value not in flipped:
flipped[value] = [key]
else:
raise Exception('Non-unique prefix encountered')

The given code only specifies that sources is an attribute of an instance. It doesn't specify its type. In fact it can be any type that is specified at the time of creating an instance of LabeledLineSentence.
i1 = LabeledLineSentence('sample text') # sources is now a string. Throws error!
i2 = LabeledLineSentence({}) # source is a now a dictionary. No error!
Note that LabeledLineSentence implementation expects the sources parameter to be a dictionary.

.items() is available for any class with an items method. For instance, I can define
class MyClass:
def items(self):
return [1,2,3,4]
and then run
mc = MyClass()
for i in mc.items(): print(i)
Presumably your sources object is of a class that has such an attribute. But we don't know what, since it's an argument to the constructor of LabeledLineSentence.
Can you point us to the full source code? Then we might be able to see what is being passed in.

Related

Static methods for recursive functions within a class?

I'm working with nested dictionaries on Python (2.7) obtained from YAML objects and I have a couple of questions that I've been trying to get an answer to by reading, but have not been successful. I'm somewhat new to Python.
One of the simplest functions is one that reads the whole dictionary and outputs a list of all the keys that exist in it. I use an underscore at the beginning since this function is later used by others within a class.
class Myclass(object):
#staticmethod
def _get_key_list(d,keylist):
for key,value in d.iteritems():
keylist.append(key)
if isinstance(value,dict):
Myclass._get_key_list(d.get(key),keylist)
return list(set(keylist))
def diff(self,dict2):
keylist = []
all_keys1 = self._get_key_list(self.d,keylist)
all_keys2 = self._get_key_list(dict2,keylist)
... # More code
Question 1: Is this a correct way to do this? I am not sure whether it's good practice to use a static method for this reason. Since self._get_key_list(d,keylist) is recursive, I dont want "self" to be the first argument once the function is recursively called, which is what would happen for a regular instance method.
I have a bunch of static methods that I'm using, but I've read in a lot of places thay they could perhaps not be good practice when used a lot. I also thought I could make them module functions, but I wanted them to be tied to the class.
Question 2: Instead of passing the argument keylist to self._get_key_list(d,keylist), how can I initialize an empty list inside the recursive function and update it? Initializing it inside would reset it to [] every time.

I would eliminate keylist as an explicit argument:
def _get_keys(d):
keyset = set()
for key, value in d.iteritems():
keylist.add(key)
if isinstance(value, dict):
keylist.update(_get_key_list(value))
return keyset
Let the caller convert the set to a list if they really need a list, rather than an iterable.
Often, there is little reason to declare something as a static method rather than a function outside the class.
If you are concerned about efficiency (e.g., getting lots of repeat keys from a dict), you can go back to threading a single set/list through the calls as an explicit argument, but don't make it optional; just require that the initial caller supply the set/list to update. To emphasize that the second argument will be mutated, just return None when the function returns.
def _get_keys(d, result):
for key, value in d.iteritems():
result.add(key)
if isinstance(value, dict):
_get_keys(value, result)
result = set()
_get_keys(d1, result)
_get_keys(d2, result)
# etc

There's no good reason to make a recursive function in a class a static method unless it is meant to be invoked outside the context of an instance.
To initialize a parameter, we usually assign to it a default value in the parameter list, but in case it needs to be a mutable object such as an empty list in this case, you need to default it to None and the initialize it inside the function, so that the list reference won't get reused in the next call:
class Myclass(object):
def _get_key_list(self, d, keylist=None):
if keylist is None:
keylist = []
for key, value in d.iteritems():
keylist.append(key)
if isinstance(value, dict):
self._get_key_list(d.get(key), keylist)
return list(set(keylist))
def diff(self, dict2):
all_keys1 = self._get_key_list(self.d)
all_keys2 = self._get_key_list(dict2)
... # More code

Why does Python look for members rather than the requested field?

I have a class that defines its own __getattr__() in order to interact with an XML tree instantiated objects contain. This hides the XML structure from the user and allows him to set tag values, etc. as if they were normal fields on the object and works fine for all fields except for one: The one named field. Here's how it looks:
>>> q = MyQuery()
>>> q.file = "database"
>>> print(q)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<requestCollection xmlns="http://dwd.de/sky">
<read>
<select>
<referenceDate>
<value></value>
</referenceDate>
</select>
<transfer>
<file name="my file"/>
</transfer>
</read>
</requestCollection>
>>> q.file
That works fine, the side effects that should happen do so. But if I try to set the field field, I get a string that the method shouldn't be returning. For clarity, this is a simplified version of my __getattr__:
def __getattr__(self, key):
logging.info("Looking up value for key {}.".format(key))
if key == "something"
return self.method_with_side_effect(key)
if key in field_list:
logging.info("Key is in the field list.")
return self.other_method_with_side_effects(key)
ensemble_member and field are both in field_list. Check this out:
>>> q = MyQuery()
>>> q.ensemble_member
Looking up value for key __members__.
Looking up value for key __methods__.
Looking up value for key ensemble_member.
Key is in the field list.
... Side effects ...
>>> q.field
'station'
Looking up value for key __members__.
Looking up value for key __methods__.
The behavior for ensemble_member is correct, for field it's totally incorrect. Why is that?
I have no methods nor class / object members named field.
Another interesting thing is, if I put this on the first line of __getattr__:
def __getattr__(self, key):
if key == "field":
raise ValueError
The following still happens:
>>> q = MyQuery()
>>> q.field
'station'
Looking up value for key __members__.
Looking up value for key __methods__.
What's going on?

I've got it - the offending code was, in the end, these lines:
class SkyQuery(object):
_unique_fields = ["parameter",
"ensemble",
"forecast",
"station"]
_field_tag_values = [field + "_value" for field in _unique_fields]
Naming the temporary variable "field" in my list comprehension was causing the problem. Python was retaining it after I was done. This behavior is consistent, I just wasn't expecting it.
I see three solutions here (the third was suggested by user4815162342). I implemented the third.
Rename the temporary variable to x rather than field. Then I still have x as a temporary variable floating around in my code, but because no members should be called x it doesn't bother me.
Call del(field) to delete the field. I don't like calling del() and I thought it would clutter up my code, but it would work if I really needed to be able to access that variable later on.
replace the list comprehension with a generator expression list(field + "_value" for field in _unique_fields) which does not share this problem

PYTHON - Unhashable List

I am trying to display some information in a GUI list-box. I have written a test method in a model only portion of my MVC which outputs the information I want; however, when I transfer that code to my full GUI, it throws me an error.
Here are the two pieces of code:
Model: (note that this method is written for a class Products())
def test(self):
for key in self._items_list:
print self.get_item(key) #this refers to the get_item function of the Products class:
def get_item(self, key):
return self._items_list[key] # items_list is a dictionary
So, this returns the output I would like to put in my list-box.
Here is how I transfer the code to my GUI (this is in a class i defined which inherits from Listbox):
def refreshData(self):
for keys in self._productslist: #this productslist is equivalent to items_list
disp = self._products.get_item(keys) #so i can call the method from the Product class
self.insert(END, dips)
This throws me the following error when I try to open and display the file:
...in get_item
return self._items_list[key]
TypeError: unhashable type: 'list'
Sorry, this is long and probably very confusing, but essentially I want to know why I get the error for the method in the full version of the code and not in the isolated model.
All the relevant code is identical as far as I know.
Any ideas would be greatly appreciated!

You can't hash lists, only immutable things. Although you could define a __hash__ method for some extension of the list object, the reason behind this logic is that if you were to look something up in a dictionary, you would expect the entries' names not to change. Similarly, in python, the keys must be immutable. As another answer said, use a tuple instead.

Use tuples instead:
http://wiki.python.org/moin/DictionaryKeys

Python - Dictionary - Modify getitem?

Ok so i've build my own variable handler which has a __getitem__ function for use when accessing data via data[key], it works great except for when trying to access a link of items:
data["key"]["subkey"]
def __getitem__(self, key, **args):
print key
...
return self.dict[key]
When trying to access a subkey that doesn't exist, Python simply returns a KeyError without printing "subkey", why is this and how can I get Python to print out what I'm actually trying to get?
I know that I've probably misunderstood the mechanics but is there a way to emulate a dictionary AND follow the string of data that's being requested?
Mainly so I can dynamically log the missing variables in a dictionary flow...
This obviously works (but it's not the native syntax that I like):
data["key:subkey"]
def __getitem__(self, key, **args):
for slice in key.split(':'):
print key
...
The goal is to emulate the following,
Works:
data = {'key' : {'subkey' : 1}}
print data["key"]["subkey"]
Will not work, but I want to catch the exception within __getitem__ and then create the missing key automatically or just log the missing subkey:
data = {'key' : {}}
print data["key"]["subkey"]
Solution:
class Var():
def __init__(self):
self.dict = {'test' : {}}
def __getitem__(self, var, **args):
print ':',var
if var in self.dict:
v = Var(self.dict[var])
return v
print vHandle['test']['down']
Output:
: test
: down
None

The fact is that when Python encounters an expression such as data["key"]["subkey"], what is done internally is (data["key"])["subkey"]. That is, the first part of the expression is resolved: the retrievalof the item "key" from the object "data". Then, Python tries do call __getitem__ on the resulting object of that expression.
If such resulting object does not have a __getitem__method itself, there is your error.
There are two possible workarounds there: you should either work with "tuple indexes" - like
data["key", "subkey"](and then test on your __getitem__ method wether you got a tuple instance as the key) - or make __getitem__ return an specialized object that also features a __getitem__ method - even if all it does is to log the requested keys.

Remember: tmp = foo['bar']['baz'] is the same as tmp = foo['bar']; tmp = tmp['baz']
So to allow arbitrary depths your __getitem__ method must return a new object that also contains such a __getitem__ method.

what's the right way to put *arg in a tuple that can be sorted?

I want a dict or tuple I can sort based on attributes of the objects I'm using as arguments for *arg. The way I've been trying to do it just gives me AttributeErrors, which leads me to believe I'm doing it weird.
def function(*arg):
items = {}
for thing in arg:
items.update({thing.name:thing})
while True:
for thing in items:
## lots of other code here, basically just a game loop.
## Problem is that the 'turn order' is based on whatever
## Python decides the order of arguments is inside "items".
## I'd like to be able to sort the dict based on each object's
## attributes (ie, highest 'thing.speed' goes first in the while loop)
The problem is when I try to sort "items" based on an attribute of the objects I put into function(), it gives me "AttributeError: 'str' object has no attribute 'attribute'". Which leads me to believe I'm either unpacking *arg in a lousy way, or I'm trying to do something the wrong way.
while True:
for thing in sorted(items, key=attrgetter('attribute')):
...doesn't work either, keeps telling me I'm trying to manipulate a 'str' object. What am I not doing here?

arg already is a tuple you can sort by an attribute of each item:
def function(*args):
for thing in sorted(args, key=attrgetter('attribute')):
When you iterate over a dict, as sorted is doing, you just get the keys, not the values. So, if you want to use a dict, you need to do:
def function(*args):
# or use a dict comprehension on 2.7+
items = dict((thing.name, thing) for thing in args)
# or just items.values on 3+
for thing in sorted(items.itervalues(), key=attrgetter('attribute')):
to actually sort the args by an attribute. If you want the keys of the dict available as well (not necessary here because the key is also an attribute of the item), use something like:
for name, thing in sorted(items.iteritems(), key=lambda item: item[1].attribute):

Your items is a dict, you can't properly sort a dict. When you try to use it as an iterable, it silently returns its keys list, which is a list of strings. And you don't use your arg after creating a dict.
If you don't need dict lookup, as you just iterate through it, you can replace dict with list of 2-tuples (thing.name, thing), sort it by any attribute and iterate through it. You can also use collections.OrderedDict from Python 2.7 (it exists as a separate ordereddict package for earlier versions) if you really want both dict lookup and ordering.

{edit} Thanks to agf, I understood the problem. So, what I wrote below is a good answer in itself, but not when related to the question above... I let it here for the trace.
Looking to the answers, I may have not understood the question. But here's my understanding: as args is a tuple of arguments you give to your function, it's likely that none of these arguments is an object with a name attribute. But, looking to the errors you report, you're giving string arguments.
Maybe some illustration will help my description:
>>> # defining a function using name attribute
>>> def f(*args):
... for arg in args:
... print arg.name
>>> # defining an object with a name attribute
>>> class o(object):
... def __init__(self, name):
... self.name = name
>>> # now applying the function on the previous object, and on a string
>>> f( o('arg 1'), 'arg 2' )
arg 1
Traceback (most recent call last):
File "<pyshell#9>", line 1, in <module>
f(o('arg 1'), 'ets')
File "<pyshell#3>", line 3, in f
print arg.name
AttributeError: 'str' object has no attribute 'name'
This is failing as strings have no such attribute.
For me, in your code, there is a mistake: you're trying to use attribute name on your inputs, without ever verifying that they have such an attribute. Maybe you should test with hasattr first:
>>> if hasattr(arg, 'name'):
... print arg.name
... else:
... print arg
or with some inspection on the input, to verify if it's an instance of a given class, known to have the requested attribute.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python dictionaries and items() - python

Related

Static methods for recursive functions within a class?

Why does Python look for members rather than the requested field?

PYTHON - Unhashable List

Python - Dictionary - Modify getitem?

what's the right way to put *arg in a tuple that can be sorted?

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python dictionaries and items() - python

Related

Static methods for recursive functions within a class?

Why does Python look for __members__ rather than the requested field?

PYTHON - Unhashable List

Python - Dictionary - Modify __getitem__?

what's the right way to put *arg in a tuple that can be sorted?

Categories

Resources

Why does Python look for members rather than the requested field?

Python - Dictionary - Modify getitem?