Mixing List and Dict in Python - python

I was wondering how do you mix list and dict together on Python? I know on PHP, I can do something like this:
$options = array(
"option1",
"option2",
"option3" => array("meta1", "meta2", "meta3"),
"option4"
);
The problem is python have different bracket for different list. () for tuple, [] for list, and {} for dict. There don't seems to be any way to mix them and I'm keep getting syntax errors.
I am using python 2.7 now. Please advice how to do it correctly.
Much thanks,
Rufas
Update 1:
I'll slightly elaborate what I'm trying to do. I am trying to write a simple python script to do some API requests here:
http://www.diffbot.com/products/automatic/article/
The relevant part is the fields query parameters. It is something like ...&fields=meta,querystring,images(url,caption)... . So the above array can be written as (in PHP)
$fields = array(
'meta',
'querystring',
'images' => array('url', 'caption')
);
And the $fields will be passed to a method for processing. The result will be returned, like this:
$json = diffbot->get("article", $url, $fields);
The thing is - I have no problem in writing it in PHP, but when I try to write it in Python, the thing is not as easy as it seems...

You can do it this way:
options = {
"option1": None,
"option2": None,
"option3": ["meta1", "meta2", "meta3"],
"option4": None,
}
But options is a dictionary in this case.
If you need the order in the dictionary you can use OrderedDict.
How can you use OrderedDict?
from collections import OrderedDict
options = OrderedDict([
("option1", None),
("option2", None),
("option3", ["meta1", "meta2", "meta3"]),
("option4", None),
])
print options["option3"]
print options.items()[2][1]
print options.items()[3][1]
Output:
['meta1', 'meta2', 'meta3']
['meta1', 'meta2', 'meta3']
None
Here you can access options either using keys (like option3), or indexes (like 2 and 3).
Disclaimer. I must stress that this solution is not one-to-one mapping between PHP and Python. PHP is another language, with other data structures/other semantics etc. You can't do one to one mapping between data structures of Python and PHP. Please also consider the answer of Hyperboreus (I gave +1 to him). It show another way to mix lists and dictionaries in Python. Please also read our discussion below.
Update1.
How can you process such structures?
You must check which type a value in each case has.
If it is a list (type(v) == type([])) you can join it;
otherwise you can use it as it is.
Here I convert the structure to a URL-like string:
options = {
"option1": None,
"option2": None,
"option3": ["meta1", "meta2", "meta3"],
"option4": "str1",
}
res = []
for (k,v) in options.items():
if v is None:
continue
if type(v) == type([]):
res.append("%s=%s" % (k,"+".join(v)))
else:
res.append("%s=%s" % (k,v))
print "&".join(res)
Output:
option4=str1&option3=meta1+meta2+meta3

This seems to do the same thing:
options = {0: 'option1',
1: 'option2',
2: 'option4'
'option3': ['meta1', 'meta2', 'meta3'] }
More in general:
[] denote lists, i.e. ordered collections: [1, 2, 3] or [x ** 2 for x in [1, 2, 3]]
{} denote sets, i.e. unordered collections of unique (hashable) elements, and dictionaries, i.e. mappings between unique (hashable) keys and values: {1, 2, 3}, {'a': 1, 'b': 2}, {x: x ** 2 for x in [1, 2, 3]}
() denote (among other things) tuples, i.e. immutable ordered collections: (1, 2, 3)
() also denote generators: (x ** 2 for x in (1, 2, 3))
You can mix them any way you like (as long as elements of a set and keys of a dictionary are hashable):
>>> a = {(1,2): [2,2], 2: {1: 2}}
>>> a
{(1, 2): [2, 2], 2: {1: 2}}
>>> a[1,2]
[2, 2]
>>> a[1,2][0]
2
>>> a[2]
{1: 2}
>>> a[2][1]
2

I'm pretty sure there are 3 answers for my question and while it received a -1 vote, it is the closest to what I want. It is very strange now that it is gone when I want to pick that one up as "accepted answer" :(
To recap, the removed answer suggest I should do this:
options = [
"option1",
"option2",
{"option3":["meta1", "meta2", "meta3"]},
"option4"
]
And that fits nicely how I want to process each item on the list. I just loop through all values and check for its type. If it is a string, process it like normal. But when it is a dict/list, it will be handled differently.
Ultimately, I managed to make it work and I get what I want.
Special thanks to Igor Chubin and Hyperboreus for providing suggestions and ideas for me to test and discover the answer I've been looking for. Greatly appreciated.
Thank you!
Rufas

Related

Compare value in dict with other values

I'd like to compare all entries in a dict with all other entries – if the values are within a close enough range, I want to merge them under a single key and delete the other key. But I cannot figure out how to iterate through the dict without errors.
An example version of my code (not the real set of values, but you get the idea):
things = { 'a': 1, 'b': 3, 'c': 22 }
for me in things.iteritems():
for other in things.iteritems():
if me == other:
continue
if abs(me-other) < 5:
print 'merge!', me, other
# merge the two into 'a'
# delete 'b'
I'd hope to then get:
>> { 'a': [ 1, 2 ], 'c': 22 }
But if I run this code, I get the first two that I want to merge:
>> merge! ('a', 1) ('b', 2)
Then the same one in reverse (which I want to have merged already):
>> duplicate! ('b', 2) ('a', 1)
If I use del things['b'] I get an error that I'm trying to modify the dict while iterating. I see lots of "how to remove items from a dict" questions, and lots about comparing two separate dicts, but not this particular problem (as far as I can tell).
EDIT
Per feedback in the comments, I realized my example is a little misleading. I want to merge two items if their values are similar enough.
So, to do this in linear time (but requiring extra space) use an intermediate dict to group the keys by value:
>>> things = { 'fruit': 'tomato', 'vegetable': 'tomato', 'grain': 'wheat' }
>>> from collections import defaultdict
>>> grouper = defaultdict(list)
>>> for k, v in things.iteritems():
... grouper[v].append(k)
...
>>> grouper
defaultdict(<type 'list'>, {'tomato': ['vegetable', 'fruit'], 'wheat': ['grain']})
Then, you simply take the first item from your list of values (that used to be keys), as the new key:
>>> {v[0]:k for k, v in grouper.iteritems()}
{'vegetable': 'tomato', 'grain': 'wheat'}
Note, dictionaries are inherently unordered, so if order is important, you should have been using an OrderedDict from the beginning.
Note that your result will depend on the direction of the traversal. Since you are bucketing data depending on distance (in the metric sense), either the right neighbor or the left neighbor can claim the data point.

TypeError 'set' object does not support item assignment

I need to know why it won't let me increase the value of the assignment by 1:
keywords = {'states' : 0, 'observations' : 1, 'transition_probability' : 2, 'emission_probability' : 3}
keylines = {-1,-1,-1,-1}
lines = file.readlines()
for i in range(0,len(lines)):
line = lines[i].rstrip()
if line in keywords.keys():
keylines[keywords[line]] = i + 1 << this is where it is giving me the error
I ran it as a class and it worked fine, but now as an in-line code piece it gives me this error.
I'm surprised at the lack of answers here, so adding one that hits on one of the more common ways this manifests here in case anyone else hits this from Search and needs to know.
It's unclear if the OP is trying to use it as a set or a second dictionary, but in my case I encountered the error due to a common pitfall in Python.
It hinges on two things:
In Python, Dictionaries and sets both use { }. (as of python 3.10.2)
You cannot create a dictionary with only keys and no values. Due to this, if you just do a_dict = {1, 2, 3} you don't make a dict, you make a set, as shown in this example:
In [1]: test_dict = {1, 2, 3}
...: test_set = {1, 2, 3}
...:
...: print(f"{type(test_dict)=} {type(test_set)=}")
type(test_dict)=<class 'set'>
type(test_set)=<class 'set'>
If you want to declare a dictionary with just it's keys, you'll add values to your key:value pairs. However, you can use None:
In [4]: test_dict = {1: None, 2: None, 3: None}
...: test_set = {1, 2, 3}
...:
...: print(f"{type(test_dict)=} {type(test_set)=}")
type(test_dict)=<class 'dict'>
type(test_set)=<class 'set'>
or dict.fromkeys([1, 2, 3])
You're using a set, you want a list, which is created with square brackets:
keylines = [-1,-1,-1,-1]

In Python how to obtain a partial view of a dict?

Is it possible to get a partial view of a dict in Python analogous of pandas df.tail()/df.head(). Say you have a very long dict, and you just want to check some of the elements (the beginning, the end, etc) of the dict. Something like:
dict.head(3) # To see the first 3 elements of the dictionary.
{[1,2], [2, 3], [3, 4]}
Thanks
Kinda strange desire, but you can get that by using this
from itertools import islice
# Python 2.x
dict(islice(mydict.iteritems(), 0, 2))
# Python 3.x
dict(islice(mydict.items(), 0, 2))
or for short dictionaries
# Python 2.x
dict(mydict.items()[0:2])
# Python 3.x
dict(list(mydict.items())[0:2])
Edit:
in Python 3.x:
Without using libraries it's possible to do it this way. Use method:
.items()
which returns a list of dictionary keys with values.
It's necessary to convert it to a list otherwise an error will occur 'my_dict' object is not subscriptable. Then convert it to the dictionary. Now it's ready to slice with square brackets.
dict(list(my_dict.items())[:3])
import itertools
def glance(d):
return dict(itertools.islice(d.iteritems(), 3))
>>> x = {1:2, 3:4, 5:6, 7:8, 9:10, 11:12}
>>> glance(x)
{1: 2, 3: 4, 5: 6}
However:
>>> x['a'] = 2
>>> glance(x)
{1: 2, 3: 4, u'a': 2}
Notice that inserting a new element changed what the "first" three elements were in an unpredictable way. This is what people mean when they tell you dicts aren't ordered. You can get three elements if you want, but you can't know which three they'll be.
I know this question is 3 years old but here a pythonic version (maybe simpler than the above methods) for Python 3.*:
[print(v) for i, v in enumerate(my_dict.items()) if i < n]
It will print the first n elements of the dictionary my_dict
one-up-ing #Neb's solution with Python 3 dict comprehension:
{k: v for i, (k, v) in enumerate(my_dict.items()) if i < n}
It returns a dict rather than printouts
For those who would rather solve this problem with pandas dataframes. Just stuff your dictionary mydict into a dataframe, rotate it, and get the first few rows:
pd.DataFrame(mydict, index=[0]).T.head()
0 hi0
1 hi1
2 hi2
3 hi3
4 hi4
From the documentation:
CPython implementation detail: Keys and values are listed in an
arbitrary order which is non-random, varies across Python
implementations, and depends on the dictionary’s history of insertions
and deletions.
I've only toyed around at best with other Python implementations (eg PyPy, IronPython, etc), so I don't know for certain if this is the case in all Python implementations, but the general idea of a dict/hashmap/hash/etc is that the keys are unordered.
That being said, you can use an OrderedDict from the collections library. OrderedDicts remember the order of the keys as you entered them.
If keys are someway sortable, you can do this:
head = dict([(key, myDict[key]) for key in sorted(myDict.keys())[:3]])
Or perhaps:
head = dict(sorted(mydict.items(), key=lambda: x:x[0])[:3])
Where x[0] is the key of each key/value pair.
list(reverse_word_index.items())[:10]
Change the number from 10 to however many items of the dictionary reverse_word_index you want to preview
A quick and short solution can be this:
import pandas as pd
d = {"a": [1,2], "b": [2, 3], "c": [3, 4]}
pd.Series(d).head()
a [1, 2]
b [2, 3]
c [3, 4]
dtype: object
This gives back a dictionary:
dict(list(my_dictname.items())[0:n])
If you just want to have a glance of your dict, then just do:
list(freqs.items())[0:n]
Order of items in a dictionary is preserved in Python 3.7+, so this question makes sense.
To get a dictionary with only 10 items from the start you can use pandas:
d = {"a": [1,2], "b": [2, 3], "c": [3, 4]}
import pandas as pd
result = pd.Series(d).head(10).to_dict()
print(result)
This will produce a new dictionary.
d = {"a": 1,"b": 2,"c": 3}
for i in list(d.items())[:2]:
print('{}:{}'.format(d[i][0], d[i][1]))
a:1
b:2

Using lists and dictionaries to store temporary information

I will have alot of similar objects with similar parameters. Example of an object parameters would be something like :
name, boolean, number and list.
The name must be unique value among all the objects while values for boolean, number and list parameters must not.
I could store the data as list of dictionaries i guess. Like that:
list = [
{'name':'a', 'bool':true, 'number':123, 'list':[1, 2, 3]},
{'name':'b', 'bool':false, 'number':143, 'list':[1, 3, 5]},
{'name':'c', 'bool':false, 'number':123, 'list':[1, 4, 5, 18]},
]
What would be the fastest way to check if the unique name exists in the list of dictionaries, before i create another dictionary in that list? Do i have to loop through the list and check what is the value of list[i][name]? What would be fastest and least memory conserving to hold and process that information, assuming, that different similar lists might be simultanously processed in different threads/tasks and that their size could be anywhere between 100 to 100 000 dictionaries per list. Should i store those lists in database instead of memory?
I understand that perhaps i should not be thinking about optimizing (storing the info and threads) before the project is working, so please, answer the unique name lookup question first :)
Thanks,
Alan
If the name is the actual (unique) identifier of each inner data, you could just use a dictionary for the outer data as well:
data = {
'a' : { 'bool':true, 'number':123, 'list':[1, 2, 3] },
'b' : { 'bool':false, 'number':143, 'list':[1, 3, 5] },
'c' : { 'bool':false, 'number':123, 'list':[1, 4, 5, 18] },
}
Then you could easily check if the key exists or not.
Btw. don't name your variables list or dict as that will overwrite the built-in objects.
once you come around to using a dict instead of a list, the fastest way to perform the check that you want is:
if 'newkey' not in items:
# create a new record
since you want to be able to access these records from multiple threads, I would keep a collection of locks. BTW, this is the sort of thing that you design in the beginning as it's part of the application design and not an optimization.
class DictLock(dict):
def __init__(self):
self._lock = threading.Lock()
def __getitem__(self, key):
# lock to prevent two threads trying to create the same
# entry at the same time. Then they would get different locks and
# both think that they could access the key guarded by that lock
with self._lock:
if key not in self.iterkeys():
self[key] = threading.Lock()
return super(DictLock, self).__getitem__(key)
now if you want to modify your items, you can use the locks to keep it safe.
locks = DictLock()
with locks['a']:
# modify a.
or to insert a new element
with locks['z']:
#we are now the only ones (playing by the rules) accessing the 'z' key
items['z'] = create_new_item()
What you want is an "intrusive" dictionary - something that looks for keys inside values. Unfortunately, I don't know of any implementation in Python. Boost's multi_index comes close.
If you don't want to change the data structure you have, then you can use the following. Otherwise, poke's answer is the way to go.
>>> my_list = [
... {'name':'a', 'bool':True, 'number':123, 'list':[1, 2, 3]},
... {'name':'b', 'bool':False, 'number':143, 'list':[1, 3, 5]},
... {'name':'c', 'bool':False, 'number':123, 'list':[1, 4, 5, 18]},
... ]
>>> def is_present(data, name):
... return any(name == d["name"] for d in data)
...
>>> is_present(my_list, "a")
True
>>> is_present(my_list, "b")
True
>>> is_present(my_list, "c")
True
>>> is_present(my_list, "d")
False
If you pass any an iterable, it returns True if any one of its elements are True.
(name == d["name"] for d in data) creates a generator. Each time somebody (in this case, any) requests the next element, it does so by getting the next element, d, from data and transforms it by the expression name == d["name"]. Since generators are lazy i.e. the transformation is done when the next element is requested, this should use relatively little memory (and should use the same amount of memory regardless of the size of the list).
Store the objects in a dictionary with the name as the key:
objects = {'a' : {'bool':true, 'number':123, 'list':[1, 2, 3]},
'b' : {'bool':false, 'number':143, 'list':[1, 3, 5]},
'c' : {'bool':false, 'number':123, 'list':[1, 4, 5, 18]}}
This way you ensure that the names are unique since all the keys in the dictionary are unique. Checking is a name is in the dictionary is also easy:
name in objects

Is there a more pythonic way to build this dictionary?

What is the "most pythonic" way to build a dictionary where I have the values in a sequence and each key will be a function of its value? I'm currently using the following, but I feel like I'm just missing a cleaner way. NOTE: values is a list that is not related to any dictionary.
for value in values:
new_dict[key_from_value(value)] = value
At least it's shorter:
dict((key_from_value(value), value) for value in values)
>>> l = [ 1, 2, 3, 4 ]
>>> dict( ( v, v**2 ) for v in l )
{1: 1, 2: 4, 3: 9, 4: 16}
In Python 3.0 you can use a "dict comprehension" which is basically a shorthand for the above:
{ v : v**2 for v in l }
Py3K:
{ key_for_value(value) : value for value in values }
This method avoids the list comprehension syntax:
dict(zip(map(key_from_value, values), values))
I will never claim to be an authority on "Pythonic", but this way feels like a good way.

Categories

Resources