Related
New to the Stackoverflow, apologies if the title isn't that clear.
Effectively I am working with two xl to CSV files, both converted into nested dictionaries using method to_dict, where index is the key for the each (main?) dictionary and the columns are the keys for each nested dictionary.
i.e.
DICTA = {0: {x:1, y:2, v:3}, 1: {x:5, y:6, v:7}, 2: {x:8, y:9, v:10}}
DICTB = {0: {a:3, b:12, c:13, d:14}, 1: {a:15, b:16, c:17, d:18}, 2: {a:19, b:20, c:21, d:22}}
Values are arbitrary for the example above (length of both dictionaries will always be the same, nested dictionaries have different number of keys)
Each nested dictionary in DICT B can only be used once to update a a nested DICT A dict i.e. each nested dict in DICT A 'belongs' to a nested dict in DICT B but not in any specific order.
My aim is to update values (of nested dicts) in Dict A with values from Dict B (keys are diff for both) if other conditions/values are met.. i.e. what I have so far:
for k, v in DICTA.items():
i=0
h=0
if DICTA[i].get('v') in (DICTB[h].get('a'), (DICTB[h].get('b')):
if (DICTB[h].get('a') != '15': #another condition I need to put in
DICTA[i].update({'x': DICTB[h].get('c')})
DICTA[i].update({'y': DICTB[h].get('d')})
i+=1
else:
DICTA[i].update({'y': DICTB[h].get('c')})
DICTA[i].update({'x': DICTB[h].get('d')})
i+=1
else:
h+=1
Actual output:
In: DICTA
Out: {0: {x:13, y:14, v:3}, 1: {x:5, y:6, v:7}, 2: {x:8, y:9, v:10}}
Expected Output for the above:
In: DICTA
Out: {0: {x:13, y:14, v:3}, 1: {x:18, y:17, v:7}, 2: {x:21, y:22, v:10}}
My issue is that this works for the first DICTA entry but then fails to update the next two i.e. this clearly doesn't update i or h correctly to loop through the next nested dictionary.
Fully aware the above might be painfully un-pythonic and am very much open to easier ways of solving this.
Thanks guys appreciate any help with the above.
If I understand you correctly this should work:
for row_a, row_b in zip(DICTA.values(), DICTB.values()):
if row_a.get('v') in (row_b.get('a'), row_b.get('b')):
if row_b.get('a') != '15':
row_a.update({
'x': row_b.get('c'),
'y': row_b.get('d')
})
else:
row_a.update({
'y': row_b.get('c'),
'x': row_b.get('d')
})
Also instead of:
row_a.update({
'x': row_b.get('c'),
'y': row_b.get('d')
})
You could use:
row_a['x'] = row_b.get('c')
row_a['y'] = row_b.get('d')
but that's a question of preference.
I don't really understand the concept of python dictionary, can anyone help me? I want the program to have similar functionality as append in list python
d = {'key': ['value']}
print(d)
# {'key': ['value']}
d['key'] = ['mynewvalue']
print(d)
# {'key': ['mynewvalue']}
what I want the output of the program, either :
print(d)
#{'key': ['value'],'key': ['mynewvalue']}
or :
print(d)
#{'key': ['value','mynewvalue']}
Sure: first thing first, you can't have two identical keys in a dictionary. So:
{'key': 'myfirstvalue', 'key': 'mysecondvalue'}
wouldn't work. If a key has multiple values, then the key's value should be a list of values, like in your last option. Like in a real dictionary, you won't find, word: definition, word: another definition but word: a list of definitions.
In this regard, you could kind of think of a dictionary as a collection of variables - you can't assign two values to a variable except by assigning a list of values to variable.
x = 4
x = 5
is working code, but the first line is rendered meaningless. x is only equal to 5, not both 4 and 5. You could, however, say:
x = [4, 5]
I often use dictionaries for trees of data. For example, I'm working on a project involving counties for every state in the US. I have a dictionary with a key for each state, and the value of each key is another dictionary, with a key for each county, and the value for each of those dictionaries is another dictionary with the various data points for that county.
That said, you can interact with your dictionary just like you would with variables.
mylist = [1, 2, 3, 4]
mylist.append(5)
print(mylist)
will print:
[1,2,3,4,5]
But also:
mydict = {'mylist': [1,2,3,4]}
mydict['mylist'].append(5)
does the same thing.
mydict['mylist']
is the same as
mylist
in the first example. Both are equal to the list [1,2,3,4]
You cannot have same keys multiple times in a dict in python. The first output scenario you gave is invalid. The value of a dict can contain any data and in your case, it can be accessed and modified just as a list. You can modify the code as given below to get the output as desired in scenario number 2.
d = {'key': ['value']}
print(d)
# {'key': ['value']}
d['key'].append('mynewvalue')
print(d)
#{'key': ['value','mynewvalue']}
you can try it:
d = {'key': ['value']}
d['key'].append("mynewvalue")
print(d)
Output will be: {'key': ['value', 'mynewvalue']}
For the first implementation you want, I think you are violating the entire idea of dictionary, we can not have multiple keys with the same name.
For the second implementation you could write a function like this:
def updateDict(mydict,value):
mydict['key'].append(value)
I am not used to code with Python, but I have to do this one with it. What I am trying to do, is something that would reproduce the result of SQL statment like this :
SELECT T2.item, AVG(T1.Value) AS MEAN FROM TABLE_DATA T1 INNER JOIN TABLE_ITEMS T2 ON T1.ptid = T2.ptid GROUP BY T2.item.
In Python, I have two lists of dictionnaries, with the common key 'ptid'. My dctData contains around 100 000 pdit and around 7000 for the dctItems. Using a comparator like [i for i in dctData for j in dctItems if i['ptid'] == j['ptid']] is endless:
ptid = 1
for line in lines[6:]: # Skipping header
data = line.split()
for d in data:
dctData.append({'ptid' : ptid, 'Value': float(d)})
ptid += 1
dctData = [{'ptid':1,'Value': 0}, {'ptid':2,'Value': 2}, {'ptid':3,'Value': 2}, {'ptid':4,'Value': 5}, {'ptid':5,'Value': 3}, {'ptid':6,'Value': 2}]
for line in lines[1:]: # Skipping header
data = line.split(';')
dctItems.append({'ptid' : int(data[1]), 'item' : data[3]})
dctItems = [{'item':21, 'ptid':1}, {'item':21, 'ptid':2}, {'item':21, 'ptid':6}, {'item':22, 'ptid':2}, {'item':22, 'ptid':5}, {'item':23, 'ptid':4}]
Now, what I would like to get for result, is a third list that would present the average values according to each item in dctItems dictionnary, while the link between the two dictionnaries would be based on the 'pdit' value.
Where for example with the item 21, it would calculate the mean value of 1.3 by getting the values (0, 2, 2) of the ptid 1, 2 and 6:
And finally, the result would look something like this, where the key Value represents the mean calculated :
dctResults = [{'id':21, 'Value':1.3}, {'id':22, 'Value':2.5}, {'id':23, 'Value':5}]
How can I achieve this?
Thanks you all for your help.
Given those data structures that you use, this is not trivial, but it will become much easier if you use a single dictionary mapping items to their values, instead.
First, let's try to re-structure your data in that way:
values = {entry['ptid']: entry['Value'] for entry in dctData}
items = {}
for item in dctItems:
items.setdefault(item['item'], []).append(values[item['ptid']])
Now, items has the form {21: [0, 2, 2], 22: [2, 3], 23: [5]}. Of course, it would be even better if you could create the dictionary in this form in the first place.
Now, we can pretty easily calculate the average for all those lists of values:
avg = lambda lst: float(sum(lst))/len(lst)
result = {item: avg(values) for item, values in items.items()}
This way, result is {21: 1.3333333333333333, 22: 2.5, 23: 5.0}
Or if you prefer your "list of dictionaries" style:
dctResult = [{'id': item, 'Value': avg(values)} for item, values in items.items()]
I have a dictionary of "documents" in python with document ID numbers as keys and dictionaries (again) as values. These internal dictionaries each have a 'weight' key that holds a floating-point value of interest. In other words:
documents[some_id]['weight'] = ...
What I want to do is obtain a list of my document IDs sorted in descending order of the 'weight' value. I know that dictionaries are inherently unordered (and there seem to be a lot of ways to do things in Python), so what is the most painless way to go? It feels like kind of a messy situation...
I would convert the dictionary to a list of tuples and sort it based on weight (in reverse order for descending), then just remove the objects to get a list of the keys
l = documents.items()
l.sort(key=lambda x: x[1]['weight'], reverse=True)
result = [d[0] for d in l]
I took the approach that you might want the keys as well as the rest of the object:
# Build a random dictionary
from random import randint
ds = {} # A |D|ata |S|tructure
for i in range(20,1,-1):
ds[i]={'weight':randint(0,100)}
sortedDS = sorted(ds.keys(),key=lambda x:ds[x]['weight'])
for i in sortedDS :
print i,ds[i]['weight']
sorted() is a python built in that takes a list and returns it sorted (obviously), however it can take a key value that it uses to determine the rank of each object. In the above case it uses the 'weight' value as the key to sort on.
The advantage of this over Ameers answer is that it returns the order of keys rather than the items. Its an extra step, but it means you can refer back into the original data structure
This seems to work for me. The inspiration for it came from OrderedDict and question #9001509
from collections import OrderedDict
d = {
14: {'weight': 90},
12: {'weight': 100},
13: {'weight': 101},
15: {'weight': 5}
}
sorted_dict = OrderedDict(sorted(d.items(), key=lambda rec: rec[1].get('weight')))
print sorted_dict
I have a Dictionary below:
colors = {
"blue" : "5",
"red" : "6",
"yellow" : "8",
}
How do I index the first entry in the dictionary?
colors[0] will return a KeyError for obvious reasons.
If anybody is still looking at this question, the currently accepted answer is now outdated:
Since Python 3.7*, dictionaries are order-preserving, that is they now behave like collections.OrderedDicts. Unfortunately, there is still no dedicated method to index into keys() / values() of the dictionary, so getting the first key / value in the dictionary can be done as
first_key = list(colors)[0]
first_val = list(colors.values())[0]
or alternatively (this avoids instantiating the keys view into a list):
def get_first_key(dictionary):
for key in dictionary:
return key
raise IndexError
first_key = get_first_key(colors)
first_val = colors[first_key]
If you need an n-th key, then similarly
def get_nth_key(dictionary, n=0):
if n < 0:
n += len(dictionary)
for i, key in enumerate(dictionary.keys()):
if i == n:
return key
raise IndexError("dictionary index out of range")
* CPython 3.6 already included insertion-ordered dicts, but this was only an implementation detail. The language specification includes insertion-ordered dicts from 3.7 onwards.
Dictionaries are unordered in Python versions up to and including Python 3.6. If you do not care about the order of the entries and want to access the keys or values by index anyway, you can create a list of keys for a dictionary d using keys = list(d), and then access keys in the list by index keys[i], and the associated values with d[keys[i]].
If you do care about the order of the entries, starting with Python 2.7 you can use collections.OrderedDict. Or use a list of pairs
l = [("blue", "5"), ("red", "6"), ("yellow", "8")]
if you don't need access by key. (Why are your numbers strings by the way?)
In Python 3.7, normal dictionaries are ordered, so you don't need to use OrderedDict anymore (but you still can – it's basically the same type). The CPython implementation of Python 3.6 already included that change, but since it's not part of the language specification, you can't rely on it in Python 3.6.
Addressing an element of dictionary is like sitting on donkey and enjoy the ride.
As a rule of Python, a DICTIONARY is orderless
If there is
dic = {1: "a", 2: "aa", 3: "aaa"}
Now suppose if I go like dic[10] = "b", then it will not add like this always
dic = {1:"a",2:"aa",3:"aaa",10:"b"}
It may be like
dic = {1: "a", 2: "aa", 3: "aaa", 10: "b"}
Or
dic = {1: "a", 2: "aa", 10: "b", 3: "aaa"}
Or
dic = {1: "a", 10: "b", 2: "aa", 3: "aaa"}
Or any such combination.
So a rule of thumb is that a DICTIONARY is orderless!
If you need an ordered dictionary, you can use odict.
oh, that's a tough one. What you have here, basically, is two values for each item. Then you are trying to call them with a number as the key. Unfortunately, one of your values is already set as the key!
Try this:
colors = {1: ["blue", "5"], 2: ["red", "6"], 3: ["yellow", "8"]}
Now you can call the keys by number as if they are indexed like a list. You can also reference the color and number by their position within the list.
For example,
colors[1][0]
// returns 'blue'
colors[3][1]
// returns '8'
Of course, you will have to come up with another way of keeping track of what location each color is in. Maybe you can have another dictionary that stores each color's key as it's value.
colors_key = {'blue': 1, 'red': 6, 'yllow': 8}
Then, you will be able to also look up the colors key if you need to.
colors[colors_key['blue']][0] will return 'blue'
Something like that.
And then, while you're at it, you can make a dict with the number values as keys so that you can always use them to look up your colors, you know, if you need.
values = {5: [1, 'blue'], 6: [2, 'red'], 8: [3, 'yellow']}
Then, (colors[colors_key[values[5][1]]][0]) will return 'blue'.
Or you could use a list of lists.
Good luck!
actually I found a novel solution that really helped me out, If you are especially concerned with the index of a certain value in a list or data set, you can just set the value of dictionary to that Index!:
Just watch:
list = ['a', 'b', 'c']
dictionary = {}
counter = 0
for i in list:
dictionary[i] = counter
counter += 1
print(dictionary) # dictionary = {'a':0, 'b':1, 'c':2}
Now through the power of hashmaps you can pull the index your entries in constant time (aka a whole lot faster)
Consider why you are indexing
First, I would say to make sure you really need to index into the dict. A dict was originally intended not to even have an order, so perhaps there is alternate way to resolve the need to index that uses the strengths of the existing base Python data types.
For example, if you have a list of colors that are needed in a certain order, just store the list of colors, then index into those, and feed them into the dict to get the values.
color_order = [ 'blue', 'yellow', 'yellow', 'blue' ]
value_0 = colors[color_order[0]]
On the other hand, if you need some default color value as index 0, consider using a separate value to store the default, or add an additional entry that sets the default value that you can just key into instead of having to index:
default_color = 'blue'
default_value = colors[default_color]
colors = { 'default': '5', 'blue': '5', 'red': '6', 'yellow': '8' }
default_value = colors['default']
Find the index with a function
You can find a dict index by counting into the dict.keys() with a loop. If you use the enumerate() function, it will generate the index values automatically.
This is the most straight-forward, but costs a little more CPU every time you look up the index. This assumes an ordered dict (Python 3.7+ guarantees this).
To find the key at a given index:
def key_at_index(mydict, index_to_find):
for index, key in enumerate(mydict.keys()):
if index == index_to_find:
return key
return None # the index doesn't exist
To find the index of an key:
def index_of_key(mydict, key_to_find):
for index, key in enumerate(mydict.keys()):
if key == key_to_find:
return index
return None # the key doesn't exist
Create a list of keys
If you need a solution that will be accessed a lot, you can create a duplicate list of the keys that mirrors the keys in your current dictionary, then index into the list if you know the index, or use the list.index(item) method of the list to find the index. A list is preferable to creating a dict with the indexes, because a list inherently already has indexes, and built-in functions are typically much faster and more likely to correctly handle edge and corner cases.
There is extra overhead with this method, but it could be worth it if you are doing a lot of data analysis and need to access the indexes regularly.
# Note: you don't actually need the `.keys()`, but it's easier to understand
colors_i = list(colors.keys())
index_blue = colors.index('blue')
index0 = colors_i[0]
value0 = colors[index0]
print(f'colors: {colors}\ncolor_i: {colors_i}')
print(f'index_blue = {index_blue}, index0 = "{index0}", value0 = "{value0}"')
# colors: {'blue': '5', 'red': '6', 'yellow': '8'}
# color_i: ['blue', 'red', 'yellow']
# index_blue = 0, index0 = "blue", value0 = "5"
Note: This is static, and will not be updated if your source dictionary get's updated. You will need to add new items to both the list and the dict to keep them in sync
Function to update the dict and list
The below is a function that will update your dict and index list at the same time. If an item already exists, it will update the value and not add it to the list (otherwise there will be a duplicate entry in the list, while the dict will only update the existing entry).
This approach could be extended into a class if doing large amounts of processing, especially if other extended functions are required on top of this.
def index_add_item(mydict, index_list, key, value):
# Note: The dict and list are passed by reference, so we can just update them
try: # in case key doesn't exist
existing_value = colors[key]
except KeyError: # key does not exist, update dict and list
mydict.update({key: value})
index_list.append(key)
else: # key already exists, just update value
mydict[key] = value
index_add_item(colors, colors_i, 'purple', '99')
print(f'colors: {colors}\ncolors_i: {colors_i}')
# colors: {'blue': '5', 'red': '6', 'yellow': '8', 'purple': '99'}
# colors_i: ['blue', 'red', 'yellow', 'purple']
index_add_item(colors, colors_i, 'blue', '1')
print(f'colors: {colors}\ncolors_i: {colors_i}')
# colors: {'blue': '1', 'red': '6', 'yellow': '8', 'purple': '99'}
# colors_i: ['blue', 'red', 'yellow', 'purple']
You can't, since dict is unordered. you can use .popitem() to get an arbitrary item, but that will remove it from the dict.
I moved further with LightCC answer:
def key_value(mydict, find_code, find_key, return_value):
for key in mydict:
if key[find_code] == find_key:
return key[return_value]
return None
and I am not sure if this def could be optimized further (as nearly as oneliner).
Given a dict mydict in Python 3.7 and later, after dict became ordered by order of insertion, one can do:
next(iter(mydict.items())) to retrieve the first key, value pair that was inserted.
next(iter(mydict.keys())) to retrieve the first key that was inserted.
next(iter(mydict.value())) to retrieve the first value that was inserted.
This approach does not require iterating through all the elements of the dictionary.
Simple code that works.
# Example dictionary
d = {
'a': 5,
'b': 6,
'c': 7,
'd': 8,
'e': 9,
}
# Index you want
index = 3
# Use the fact that d.keys() is ordered the same as d.values()
value = d[list(d.keys())[index]]
print(value)
Will print
8
Keys and values are ordered the same according to this question