I am trying to build a dictionary based on a larger input of text. From this input, I will create nested dictionaries which will need to be updated as the program runs. The structure ideally looks like this:
nodes = {}
node_name: {
inc_name: inc_capacity,
inc_name: inc_capacity,
inc_name: inc_capacity,
}
Because of the nature of this input, I would like to use variables to dynamically create dictionary keys (or access them if they already exist). But I get KeyError if the key doesn't already exist. I assume I could do a try/except, but was wondering if there was a 'cleaner' way to do this in python. The next best solution I found is illustrated below:
test_dict = {}
inc_color = 'light blue'
inc_cap = 2
test_dict[f'{inc_color}'] = inc_cap
# test_dict returns >>> {'light blue': 2}
Try this code, for Large Scale input. For example file input
Lemme give you an example for what I am aiming for, and I think, this what you want.
File.txt
Person1: 115.5
Person2: 128.87
Person3: 827.43
Person4:'18.9
Numerical Validation Function
def is_number(a):
try:
float (a)
except ValueError:
return False
else:
return True
Code for dictionary File.txt
adict = {}
with open("File.txt") as data:
adict = {line[:line.index(':')]: line[line.index(':')+1: ].strip(' \n') for line in data.readlines() if is_number(line[line.index(':')+1: ].strip('\n')) == True}
print(adict)
Output
{'Person1': '115.5', 'Person2': '128.87', 'Person3': '827.43'}
For more explanation, please follow this issue solution How to fix the errors in my code for making a dictionary from a file
As already mentioned in the comments sections, you can use setdefault.
Here's how I will implement it.
Assume I want to add values to dict : node_name and I have the keys and values in two lists. Keys are in inc_names and values are in inc_ccity. Then I will use the below code to load them. Note that inc_name2 key exists twice in the key list. So the second occurrence of it will be ignored from entry into the dictionary.
node_name = {}
inc_names = ['inc_name1','inc_name2','inc_name3','inc_name2']
inc_ccity = ['inc_capacity1','inc_capacity2','inc_capacity3','inc_capacity4']
for i,names in enumerate(inc_names):
node = node_name.setdefault(names, inc_ccity[i])
if node != inc_ccity[i]:
print ('Key=',names,'already exists with value',node, '. New value=', inc_ccity[i], 'skipped')
print ('\nThe final list of values in the dict node_name are :')
print (node_name)
The output of this will be:
Key= inc_name2 already exists with value inc_capacity2 . New value= inc_capacity4 skipped
The final list of values in the dict node_name are :
{'inc_name1': 'inc_capacity1', 'inc_name2': 'inc_capacity2', 'inc_name3': 'inc_capacity3'}
This way you can add values into a dictionary using variables.
Related
[I had problem on how to iter through dict to find a pair of similar words and output it then the delete from dict]
My intention is to generate a random output label then store it into dictionary then iter through the dictionary and store the first key in the list or some sort then iter through the dictionary to search for similar key eg Light1on and Light1off has Light1 in it and get the value for both of the key to store into a table in its respective columns.
such as
Dict = {Light1on,Light2on,Light1off...}
store value equal to Light1on the iter through the dictionary to get eg Light1 off then store its Light1on:value1 and Light1off:value2 into a table or DF with columns name: On:value1 off:value2
As I dont know how to insert the code as code i can only provide the image sry for the trouble,its my first time asking question here thx.
from collections import defaultdict
import difflib, random
olist = []
input = 10
olist1 = ['Light1on','Light2on','Fan1on','Kettle1on','Heater1on']
olist2 = ['Light2off','Kettle1off','Light1off','Fan1off','Heater1off']
events = list(range(input + 1))
for i in range(len(olist1)):
output1 = random.choice(olist1)
print(output1,'1')
olist1.remove(output1)
output2 = random.choice(olist2)
print(output2,'2')
olist2.remove(output2)
olist.append(output1)
olist.append(output2)
print(olist,'3')
outputList = {olist[i]:events[i] for i in range(10)}
print (str(outputList),'4')
# Iterating through the keys finding a pair match
for s in range(5):
for i in outputList:
if i == list(outputList)[0]:
skeys = difflib.get_close_matches(i, outputList, n=2, cutoff=0.75)
print(skeys,'5')
del outputList[skeys]
# Modified Dictionary
difflib.get_close_matches('anlmal', ['car', 'animal', 'house', 'animaltion'])
['animal']
Updated: I was unable to delete the pair of similar from the list(Dictionary) after founding par in the dictionary
You're probably getting an error about a dictionary changing size during iteration. That's because you're deleting keys from a dictionary you're iterating over, and Python doesn't like that:
d = {1:2, 3:4}
for i in d:
del d[i]
That will throw:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration
To work around that, one solution is to store a list of the keys you want to delete, then delete all those keys after you've finished iterating:
keys_to_delete = []
d = {1:2, 3:4}
for i in d:
if i%2 == 1:
keys_to_delete.append(i)
for i in keys_to_delete:
del d[i]
Ta-da! Same effect, but this way avoids the error.
Also, your code above doesn't call the difflib.get_close_matches function properly. You can use print(help(difflib.get_close_matches)) to see how you are meant to call that function. You need to provide a second argument that indicates the items to which you wish to compare your first argument for possible matches.
All of that said, I have a feeling that you can accomplish your fundamental goals much more simply. If you spend a few minutes describing what you're really trying to do (this shouldn't involve any references to data types, it should just involve a description of your data and your goals), then I bet someone on this site can help you solve that problem much more simply!
Summary of issue: I'm trying to create a nested Python dictionary, with keys defined by pre-defined variables and strings. And I'm populating the dictionary from regular expressions outputs. This mostly works. But I'm getting an error because the nested dictionary - not the main one - doesn't like having the key set to a string, it wants an integer. This is confusing me. So I'd like to ask you guys how I can get a nested python dictionary with string keys.
Below I'll walk you through the steps of what I've done. What is working, and what isn't. Starting from the top:
# Regular expressions module
import re
# Read text data from a file
file = open("dt.cc", "r")
dtcc = file.read()
# Create a list of stations from regular expression matches
stations = sorted(set(re.findall(r"\n(\w+)\s", dtcc)))
The result is good, and is as something like this:
stations = ['AAAA','BBBB','CCCC','DDDD']
# Initialize a new dictionary
rows = {}
# Loop over each station in the station list, and start populating
for station in stations:
rows[station] = re.findall("%s\s(.+)" %station, dtcc)
The result is good, and is something like this:
rows['AAAA'] = ['AAAA 0.1132 0.32 P',...]
However, when I try to create a sub-dictionary with a string key:
for station in stations:
rows[station] = re.findall("%s\s(.+)" %station, dtcc)
rows[station]["dt"] = re.findall("%s\s(\S+)" %station, dtcc)
I get the following error.
"TypeError: list indices must be integers, not str"
It doesn't seem to like that I'm specifying the second dictionary key as "dt". If I give it a number instead, it works just fine. But then my dictionary key name is a number, which isn't very descriptive.
Any thoughts on how to get this working?
The issue is that by doing
rows[station] = re.findall(...)
You are creating a dictionary with the station names as keys and the return value of re.findall method as values, which happen to be lists. So by calling them again by
rows[station]["dt"] = re.findall(...)
on the LHS row[station] is a list that is indexed by integers, which is what the TypeError is complaining about. You could do rows[station][0] for example, you would get the first match from the regex. You said you want a nested dictionary. You could do
rows[station] = dict()
rows[station]["dt"] = re.findall(...)
To make it a bit nicer, a data structure that you could use instead is a defaultdict from the collections module.
The defaultdict is a dictionary that accepts a default type as a type for its values. You enter the type constructor as its argument. For example dictlist = defaultdict(list) defines a dictionary that has as values lists! Then immediately doing dictlist[key].append(item1) is legal as the list is automatically created when setting the key.
In your case you could do
from collections import defaultdict
rows = defaultdict(dict)
for station in stations:
rows[station]["bulk"] = re.findall("%s\s(.+)" %station, dtcc)
rows[station]["dt"] = re.findall("%s\s(\S+)" %station, dtcc)
Where you have to assign the first regex result to a new key, "bulk" here but you can call it whatever you like. Hope this helps.
I am currently working with a dataframe consisting of a column of 13 letter strings ('13mer') paired with ID codes ('Accession') as such:
However, I would like to create a dictionary in which the Accession codes are the keys with values being the 13mers associated with the accession so that it looks as follows:
{'JO2176': ['IGY....', 'QLG...', 'ESS...', ...],
'CYO21709': ['IGY...', 'TVL...',.............],
...}
Which I've accomplished using this code:
Accession_13mers = {}
for group in grouped:
Accession_13mers[group[0]] = []
for item in group[1].iteritems():
Accession_13mers[group[0]].append(item[1])
However, now I would like to go back through and iterate through the keys for each Accession code and run a function I've defined as find_match_position(reference_sequence, 13mer) which finds the 13mer in in a reference sequence and returns its position. I would then like to append the position as a value for the 13mer which will be the key.
If anyone has any ideas for how I can expedite this process that would be extremely helpful.
Thanks,
Justin
I would suggest creating a new dictionary, whose values are another dictionary. Essentially a nested dictionary.
position_nmers = {}
for key in H1_Access_13mers:
position_nmers[key] = {} # replicate key, val in new dictionary, as a dictionary
for value in H1_Access_13mers[key]:
position_nmers[key][value] = # do something
To introspect the dictionary and make sure it's okay:
print position_nmers
You can iterate over the groupby more cleanly by unpacking:
d = {}
for key, s in df.groupby('Accession')['13mer']:
d[key] = list(s)
This also makes it much clearer where you should put your function!
... However, I think that it might be better suited to an enumerate:
d2 = {}
for pos, val in enumerate(df['13mer']):
d2[val] = pos
Python dictionaries really have me today. I've been pouring over stack, trying to find a way to do a simple append of a new value to an existing key in a python dictionary adn I'm failing at every attempt and using the same syntaxes I see on here.
This is what i am trying to do:
#cursor seach a xls file
definitionQuery_Dict = {}
for row in arcpy.SearchCursor(xls):
# set some source paths from strings in the xls file
dataSourcePath = str(row.getValue("workspace_path")) + "\\" + str(row.getValue("dataSource"))
dataSource = row.getValue("dataSource")
# add items to dictionary. The keys are the dayasource table and the values will be definition (SQL) queries. First test is to see if a defintion query exists in the row and if it does, we want to add the key,value pair to a dictionary.
if row.getValue("Definition_Query") <> None:
# if key already exists, then append a new value to the value list
if row.getValue("dataSource") in definitionQuery_Dict:
definitionQuery_Dict[row.getValue("dataSource")].append(row.getValue("Definition_Query"))
else:
# otherwise, add a new key, value pair
definitionQuery_Dict[row.getValue("dataSource")] = row.getValue("Definition_Query")
I get an attribute error:
AttributeError: 'unicode' object has no attribute 'append'
But I believe I am doing the same as the answer provided here
I've tried various other methods with no luck with various other error messages. i know this is probably simple and maybe I couldn't find the right source on the web, but I'm stuck. Anyone care to help?
Thanks,
Mike
The issue is that you're originally setting the value to be a string (ie the result of row.getValue) but then trying to append it if it already exists. You need to set the original value to a list containing a single string. Change the last line to this:
definitionQuery_Dict[row.getValue("dataSource")] = [row.getValue("Definition_Query")]
(notice the brackets round the value).
ndpu has a good point with the use of defaultdict: but if you're using that, you should always do append - ie replace the whole if/else statement with the append you're currently doing in the if clause.
Your dictionary has keys and values. If you want to add to the values as you go, then each value has to be a type that can be extended/expanded, like a list or another dictionary. Currently each value in your dictionary is a string, where what you want instead is a list containing strings. If you use lists, you can do something like:
mydict = {}
records = [('a', 2), ('b', 3), ('a', 4)]
for key, data in records:
# If this is a new key, create a list to store
# the values
if not key in mydict:
mydict[key] = []
mydict[key].append(data)
Output:
mydict
Out[4]: {'a': [2, 4], 'b': [3]}
Note that even though 'b' only has one value, that single value still has to be put in a list, so that it can be added to later on.
Use collections.defaultdict:
from collections import defaultdict
definitionQuery_Dict = defaultdict(list)
# ...
I have some Python dictionaries like this:
A = {id: {idnumber: condition},....
e.g.
A = {1: {11 : 567.54}, 2: {14 : 123.13}, .....
I need to search if the dictionary has any idnumber == 11 and calculate something with the condition. But if in the entire dictionary doesn't have any idnumber == 11, I need to continue with the next dictionary.
This is my try:
for id, idnumber in A.iteritems():
if 11 in idnumber.keys():
calculate = ......
else:
break
You're close.
idnum = 11
# The loop and 'if' are good
# You just had the 'break' in the wrong place
for id, idnumber in A.iteritems():
if idnum in idnumber.keys(): # you can skip '.keys()', it's the default
calculate = some_function_of(idnumber[idnum])
break # if we find it we're done looking - leave the loop
# otherwise we continue to the next dictionary
else:
# this is the for loop's 'else' clause
# if we don't find it at all, we end up here
# because we never broke out of the loop
calculate = your_default_value
# or whatever you want to do if you don't find it
If you need to know how many 11s there are as keys in the inner dicts, you can:
idnum = 11
print sum(idnum in idnumber for idnumber in A.itervalues())
This works because a key can only be in each dict once so you just have to test if the key exits. in returns True or False which are equal to 1 and 0, so the sum is the number of occurences of idnum.
dpath to the rescue.
http://github.com/akesterson/dpath-python
dpath lets you search by globs, which will get you what you want.
$ easy_install dpath
>>> for (path, value) in dpath.util.search(MY_DICT, '*/11', yielded=True):
>>> ... # 'value' will contain your condition; now do something with it.
It will iterate out all of the conditions in the dictionary, so no special looping constructs required.
See also
how do i traverse nested dictionaries (python)?
How to do this - python dictionary traverse and search
Access nested dictionary items via a list of keys?
Find all occurrences of a key in nested python dictionaries and lists
Traverse a nested dictionary and get the path in Python?
Find all the keys and keys of the keys in a nested dictionary
Searching for keys in a nested dictionary
Python: Updating a value in a deeply nested dictionary
Is there a query language for JSON?
Chained, nested dict() get calls in python