I have a script where I read a file line by line and save some information to lists and dictionary. Namely, I store the keys (say ley1 and key2) to pass to the dictionary and a list to be stored as item in the dictionary.
It happens that I have to update a dictionary only if some conditions are met, i.e.:
myDict = {}
if mylist:
myDict[key1][key2] = mylist
Of course this will raise a KeyError if key2 does not exist already. Therefore, I introduced the following function:
def updateDict2keys(myDict,mykey1,mykey2,myitems):
"""
updates a dictionary by appending values at given keys (generating key2 if not given)
input: key1, key2 and items to append
output: dictionary orgnanized as {mykey1:{mykey2:myitems}}
"""
try:
myDict[mykey1][mykey2] = myitems
except KeyError:
myDict[mykey1] = {mykey2:myitems}
# output
return myDict
My question is: is it "safe" to call such a function in the main code inside a for loop like this?
with open(os.path.join(path+myfile)) as ntwkf:
# read file
rdlistcsv = csv.reader(ntwkf)
rowslist = [line for line in rdlistcsv]
ntwkJuncDict = {}
for idx,f in enumerate(rowslist): # loop over file lines
if f:
lineelmnt = f[0].split()[0]
else:
continue
# flags
isBranchName = True if lineelmnt=='definitions' else False
isJunction = True if lineelmnt=='connections' else False
# update dictionary
if isBranchName:
reachname = f[0].split()[2].replace("'","")
if isJunction:
usreach = f[0].split()[2].replace("'","")
uschain = float(f[1].replace("'","").replace(" ",""))
if usreach:
uslist = [usreach, uschain]
todivide.append(uslist)
ntwkJuncDict = updateDict2keys(ntwkJuncDict, reachname, 'upstream', uslist)
I must say, my code works pretty well, I am just asking myself (and yourselves of course!) if I am doing everything the python way and if there are smarter solutions.
Instead of accessing the primary key, use dict.setdefault with a default empty dict on it.
def updateDict2keys(myDict,mykey1,mykey2,myitems):
myDict.setdefault(mykey1, {})[mykey2] = myitems
return myDict
Note that your initial approach is also safe. Only lookup, not assignment, throws KeyError. In the statement myDict[mykey1][mykey2] = myitems, a KeyError can only be thrown of mykey1 is not set. Thus, setting it to an empty dict on a KeyError does not overwrite anything.
Related
I am trying to build a dictionary based on a larger input of text. From this input, I will create nested dictionaries which will need to be updated as the program runs. The structure ideally looks like this:
nodes = {}
node_name: {
inc_name: inc_capacity,
inc_name: inc_capacity,
inc_name: inc_capacity,
}
Because of the nature of this input, I would like to use variables to dynamically create dictionary keys (or access them if they already exist). But I get KeyError if the key doesn't already exist. I assume I could do a try/except, but was wondering if there was a 'cleaner' way to do this in python. The next best solution I found is illustrated below:
test_dict = {}
inc_color = 'light blue'
inc_cap = 2
test_dict[f'{inc_color}'] = inc_cap
# test_dict returns >>> {'light blue': 2}
Try this code, for Large Scale input. For example file input
Lemme give you an example for what I am aiming for, and I think, this what you want.
File.txt
Person1: 115.5
Person2: 128.87
Person3: 827.43
Person4:'18.9
Numerical Validation Function
def is_number(a):
try:
float (a)
except ValueError:
return False
else:
return True
Code for dictionary File.txt
adict = {}
with open("File.txt") as data:
adict = {line[:line.index(':')]: line[line.index(':')+1: ].strip(' \n') for line in data.readlines() if is_number(line[line.index(':')+1: ].strip('\n')) == True}
print(adict)
Output
{'Person1': '115.5', 'Person2': '128.87', 'Person3': '827.43'}
For more explanation, please follow this issue solution How to fix the errors in my code for making a dictionary from a file
As already mentioned in the comments sections, you can use setdefault.
Here's how I will implement it.
Assume I want to add values to dict : node_name and I have the keys and values in two lists. Keys are in inc_names and values are in inc_ccity. Then I will use the below code to load them. Note that inc_name2 key exists twice in the key list. So the second occurrence of it will be ignored from entry into the dictionary.
node_name = {}
inc_names = ['inc_name1','inc_name2','inc_name3','inc_name2']
inc_ccity = ['inc_capacity1','inc_capacity2','inc_capacity3','inc_capacity4']
for i,names in enumerate(inc_names):
node = node_name.setdefault(names, inc_ccity[i])
if node != inc_ccity[i]:
print ('Key=',names,'already exists with value',node, '. New value=', inc_ccity[i], 'skipped')
print ('\nThe final list of values in the dict node_name are :')
print (node_name)
The output of this will be:
Key= inc_name2 already exists with value inc_capacity2 . New value= inc_capacity4 skipped
The final list of values in the dict node_name are :
{'inc_name1': 'inc_capacity1', 'inc_name2': 'inc_capacity2', 'inc_name3': 'inc_capacity3'}
This way you can add values into a dictionary using variables.
[I had problem on how to iter through dict to find a pair of similar words and output it then the delete from dict]
My intention is to generate a random output label then store it into dictionary then iter through the dictionary and store the first key in the list or some sort then iter through the dictionary to search for similar key eg Light1on and Light1off has Light1 in it and get the value for both of the key to store into a table in its respective columns.
such as
Dict = {Light1on,Light2on,Light1off...}
store value equal to Light1on the iter through the dictionary to get eg Light1 off then store its Light1on:value1 and Light1off:value2 into a table or DF with columns name: On:value1 off:value2
As I dont know how to insert the code as code i can only provide the image sry for the trouble,its my first time asking question here thx.
from collections import defaultdict
import difflib, random
olist = []
input = 10
olist1 = ['Light1on','Light2on','Fan1on','Kettle1on','Heater1on']
olist2 = ['Light2off','Kettle1off','Light1off','Fan1off','Heater1off']
events = list(range(input + 1))
for i in range(len(olist1)):
output1 = random.choice(olist1)
print(output1,'1')
olist1.remove(output1)
output2 = random.choice(olist2)
print(output2,'2')
olist2.remove(output2)
olist.append(output1)
olist.append(output2)
print(olist,'3')
outputList = {olist[i]:events[i] for i in range(10)}
print (str(outputList),'4')
# Iterating through the keys finding a pair match
for s in range(5):
for i in outputList:
if i == list(outputList)[0]:
skeys = difflib.get_close_matches(i, outputList, n=2, cutoff=0.75)
print(skeys,'5')
del outputList[skeys]
# Modified Dictionary
difflib.get_close_matches('anlmal', ['car', 'animal', 'house', 'animaltion'])
['animal']
Updated: I was unable to delete the pair of similar from the list(Dictionary) after founding par in the dictionary
You're probably getting an error about a dictionary changing size during iteration. That's because you're deleting keys from a dictionary you're iterating over, and Python doesn't like that:
d = {1:2, 3:4}
for i in d:
del d[i]
That will throw:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration
To work around that, one solution is to store a list of the keys you want to delete, then delete all those keys after you've finished iterating:
keys_to_delete = []
d = {1:2, 3:4}
for i in d:
if i%2 == 1:
keys_to_delete.append(i)
for i in keys_to_delete:
del d[i]
Ta-da! Same effect, but this way avoids the error.
Also, your code above doesn't call the difflib.get_close_matches function properly. You can use print(help(difflib.get_close_matches)) to see how you are meant to call that function. You need to provide a second argument that indicates the items to which you wish to compare your first argument for possible matches.
All of that said, I have a feeling that you can accomplish your fundamental goals much more simply. If you spend a few minutes describing what you're really trying to do (this shouldn't involve any references to data types, it should just involve a description of your data and your goals), then I bet someone on this site can help you solve that problem much more simply!
I’m having a problem with a dictionary. I"m using Python3. I’m sure there’s something easy that I’m just not seeing.
I’m reading lines from a file to create a dictionary. The first 3 characters of each line are used as keys (they are unique). From there, I create a list from the information in the rest of the line. Each 4 characters make up a member of the list. Once I’ve created the list, I write to the directory with the list being the value and the first three characters of the line being the key.
The problem is, each time I add a new key:value pair to the dictionary, it seems to overlay (or update) the values in the previously written dictionary entries. The keys are fine, just the values are changed. So, in the end, all of the keys have a value equivalent to the list made from the last line in the file.
I hope this is clear. Any thoughts would be greatly appreciated.
A snippet of the code is below
formatDict = dict()
sectionList = list()
for usableLine in formatFileHandle:
lineLen = len(usableLine)
section = usableLine[:3]
x = 3
sectionList.clear()
while x < lineLen:
sectionList.append(usableLine[x:x+4])
x += 4
formatDict[section] = sectionList
for k, v in formatDict.items():
print ("for key= ", k, "value =", v)
formatFileHandle.close()
You always clear, then append and then insert the same sectionList, that's why it always overwrites the entries - because you told the program it should.
Always remember: In Python assignment never makes a copy!
Simple fix
Just insert a copy:
formatDict[section] = sectionList.copy() # changed here
Instead of inserting a reference:
formatDict[section] = sectionList
Complicated fix
There are lots of things going on and you could make it "better" by using functions for subtasks like the grouping, also files should be opened with with so that the file is closed automatically even if an exception occurs and while loops where the end is known should be avoided.
Personally I would use code like this:
def groups(seq, width):
"""Group a sequence (seq) into width-sized blocks. The last block may be shorter."""
length = len(seq)
for i in range(0, length, width): # range supports a step argument!
yield seq[i:i+width]
# Printing the dictionary could be useful in other places as well -> so
# I also created a function for this.
def print_dict_line_by_line(dct):
"""Print dictionary where each key-value pair is on one line."""
for key, value in dct.items():
print("for key =", key, "value =", value)
def mytask(filename):
formatDict = {}
with open(filename) as formatFileHandle:
# I don't "strip" each line (remove leading and trailing whitespaces/newlines)
# but if you need that you could also use:
# for usableLine in (line.strip() for line in formatFileHandle):
# instead.
for usableLine in formatFileHandle:
section = usableLine[:3]
sectionList = list(groups(usableLine[3:]))
formatDict[section] = sectionList
# upon exiting the "with" scope the file is closed automatically!
print_dict_line_by_line(formatDict)
if __name__ == '__main__':
mytask('insert your filename here')
You could simplify your code here by using a with statement to auto close the file and chunk the remainder of the line into groups of four, avoiding the re-use of a single list.
from itertools import islice
with open('somefile') as fin:
stripped = (line.strip() for line in fin)
format_dict = {
line[:3]: list(iter(lambda it=iter(line[3:]): ''.join(islice(it, 4)), ''))
for line in stripped
}
for key, value in format_dict.items():
print('key=', key, 'value=', value)
I have two text files name weburl.txt and imageurl.txt, weburl.txt contain URLs of website and imageurl.txt contain all images URLs I want to create a dictionary that read a line of weburl.txt and make key of a dictionary and imageurl.txt line as a value.
weburl.txt
url1
url2
url3
url4
url5......
imageurl.txt
imgurl1
imgurl2
imgurl3
imgurl4
imgurl5
required output is
{'url1': imgurl1, 'url2': imgurl2, 'url3': imgurl3......}
I am using this code
with open('weburl.txt') as f :
key = f.readlines()
with open('imageurl.txt') as g:
value = g.readlines()
dict[key] = [value]
print dict
I am not getting the required results
you can write something like
with open('weburl.txt') as f, \
open('imageurl.txt') as g:
# we use `str.strip` method
# to remove newline characters
keys = (line.strip() for line in f)
values = (line.strip() for line in g)
result = dict(zip(keys, values))
print(result)
more info about zip at docs
There are problems with the statement dict[key] = [value] on so many levels that I get a kind of vertigo as we drill down through them:
The apparent intention to use a variable called dict (a bad idea because it would overshadow Python's builtin reference to the dict class). Let's call it d instead.
Not initializing the dictionary instance first. If you had called it something liked this oversight would earn you an easy-to-understand NameError. However since you're calling it dict, Python will actually be attempting to set items in the dict class itself (which doesn't support __setitem__) instead of inside a dict instance, so you'll get a different, more-confusing error.
Attempting to make a dict entry assignment where the key is a non-hashable type (key is alist). You could convert thelist to the hashable type tuple easily enough, but that's not what you want because you'd still be...
Attempting to assign bunch of values to their respective keys all at once. This can't be done with d[key] = value syntax. It could be done all in one relatively simple statement, i.e. d=dict(zip(key,value)) but unfortunately that doesn't get around the fact that you're...
Not stripping the newline character off the end of each key and value.
Instead, this line:
d = dict((k.strip(), v.strip()) for k, v in zip(key, value))
will do what you appear to want.
So basically I'm traversing through the nested dictionary extension like so:
extension['value1']['value2']['value3']['value4']
However, sometimes the dict file can be a little different:
extension['value1']['value2']['blah1']['value4']
How can I account for this scenario? I don't have to worry about a large number of scenarios, the key will only ever be value3 or blah1
You could write a function to get the first key that exists:
def get_first_item(items, keys):
for k in keys:
if k in items:
return items[k]
raise KeyError
And then use it like this:
get_first_item(extension['value1']['value2'], ['value3', 'blah1'])['value4']
You have to explicitly check for the key and then fetch its value. For example:
optional_keys = ['value3', 'blah1']
value = None
for optional_key in optional_keys:
if optional_key in extension['value1']['value2']:
value = extension['value1']['value2'][optional_key]['value4']
break
I think the above two answers can well fix your problem. Since your key will be either value3 or blah1, instead of a function, you may as well use the following code when you loop through the dictionary:
try:
value = extension['value1']['value2']['value3']['value4']
except Exception as e: # except KeyError:
# print(e)
value = extension['value1']['value2']['blah1']['value4']