Python dictionaries really have me today. I've been pouring over stack, trying to find a way to do a simple append of a new value to an existing key in a python dictionary adn I'm failing at every attempt and using the same syntaxes I see on here.
This is what i am trying to do:
#cursor seach a xls file
definitionQuery_Dict = {}
for row in arcpy.SearchCursor(xls):
# set some source paths from strings in the xls file
dataSourcePath = str(row.getValue("workspace_path")) + "\\" + str(row.getValue("dataSource"))
dataSource = row.getValue("dataSource")
# add items to dictionary. The keys are the dayasource table and the values will be definition (SQL) queries. First test is to see if a defintion query exists in the row and if it does, we want to add the key,value pair to a dictionary.
if row.getValue("Definition_Query") <> None:
# if key already exists, then append a new value to the value list
if row.getValue("dataSource") in definitionQuery_Dict:
definitionQuery_Dict[row.getValue("dataSource")].append(row.getValue("Definition_Query"))
else:
# otherwise, add a new key, value pair
definitionQuery_Dict[row.getValue("dataSource")] = row.getValue("Definition_Query")
I get an attribute error:
AttributeError: 'unicode' object has no attribute 'append'
But I believe I am doing the same as the answer provided here
I've tried various other methods with no luck with various other error messages. i know this is probably simple and maybe I couldn't find the right source on the web, but I'm stuck. Anyone care to help?
Thanks,
Mike
The issue is that you're originally setting the value to be a string (ie the result of row.getValue) but then trying to append it if it already exists. You need to set the original value to a list containing a single string. Change the last line to this:
definitionQuery_Dict[row.getValue("dataSource")] = [row.getValue("Definition_Query")]
(notice the brackets round the value).
ndpu has a good point with the use of defaultdict: but if you're using that, you should always do append - ie replace the whole if/else statement with the append you're currently doing in the if clause.
Your dictionary has keys and values. If you want to add to the values as you go, then each value has to be a type that can be extended/expanded, like a list or another dictionary. Currently each value in your dictionary is a string, where what you want instead is a list containing strings. If you use lists, you can do something like:
mydict = {}
records = [('a', 2), ('b', 3), ('a', 4)]
for key, data in records:
# If this is a new key, create a list to store
# the values
if not key in mydict:
mydict[key] = []
mydict[key].append(data)
Output:
mydict
Out[4]: {'a': [2, 4], 'b': [3]}
Note that even though 'b' only has one value, that single value still has to be put in a list, so that it can be added to later on.
Use collections.defaultdict:
from collections import defaultdict
definitionQuery_Dict = defaultdict(list)
# ...
Related
I am trying to build a dictionary based on a larger input of text. From this input, I will create nested dictionaries which will need to be updated as the program runs. The structure ideally looks like this:
nodes = {}
node_name: {
inc_name: inc_capacity,
inc_name: inc_capacity,
inc_name: inc_capacity,
}
Because of the nature of this input, I would like to use variables to dynamically create dictionary keys (or access them if they already exist). But I get KeyError if the key doesn't already exist. I assume I could do a try/except, but was wondering if there was a 'cleaner' way to do this in python. The next best solution I found is illustrated below:
test_dict = {}
inc_color = 'light blue'
inc_cap = 2
test_dict[f'{inc_color}'] = inc_cap
# test_dict returns >>> {'light blue': 2}
Try this code, for Large Scale input. For example file input
Lemme give you an example for what I am aiming for, and I think, this what you want.
File.txt
Person1: 115.5
Person2: 128.87
Person3: 827.43
Person4:'18.9
Numerical Validation Function
def is_number(a):
try:
float (a)
except ValueError:
return False
else:
return True
Code for dictionary File.txt
adict = {}
with open("File.txt") as data:
adict = {line[:line.index(':')]: line[line.index(':')+1: ].strip(' \n') for line in data.readlines() if is_number(line[line.index(':')+1: ].strip('\n')) == True}
print(adict)
Output
{'Person1': '115.5', 'Person2': '128.87', 'Person3': '827.43'}
For more explanation, please follow this issue solution How to fix the errors in my code for making a dictionary from a file
As already mentioned in the comments sections, you can use setdefault.
Here's how I will implement it.
Assume I want to add values to dict : node_name and I have the keys and values in two lists. Keys are in inc_names and values are in inc_ccity. Then I will use the below code to load them. Note that inc_name2 key exists twice in the key list. So the second occurrence of it will be ignored from entry into the dictionary.
node_name = {}
inc_names = ['inc_name1','inc_name2','inc_name3','inc_name2']
inc_ccity = ['inc_capacity1','inc_capacity2','inc_capacity3','inc_capacity4']
for i,names in enumerate(inc_names):
node = node_name.setdefault(names, inc_ccity[i])
if node != inc_ccity[i]:
print ('Key=',names,'already exists with value',node, '. New value=', inc_ccity[i], 'skipped')
print ('\nThe final list of values in the dict node_name are :')
print (node_name)
The output of this will be:
Key= inc_name2 already exists with value inc_capacity2 . New value= inc_capacity4 skipped
The final list of values in the dict node_name are :
{'inc_name1': 'inc_capacity1', 'inc_name2': 'inc_capacity2', 'inc_name3': 'inc_capacity3'}
This way you can add values into a dictionary using variables.
I am trying to find a way to remove duplicates from a dict list. I don't have to test the entire object contents because the "name" value in a given object is enough to identify duplication (i.e., duplicate name = duplicate object). My current attempt is this;
newResultArray = []
for i in range(0, len(resultArray)):
for j in range(0, len(resultArray)):
if(i != j):
keyI = resultArray[i]['name']
keyJ = resultArray[j]['name']
if(keyI != keyJ):
newResultArray.append(resultArray[i])
, which is wildly incorrect. Grateful for any suggestions. Thank you.
If name is unique, you should just use a dictionary to store your inner dictionaries, with name being the key. Then you won't even have the issue of duplicates, and you can remove from the list in O(1) time.
Since I don't have access to the code that populates resultArray, I'll simply show how you can convert it into a dictionary in linear time. Although the best option would be to use a dictionary instead of resultArray in the first place, if possible.
new_dictionary = {}
for item in resultArray:
new_dictionary[item['name']] = item
If you must have a list in the end, then you can convert back into a dictionary as such:
new_list = [v for k,v in new_dictionary.items()]
Since "name" provides uniqueness... and assuming "name" is a hashable object, you can build an intermediate dictionary keyed by "name". Any like-named dicts will simply overwrite their predecessor in the dict, giving you a list of unique dictionaries.
tmpDict = {result["name"]:result for result in resultArray}
newArray = list(tmpDict.values())
del tmpDict
You could shrink that down to
newArray = list({result["name"]:result for result in resultArray}.values())
which may be a bit obscure.
Summary of issue: I'm trying to create a nested Python dictionary, with keys defined by pre-defined variables and strings. And I'm populating the dictionary from regular expressions outputs. This mostly works. But I'm getting an error because the nested dictionary - not the main one - doesn't like having the key set to a string, it wants an integer. This is confusing me. So I'd like to ask you guys how I can get a nested python dictionary with string keys.
Below I'll walk you through the steps of what I've done. What is working, and what isn't. Starting from the top:
# Regular expressions module
import re
# Read text data from a file
file = open("dt.cc", "r")
dtcc = file.read()
# Create a list of stations from regular expression matches
stations = sorted(set(re.findall(r"\n(\w+)\s", dtcc)))
The result is good, and is as something like this:
stations = ['AAAA','BBBB','CCCC','DDDD']
# Initialize a new dictionary
rows = {}
# Loop over each station in the station list, and start populating
for station in stations:
rows[station] = re.findall("%s\s(.+)" %station, dtcc)
The result is good, and is something like this:
rows['AAAA'] = ['AAAA 0.1132 0.32 P',...]
However, when I try to create a sub-dictionary with a string key:
for station in stations:
rows[station] = re.findall("%s\s(.+)" %station, dtcc)
rows[station]["dt"] = re.findall("%s\s(\S+)" %station, dtcc)
I get the following error.
"TypeError: list indices must be integers, not str"
It doesn't seem to like that I'm specifying the second dictionary key as "dt". If I give it a number instead, it works just fine. But then my dictionary key name is a number, which isn't very descriptive.
Any thoughts on how to get this working?
The issue is that by doing
rows[station] = re.findall(...)
You are creating a dictionary with the station names as keys and the return value of re.findall method as values, which happen to be lists. So by calling them again by
rows[station]["dt"] = re.findall(...)
on the LHS row[station] is a list that is indexed by integers, which is what the TypeError is complaining about. You could do rows[station][0] for example, you would get the first match from the regex. You said you want a nested dictionary. You could do
rows[station] = dict()
rows[station]["dt"] = re.findall(...)
To make it a bit nicer, a data structure that you could use instead is a defaultdict from the collections module.
The defaultdict is a dictionary that accepts a default type as a type for its values. You enter the type constructor as its argument. For example dictlist = defaultdict(list) defines a dictionary that has as values lists! Then immediately doing dictlist[key].append(item1) is legal as the list is automatically created when setting the key.
In your case you could do
from collections import defaultdict
rows = defaultdict(dict)
for station in stations:
rows[station]["bulk"] = re.findall("%s\s(.+)" %station, dtcc)
rows[station]["dt"] = re.findall("%s\s(\S+)" %station, dtcc)
Where you have to assign the first regex result to a new key, "bulk" here but you can call it whatever you like. Hope this helps.
the current code I have is category1[name]=(number) however if the same name comes up the value in the dictionary is replaced by the new number how would I make it so instead of the value being replaced the original value is kept and the new value is also added, giving the key two values now, thanks.
You would have to make the dictionary point to lists instead of numbers, for example if you had two numbers for category cat1:
categories["cat1"] = [21, 78]
To make sure you add the new numbers to the list rather than replacing them, check it's in there first before adding it:
cat_val = # Some value
if cat_key in categories:
categories[cat_key].append(cat_val)
else:
# Initialise it to a list containing one item
categories[cat_key] = [cat_val]
To access the values, you simply use categories[cat_key] which would return [12] if there was one key with the value 12, and [12, 95] if there were two values for that key.
Note that if you don't want to store duplicate keys you can use a set rather than a list:
cat_val = # Some value
if cat_key in categories:
categories[cat_key].add(cat_val)
else:
# Initialise it to a set containing one item
categories[cat_key] = set(cat_val)
a key only has one value, you would need to make the value a tuple or list etc
If you know you are going to have multiple values for a key then i suggest you make the values capable of handling this when they are created
It's a little hard to understand your question.
I think you want this:
>>> d[key] = [4]
>>> d[key].append(5)
>>> d[key]
[4, 5]
Depending on what you expect, you could check if name - a key in your dictionary - already exists. If so, you might be able to change its current value to a list, containing both the previous and the new value.
I didn't test this, but maybe you want something like this:
mydict = {'key_1' : 'value_1', 'key_2' : 'value_2'}
another_key = 'key_2'
another_value = 'value_3'
if another_key in mydict.keys():
# another_key does already exist in mydict
mydict[another_key] = [mydict[another_key], another_value]
else:
# another_key doesn't exist in mydict
mydict[another_key] = another_value
Be careful when doing this more than one time! If it could happen that you want to store more than two values, you might want to add another check - to see if mydict[another_key] already is a list. If so, use .append() to add the third, fourth, ... value to it.
Otherwise you would get a collection of nested lists.
You can create a dictionary in which you map a key to a list of values, in which you would want to append a new value to the lists of values stored at each key.
d = dict([])
d["name"] = 1
x = d["name"]
d["name"] = [1] + x
I guess this is the easiest way:
category1 = {}
category1['firstKey'] = [7]
category1['firstKey'] += [9]
category1['firstKey']
should give you:
[7, 9]
So, just use lists of numbers instead of numbers.
I have a for loop that goes through two lists and combines them in dictionary. Keys are strings (web page headers) and values are lists (containing links).
Sometimes I get the same key from the loop that already exists in the dictionary. Which is fine. But the value is different (new links) and I'd like to update the key's value in a way where I append the links instead of replacing them.
The code looks something like that below. Note: issue_links is a list of URLs
for index, link in enumerate(issue_links):
issue_soup = BeautifulSoup(urllib2.urlopen(link))
image_list = []
for image in issue_soup.findAll('div', 'mags_thumb_article'):
issue_name = issue_soup.findAll('h1','top')[0].text
image_list.append(the_url + image.a['href'])
download_list[issue_name] = image_list
Currently the new links (image_list) that belong to the same issue_name key get overwritten. I'd like instead to append them. Someone told me to use collections.defaultdict module but I'm not familiar with it.
Note: I'm using enumerate because the index gets printed to the console (not included in the code).
Something like this:
from collections import defaultdict
d = defaultdict(list)
d["a"].append(1)
d["a"].append(2)
d["b"].append(3)
Then:
print(d)
defaultdict(<class 'list'>, {'b': [3], 'a': [1, 2]})
if download_list.has_key(issume_name):
download_list[issume_name].append(image_list)
else:
download_list[issume_name] = [image_list]
is it right?If you have the same key, append the list.