How to assign a dictionary with empty or null value - python

I have a code which uses a dict to create a bunch of key-value pairs.
The no. of key-value pairs is undefined and not fixed. In one iteration, it can have 2 key value, in another it can 4 or 5.
The way I am doing is I currently use an empty dict
like
cost_dict = {}
Now when a regular expression pattern is found in a text string then I extract part of those text as key value pairs and populate the above dict with it.
However wherever the pattern is not found, I am trying to catch that exception of AttributeError and then in that specific case I want this above dict to be assigned like null or blank value.
So like
cost_dict ={}
try:
cost_breakdown = re.search(regex, output).group()
except AttributeError:
cost_dict =' ' # this part I am not sure how to do
... (if pattern matches extract the text and populate the above dict as key-value)
But I am not sure how to assign null or blank value to this dict then as above obviously creates a string variable cost_dict and does not assign it to the above defined empty dict.

Related

How to create a nested python dictionary with keys as strings?

Summary of issue: I'm trying to create a nested Python dictionary, with keys defined by pre-defined variables and strings. And I'm populating the dictionary from regular expressions outputs. This mostly works. But I'm getting an error because the nested dictionary - not the main one - doesn't like having the key set to a string, it wants an integer. This is confusing me. So I'd like to ask you guys how I can get a nested python dictionary with string keys.
Below I'll walk you through the steps of what I've done. What is working, and what isn't. Starting from the top:
# Regular expressions module
import re
# Read text data from a file
file = open("dt.cc", "r")
dtcc = file.read()
# Create a list of stations from regular expression matches
stations = sorted(set(re.findall(r"\n(\w+)\s", dtcc)))
The result is good, and is as something like this:
stations = ['AAAA','BBBB','CCCC','DDDD']
# Initialize a new dictionary
rows = {}
# Loop over each station in the station list, and start populating
for station in stations:
rows[station] = re.findall("%s\s(.+)" %station, dtcc)
The result is good, and is something like this:
rows['AAAA'] = ['AAAA 0.1132 0.32 P',...]
However, when I try to create a sub-dictionary with a string key:
for station in stations:
rows[station] = re.findall("%s\s(.+)" %station, dtcc)
rows[station]["dt"] = re.findall("%s\s(\S+)" %station, dtcc)
I get the following error.
"TypeError: list indices must be integers, not str"
It doesn't seem to like that I'm specifying the second dictionary key as "dt". If I give it a number instead, it works just fine. But then my dictionary key name is a number, which isn't very descriptive.
Any thoughts on how to get this working?
The issue is that by doing
rows[station] = re.findall(...)
You are creating a dictionary with the station names as keys and the return value of re.findall method as values, which happen to be lists. So by calling them again by
rows[station]["dt"] = re.findall(...)
on the LHS row[station] is a list that is indexed by integers, which is what the TypeError is complaining about. You could do rows[station][0] for example, you would get the first match from the regex. You said you want a nested dictionary. You could do
rows[station] = dict()
rows[station]["dt"] = re.findall(...)
To make it a bit nicer, a data structure that you could use instead is a defaultdict from the collections module.
The defaultdict is a dictionary that accepts a default type as a type for its values. You enter the type constructor as its argument. For example dictlist = defaultdict(list) defines a dictionary that has as values lists! Then immediately doing dictlist[key].append(item1) is legal as the list is automatically created when setting the key.
In your case you could do
from collections import defaultdict
rows = defaultdict(dict)
for station in stations:
rows[station]["bulk"] = re.findall("%s\s(.+)" %station, dtcc)
rows[station]["dt"] = re.findall("%s\s(\S+)" %station, dtcc)
Where you have to assign the first regex result to a new key, "bulk" here but you can call it whatever you like. Hope this helps.

Parsing string through dictionary

ab='TS_Automation=Manual;TS_Method=Test;TS_Priority=1;TS_Tested_By=rjrjjn;TS_Written_By=SUN;TS_Review_done=No;TS_Regression=No;'
a={'TS_Automation'='Automated',TS_Tested_By='qz9ghv','TS_Review_done'='yes'}
I have a string and a dictionary ,Now i have to change the value in string based on the keys of dictionary.If the keys are not there subsequent value need to be removed.As TS_Method is not there in dictionary so need to be removed from the string ab.
Am I correct in understanding that you don't want to keep key-value pairs in the string if they don't occur in the dictionary? If that's the case, you can simply parse the dictionary to that particular string format. In your case it's simply in the form key=value; for each entry in the dictionary:
ab = ''
for key, value in a.items():
ab += "{}={};".format(key, value)
You would have to create a new string.
I would do it by using the find method using dictionary key/values for the search.
If the value being searched for does exist, I would append to a new string
s=''
for val in a:
word=val+'='+a[val]
wordLen=len(word)
x=ab.find(word)
if x != -1:
s+=ab[x:wordLen]
myvalue = ''
for k,v in a.items()
myvalue = myvalue+"{}={};".format(key, value)
ab = myvalue
just convert the dict to desired formated string and use it. There is no need for you to remove the key as your requirement is to use the dict as it is in string format.

Adding Multiple Values to a Single Key in Python Dictionary

Python dictionaries really have me today. I've been pouring over stack, trying to find a way to do a simple append of a new value to an existing key in a python dictionary adn I'm failing at every attempt and using the same syntaxes I see on here.
This is what i am trying to do:
#cursor seach a xls file
definitionQuery_Dict = {}
for row in arcpy.SearchCursor(xls):
# set some source paths from strings in the xls file
dataSourcePath = str(row.getValue("workspace_path")) + "\\" + str(row.getValue("dataSource"))
dataSource = row.getValue("dataSource")
# add items to dictionary. The keys are the dayasource table and the values will be definition (SQL) queries. First test is to see if a defintion query exists in the row and if it does, we want to add the key,value pair to a dictionary.
if row.getValue("Definition_Query") <> None:
# if key already exists, then append a new value to the value list
if row.getValue("dataSource") in definitionQuery_Dict:
definitionQuery_Dict[row.getValue("dataSource")].append(row.getValue("Definition_Query"))
else:
# otherwise, add a new key, value pair
definitionQuery_Dict[row.getValue("dataSource")] = row.getValue("Definition_Query")
I get an attribute error:
AttributeError: 'unicode' object has no attribute 'append'
But I believe I am doing the same as the answer provided here
I've tried various other methods with no luck with various other error messages. i know this is probably simple and maybe I couldn't find the right source on the web, but I'm stuck. Anyone care to help?
Thanks,
Mike
The issue is that you're originally setting the value to be a string (ie the result of row.getValue) but then trying to append it if it already exists. You need to set the original value to a list containing a single string. Change the last line to this:
definitionQuery_Dict[row.getValue("dataSource")] = [row.getValue("Definition_Query")]
(notice the brackets round the value).
ndpu has a good point with the use of defaultdict: but if you're using that, you should always do append - ie replace the whole if/else statement with the append you're currently doing in the if clause.
Your dictionary has keys and values. If you want to add to the values as you go, then each value has to be a type that can be extended/expanded, like a list or another dictionary. Currently each value in your dictionary is a string, where what you want instead is a list containing strings. If you use lists, you can do something like:
mydict = {}
records = [('a', 2), ('b', 3), ('a', 4)]
for key, data in records:
# If this is a new key, create a list to store
# the values
if not key in mydict:
mydict[key] = []
mydict[key].append(data)
Output:
mydict
Out[4]: {'a': [2, 4], 'b': [3]}
Note that even though 'b' only has one value, that single value still has to be put in a list, so that it can be added to later on.
Use collections.defaultdict:
from collections import defaultdict
definitionQuery_Dict = defaultdict(list)
# ...

eliminate '\n' in dictionary

I have a dictionary looks like this, the DNA is the keys and quality value is value:
{'TTTGTTCTTTTTGTAATGGGGCCAGATGTCACTCATTCCACATGTAGTATCCAGATTGAAATGAAATGAGGTAGAACTGACCCAGGCTGGACAAGGAAGG\n':
'eeeecdddddaaa`]eceeeddY\\cQ]V[F\\\\TZT_b^[^]Z_Z]ac_ccd^\\dcbc\\TaYcbTTZSb]Y]X_bZ\\a^^\\S[T\\aaacccBBBBBBBBBB\n',
'ACTTATATTATGTTGACACTCAAAAATTTCAGAATTTGGAGTATTTTGAATTTCAGATTTTCTGATTAGGGATGTACCTGTACTTTTTTTTTTTTTTTTT\n':
'dddddd\\cdddcdddcYdddd`d`dcd^dccdT`cddddddd^dddddddddd^ddadddadcd\\cda`Y`Y`b`````adcddd`ddd_dddadW`db_\n',
'CTGCCAGCACGCTGTCACCTCTCAATAACAGTGAGTGTAATGGCCATACTCTTGATTTGGTTTTTGCCTTATGAATCAGTGGCTAAAAATATTATTTAAT\n':
'deeee`bbcddddad\\bbbbeee\\ecYZcc^dd^ddd\\\\`]``L`ccabaVJ`MZ^aaYMbbb__PYWY]RWNUUab`Y`BBBBBBBBBBBBBBBBBBBB\n'}
I want to write a function so that if I query a DNA sequence, it returns a tuple of this DNA sequence and its corresponding quality value
I wrote the following function, but it gives me an error message that says list indices must be integers, not str
def query_sequence_id(self, dna_seq=''):
"""Overrides the query_sequence_id so that it optionally returns both the sequence and the quality values.
If DNA sequence does not exist in the class, return a string error message"""
list_dna = []
for t in self.__fastqdict.keys():
list_dna.append(t.rstrip('\n'))
self.dna_seq = dna_seq
if self.dna_seq in list_dna:
return (self.dna_seq,self.__fastqdict.values()[self.dna_seq + "\n"])
else:
return "This DNA sequence does not exist"
so I want something like if I print
query_sequence_id("TTTGTTCTTTTTGTAATGGGGCCAGATGTCACTCATTCCACATGTAGTATCCAGATTGAAATGAAATGAGGTAGAACTGACCCAGGCTGGACAAGGAAGG"),
I would get
('TTTGTTCTTTTTGTAATGGGGCCAGATGTCACTCATTCCACATGTAGTATCCAGATTGAAATGAAATGAGGTAGAACTGACCCAGGCTGGACAAGGAAGG',
'eeeecdddddaaa`]eceeeddY\\cQ]V[F\\\\TZT_b^[^]Z_Z]ac_ccd^\\dcbc\\TaYcbTTZSb]Y]X_bZ\\a^^\\S[T\\aaacccBBBBBBBBBB')
I want to get rid of "\n" for both keys and values, but my code failed. Can anyone help me fix my code?
The newline characters aren't your problem, though they are messy. You're trying to index the view returned by dict.values() based on the string. That's not only not what you want, but it also defeats the whole purpose of using the dictionary in the first place. Views are iterables, not mappings like dicts are. Just look up the value in the dictionary, the normal way:
return (self.dna_seq, self.__fastqdict[self.dna_seq + "\n"])
As for the newlines, why not just take them out when you build the dictionary in the first place?
To modify the dictionary you can just do the following:
myNewDict = {}
for var in myDict:
myNewDict[var.strip()] = myDict[var].strip()
You can remove those pesky newlines from your dictionary's keys and values like this (assuming your dictionary was stored in a variable nameddna):
dna = {k.rstrip(): v.rstrip() for k, v in dna.iteritems()}

Dynamic Python Lists

In the below Python Code, am dynamically create Lists.
g['quest_{0}'.format(random(x))] = []
where random(x) is a random number, how to print the List(get the name of the dynamically created List name?)
To get a list of all the keys of your dictionary :
list(g.keys())
There is nothing different with a regular dictionary because you generate the key dynamically.
Note that you can also put any type of hashable object as a key, such as a tuple :
g[('quest', random(x))] = []
Which will let you get a list of all your quest numbers easily :
[number for tag, number in g.keys() if tag == "quest"]
With this technic, you can actually loop through the tag ('quest'), the number and the value in one loop :
for (tag, number), value in g.items():
# do somthing
Unpacking is your best friend in Python.
You can iterate over the g dictionary with a for loop, like this
for key, value in g.items():
print key, value
This will print all the keys and their corresponding lists.

Categories

Resources