Create nested dictionaries during for loop - python

I have created a list during a for loop which works well but I want create a dictionary instead.
from System.Collections.Generic import List
#Collector
viewPorts = list(FilteredElementCollector(doc).OfClass(Viewport))
#create a dictionary
viewPortDict = {}
#add Sheet Number, View Name and boxoutline to dictionary
for vp in viewPorts:
sheet = doc.GetElement(vp.SheetId)
view = doc.GetElement(vp.ViewId)
vbox = vp.GetBoxOutline()
viewPortDict = {view.ViewName : {'sheetNum': sheet.SheetNumber, 'viewBox' : vbox}}
print(viewPortDict)
The output from this is as follows:
{'STEEL NOTES': {'viewBox': <Autodesk.Revit.DB.Outline object at 0x000000000000065A [Autodesk.Revit.DB.Outline]>, 'sheetNum': 'A0.07'}}
Which the structure is perfect but I want it to grab everything as while it does the for loop it seems to stop on the first loop. Why is that? And how can I get it to keep the loop going?
I have tried various things like creating another list of keys called "Keys" and list of values called "viewPortList" like:
dict.fromkeys(Keys, viewPortList)
But I always have the same problem I am not able to iterate over all elements. For full disclosure I am successful when I create a list instead. Here is what that looks like.
from System.Collections.Generic import List
#Collector
viewPorts = list(FilteredElementCollector(doc).OfClass(Viewport))
#create a dictionary
viewPortList = []
#add Sheet Number, View Name and boxoutline to dictionary
for vp in viewPorts:
sheet = doc.GetElement(vp.SheetId)
view = doc.GetElement(vp.ViewId)
vbox = vp.GetBoxOutline()
viewPortList.append([sheet.SheetNumber, view.ViewName, vbox])
print(viewPortList)
Which works fine and prints the below (only portion of a long list)
[['A0.01', 'APPLICABLE CODES', <Autodesk.Revit.DB.Outline object at 0x000000000000060D [Autodesk.Revit.DB.Outline]>], ['A0.02', etc.]
But I want a dictionary instead. Any help would be appreciated. Thanks!

In your list example, you are appending to the list. In your dictionary example, you are creating a new dictionary each time (thus removing the data from previous iterations of the loop). You can do the equivalent of appending to it as well by just assigning to a particular key in the existing dictionary.
viewPortDict[view.ViewName] = {'sheetNum': sheet.SheetNumber, 'viewBox' : vbox}

Related

Need help adding value to a variable

Here is the whole code section
for entry in auth_log:
# timestamp is converted to milliseconds for CEF
# repr is used to keep '\\' in the domain\username
extension = {
'rt=': str(time.ctime(int(entry['timestamp']))),
'src=': entry['ip'],
'dhost=': entry['host'],
'duser=': repr(entry['username']).lstrip("u").strip("'"),
'outcome=': entry['result'],
'cs1Label=': 'new_enrollment',
'cs1=': str(entry['new_enrollment']),
'cs2Label=': 'factor',
'cs2=': entry['factor'],
'ca3Label=': 'integration',
'cs3=': entry['integration'],
}
log_to_cef(entry['eventtype'], entry['eventtype'], **extension)
In line 5 (rt=), I would like to add the timestamp output to a variable where I can call it later in the script.
You can access the value from the dictionary directly with extension["rt="]?
If you are looking for a way to have a list of all the variables outside of your loop you can use this method.
Before your loop you should make an empty list like this:
extensionRt = []
Then after extension is created inside each loop use:
extensionRt.append(extension["rt="])
You can then access the values in this list by index:
extensionRt[YOUR INDEX HERE]

Parsing and arranging text in python

I'm having some trouble figuring out the best implementation
I have data in file in this format:
|serial #|machine_name|machine_owner|
If a machine_owner has multiple machines, I'd like the machines displayed in a comma separated list in the field. so that.
|1234|Fred Flinstone|mach1|
|5678|Barney Rubble|mach2|
|1313|Barney Rubble|mach3|
|3838|Barney Rubble|mach4|
|1212|Betty Rubble|mach5|
Looks like this:
|Fred Flinstone|mach1|
|Barney Rubble|mach2,mach3,mach4|
|Betty Rubble|mach5|
Any hints on how to approach this would be appreciated.
You can use dict as temporary container to group by name and then print it in desired format:
import re
s = """|1234|Fred Flinstone|mach1|
|5678|Barney Rubble|mach2|
|1313|Barney Rubble||mach3|
|3838|Barney Rubble||mach4|
|1212|Betty Rubble|mach5|"""
results = {}
for line in s.splitlines():
_, name, mach = re.split(r"\|+", line.strip("|"))
if name in results:
results[name].append(mach)
else:
results[name] = [mach]
for name, mach in results.items():
print(f"|{name}|{','.join(mach)}|")
You need to store all the machines names in a list. And every time you want to append a machine name, you run a function to make sure that the name is not already in the list, so that it will not put it again in the list.
After storing them in an array called data. Iterate over the names. And use this function:
data[i] .append( [ ] )
To add a list after each machine name stored in the i'th place.
Once your done, iterate over the names and find them in in the file, then append the owner.
All of this can be done in 2 steps.

How to reference a Class List with another class?

I am setting up a script that will extract data from excel and return it in lists. Right now I am trying to be able to reorganized the data into smaller lists that have a common attribute. (Such as: A list that has the indices of the rows that contained, 'Pencil') I keep having the smaller list returning None.
I've checked and the lists that extract the data are working fine. But I cant seem to get the smaller lists working.
#Create a class for the multiple lists of Columns
class Data_Column(list):
def Fill_List (self,col): #fills the list
for i in range(sheet.nrows):
self.append(sheet.cell_value(i,col))
#Create a class for a specific list that has data of a common artifact
class Specific_List(list):
def Find_And_Fill (self, listy, word):
for i in range (sheet.nrows):
if listy[i] == word:
self.append(I)
#Initiate and Populate lists from excel spreadsheet
date = Data_Column()
date.Fill_List(0)
location = Data_Column()
location.Fill_List(1)
name = Data_Column()
name.Fill_List(2)
item = Data_Column()
item.Fill_List(3)
specPencil = Specific_List()
print(specPencil.Find_And_Fill(item,'Pencil'))
I expected a List that contained the indices where 'Pencil' was found such as [1,6,12,14,19].
The actual output was: None
I needed to take the print out of the very last line.
specPencil.Find_And_Fill(item,'Pencil')
print(specPencil)
I knew it was a simple fix

How to create a nested python dictionary with keys as strings?

Summary of issue: I'm trying to create a nested Python dictionary, with keys defined by pre-defined variables and strings. And I'm populating the dictionary from regular expressions outputs. This mostly works. But I'm getting an error because the nested dictionary - not the main one - doesn't like having the key set to a string, it wants an integer. This is confusing me. So I'd like to ask you guys how I can get a nested python dictionary with string keys.
Below I'll walk you through the steps of what I've done. What is working, and what isn't. Starting from the top:
# Regular expressions module
import re
# Read text data from a file
file = open("dt.cc", "r")
dtcc = file.read()
# Create a list of stations from regular expression matches
stations = sorted(set(re.findall(r"\n(\w+)\s", dtcc)))
The result is good, and is as something like this:
stations = ['AAAA','BBBB','CCCC','DDDD']
# Initialize a new dictionary
rows = {}
# Loop over each station in the station list, and start populating
for station in stations:
rows[station] = re.findall("%s\s(.+)" %station, dtcc)
The result is good, and is something like this:
rows['AAAA'] = ['AAAA 0.1132 0.32 P',...]
However, when I try to create a sub-dictionary with a string key:
for station in stations:
rows[station] = re.findall("%s\s(.+)" %station, dtcc)
rows[station]["dt"] = re.findall("%s\s(\S+)" %station, dtcc)
I get the following error.
"TypeError: list indices must be integers, not str"
It doesn't seem to like that I'm specifying the second dictionary key as "dt". If I give it a number instead, it works just fine. But then my dictionary key name is a number, which isn't very descriptive.
Any thoughts on how to get this working?
The issue is that by doing
rows[station] = re.findall(...)
You are creating a dictionary with the station names as keys and the return value of re.findall method as values, which happen to be lists. So by calling them again by
rows[station]["dt"] = re.findall(...)
on the LHS row[station] is a list that is indexed by integers, which is what the TypeError is complaining about. You could do rows[station][0] for example, you would get the first match from the regex. You said you want a nested dictionary. You could do
rows[station] = dict()
rows[station]["dt"] = re.findall(...)
To make it a bit nicer, a data structure that you could use instead is a defaultdict from the collections module.
The defaultdict is a dictionary that accepts a default type as a type for its values. You enter the type constructor as its argument. For example dictlist = defaultdict(list) defines a dictionary that has as values lists! Then immediately doing dictlist[key].append(item1) is legal as the list is automatically created when setting the key.
In your case you could do
from collections import defaultdict
rows = defaultdict(dict)
for station in stations:
rows[station]["bulk"] = re.findall("%s\s(.+)" %station, dtcc)
rows[station]["dt"] = re.findall("%s\s(\S+)" %station, dtcc)
Where you have to assign the first regex result to a new key, "bulk" here but you can call it whatever you like. Hope this helps.

use slice in for loop to build a list

I would like to build up a list using a for loop and am trying to use a slice notation. My desired output would be a list with the structure:
known_result[i] = (record.query_id, (align.title, align.title,align.title....))
However I am having trouble getting the slice operator to work:
knowns = "output.xml"
i=0
for record in NCBIXML.parse(open(knowns)):
known_results[i] = record.query_id
known_results[i][1] = (align.title for align in record.alignment)
i+=1
which results in:
list assignment index out of range.
I am iterating through a series of sequences using BioPython's NCBIXML module but the problem is adding to the list. Does anyone have an idea on how to build up the desired list either by changing the use of the slice or through another method?
thanks zach cp
(crossposted at [Biostar])1
You cannot assign a value to a list at an index that doesn't exist. The way to add an element (at the end of the list, which is the common use case) is to use the .append method of the list.
In your case, the lines
known_results[i] = record.query_id
known_results[i][1] = (align.title for align in record.alignment)
Should probably be changed to
element=(record.query_id, tuple(align.title for align in record.alignment))
known_results.append(element)
Warning: The code above is untested, so might contain bugs. But the idea behind it should work.
Use:
for record in NCBIXML.parse(open(knowns)):
known_results[i] = (record.query_id, None)
known_results[i][1] = (align.title for align in record.alignment)
i+=1
If i get you right you want to assign every record.query_id one or more matching align.title. So i guess your query_ids are unique and those unique ids are related to some titles. If so, i would suggest a dictionary instead of a list.
A dictionary consists of a key (e.g. record.quer_id) and value(s) (e.g. a list of align.title)
catalog = {}
for record in NCBIXML.parse(open(knowns)):
catalog[record.query_id] = [align.title for align in record.alignment]
To access this catalog you could either iterate through:
for query_id in catalog:
print catalog[query_id] # returns the title-list for the actual key
or you could access them directly if you know what your looking for.
query_id = XYZ_Whatever
print catalog[query_id]

Categories

Resources