Why am I seeing "KeyError: 'rating_count_total' " in this code - python

I am trying to iterate through two series which are derived from ios_mod2_apps
list_of_genre = ios_mod2_apps['prime_genre'].unique()
list_of_app = ios_mod2_apps['track_name']
Then I am iterating through the two series and run the following code inside the two for loops
app_rating_percentage[app] = ios_mod2_apps['rating_count_total'][ios_mod2_apps['track_name']==app].sum() /(ios_mod2_apps['rating_count_tot'][ios_mod2_apps['prime_genre']==genre].sum())
Basically the code above calculates the sum of the series of 'rating_count_total' which has the track_name as the app valuable in that iteration.
When I ran this code I get the error
KeyError: 'rating_count_total'
I have tried to understand this error but could not. I would use some help if someone have a clue of what is wrong.
Full code
#initiating an empty dictionary to take the values that will be created on the for loops
app_rating_percentage = {}
#create a list of unique genres
#create a list of apps these two lists will be used to iterate
list_of_genre = ios_mod2_apps['prime_genre'].unique()
list_of_app = ios_mod2_apps['track_name']
#iterate using the two above lists, and calculate the rating_count_total(of that app) devide by the sum of rating_count_total in that genre
for genre in list_of_genre:
for app in list_of_app:
app_rating_percentage[app] = ios_mod2_apps['rating_count_total'][ios_mod2_apps['track_name']==app].sum() /(ios_mod2_apps['rating_count_tot'][ios_mod2_apps['prime_genre']==genre].sum())
app_rating_percentage

Related

Create nested dictionaries during for loop

I have created a list during a for loop which works well but I want create a dictionary instead.
from System.Collections.Generic import List
#Collector
viewPorts = list(FilteredElementCollector(doc).OfClass(Viewport))
#create a dictionary
viewPortDict = {}
#add Sheet Number, View Name and boxoutline to dictionary
for vp in viewPorts:
sheet = doc.GetElement(vp.SheetId)
view = doc.GetElement(vp.ViewId)
vbox = vp.GetBoxOutline()
viewPortDict = {view.ViewName : {'sheetNum': sheet.SheetNumber, 'viewBox' : vbox}}
print(viewPortDict)
The output from this is as follows:
{'STEEL NOTES': {'viewBox': <Autodesk.Revit.DB.Outline object at 0x000000000000065A [Autodesk.Revit.DB.Outline]>, 'sheetNum': 'A0.07'}}
Which the structure is perfect but I want it to grab everything as while it does the for loop it seems to stop on the first loop. Why is that? And how can I get it to keep the loop going?
I have tried various things like creating another list of keys called "Keys" and list of values called "viewPortList" like:
dict.fromkeys(Keys, viewPortList)
But I always have the same problem I am not able to iterate over all elements. For full disclosure I am successful when I create a list instead. Here is what that looks like.
from System.Collections.Generic import List
#Collector
viewPorts = list(FilteredElementCollector(doc).OfClass(Viewport))
#create a dictionary
viewPortList = []
#add Sheet Number, View Name and boxoutline to dictionary
for vp in viewPorts:
sheet = doc.GetElement(vp.SheetId)
view = doc.GetElement(vp.ViewId)
vbox = vp.GetBoxOutline()
viewPortList.append([sheet.SheetNumber, view.ViewName, vbox])
print(viewPortList)
Which works fine and prints the below (only portion of a long list)
[['A0.01', 'APPLICABLE CODES', <Autodesk.Revit.DB.Outline object at 0x000000000000060D [Autodesk.Revit.DB.Outline]>], ['A0.02', etc.]
But I want a dictionary instead. Any help would be appreciated. Thanks!
In your list example, you are appending to the list. In your dictionary example, you are creating a new dictionary each time (thus removing the data from previous iterations of the loop). You can do the equivalent of appending to it as well by just assigning to a particular key in the existing dictionary.
viewPortDict[view.ViewName] = {'sheetNum': sheet.SheetNumber, 'viewBox' : vbox}

How to iterate to create variables in a list

Suppose I have the following code:
classifiers_name_all = [('AdaBoostClassifier', AdaBoostClassifier(), 'AdaBoost'),
('BernoulliNB', BernoulliNB(), 'Bernoulli Naive Bayes'),
('DummyClassifier', DummyClassifier(), 'Dummy Classifier')]
clf_values = []
for clf_na in classifiers_name_all:
clf_values.append((locals()['score_'+clf_na[0]+'_mean'], locals()['score_'+clf_na[0]+'_stddev']))
clf_values
The code above doesn't quite work.
I want to get a list which contains the variables:
clf_values = [(score_AdaBoostClassifier_mean, score_AdaBoostClassifier_stddev),
(score_BernoulliNB_mean, score_BernoulliNB_stddev)
(score_DummyClassifier_mean, score_DummyClassifier_stddev)]
How do I do this? Many thanks.
From whatever info you have given so far, I infer that there are no key errors and the resultant list is a list containing nones.
This can only mean, that your code works fine but the variables u are trying to access have 'None' values assigned to them. Check why your values are having None values and once that is fixed, this list will get desired values.

Using tqdm to track progress in slow REST-requests: progress for filling lists

I have a function where I use REST-requests. The function requests data from a remote Confluence server in chunks. Each chunk requests 25 results from a REST API.
for result in response.json()['results']:
user = dict()
user['group'] = result['name']
groups.append(user)
size = response.json()['size']
start += chunk
return groups
The result variable contains these requested 25 results from a REST-request. Results are returned in a dictionary. So I get values from the returned dictionary and then store them in a new dictionary.
Then I make the next query and get following 25 results as a new user dictionary.
Finally, I add all dictionaries to the groups list. As a result, the function returns a list consiting of multiple dictionaries.
More on this:
I use this function to collect data in a dataframe and output to a file. Because my function retrives a list of dictionaries, I expand the result to a single list as follows.
groups = list()
print('Retrieving groups:')
groups.extend(get_groups())
df = pd.DataFrame(
groups,
columns = ['group']
)
The question is: how do I track the progress of how the group list gets filled with dictionaries? Currently this print('Retrieving groups:') steps takes a lot of time, and I want some progress bar to track the go.
I thought of
tqdm.pandas(desc="desc")
for user in tqdm(groups):
pass
Within the get_groups() function, but this doesn't seem the right way to do so. I'd apreciate your help. TIA.

How to reference a Class List with another class?

I am setting up a script that will extract data from excel and return it in lists. Right now I am trying to be able to reorganized the data into smaller lists that have a common attribute. (Such as: A list that has the indices of the rows that contained, 'Pencil') I keep having the smaller list returning None.
I've checked and the lists that extract the data are working fine. But I cant seem to get the smaller lists working.
#Create a class for the multiple lists of Columns
class Data_Column(list):
def Fill_List (self,col): #fills the list
for i in range(sheet.nrows):
self.append(sheet.cell_value(i,col))
#Create a class for a specific list that has data of a common artifact
class Specific_List(list):
def Find_And_Fill (self, listy, word):
for i in range (sheet.nrows):
if listy[i] == word:
self.append(I)
#Initiate and Populate lists from excel spreadsheet
date = Data_Column()
date.Fill_List(0)
location = Data_Column()
location.Fill_List(1)
name = Data_Column()
name.Fill_List(2)
item = Data_Column()
item.Fill_List(3)
specPencil = Specific_List()
print(specPencil.Find_And_Fill(item,'Pencil'))
I expected a List that contained the indices where 'Pencil' was found such as [1,6,12,14,19].
The actual output was: None
I needed to take the print out of the very last line.
specPencil.Find_And_Fill(item,'Pencil')
print(specPencil)
I knew it was a simple fix

Python - Iterating through python list using another list

I'm stuck on the following problem:
I have a list with a ton of duplicative data. This includes entry numbers and names.
The following gives me a list of unique (non duplicative) names of people from the Data2014 table:
tablequery = c.execute("SELECT * FROM Data2014")
tablequery_results = list(people2014)
people2014_count = len(tablequery_results)
people2014_list = []
for i in tablequery_results:
if i[1] not in people2014_list:
people2014_list.append(i[1])
people2014_count = len(people2014_list)
# for i in people2014_list:
# print(i)
Now that I have a list of people. I need to iterate through tablequery_results again, however, this time I need to find the number of unique entry numbers each person has. There are tons of duplicates in the tablequery_results list. Without creating a block of code for each individual person's name, is there a way to iterate through tablequery_results using the names from people2014_list as the unique identifier? I can replicate the code from above to give me a list of unique entry numbers, but I can't seem to match the names with the unique entry numbers.
Please let me know if that does not make sense.
Thanks in advance!
I discovered my answer after delving into SQL a bit more. This gives me a list with two columns. The person's name in the first column, and then the numbers of entries that person has in the second column.
def people_data():
data_fetch = c.execute("SELECT person, COUNT(*) AS `NUM` FROM Data2014 WHERE ACTION='UPDATED' GROUP BY Person ORDER BY NUM DESC")
people_field_results = list(data_fetch)
people_field_results_count = len(people_field_results)
for i in people_field_results:
print(i)
print(people_field_results_count)

Categories

Resources