I am trying to iterate through two series which are derived from ios_mod2_apps
list_of_genre = ios_mod2_apps['prime_genre'].unique()
list_of_app = ios_mod2_apps['track_name']
Then I am iterating through the two series and run the following code inside the two for loops
app_rating_percentage[app] = ios_mod2_apps['rating_count_total'][ios_mod2_apps['track_name']==app].sum() /(ios_mod2_apps['rating_count_tot'][ios_mod2_apps['prime_genre']==genre].sum())
Basically the code above calculates the sum of the series of 'rating_count_total' which has the track_name as the app valuable in that iteration.
When I ran this code I get the error
KeyError: 'rating_count_total'
I have tried to understand this error but could not. I would use some help if someone have a clue of what is wrong.
Full code
#initiating an empty dictionary to take the values that will be created on the for loops
app_rating_percentage = {}
#create a list of unique genres
#create a list of apps these two lists will be used to iterate
list_of_genre = ios_mod2_apps['prime_genre'].unique()
list_of_app = ios_mod2_apps['track_name']
#iterate using the two above lists, and calculate the rating_count_total(of that app) devide by the sum of rating_count_total in that genre
for genre in list_of_genre:
for app in list_of_app:
app_rating_percentage[app] = ios_mod2_apps['rating_count_total'][ios_mod2_apps['track_name']==app].sum() /(ios_mod2_apps['rating_count_tot'][ios_mod2_apps['prime_genre']==genre].sum())
app_rating_percentage
Related
I have created a list during a for loop which works well but I want create a dictionary instead.
from System.Collections.Generic import List
#Collector
viewPorts = list(FilteredElementCollector(doc).OfClass(Viewport))
#create a dictionary
viewPortDict = {}
#add Sheet Number, View Name and boxoutline to dictionary
for vp in viewPorts:
sheet = doc.GetElement(vp.SheetId)
view = doc.GetElement(vp.ViewId)
vbox = vp.GetBoxOutline()
viewPortDict = {view.ViewName : {'sheetNum': sheet.SheetNumber, 'viewBox' : vbox}}
print(viewPortDict)
The output from this is as follows:
{'STEEL NOTES': {'viewBox': <Autodesk.Revit.DB.Outline object at 0x000000000000065A [Autodesk.Revit.DB.Outline]>, 'sheetNum': 'A0.07'}}
Which the structure is perfect but I want it to grab everything as while it does the for loop it seems to stop on the first loop. Why is that? And how can I get it to keep the loop going?
I have tried various things like creating another list of keys called "Keys" and list of values called "viewPortList" like:
dict.fromkeys(Keys, viewPortList)
But I always have the same problem I am not able to iterate over all elements. For full disclosure I am successful when I create a list instead. Here is what that looks like.
from System.Collections.Generic import List
#Collector
viewPorts = list(FilteredElementCollector(doc).OfClass(Viewport))
#create a dictionary
viewPortList = []
#add Sheet Number, View Name and boxoutline to dictionary
for vp in viewPorts:
sheet = doc.GetElement(vp.SheetId)
view = doc.GetElement(vp.ViewId)
vbox = vp.GetBoxOutline()
viewPortList.append([sheet.SheetNumber, view.ViewName, vbox])
print(viewPortList)
Which works fine and prints the below (only portion of a long list)
[['A0.01', 'APPLICABLE CODES', <Autodesk.Revit.DB.Outline object at 0x000000000000060D [Autodesk.Revit.DB.Outline]>], ['A0.02', etc.]
But I want a dictionary instead. Any help would be appreciated. Thanks!
In your list example, you are appending to the list. In your dictionary example, you are creating a new dictionary each time (thus removing the data from previous iterations of the loop). You can do the equivalent of appending to it as well by just assigning to a particular key in the existing dictionary.
viewPortDict[view.ViewName] = {'sheetNum': sheet.SheetNumber, 'viewBox' : vbox}
Suppose I have the following code:
classifiers_name_all = [('AdaBoostClassifier', AdaBoostClassifier(), 'AdaBoost'),
('BernoulliNB', BernoulliNB(), 'Bernoulli Naive Bayes'),
('DummyClassifier', DummyClassifier(), 'Dummy Classifier')]
clf_values = []
for clf_na in classifiers_name_all:
clf_values.append((locals()['score_'+clf_na[0]+'_mean'], locals()['score_'+clf_na[0]+'_stddev']))
clf_values
The code above doesn't quite work.
I want to get a list which contains the variables:
clf_values = [(score_AdaBoostClassifier_mean, score_AdaBoostClassifier_stddev),
(score_BernoulliNB_mean, score_BernoulliNB_stddev)
(score_DummyClassifier_mean, score_DummyClassifier_stddev)]
How do I do this? Many thanks.
From whatever info you have given so far, I infer that there are no key errors and the resultant list is a list containing nones.
This can only mean, that your code works fine but the variables u are trying to access have 'None' values assigned to them. Check why your values are having None values and once that is fixed, this list will get desired values.
I have a function where I use REST-requests. The function requests data from a remote Confluence server in chunks. Each chunk requests 25 results from a REST API.
for result in response.json()['results']:
user = dict()
user['group'] = result['name']
groups.append(user)
size = response.json()['size']
start += chunk
return groups
The result variable contains these requested 25 results from a REST-request. Results are returned in a dictionary. So I get values from the returned dictionary and then store them in a new dictionary.
Then I make the next query and get following 25 results as a new user dictionary.
Finally, I add all dictionaries to the groups list. As a result, the function returns a list consiting of multiple dictionaries.
More on this:
I use this function to collect data in a dataframe and output to a file. Because my function retrives a list of dictionaries, I expand the result to a single list as follows.
groups = list()
print('Retrieving groups:')
groups.extend(get_groups())
df = pd.DataFrame(
groups,
columns = ['group']
)
The question is: how do I track the progress of how the group list gets filled with dictionaries? Currently this print('Retrieving groups:') steps takes a lot of time, and I want some progress bar to track the go.
I thought of
tqdm.pandas(desc="desc")
for user in tqdm(groups):
pass
Within the get_groups() function, but this doesn't seem the right way to do so. I'd apreciate your help. TIA.
I am setting up a script that will extract data from excel and return it in lists. Right now I am trying to be able to reorganized the data into smaller lists that have a common attribute. (Such as: A list that has the indices of the rows that contained, 'Pencil') I keep having the smaller list returning None.
I've checked and the lists that extract the data are working fine. But I cant seem to get the smaller lists working.
#Create a class for the multiple lists of Columns
class Data_Column(list):
def Fill_List (self,col): #fills the list
for i in range(sheet.nrows):
self.append(sheet.cell_value(i,col))
#Create a class for a specific list that has data of a common artifact
class Specific_List(list):
def Find_And_Fill (self, listy, word):
for i in range (sheet.nrows):
if listy[i] == word:
self.append(I)
#Initiate and Populate lists from excel spreadsheet
date = Data_Column()
date.Fill_List(0)
location = Data_Column()
location.Fill_List(1)
name = Data_Column()
name.Fill_List(2)
item = Data_Column()
item.Fill_List(3)
specPencil = Specific_List()
print(specPencil.Find_And_Fill(item,'Pencil'))
I expected a List that contained the indices where 'Pencil' was found such as [1,6,12,14,19].
The actual output was: None
I needed to take the print out of the very last line.
specPencil.Find_And_Fill(item,'Pencil')
print(specPencil)
I knew it was a simple fix
I'm stuck on the following problem:
I have a list with a ton of duplicative data. This includes entry numbers and names.
The following gives me a list of unique (non duplicative) names of people from the Data2014 table:
tablequery = c.execute("SELECT * FROM Data2014")
tablequery_results = list(people2014)
people2014_count = len(tablequery_results)
people2014_list = []
for i in tablequery_results:
if i[1] not in people2014_list:
people2014_list.append(i[1])
people2014_count = len(people2014_list)
# for i in people2014_list:
# print(i)
Now that I have a list of people. I need to iterate through tablequery_results again, however, this time I need to find the number of unique entry numbers each person has. There are tons of duplicates in the tablequery_results list. Without creating a block of code for each individual person's name, is there a way to iterate through tablequery_results using the names from people2014_list as the unique identifier? I can replicate the code from above to give me a list of unique entry numbers, but I can't seem to match the names with the unique entry numbers.
Please let me know if that does not make sense.
Thanks in advance!
I discovered my answer after delving into SQL a bit more. This gives me a list with two columns. The person's name in the first column, and then the numbers of entries that person has in the second column.
def people_data():
data_fetch = c.execute("SELECT person, COUNT(*) AS `NUM` FROM Data2014 WHERE ACTION='UPDATED' GROUP BY Person ORDER BY NUM DESC")
people_field_results = list(data_fetch)
people_field_results_count = len(people_field_results)
for i in people_field_results:
print(i)
print(people_field_results_count)