How do you simplify .str.contains code with multiple contains variables?

How do you simplify .str.contains code with multiple contains variables? - python

I am using the Series.str.contains() function to find if a value in a dictionary is contained in a string from a specific column. My code works, but I am trying to simplify it.
I tried using a for loop to go through the values in a list. My attempt is down below.
total_joined_list=[joined_groc_lst,joined_util_lst]
for i in range(len(total_joined_list)):
groc_amount,util_amount=abs(round(df.loc[df['Description'].str.contains('\b'+i+'\b',na=False),'Amount'].sum(),2))
Here is the current working code.
joined_groc_lst = '|'.join(expenses_dict['Groceries'])
joined_util_lst = '|'.join(expenses_dict['Utilities'])
groc_amount=abs(round(df.loc[df['Description'].str.contains('\b'+joint_groc_lst+'\b',na=False),
'Amount'].sum(),2))
util_amount=abs(round(df.loc[df['Description'].str.contains('\b'+joint_util_lst+'\b',na=False),
'Amount'].sum(),2))
I expect my function to create two variables; groc_amount and util_amount, and I would be able to print my results. I got this error "TypeError: can only concatenate str (not "int") to str" and then added str() to the i in my for loop and get the following error "TypeError: cannot unpack non-iterable int object"

total_joined_list=[joined_groc_lst,joined_util_lst]
groc_amount, util_amount = (abs(round(df.loc[df['Description'].str.contains(f'\b{e}\b',na=False),'Amount'].sum(),2)) for e in total_joined_list)
I have change your forloop into a generator that allow unpacking data in order to get each items in the list.

Related

for loops - why does range(len()) cause TypeError: int not subscriptable in this case?

I'm working on looping through a JSON response and I'm trying to figure out why in this case below I get a TypeError: 'int' not subscriptable.
list = []
for i in range(len(json_data['MRData']['RaceTable']['Races'][0]['Results'])):
list.append(i['FastestLap']['Time']['time'])
print(list)
I got around this by just doing a try/except block but I would rather know the length of what I'm iterating over. I tried reading some of the posts on here from other folks regarding this but couldn't make sense of it.

this is because i is an integer in this: list.append(i['FastestLap']['Time']['time'])
try this:
results = json_data['MRData']['RaceTable']['Races'][0]['Results']
for i in range(len(results)):
list.append(results[i]['FastestLap']['Time']['time'])
It will get the ith item from the list.

You have taken the len inside range, so you'll get the value for i as int from 0 to len(whateverItIs) - 1. And inside the loop you're treating i as a dict. The “typeerror: 'int' object is not subscriptable” error is raised when you try to access an integer as if it were a subscriptable object, like a list or a dictionary.
That's why you are getting that error in this line:
list.append(i['FastestLap']['Time']['time'])
To overcome the problem, you can do as follows:
dict_val = json_data['MRData']['RaceTable']['Races'][0]['Results']
for element in dict_val:
list.append(element['FastestLap']['Time']['time'])

TypeError: tuple indices must be integers or slices, not str using Python Core API?

I am trying to filter some data using the Python Core API, which is through Apache Spark, but I am coming into this error, and I am unable to solve it in terms of the data I have:
TypeError: tuple indices must be integers or slices, not str
Now, this is a sample of my data structure:
This is the code I am using to filter my data, but it keeps giving me that error. I am simply trying to return the business_id, city and stars from my dataset.
(my_rdd
.filter(lambda x: x['city']=='Toronto')
.map(lambda x: (x['business_id'], x['city'], x['stars']))
).take(5)
Any guidance on how to filter my data would be helpful.
Thanks.

Sinc your data is nested in tuples, you need to specify the tuple indices in your filter and map:
result = (my_rdd
.filter(lambda x: x[1][1]['city']=='Toronto')
.map(lambda x: (x[1][1]['business_id'], x[1][1]['city'], x[1][1]['stars']))
)
print(result.collect())
[('7v91woy8IpLrqXsRvxj_vw', 'Toronto', 3.0)]

I think you are mistaking in the use of filter and map here. Both of them are used to update lists, and returns lists.
Both of them take a function as parameter (that's the case in the object version, you can also find a functional version which takes the input list as second parameter) and apply it on each item of the input list to build the output list. What differs though is their usage of the function:
filter uses it to, well, filter the input list. The function should return a boolean which indicates whether or not to include the item in the output list.
map uses it to build a new list of the same length as the old one, but with values updated using the provided function.
Now that being said, I believe you have the error TypeError: tuple indices must be integers or slices, not str when you try to filter the list.
On the first loop, the filter function will try to run the function against the first element of the list. This first element is the tuple ('7v91woy8IpLrqXsRvxj_vw', (({'average_stars': 3.41, 'compliment_cool': 9, ...}))). The problem is that you are trying to access a value of this tuple using a string, as if it was a dictionary, which is not permitted in python (and doesn't make much sense).
To extract the data you need, I would go with something much more simple:
item = my_rdd[0]
(item[1][1]['business_id'], item[1][1]['city'], item[1][1]['stars'])

Python unittest list comparison repr problem

Im writing some tests for more my python application for a company and now got stuck with the following problem:
I need to compare two list of lists and I always get a an error when converting the second automatically generated list: TypeError: repr returned non-string (type dict). Which means the list I'm trying to compare with self.assertListEqual(l1, l2) contains a sub list again, I already checked the structure and I'm always getting the same result: there is no sub list in the list, I have printed everything, evaluated the content multiply times and still getting the same error, now I'm a bit stuck and don't know if how to proceed further. This is the code I used to generate the list which is correct and should be structural the same as the list the function generates:
expected.append([])
expected[0].append(openers[0])
expected[0].extend(locks[0:5])
expected.append([])
expected[1].append(openers[1])
expected[1].extend(locks[6:10])
expected.append([other])
And this hard coded list is than compared to a dynamically created list which should be exactly the same
Thanks for any help, if more cod is required I will append it here

Using User Input to Index List in Python

I can't understand what I am doing wrong with this method. It is called from another in class method as such:
def equip_menu(self): # this is not the actual method, but equip_choice is used only in the following lines
#snipped code
equip_choice = Input("Input the number of the item you want to equip:\n")
self.select_from_list_equip(equip_choice)
and this is the method throwing error:
def select_from_list_equip(self, equip_choice): # Trying to select item in list self.backpack
item_to_equip = self.backpack[equip_choice]
print("*DEBUG* Equip chosen:", item_to_equip.name)
playeritems.equip(item_to_equip)
I get the error:
"classes.py", line 109, in select_from_list_equip
item_to_equip = self.backpack[count]
TypeError: list indices must be integers or slices, not str"
So I tried making equip_choice an integer, even though I just try inputting digits without decimals and still get the error. I am in the opinion that I am not trying to use a string as list index, but obviously I am wrong. So I tried to force equip_choice to become an integer like this:
def select_from_list_equip(self, equip_choice):
int(equip_choice)
item_to_equip = self.backpack[equip_choice]
print("*DEBUG* Equip chosen:", item_to_equip.name)
playeritems.equip(item_to_equip)
But I still get the same identical error. Why can't I use the value of equip_choice as list index? I must be missing something very obvious and basic I am blind to?

Input() returns a string. You will need to use int() to convert to an integer.
input() resource

"TypeError: float argument must be a string or a number, not a list." converting a list of strings to a list of floats

I am trying to convert a list of strings into a list of floats. I have tried list comprehension, mapping, and simply writing it out in a for loop. I don't really want to use mapping since I can't seem to get it back into a proper list even with list(map).
So far none of my attempts have worked because I am having trouble finding the correct syntax for Python 3x. My latest attempt seems to show promise, but I keep getting the following error.
Traceback (most recent call last):
File "G:/test.py", line 56, in <module>
heartdis_flt.append(float(item))
TypeError: float() argument must be a string or a number, not 'list'
This is the code I am using:
heartdis = heartdis[5:]
heartdis_flt = []
for item in heartdis:
heartdis_flt.append(float(item))
print(heartdis_flt)
heartdis is a list of strings created from a CSV file.
Can someone explain the correct syntax or maybe some flaw in my logic?

I found something that will work. I used itertools to change the list of list into one list then converted it all to floats.
heartdis = heartdis[5:]
heartdis_flt = []
heartdis2 = list(itertools.chain.from_iterable(heartdis))
for item in heartdis2:
heartdis_flt.append(float(item))
print(heartdis_flt)

Like #utdemir commented, the problem with your code is that you are treating a list as a string. You indeed could use itertools.chain, but maybe you want to change the way you are reading from heartdis in the first place. I don't know how you are reading your CSV file, but if you are using the csv module, I reckon you should not be getting list of lists as outputs. Anyway, something you should check in my opinion.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do you simplify .str.contains code with multiple contains variables? - python

Related

for loops - why does range(len()) cause TypeError: int not subscriptable in this case?

TypeError: tuple indices must be integers or slices, not str using Python Core API?

Python unittest list comparison repr problem

Using User Input to Index List in Python

"TypeError: float argument must be a string or a number, not a list." converting a list of strings to a list of floats

Categories

Resources