Counting Occurences of a String within elements of a list? - python

I'm trying to count the amount of times "a class" occurs within each "h2 class", so I split the parsed texted by "h2 class" but am having struggles with the second part, this is where I'm at
#splitting parsed text by header
parsed.split("h2 class")
#creating the list for the a value count to be stored
aValCount = []
#counting amount of items per header
for i in range (len(parsed)):
aValCount = aValCount + ((parsed[i]).count("a class"))
the error I'm getting is
TypeError: can only concatenate list (not "int") to list
, but I can't figure out how to this without getting some sort of error
Edited: Thought I should add, I want it to be a list of the counts from the strings, so the count from element one in parsed, should be element 1 in aValCount

The issue is that aValCount is an array and ((parsed[i]).count("a class")) is an int.
What you want is to add the count to aValCount so you need to pass another array.
aValCount = aValCount + [((parsed[i]).count("a class"))]
If you add [...] that should do it.
Or you can also do
aValCount.append(((parsed[i]).count("a class"))])
Hope that help.
results = parsed.split("h2 class")
aValCountList = []
for i in range (len(results)):
aValCountList.append((results[i]).count("a class"))

Related

"TypeError: can only join an iterable" when attempting to add list entries to a string

I have a class with two lists as variables. It has an object which is supposed to add every element in the lists to a (quite lengthy) string, which is then returned to the main program, to eventually by printed. I'm iterating through the list with a for-loop and using .join() to add every object to the string, but I'm getting a TypeError: "can only join an iterable".
The lists contain prices of what has been purchased in a restaurant, so just floating numbers.
Class A:
def __init__(self, etc.):
self.__foods = []
self.__drinks = []
I then have an object which is supposed to print a receipt, with a predetermined form, which is then passed on to the main program as a string.
Class A:
...
def create_receipt(self):
food_price_string = "" # What is eventually joined to the main string
food_prices = self.__foods # What is iterated
for price in food_prices:
food_price_string.join(price) # TypeError here
food_price_string.join("\n") # For the eventual print
Here's where I get the TypeError - the program refuses to join 'price' variable to the string created above. I'm supposed to do the same thing for the drink prices too, both of which would then be joined to the rest of the string:
There are two problems here:
str.join does not alter the string (strings are immutable), it returns a new string; and
it takes as input an iterable of strings that are joined together, not add a single string together.
The fact that food_prices is iterable does not matter, since you use a for loop, the prices are the element of the food_prices, and thus you join a single item of the list.
You can rewrite the program like:
def create_receipt(self):
food_prices = self.__foods
food_price_string = '\n'.join(str(price) for price in food_prices)
food_price_string += '\n' # (optional) add a new line at the end
# ... continue processing food_price_string (or return it)

Web2py comparing part of a request.vars element

I have a form with a table with rows containing SELECTs with _names with IDs attached, like this:
TD_list.append(TD(SELECT(lesson_reg_list, _name='lesson_reg_' + str(student[4]))))
When the form is submitted I want to extract both the student[4] value and the value held by request.vars.lesson_reg_student[4].
I've tried something like:
for item in request.vars:
if item[0:9] == "lesson_reg":
enrolment_id = int(item[10:])
code = request.vars.item
I also tried treating request.vars like a dictionary by using:
for key, value in request.vars:
if key[0:9] == "lesson_reg":
enrolment_id = int(key[10:])
code = value
but then I got 'too many values to unpack'. How do I retrieve the value of a request.vars item when the last part of its name could be any number, plus a substring of the item name itself?
Thanks in advance for helping me.
In Python, when slicing, the ending index is excluded, so your two slices should be 0:10 and 0:11. To simplify, you can also use .startswith for the first one:
for item in request.vars:
if item.startswith('lesson_reg'):
enrolment_id = int(item[11:])
code = request.vars.item

Python list index not found in loading list from text file

The assignment was to get a user to input 4 numbers, then store them in a text file, open that text file, show the 4 numbers on different lines, then get the average of those numbers and display it to the user.
Here is my code so far:
__author__ = 'Luca Sorrentino'
numbers = open("Numbers", 'r+')
numbers.truncate() #OPENS THE FILE AND DELETES THE PREVIOUS CONTENT
# Otherwise it prints out all the inputs into the file ever
numbers = open("Numbers", 'a') #Opens the file so that it can be added to
liist = list() #Creates a list called liist
def entry(): #Defines a function called entry, to enable the user to enter numbers
try:
inputt = float(input("Please enter a number")) #Stores the users input as a float in a variable
liist.append(inputt) #Appends the input into liist
except ValueError: #Error catching that loops until input is correct
print("Please try again. Ensure your input is a valid number in numerical form")
entry() #Runs entry function again to enable the user to retry.
x = 0
while x < 4: # While loop so that the program collects 4 numbers
entry()
x = x + 1
for inputt in liist:
numbers.write("%s\n" % inputt) #Writes liist into the text file
numbers.close() #Closes the file
numbers = open("Numbers", 'r+')
output = (numbers.readlines())
my_list = list()
my_list.append(output)
print(my_list)
print(my_list[1])
The problem is loading the numbers back from the text file and then storing each one as a variable so that I can get the average of them.
I can't seem to find a way to specifically locate each number, just each byte which is not what I want.
Your list (my_list) has only 1 item - a list with the items you want.
You can see this if you try print(len(my_list)), so your print(my_list[1]) is out of range because the item with index = 1 does not exist.
When you create an empty list and append output, you are adding one item to the list, which is what the variable output holds for a value.
To get what you want just do
my_list = list(output)
You'll have two main problems.
First, .append() is for adding an individual item to a list, not for adding one list to another. Because you used .append() you've ended up with a list containing one item, and that item is itself a list... not what you want, and the explanation for your error message. For concatenating one list to another .extend() or += would work, but you should ask yourself whether that is even necessary in your case.
Second, your list elements are strings and you want to work with them as numbers. float() will convert them for you.
In general, you should investigate the concept of "list comprehensions". They make operations like this very convenient. The following example creates a new list whose members are the respectively float()ed versions of your .readlines() output:
my_list = [float(x) for x in output]
The ability to add conditionals into a list comprehension is also a real complexity-saver. For example, if you wanted to skip any blank/whitespace lines that had crept into your file:
my_list = [float(x) for x in output if len(x.strip())]
You can change the end of your program a little and it will work:
output = numbers.readlines()
# this line uses a list comprehension to make
# a new list without new lines
output = [i.strip() for i in output]
for num in output:
print(num)
1.0
2.0
3.0
4.0
print sum(float(i) for i in output)
10

Remove duplicates from user input

I want to ignore any duplicate entry given by user as input. I have below code :
def pITEMName():
global ITEMList,fITEMList
pITEMList = []
fITEMList = []
ITEMList = str(raw_input('Enter pipe separated list of ITEMS : ')).upper().strip()
items = ITEMList.split("|")
count = len(items)
print 'Total Distint ITEM Count : ', count
pipelst = [i.replace('-mc','').replace('-MC','').replace('$','').replace('^','') for i in ITEMList.split('|')]
filepath = '/location/data.txt'
f = open(filepath, 'r')
for lns in f:
split_pipe = lns.split(':', 1)
if split_pipe[0] in pipelst:
index = pipelst.index(split_pipe[0])
pITEMList=split_pipe[0]+"|"
fITEMList.append(pITEMList)
del pipelst[index]
for lns in pipelst:
print bcolors.red + lns,' is wrong ITEM Name' + bcolors.ENDC
f.close()
When I execute above code it prompts me for user input as :
Enter pipe separated list of items :
And if I provide the input as :
Enter pipe separated list of items : AAA|IFA|AAA
After pressing enter I am getting the result as :
Enter pipe separated list of Items : AAA|IFA|AAA
Total Distint Item Count : 3
AAA is wrong Item Name
Items Belonging to other Centers :
Other Centers :
Item Count From Other Center = 0
Items Belonging to Current Centers :
Active Items in US1:
^IFA$
Active Items in US2 :
^AAA$|^AAA$
Ignored Item Count From Current Center = 0
You Have Entered ItemList belonging to this Center as:
^IFA$|^AAA$|^AAA$
Active Item Count : 3
Do You Want To Continue [YES|Y|NO|N] :
In above result you must be noticing that I have mentioned AAA entry twice so its counting as wrong Item. I want as duplicate entry to be ignored. Here I want to ignore the case sensitive condition also. Means If I give AAA|aaa|ifa, one 'aaa' should get ignored.
Please help me that how I can implement this.
First, you're doing ITEMList.split("|") several times. You should just use your already calculated items.
Second, you probably want:
items = set(ITEMList.lower().split("|"))
This way you get a set with unique, all lowercase elements.
I assume this doesn't matter since you can discard either uppercase or lowercase.
If item order is not important, then a set will do this very well.
items = set(ITEMList.split("|"))
Lots of great answers here; throwing my hat into the ring as well. One straightforward way to do this:
items = list(set(ITEMList.split("|")))
items.sort()
This preserves your items object as a list and orders it (which is something you may or may not prefer in this case).
If you decide later that you want to return an element of your items list in your code, you will be able to do it by referring to the list index (this functionality doesn't exist with sets).
If you want to preserve the value of the variable count, you could implement the code as:
items = ITEMList.split("|")
count = len(items)
items = list(set(ITEMList.split("|")))
items.sort()
You will also want to adjust this line:
pipelst = [i.replace('-mc','').replace('MC','').replace('$','').replace('^','') for i in ITEMList.split('|')]
to this:
pipelst = [i.replace('-mc','').replace('MC','').replace('$','').replace('^','') for i in items]
if order is important
my_list = "^IFA$|^AAA$|^AAA$"
"|".join(collections.Counter(my_list.upper().split("|")).keys())
is one way to do it

Splitting json data in python

I'm trying to manipulate a list of items in python but im getting the error "AttributeError: 'list' object has no attribute 'split'"
I understand that list does not understand .split but i don't know what else to do. Below is a copy paste of the relevant part of my code.
tourl = 'http://data.bitcoinity.org/chart_data'
tovalues = {'timespan':'24h','resolution':'hour','currency':'USD','exchange':'all','mining_pool':'all','compare':'no','data_type':'price_volume','chart_type':'line_bar','smoothing':'linear','chart_types':'ccacdfcdaa'}
todata = urllib.urlencode(tovalues)
toreq = urllib2.Request(tourl, todata)
tores = urllib2.urlopen(toreq)
tores2 = tores.read()
tos = json.loads(tores2)
tola = tos["data"]
for item in tola:
ting = item.get("values")
ting.split(',')[2] <-----ERROR
print(ting)
To understand what i'm trying to do you will also need to see the json data. Ting outputs this:
[
[1379955600000L, 123.107310846774], [1379959200000L, 124.092526428571],
[1379962800000L, 125.539504822835], [1379966400000L, 126.27024617931],
[1379970000000L, 126.723474983766], [1379973600000L, 126.242406356837],
[1379977200000L, 124.788410570987], [1379980800000L, 126.810084904632],
[1379984400000L, 128.270580796748], [1379988000000L, 127.892411269036],
[1379991600000L, 126.140579640523], [1379995200000L, 126.513705084746],
[1379998800000L, 128.695124951923], [1380002400000L, 128.709738051044],
[1380006000000L, 125.987767097378], [1380009600000L, 124.323433535528],
[1380013200000L, 123.359378559603], [1380016800000L, 125.963250678733],
[1380020400000L, 125.074618194444], [1380024000000L, 124.656345088853],
[1380027600000L, 122.411303435449], [1380031200000L, 124.145747100372],
[1380034800000L, 124.359452274881], [1380038400000L, 122.815357211394],
[1380042000000L, 123.057706915888]
]
[
[1379955600000L, 536.4739135], [1379959200000L, 1235.42506637],
[1379962800000L, 763.16329656], [1379966400000L, 804.04579319],
[1379970000000L, 634.84689741], [1379973600000L, 753.52716718],
[1379977200000L, 506.90632968], [1379980800000L, 494.473732950001],
[1379984400000L, 437.02095093], [1379988000000L, 176.25405034],
[1379991600000L, 319.80432715], [1379995200000L, 206.87212398],
[1379998800000L, 638.47226435], [1380002400000L, 438.18036666],
[1380006000000L, 512.68490443], [1380009600000L, 904.603705539997],
[1380013200000L, 491.408088450001], [1380016800000L, 670.275397960001],
[1380020400000L, 767.166941339999], [1380024000000L, 899.976089609997],
[1380027600000L, 1243.64963909], [1380031200000L, 1508.82429811],
[1380034800000L, 1190.18854705], [1380038400000L, 546.504592349999],
[1380042000000L, 206.84883264]
]
And ting[0] outputs this:
[1379955600000L, 123.187067936508]
[1379955600000L, 536.794013499999]
What i'm really trying to do is add up the values from ting[0-24] that comes AFTER the second comma. This made me try to do a split but that does not work
You already have a list; the commas are put there by Python to delimit the values only when printing the list.
Just access element 2 directly:
print ting[2]
This prints:
[1379962800000, 125.539504822835]
Each of the entries in item['values'] (so ting) is a list of two float values, so you can address each of those with index 0 and 1:
>>> print ting[2][0]
1379962800000
>>> print ting[2][1]
125.539504822835
To get a list of all the second values, you could use a list comprehension:
second_vals = [t[1] for t in ting]
When you load the data with json.loads, it is already parsed into a real list that you can slice and index as normal. If you want the data starting with the third element, just use ting[2:]. (If you just want the third element by itself, just use ting[2].)

Categories

Resources