Seperating tuples in list in Python - python

I want to seperate tuples in list that comes from a sqlite database. But I don't know why I can't seperate them anyway.
Here is the output : [('3:45',), ('4:52',), ('5:42',), ('6:52',)]
I'm pulling that output from sqlite database like this :
asking = "Select SongTimes from Song_List"
self.cursor.execute(asking)
times = list(self.cursor.fetchall())
print(times)
And after that I want to sum all of song times in that list. But I need to acquire them "3:45" like this. After I will seperate them 3,45 like this and the rest is kinda easy. But like I said I need to focus that output for seperating these tuples. That "," is troublemaker I guess.

You can do the following:
times = [('3:45',), ('4:52',), ('5:42',), ('6:52',)]
seconds = 0 # Total number of seconds
# Iterate over all the tuples
for time in times:
time = time[0] # Get first element of tuple
m, s = time.split(":") # split string in two removing ':'
seconds += 60 * int(m) + int(s) # Convert str to int and add to total sum
print(seconds)

Related

CS50 'DNA': Ways to speed up my Week 6 'dna.py' program?

So for this problem I had to create a program that takes in two arguments. A CSV database like this:
name,AGATC,AATG,TATC
Alice,2,8,3
Bob,4,1,5
Charlie,3,2,5
And a DNA sequence like this:
TAAAAGGTGAGTTAAATAGAATAGGTTAAAATTAAAGGAGATCAGATCAGATCAGATCTATCTATCTATCTATCTATCAGAAAAGAGTAAATAGTTAAAGAGTAAGATATTGAATTAATGGAAAATATTGTTGGGGAAAGGAGGGATAGAAGG
My program works by first getting the "Short Tandem Repeat" (STR) headers from the database (AGATC, etc.), then counting the highest number of times each STR repeats consecutively within the sequence. Finally, it compares these counted values to the values of each row in the database, printing out a name if a match is found, or "No match" otherwise.
The program works for sure, but is ridiculously slow whenever ran using the larger database provided, to the point where the terminal pauses for an entire minute before returning any output. And unfortunately this is causing the 'check50' marking system to time-out and return a negative result upon testing with this large database.
I'm presuming the slowdown is caused by the nested loops within the 'STR_count' function:
def STR_count(sequence, seq_len, STR_array, STR_array_len):
# Creates a list to store max recurrence values for each STR
STR_count_values = [0] * STR_array_len
# Temp value to store current count of STR recurrence
temp_value = 0
# Iterates over each STR in STR_array
for i in range(STR_array_len):
STR_len = len(STR_array[i])
# Iterates over each sequence element
for j in range(seq_len):
# Ensures it's still physically possible for STR to be present in sequence
while (seq_len - j >= STR_len):
# Gets sequence substring of length STR_len, starting from jth element
sub = sequence[j:(j + (STR_len))]
# Compares current substring to current STR
if (sub == STR_array[i]):
temp_value += 1
j += STR_len
else:
# Ensures current STR_count_value is highest
if (temp_value > STR_count_values[i]):
STR_count_values[i] = temp_value
# Resets temp_value to break count, and pushes j forward by 1
temp_value = 0
j += 1
i += 1
return STR_count_values
And the 'DNA_match' function:
# Searches database file for DNA matches
def DNA_match(STR_values, arg_database, STR_array_len):
with open(arg_database, 'r') as csv_database:
database = csv.reader(csv_database)
name_array = [] * (STR_array_len + 1)
next(database)
# Iterates over one row of database at a time
for row in database:
name_array.clear()
# Copies entire row into name_array list
for column in row:
name_array.append(column)
# Converts name_array number strings to actual ints
for i in range(STR_array_len):
name_array[i + 1] = int(name_array[i + 1])
# Checks if a row's STR values match the sequence's values, prints the row name if match is found
match = 0
for i in range(0, STR_array_len, + 1):
if (name_array[i + 1] == STR_values[i]):
match += 1
if (match == STR_array_len):
print(name_array[0])
exit()
print("No match")
exit()
However, I'm new to Python, and haven't really had to consider speed before, so I'm not sure how to improve upon this.
I'm not particularly looking for people to do my work for me, so I'm happy for any suggestions to be as vague as possible. And honestly, I'll value any feedback, including stylistic advice, as I can only imagine how disgusting this code looks to those more experienced.
Here's a link to the full program, if helpful.
Thanks :) x
Thanks for providing a link to the entire program. It seems needlessly complex, but I'd say it's just a lack of knowing what features are available to you. I think you've already identified the part of your code that's causing the slowness - I haven't profiled it or anything, but my first impulse would also be the three nested loops in STR_count.
Here's how I would write it, taking advantage of the Python standard library. Every entry in the database corresponds to one person, so that's what I'm calling them. people is a list of dictionaries, where each dictionary represents one line in the database. We get this for free by using csv.DictReader.
To find the matches in the sequence, for every short tandem repeat in the database, we create a regex pattern (the current short tandem repeat, repeated one or more times). If there is a match in the sequence, the total number of repetitions is equal to the length of the match divided by the length of the current tandem repeat. For example, if AGATCAGATCAGATC is present in the sequence, and the current tandem repeat is AGATC, then the number of repetitions will be len("AGATCAGATCAGATC") // len("AGATC") which is 15 // 5, which is 3.
count is just a dictionary that maps short tandem repeats to their corresponding number of repetitions in the sequence. Finally, we search for a person whose short tandem repeat counts match those of count exactly, and print their name. If no such person exists, we print "No match".
def main():
import argparse
from csv import DictReader
import re
parser = argparse.ArgumentParser()
parser.add_argument("database_filename")
parser.add_argument("sequence_filename")
args = parser.parse_args()
with open(args.database_filename, "r") as file:
reader = DictReader(file)
short_tandem_repeats = reader.fieldnames[1:]
people = list(reader)
with open(args.sequence_filename, "r") as file:
sequence = file.read().strip()
count = dict(zip(short_tandem_repeats, [0] * len(short_tandem_repeats)))
for short_tandem_repeat in short_tandem_repeats:
pattern = f"({short_tandem_repeat}){{1,}}"
match = re.search(pattern, sequence)
if match is None:
continue
count[short_tandem_repeat] = len(match.group()) // len(short_tandem_repeat)
try:
person = next(person for person in people if all(int(person[k]) == count[k] for k in short_tandem_repeats))
print(person["name"])
except StopIteration:
print("No match")
return 0
if __name__ == "__main__":
import sys
sys.exit(main())

Python list does not append correctly (IndexError: list index out of range)

I tried different approaches for 3 hours now and I just don't get why this does not work.
current_stock_dict = db.execute("SELECT * FROM current_stocks WHERE c_user_id=:user_id ", user_id=session["user_id"])
# make a list for the mainpage
mainpage_list = [[],[]]
# save the lengh of the dict
lengh_dict = len(current_stock_dict)
price_sum = 0
share_sum = 0
# iterate over all rows in the dict
for i in range(0, (lengh_dict - 1)):
# lookup the symbol in the current stocks
c_symbol = current_stock_dict[i]["c_symbol"]
lookup_symbol = lookup(c_symbol)
# append the symbol to the list for the mainpage
mainpage_list[i].append(c_symbol)
# append the name of the share
share_name = lookup_symbol["name"]
mainpage_list[i].append(share_name)
# append the count of shares for mainpage
c_count = current_stock_dict[i]["c_count"]
mainpage_list[i].append(c_count)
# append the current price
share_price = lookup_symbol["price"]
mainpage_list[i].append("$" + str(share_price))
# append the total price of all shares
total_price = float(share_price) * int(c_count)
mainpage_list[i].append("$" + str(total_price))
# count up the price and shares
price_sum += total_price
share_sum += c_count
When i run my website via Flask i get an error message saying:
IndexError: list index out of range
in the line:
mainpage_list[i].append(c_symbol)
(and i guess if it did not allready fail there i'd get it for the rest of the lines too).
As long as lengh_dict = len(current_stock_dict) is 3 or less (So the SQL db has 3 rows or less) the error message does not appear and the code works fine. I do not really understand lists (and multidimensional lists) in python yet so i would be happy if somebody could explain my mistake to me.
Normally i would print out a lot of things and just try out where the mistake is but i just began using flask and i can't print out lists, dicts or anything if the code stops before reaching the bug.
Thanks allready for your help!!!
Let's look at the relevant part of your code.
mainpage_list = [[],[]]
for i in range(0, (lengh_dict - 1)):
mainpage_list[i].append(c_symbol)
mainpage_list is a list that contains two elements, both of which are empty lists. So, accessing mainpage_list[0] is the first list inside mainpage_list, and mainpage_list[1] is the second empty list. Any index above that will result in an IndexError.
It is not exactly clear what you are trying to achieve, but you could initialize mainpage_list with the correct number of empty lists inside if that is what you need, e.g. for the case where you want as many empty lists as the length of current_stock_dict, you could do
mainpage_list = [ [] for _ in range(length_dict) ]
The issue here is that the list mainpage_list is a two element list, and you're trying to access the third element of it.
Generally, when processing lists of indeterminate size, I prefer to iterate and append rather than indexing into the list.
This gives you something like:
source = ["abc", "def", "ghi"] # List of data to process
target = [] # The processed data
for row in source: # For every row of data
value = [] # Empty list to accumate result in
value.append(row[2])
value.append(row[1])
value.append(row[0])
target.append(value)
print(target)
which will work for any size of source list.
Applying this to your code gives you:
# current_stock is a list of dictionaries.
current_stock = db.execute("SELECT * FROM current_stocks WHERE c_user_id=:user_id ", user_id=session["user_id"])
# make a list for the mainpage
mainpage_list = []
price_sum = 0
share_sum = 0
# iterate over all rows in current_stock
for row in current_stock:
value = []
# lookup the symbol in the current stocks
c_symbol = row["c_symbol"]
lookup_symbol = lookup(c_symbol)
# append the symbol to the list for the mainpage
value.append(c_symbol)
# append the name of the share
share_name = lookup_symbol["name"]
value.append(share_name)
# append the count of shares for mainpage
c_count = row["c_count"]
value.append(c_count)
# deleted code
# count up the price and shares
price_sum += total_price
share_sum += c_count
mainpage_list.append(value)

How can I convert some of my strings into integers, then do math with them in Python?

I keep getting syntax errors whenever I try to convert the string "discount1" to convert it to "actualDiscount" into the program. Say you input "30|.15|0-Clothes" into the program:
product1 = raw_input("Enter the information for the first product>")
space1 = product1.find("|")
space2 = product1.rfind("|")
price1=product1[0:space1]
discount1=product1[space1+1:space2]
category1 = product1[space2+1:len(product1)]
actualPrice = int(price1)
actualDiscount = int(discount1)
printNow("Price is " + int(actualPrice))
printNow("Discount is " + int(actualDiscount) + " percent.")
Sometimes, depending on what I enter as input, it can manage to work all the way up to the printNows. Other times, it can't even get that far.
On top of it all, I have to do math with these integers-- subtract the price from the price multiplied by the discount percentage. I'm a little lost to say the least, and some help would be greatly appreciated.
First, it may be easier to parse your fields using split: data = product1.split( "|" )
Second, ".15" is not an int. Convert it using float().
I'm going to comment your code and try to explain errors or modifications you can do.
product1 = raw_input("Enter the information for the first product>")
# Instead of find/rfind, you can split by '|'
Replace this:
space1 = product1.find("|")
space2 = product1.rfind("|")
price1=product1[0:space1]
discount1=product1[space1+1:space2]
category1 = product1[space2+1:len(product1)]
by this:
# You get a list: ['30', '.15', '0-Clothes']
data = product1.split("|")
After that, you have to apply data transformations:
# Price is an integer
actualPrice = int(data[0])
# Discount is a float
actualDiscount = float(data[1)]
# Category
category = data[2]
And then, print the results:
print("Price is " + int(actualPrice))
print("Discount is " + int(actualDiscount) + " percent.")
Optimizing code
I think you want to generalize that code to process more than one item "a|b|c". You can use list comprehension to simplify your code as follows:
# Data input
product1 = raw_input("Enter the information for the first product>")
# Data
data = [ [int(price),float(discount),category] for price, count, category in [product1.split("|")]]
for actualPrice, actualDiscount, category in data:
print("Price is " + str(actualPrice))
print("Discount is " + str(actualDiscount) + " percent.")
print("Final price is " + str(actualPrice - actualPrice * actualDiscoun$
This is how a comprehension list works:
list_name = [ elements_you_want_to_get for list_element(s) in another_list ]
Creates a list iterating over another_list and gets it elements using the list_element(s) selection, which allows to assign values our custom names. Then, in elements_you_want_to_get you can apply transformations over data.
Explained by each element:
[product1.split("|")] : creates a list with splitted data. We surround the split result with brackets to make a list of type [[item1, item2, item3]] in order to create a structure with elements composed by 3 sub elements. For the example, now we have:
[['30', '.15', '0-Clothes']]
Note that this list of list of 3 subelements each one is the another_list element of list comprehension I showed you.
for price, count, category : for each element of another_list the loop gets each element (in our case is only ['30', '.15', '0-Clothes']) and names it's sub elements as price, discount and category. At this time we have price='30', discount='.15' and category='0-Clothes'.
The last step, the comprehension applies transformations. In our case we want to convert price in integer and discount in float (to avoid incompatibilities, the changes should be all to float).
After the comprehension, the result is:
[[price_1, discount_1, category_1], [price_2, discount_2, category_2], ...]
In our case, we only have one element, and the result would be:
`data = [....]
[[30, 0.15, '0-Clothes]]
Note that the result list has two numeric elements and one string element (surrounded by quotes).

Add up the value of data[x] to data[x+1]

I have a long list of data which I am working with now,containing a list of 'timestamp' versus 'quantity'. However, the timestamp in the list is not all in order (for example,timestamp[x] can be 140056 while timestamp[x+1] can be 560). I am not going to arrange them, but to add up the value of timestamp[x] to timestamp[x+1] when this happens.
ps:The arrangement of quantity needs to be in the same order as in the list when plotting.
I have been working with this using the following code, which timestamp is the name of the list which contain all the timestamp values:
for t in timestamp:
previous = timestamp[t-1]
increment = 0
if previous > timestamp[t]:
increment = previous
t += increment
delta = datetime.timedelta(0, (t - startTimeStamp) / 1000);
timeAtT = fileStartDate + (delta + startTime)
print("time at t=" + str(t) + " is: " + str(timeAtT));
previous = t
However it comes out with TypeError: list indices must be integers, not tuples. May I know how to solve this, or any other ways of doing this task? Thanks!
The problem is that you're treating t as if it is an index of the list. In your case, t holds the actual values of the list, so constructions like timestamp[t] are not valid. You either want:
for t in range(len(timestamp)):
Or if you want both an index and the value:
for (t, value) in enumerate(timestamp):
When you for the in timestamp you are making t take on the value of each item in timestamp. But then you try to use t as an index to make previous. To do this, try:
for i, t, in enumerate(timestamp):
previous = timestamp[i]
current = t
Also when you get TypeErrors like this make sure you try printing out the intermediate steps, so you can see exactly what is going wrong.

Grabbing values of an array in sets of 100

In the code below, ids is an array which contains the steam64 ids of all users in your friendslist. Now according to the steam web api documentation, GetPlayerSummaries only takes a list of 100 comma separated steam64 ids. Some users have more than 100 friends, and instead of running a for loop 200 times that each time calls the API, I want to take array in sets of 100 steam ids. What would be the most efficient way to do this (in terms of speed)?
I know that I can do ids[0:100] to grab the first 100 elements of an array, but how I accomplish doing this for a friendlist of say 230 users?
def getDescriptions(ids):
sids = ','.join(map(str, ids))
r = requests.get('http://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key='+API_KEY+'&steamids=' + sids)
data = r.json();
...
Utilizing the code from this answer, you are able to break this into groups of 100 (or less for the last loop) of friends.
def chunkit(lst, n):
newn = int(len(lst)/n)
for i in xrange(0, n-1):
yield lst[i*newn:i*newn+newn]
yield lst[n*newn-newn:]
def getDescriptions(ids):
friends = chunkit(ids, 3)
while (True):
try:
fids = friends.next()
sids = ','.join(map(str, fids))
r = requests.get('http://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key='+API_KEY+'&steamids=' + sids)
data = r.json()
# Do something with the data variable
except StopIteration:
break
This will create iterators broken into 3 (second parameter to chunkit) groups. I chose 3, because the base size of the friends list is 250. You can get more (rules from this post), but it is a safe place to start. You can fine tune that value as you need.
Utilizing this method, your data value will be overwritten each loop. Make sure you do something with it at the place indicated.
I have an easy alternative, just reduce your list size on each while/loop until exhaustion:
def getDescriptions(ids):
sids = ','.join(map(str, ids))
sids_queue = sids.split(',')
data = []
while len(sids_queue) != 0:
r = requests.get('http://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key='+ \
API_KEY+'&steamids=' + ','.join(sids_queue[:100])
data.append(r.json) # r.json without (), by the way
# then skip [0:100] and reassign to sids_queue, you get the idea
sids_queue = sids_queue[101:]

Categories

Resources