Python - .insert() method only replaces the first letter of word? - python

I've posted about this before and I've been able to whittle the program down to a single function for testing purposes. I'm not getting any errors, but I am getting a bug that is driving me up the wall. I'm able to replace the shortcut, but it only replaces the first letter of the shortcut.
Here's the code:
from tkinter import *
root = Tk()
text = Text(root)
text.pack(expand=1, fill=BOTH)
syntax = "shortcut = sc" # This will be turned into a function to return the shortcut
# and the word, I'm only doing this for debugging purposes.
def replace_shortcut(event=None):
tokens = syntax.split()
word = tokens[:1]
shortcut = tokens[2:3]
index = '1.0'
while 1:
index = text.search(shortcut, index, stopindex="end")
if not index: break
last_idx = '%s + %dc' % (index, len(shortcut))
text.delete(index, last_idx)
text.insert(index, word)
last_idx = '%s + %dc' % (index, len(word))
text.bind('<space>', replace_shortcut)
text.mainloop()
The shortcut given, in our case 'sc' will turn into 'shortcutc' after the space is typed. Any help is appreciated!

You have two problems.
You have the variable shortcut defined to be ['sc'] instead of 'sc'. So len(shortcut) will always be 1 (the length of the array) rather than 2 (the length of the string). You'll always end up deleting just one character. Probably you want len(shortcut[0])
[You also have the same problem with len(word). You'll always get 1, the length of the array.]
Also, the last line of your while loop should set index rather than last_idx, since that's the variable that's going to be used in the next search.

Related

CS50 'DNA': Ways to speed up my Week 6 'dna.py' program?

So for this problem I had to create a program that takes in two arguments. A CSV database like this:
name,AGATC,AATG,TATC
Alice,2,8,3
Bob,4,1,5
Charlie,3,2,5
And a DNA sequence like this:
TAAAAGGTGAGTTAAATAGAATAGGTTAAAATTAAAGGAGATCAGATCAGATCAGATCTATCTATCTATCTATCTATCAGAAAAGAGTAAATAGTTAAAGAGTAAGATATTGAATTAATGGAAAATATTGTTGGGGAAAGGAGGGATAGAAGG
My program works by first getting the "Short Tandem Repeat" (STR) headers from the database (AGATC, etc.), then counting the highest number of times each STR repeats consecutively within the sequence. Finally, it compares these counted values to the values of each row in the database, printing out a name if a match is found, or "No match" otherwise.
The program works for sure, but is ridiculously slow whenever ran using the larger database provided, to the point where the terminal pauses for an entire minute before returning any output. And unfortunately this is causing the 'check50' marking system to time-out and return a negative result upon testing with this large database.
I'm presuming the slowdown is caused by the nested loops within the 'STR_count' function:
def STR_count(sequence, seq_len, STR_array, STR_array_len):
# Creates a list to store max recurrence values for each STR
STR_count_values = [0] * STR_array_len
# Temp value to store current count of STR recurrence
temp_value = 0
# Iterates over each STR in STR_array
for i in range(STR_array_len):
STR_len = len(STR_array[i])
# Iterates over each sequence element
for j in range(seq_len):
# Ensures it's still physically possible for STR to be present in sequence
while (seq_len - j >= STR_len):
# Gets sequence substring of length STR_len, starting from jth element
sub = sequence[j:(j + (STR_len))]
# Compares current substring to current STR
if (sub == STR_array[i]):
temp_value += 1
j += STR_len
else:
# Ensures current STR_count_value is highest
if (temp_value > STR_count_values[i]):
STR_count_values[i] = temp_value
# Resets temp_value to break count, and pushes j forward by 1
temp_value = 0
j += 1
i += 1
return STR_count_values
And the 'DNA_match' function:
# Searches database file for DNA matches
def DNA_match(STR_values, arg_database, STR_array_len):
with open(arg_database, 'r') as csv_database:
database = csv.reader(csv_database)
name_array = [] * (STR_array_len + 1)
next(database)
# Iterates over one row of database at a time
for row in database:
name_array.clear()
# Copies entire row into name_array list
for column in row:
name_array.append(column)
# Converts name_array number strings to actual ints
for i in range(STR_array_len):
name_array[i + 1] = int(name_array[i + 1])
# Checks if a row's STR values match the sequence's values, prints the row name if match is found
match = 0
for i in range(0, STR_array_len, + 1):
if (name_array[i + 1] == STR_values[i]):
match += 1
if (match == STR_array_len):
print(name_array[0])
exit()
print("No match")
exit()
However, I'm new to Python, and haven't really had to consider speed before, so I'm not sure how to improve upon this.
I'm not particularly looking for people to do my work for me, so I'm happy for any suggestions to be as vague as possible. And honestly, I'll value any feedback, including stylistic advice, as I can only imagine how disgusting this code looks to those more experienced.
Here's a link to the full program, if helpful.
Thanks :) x
Thanks for providing a link to the entire program. It seems needlessly complex, but I'd say it's just a lack of knowing what features are available to you. I think you've already identified the part of your code that's causing the slowness - I haven't profiled it or anything, but my first impulse would also be the three nested loops in STR_count.
Here's how I would write it, taking advantage of the Python standard library. Every entry in the database corresponds to one person, so that's what I'm calling them. people is a list of dictionaries, where each dictionary represents one line in the database. We get this for free by using csv.DictReader.
To find the matches in the sequence, for every short tandem repeat in the database, we create a regex pattern (the current short tandem repeat, repeated one or more times). If there is a match in the sequence, the total number of repetitions is equal to the length of the match divided by the length of the current tandem repeat. For example, if AGATCAGATCAGATC is present in the sequence, and the current tandem repeat is AGATC, then the number of repetitions will be len("AGATCAGATCAGATC") // len("AGATC") which is 15 // 5, which is 3.
count is just a dictionary that maps short tandem repeats to their corresponding number of repetitions in the sequence. Finally, we search for a person whose short tandem repeat counts match those of count exactly, and print their name. If no such person exists, we print "No match".
def main():
import argparse
from csv import DictReader
import re
parser = argparse.ArgumentParser()
parser.add_argument("database_filename")
parser.add_argument("sequence_filename")
args = parser.parse_args()
with open(args.database_filename, "r") as file:
reader = DictReader(file)
short_tandem_repeats = reader.fieldnames[1:]
people = list(reader)
with open(args.sequence_filename, "r") as file:
sequence = file.read().strip()
count = dict(zip(short_tandem_repeats, [0] * len(short_tandem_repeats)))
for short_tandem_repeat in short_tandem_repeats:
pattern = f"({short_tandem_repeat}){{1,}}"
match = re.search(pattern, sequence)
if match is None:
continue
count[short_tandem_repeat] = len(match.group()) // len(short_tandem_repeat)
try:
person = next(person for person in people if all(int(person[k]) == count[k] for k in short_tandem_repeats))
print(person["name"])
except StopIteration:
print("No match")
return 0
if __name__ == "__main__":
import sys
sys.exit(main())

Amateur Text Editor Malfunctioning

My brother and I are creating a simple text editor that changes entries to pig latin using Python. Code below:
our_word = ("cat")
vowels = ("a","e","i","o","u")
#remember I have to compare variables not strings
way = "way"
for i in range(len(our_word)):
for j in range (len(vowels)):
#checking if there is any vowel present
if our_word[i] == vowels[j]:
# if there were to be any vowels our_word[i] wil now be changed with way
#.replace is our function the dot is what notates this in the python library
our_word = our_word.replace(our_word[i], way)
print(our_word)
Right now we're testing the word 'cat' but the program when run returns the following:
/Users/x/PycharmProjects/pythonProject3/venv/bin/python /Users/x/PycharmProjects/pythonProject3/main.py
cwwayyt
Process finished with exit code 0
We're not sure why there is a double 'w' and a double 'y'. It seems the word 'cat' is edited once to 'cwayt' and then a second time to 'cwwayyt'.
Any suggestions are welcome!
The problem arises from the fact that on the next iteration of for loop after doing the substitution, you are looking at the next position, which is part of the way that you just substituted into place. Instead, you need to skip past this. You would also experience another problem, that it only loops up to the original length, rather than the new increased length. You are probably better in this situation to use a while loop with an index variable that you can manipulate to point to the correct place as needed. For example:
our_word = "cat"
vowels = "aeiou"
way = "way"
i = 0
while i < len(our_word):
if our_word[i] in vowels:
our_word = our_word[:i] + way + our_word[i + 1:]
i += len(way) # <=== if you made a substitution, skip over the bit
# that you just substituted in place
else:
i += 1 # <=== if you didn't make any substitution
# just go to the next position next time
print(our_word)

How to do this recursion method to print out every 3 letters from a string on a separate line?

I'm making a method that takes a string, and it outputs parts of the strings on separate line according to a window.
For example:
I want to output every 3 letters of my string on separate line.
Input : "Advantage"
Output:
Adv
ant
age
Input2: "23141515"
Output:
231
141
515
My code:
def print_method(input):
mywindow = 3
start_index = input[0]
if(start_index == input[len(input)-1]):
exit()
print(input[1:mywindow])
printmethod(input[mywindow:])
However I get a runtime error.... Can someone help?
I think this is what you're trying to get. Here's what I changed:
Renamed input to input_str. input is a keyword in Python, so it's not good to use for a variable name.
Added the missing _ in the recursive call to print_method
Print from 0:mywindow instead of 1:mywindow (which would skip the first character). When you start at 0, you can also just say :mywindow to get the same result.
Change the exit statement (was that sys.exit?) to be a return instead (probably what is wanted) and change the if condition to be to return once an empty string is given as the input. The last string printed might not be of length 3; if you want this, you could use instead if len(input_str) < 3: return
def print_method(input_str):
mywindow = 3
if not input_str: # or you could do if len(input_str) == 0
return
print(input_str[:mywindow])
print_method(input_str[mywindow:])
edit sry missed the title: if that is not a learning example for recursion you shouldn't use recursion cause it is less efficient and slices the list more often.
def chunked_print (string,window=3):
for i in range(0,len(string) // window + 1): print(string[i*window:(i+1)*window])
This will work if the window size doesn't divide the string length, but print an empty line if it does. You can modify that according to your needs

Recursive function that prints the letters of the alphabet one per line, one per function call

Well, I am still learning python and trying to print letters of alphabet one per line, one per function call. This should be done using recursion.
Here I am struggling due to error. Just need another pair of eyes to see if there is something missing.
def recursive_print(cursor):
alphabet = 'abcdefghijklmnopqstuvwxyz'
index = len(alphabet) - cursor
if index > 0
recursive_print(index - 1)
letter = alphabet[index]
print letter
print recursive_print(0)
Error below:
NameError: name 'index' is not defined
sh-4.3$ python main.py
File "main.py", line 4
if index > 0:
^
Any Pointers will be very helpful.
To solve your immediate problem, you haven't indented properly. You're also missing a colon on the if statement you posted. Try this:
def recursive_print(curser):
alphabet = 'abcdefghijklmnopqstuvwxyz'
index = len(alphabet) - curser
if index > 0:
recursive_print(index - 1)
letter = alphabet[index]
print letter
print recursive_print(0)
Next, you have to worry about is the infinite recursion from not handling your index properly. I believe that the problem is a trivial mental error: change the recursion to
recursive_print(curser + 1)
This still gives you an index out of range on the base case, but I expect you can fix that.
BTW, the word is spelled "cursor", in case you care.

Variable updating one step behind in list searcher in Tkinter

I'm trying to build a search engine that will check a list and then remove all list items that do not meet the search parameters. I know there is several problems with my program such as it will not add things back to the list when you backspace and in my updating for loop I simply tack on a '*' thinking that it will search for strings only beginning with the current parameters, but I will cross those bridges later.
class StudentFinderWindow(Tkinter.Toplevel):
def __init__(self):
Tkinter.Toplevel.__init__(self) # Create Window
searchResultList = ['student1', 'student2', 'student3'] # Test list.
##### window attributes
self.title('Edit Students') #sets window title.
##### Puts stuff into the window.
# text
editStudentInfoLabel = Tkinter.Label(self,text='Select the student from the list below or search for one in the search box provided')
editStudentInfoLabel.grid(row=0, column=0)
# Entry box
self.searchRepositoryEntry = Tkinter.Entry(self)
self.searchRepositoryEntry.grid(row=1, column=0)
# List box
self.searchResults = Tkinter.Listbox(self)
self.searchResults.grid(row=2, column=0)
This fills the Tkinter Listbox with the original list.
# Search results initial updater.
self.getStudentList()
for student in self.studentList:
self.searchResults.insert(Tkinter.END, student)
##### Event handler
Right here I bind to run the list updater after a key is entered into the search box
self.searchRepositoryEntry.bind('<Key>', self.updateSearch)
This is supposed to run every time a key is pressed. It gets the string that is in the Entry then starts a variable count so I know which index the name is at. After that it run a for loop on the current list supposedly checking to see if it fits the requirement of the parameters and any other letter after it. If it does not match it should delete. The problem is the first time I hit a letter the parameters string is just a blank space and then the next letter the string is the first letter and so on. It is always one step behind. And that is the problem
def updateSearch(self, event):
parameters = self.searchRepositoryEntry.get()
int = 0
currentList = self.searchResults.get(0, Tkinter.END)
for i in currentList:
if not i == parameters + '*':
self.searchResults.delete(int)
int += 1
def getStudentList(self):
global fileDirectory # Gets the directory that all the files are in.
fileList = listdir(fileDirectory)
self.studentList = []
for file in fileList:
self.studentList.append(file[:-4])
I believe I have run into this same problem you describe before, when attempting to make an actively searching ctrl-F feature in one of my programs.
What I found to work is not bind on Key but instead KeyRelease. I'm not entirely sure why this works (probably just a quirk with Tkinter). However, it works.
Snippet's:
The binding
# self.FW.En is an entry widget.
self.FW.En.bind('<KeyRelease>', self.find)
Which would run
def find (self, event):
self.Te.tag_remove('found', '1.0', 'end')
pat = self.FW.En.get()
if len(pat) > 1:
index = '1.0'
while True:
index = self.Te.search(pat, index, nocase=1, stopindex='end')
if not index:
break
lastidex = '%s+%dc' % (index, len(pat))
self.Te.tag_add('found', index, lastidex)
index = lastidex
self.Te.tag_config('found', background='#80ff00')

Categories

Resources