Iterating a string and replacing an element in python

Iterating a string and replacing an element in python - python

I am attempting to search through two strings looking for matching elements. If the strings have two elements in common that are in different positions, I want to make that element in the 'guess' string a COW. If the strings have two elements in the same position, the element is a BULL.
Here is what I have:
if index(number,i) in guess and not index(guess,i) == index(guess,i):
replace(index(guess,i),'COW')
if index(guess,i) == index(number,i):
replace(index(guess,i),'BULL')
I'm not sure if I'm using index correctly.

First off, you need to be using index() and replace() as string methods, like Martijn said in a comment.
This would be like so: guess.index(i) to find the index of i in the string guess.
You might want to check out find() which will do the same as index() but won't raise an exception when the substring is not found.
Also note that you are seeing if the result of index() is in the string guess. That is an error, since an integer cannot be in a string! index() returns an integer!
Then consider that you are stating ... and not guess.index(i) == guess.index(i): (I fixed the index code) which makes no sense, since of course they are equal! They are the same thing!
Lastly, you are using replace incorrectly.
From the documentation, replace takes a string as the first argument - not an index! Try using it like so: guess = guess.replace(i, 'BULL'). That will change guess to have all occurrences of i replaced by the string 'BULL'.
I wasn't concerned with you actual algorithm here, but just your basic errors.

I wouldn't use the index() method. Instead, I would turn the string's elements into a list, then say:
listOne = [hello,goodbye,adios, shalom]
listTwo = [hello,adios,arrivaderci]
def cowbull(L1, L2):
for i in range(len(L1)):
if L1[i] in L2:
if L1[i] == L2[i]:
L1[i] = 'BULL'
L2[i] = 'BULL'
else:
L1[i] = 'COW'
L2[L1[i]] = 'COW'
This is just how I would do it, but using the way you and William's code may work well also. I am just used to doing it this way, and it may very well be not as efficient as his, but it usually works very well.

Related

Python list - string formatting as list indices

Depending on a condition I need to get a value from one or another function. I'm trying to put it inside a simple If ... Else statement. I tried to use %s string formatting but it won't work. Below code, so it will become more clear what I try to do:
if condition:
item = my_list['%s']
else:
item = my_other_list['%s']
# now I do something with this value:
print item % 3
This way I tried to print 3rd value of one or other list if the condition was True of False. This returned an error about list indices being string. So I tried to put it inside int() what didn't help.
How should I do it? The problem is I get the value later than I declare what item is.
EDIT
I will add some more infos here:
I have a for loop, that goes through ~1000 elements and processes them. If the condition is True, it calls one function or another if false. Now, I don't want to check the same condition 1000 times, because I know it won't change during the time and would like to check it once and apply the method to all of the elements.
More code:
if self.dlg.comboBox_3.currentIndex == 0:
item = QCustomTableWidgetItem(str(round((sum(values['%s'])/len(values['%s'])),2)))
else:
item = QCustomTableWidgetItem(str(round(sum(values['%s'],2))))
for row in range(len(groups)):
group = QTableWidgetItem(str(groups[row]))
qTable.setItem(row,0,group)
qTable.setItem(row,1,item % row)
This is the actual code. Not the '%s' and '% row'. I used simplified before not to distract from the actual problem, but I think it's needed. I'm sorry if it wasn't a good decision.

You have a reasonably large misconception about how list slicing works. It will always happen at the time you call it, so inside your if loop itself Python will be trying to slice either of the lists by the literal string "%s", which can't possibly work.
There is no need to do this. You can just assign the list as the output from the if statement, and then slice that directly:
if condition:
list_to_slice = my_list
else:
list_to_slice = my_other_list
# now I do something with this value:
print list_to_slice[3]

Short answer:
'%s' is a string by definition, while a list index should be an integer by definition.
Use int(string) if you are sure the string can be an integer (if not, it will raise a ValueError)

A list is made up of multiple data values that are referenced by an indice.
So if i defined my list like so :
my_list = [apples, orange, peaches]
If I want to reference something in the list I do it like this
print(my_list[0])
The expected output for this line of code would be "apples".
To actually add something new to a list you need to use an inbuilt method of the list object, which looks something like this :
my_list.append("foo")
The new list would then look like this
[apples, orange, peaches, foo]
I hope this helps.

I'd suggest wrapping around a function like this:
def get_item(index, list1, list2)
if condition:
return list1[index]
else:
return list2[index]
print get_item(3)

Here is a compact way to do it:
source = my_list if condition else my_other_list
print(source[2])
This binds a variable source to either my_list or my_other_list depending on the condition. Then the 3rd element of the selected list is accessed using an integer index. This method has the advantage that source is still bound to the list should you need to access other elements in the list.
Another way, similar to yours, is to get the element directly:
index = 2
if condition:
item = my_list[index]
else:
item = my_other_list[index]
print(item)

how can i search for common elements in two integers with while loop

in my code im having a problem because i cannot compare to list as i wanted. what i try to do is looking for first indexes of inputs firstly and then if indexes not the same looking for the next index of the longer input as i guess1. and then after finishing comparing the first index of elements i want to compare second indexes .. what i mean first checking (A-C)(A-A)(A-T) and then (C-A)(C-T).. and then (T-T)...
and want an input list as (A,T) beacuse of ATT part of guess1..
however i stuck in a moment that i always find ACT not A and T..
where i am wrong.. i will be very glad if you enlighten me..
edit..
what i'm trying to do is looking for the best similarity in the longer list of guess1 and find the most similiar list as ATT
GUESS1="CATTCG"
GUESS2="ACT"
if len(str(GUESS1))>len(str(GUESS2)):
DNA_input_list=list((GUESS1))
DNA_input1_list=list((GUESS2))
common_elements=[]
i=0
while i<len(DNA_input1_list)-1:
j=0
while j<len(DNA_input_list)-len(DNA_input1_list):
if DNA_input_list[i] == DNA_input1_list[j]:
common_elements.append(DNA_input1_list[j])
i+=1
j+=1
if j>len(DNA_input1_list)-1:
break
print(common_elements)

As far as I understand, you want to find a shorter substring in a longer substring, and if not found, remove an element from shorter substring then repeat the search.
You can use string find function in python for that. i.e. "CATTCG".find('ACT'), this function will return -1 because there are no substing ACT. What then you can do is remove an element from the shorter string using slice operator [::] and repeat the search like this --
>>> for x in range(len('ACT')):
... if "CATTCG".find('ACT'[x:]) > -1 :
... print("CATTCG".find('ACT'[x:]))
... print("Match found for " + 'ACT'[x:])
In code here, first a range of lengths is generated i.e. [0, 1, 2, 3] this is the number of items we're gonna slice off from the beginning.
In second line we do the slicing with 'ACT'[x:] (for x==0, we get 'ACT', for x == 1, we get 'CT' and for x==2, we get 'T').
The last two lines print out the position and the string that matched.

If I have understood everything correctly, you want to return the longest similar substring from GUESS2, with is included in GUESS1.
I would use something like this.
<!-- language: lang-py -->
for count in range(len(GUESS2)):
if GUESS2[:count] in GUESS1:
common_elements = GUESS2[:count]
print(GUESS2[:count]) #if a function, return GUESS2[:count]
A loop as long as the count from the searching string.
Then check if the substring is included in the other.
If so, save it to a variable and print/return it after the loop has finished.

Python: User input as a list index

I am trying to use user input as an index for a list, but I keep getting the error "TypeError: list indices must be integers, not tuple." Here is what I have:
def sort(j, k):
sublist = list[j, k]
print sublist
sorted = sublist.sort
print sorted
operation = raw_input()
sort(operation[5], operation[7])
The user is supposed to input
SORT 3 5
and a subset of the original list will be sorted.

Your (immediate) problem is at this line:
sublist = list[j, k]
Presumably list is a list of items1. When you do somelist[a, b], python sees something equivalent to somelist[(a, b)]. So, you can see, you're indexing somelist with a tuple (which doesn't work). Chances are that you want a slice. In that case, you'll do:
sublist = list[j:k]
Even after making this change however, you'll still have problems -- Notably, j and k in your code are of type str and lists want to be indexed/sliced with integers (or None...)2. So, now we have:
sublist = list[int(j):int(k)]
At this point, you might stop seeing errors, but you won't see the results you want which brings us to the next problem.
sorted = sublist.sort
Here you're just assigning a bound method to a name. You're not actually sorting anything. If you want to sort the sublist (in place), you'd do:
sublist.sort()
print(sublist)
If you are ok with sorting it out of place, you can use the builtin sorted function (provided you haven't named something else sorted ;-)
print(sorted(sublist))
1Note, it is generally accepted that naming a variable the same thing as a builtin type can lead to hard to read and debug code :-).
2While we're at it, I might mention there is a better way to chunk up your string -- You can .split it. e.g. operation.split() will give you ['SORT', '5', '7'] rather than needing to make assumptions about the input and indexing the input string.

You have a few problems here:
Your function is called sort, which is the name of a built-in method.
You are not calling the method in this line sorted = sublist.sort (its missing ()).
You are giving each letter from the input as an argument to your function.
This: list[j,k] is what is causing your problem, because j,k is a tuple.
sort is in an in-place operation, so it will return None, which is what you will end up printing.
To fix these issues:
def my_sorter(j, k): # Changed method name
sublist = my_list[int(j):int(k)] # You need j:k
sublist.sort() # Note, no return value, because its in-place
print sublist
user_input = raw_input('Please enter the indices: ')
j,k = user_input.split()
my_sorter(j,k)

Python: List item is empty, code to detect if it is and then put in a place holder value?

Hey I'm writing a program that receives a broadcast from Scratch and then determines based on the broadcast, where to proceed. The code turns the broadcast(list item) into a string and then breaks that string into a list using .split(). The only problem is the broadcast may only be 1 word instead of 2. Is there a way to check if one of the list items from .split() is empty and then change it to a place holder value?
Where I am having trouble
scratchbroadcast = str(msg[1])
BroadcastList = scratchbroadcast.split()
#starts the switch statement that interprets the message and proceeds
#to the appropriate action
v = BroadcastList[0]
w = BroadcastList[1]
if BroadcastList[1] == '':
w = "na"

If BroadcastList contains only one word then BroadcastList will be a single-element list, e.g.
>>> "foo".split()
['foo']
Obviously we can't check whether the second item in the list is an empty string ''; there isn't a second element. Instead, check the length of the list:
w = "na" if len(BroadcastList) == 1 else BroadcastList[1]
Alternatively, use try to catch the IndexError (it's easier to ask for forgiveness than permission):
try:
w = BroadcastList[1]
except IndexError:
w = "na"

Okay, first consider this: how about the third item? Or the fourth? Or the forty-second?
If the string doesn't contain a splitter character (e.g. a space), you wouldn't end up with a list of two items, one of which blank -- you would end up with a list of only one item.
In Python, the length of something is generally obtained through the built-in len() function:
len([]) # == 0
len(["foo"]) # == 1
len(["foo", "bar"]) # == 2
Therefore, you would do:
if len(broadcast_list) == 1:
broadcast_list += [""]
Other ways of doing the same thing include broadcast_list.append("") and broadcast_list.extend([""]). Which one to use is completely up to you; += and .extend are more or less equivalent while .append can only add a single element.
Looking at the rest of your code, your case calls won't work like you expect them to: in Python, strings are truthy, so 'string' or 'otherString' is basically the same as True or True. or is strictly a boolean operator and you can't use it for 'either this or that'.
Python is notorious for not having a switch statement. Your attempt at implementing one would actually be kind of cute had you gone through with it -- something like that can be a pretty good exercise in Python OOP and passing functions as first-class objects. (In my day-to-day use of Python I hardly ever need to do something like that, but it's great to have it in your conceptual toolkit.)
You might be happy to learn that Python strings have a lower method; with it, your code would end up looking something like this:
v = broadcast_list[0].lower()
if v == 'pilight':
# ...
else if v == 'motor':
# ...
else if v == 'camera':
# ....
On a side note, you might want to have a look a PEP8 which is the de facto standard for formatting Python code. If you want other people to be able to quickly figure out your code, you should conform at least to its most basic propositions - such as classes being CamelCased and variables in lowercase, rather than the other way around.

Check if string in strings

I have a huge list containing many strings like:
['xxxx','xx','xy','yy','x',......]
Now I am looking for an efficient way that removes all strings that are present within another string. For example 'xx' 'x' fit in 'xxxx'.
As the dataset is huge, I was wondering if there is an efficient method for this beside
if a in b:
The complete code: With maybe some optimization parts:
for x in range(len(taxlistcomplete)):
if delete == True:
x = x - 1
delete = False
for y in range(len(taxlistcomplete)):
if taxlistcomplete[x] in taxlistcomplete[y]:
if x != y:
print x,y
print taxlistcomplete[x]
del taxlistcomplete[x]
delete = True
break
print x, len(taxlistcomplete)
An updated version of the code:
for x in enumerate(taxlistcomplete):
if delete == True:
#If element is removed, I need to step 1 back and continue looping.....
delete = False
for y in enumerate(taxlistcomplete):
if x[1] in y[1]:
if x[1] != y[1]:
print x[1],y[1]
print taxlistcomplete[x]
del taxlistcomplete[x[0]]
delete = True
break
print x, len(taxlistcomplete)
Now implemented with the enumerate, only now I am wondering if this is more efficient and howto implement the delete step so I have less to search in as well.
Just a short thought...
Basically what I would like to see...
if element does not match any other elements in list write this one to a file.
Thus if 'xxxxx' not in 'xx','xy','wfirfj',etc... print/save
A new simple version as I dont think I can optimize it much further anyway...
print 'comparison'
file = open('output.txt','a')
for x in enumerate(taxlistcomplete):
delete = False
for y in enumerate(taxlistcomplete):
if x[1] in y[1]:
if x[1] != y[1]:
taxlistcomplete[x[0]] = ''
delete = True
break
if delete == False:
file.write(str(x))

x in <string> is fast, but checking each string against all other strings in the list will take O(n^2) time. Instead of shaving a few cycles by optimizing the comparison, you can achieve huge savings by using a different data structure so that you can check each string in just one lookup: For two thousand strings, that's two thousand checks instead of four million.
There's a data structure called a "prefix tree" (or trie) that allows you to very quickly check whether a string is a prefix of some string you've seen before. Google it. Since you're also interested in strings that occur in the middle of another string x, index all substrings of the form x, x[1:], x[2:], x[3:], etc. (So: only n substrings for a string of length n). That is, you index substrings that start in position 0, 1, 2, etc. and continue to the end of the string. That way you can just check if a new string is an initial part of something in your index.
You can then solve your problem in O(n) time like this:
Order your strings in order of decreasing length. This ensures that no string could be a substring of something you haven't seen yet. Since you only care about length, you can do a bucket sort in O(n) time.
Start with an empty prefix tree and loop over your ordered list of strings. For each string x, use your prefix tree to check whether it is a substring of a string you've seen before. If not, add its substrings x, x[1:], x[2:] etc. to the prefix tree.
Deleting in the middle of a long list is very expensive, so you'll get a further speedup if you collect the strings you want to keep into a new list (the actual string is not copied, just the reference). When you're done, delete the original list and the prefix tree.
If that's too complicated for you, at least don't compare everything with everything. Sort your strings by size (in decreasing order), and only check each string against the ones that have come before it. This will give you a 50% speedup with very little effort. And do make a new list (or write to a file immediately) instead of deleting in place.

Here is a simple approach, assuming you can identify a character (I will use '$' in my example) that is guaranteed not to be in any of the original strings:
result = ''
for substring in taxlistcomplete:
if substring not in result: result += '$' + substring
taxlistcomplete = result.split('$')
This leverages Python's internal optimizations for substring searching by just making one big string to substring-search :)

Here is my suggestion. First I sort the elements by length. Because obviously the shorter the string is, the more likely it is to be a substring of another string. Then I have two for loops, where I run through the list and remove every element from the list where el is a substring. Note that the first for loop only passes each element once.
By sortitng the list first, we destroy the order of elements in the list. So if the order is important, then you can't use this solution.
Edit. I assume there are no identical elements in the list. So that when el == el2, it's because its the same element.
a = ["xyy", "xx", "zy", "yy", "x"]
a.sort(key=len)
for el in a:
for el2 in a:
if el in el2 and el != el2:
a.remove(el2)

Using a list comprehension -- note in -- is the fastest and more Pythonic way of solving your problem:
[element for element in arr if 'xx' in element]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.