Python find index of string with while loop - python

New to python-
I need to create a "while" loop that searches for the position of a certain string in a text file.
All the characters in the file have been set to integers (x = len(all)), so now I need a loop to search for the index/position of a certain string.
This is where I'm at right now:
string = 'found'
index = 0
while index < x:
if ????
Then it should print out something like
String found between (startIndex) and (endIndex)

You can use the .find() function:
string = "found"
x = "I found the index"
index = x.find(string)
end = index + len(string)
print(index, end)
2, 7

Python has a built-in function called index that provides this functionality:
string = "found"
with open("FILE", "r") as f:
for i,j in f.readlines():
if string in j:
foo = f.index(string)
print(f"String found at line {i+1} between ({foo}) and ({foo + len(string)})")
break

Related

Replace sequence of the same letter with single one

I am trying to replace the number of letters with a single one, but seems to be either hard either I am totally block how this should be done
So example of input:
aaaabbcddefff
The output should be abcdef
Here is what I was able to do, but when I went to the last piece of the string I can't get it done. Tried different variants, but I am stucked. Can someone help me finish this code?
text = "aaaabbcddefff"
new_string = ""
count = 0
while text:
for i in range(len(text)):
l = text[i]
for n in range(len(text)):
if text[n] == l:
count += 1
continue
new_string += l
text = text.replace(l, "", count)
break
count = 0
break
Using regex
re.sub(r"(.)(?=\1+)", "", text)
>>> import re
>>> text = "aaaabbcddefff"
>>> re.sub(r"(.)(?=\1+)", "", text)
abcdeaf
Side note: You should consider building your string up in a list and then joining the list, because it is expensive to append to a string, since strings are immutable.
One way to do this is to check if every letter you look at is equal to the previous letter, and only append it to the new string if it is not equal:
def remove_repeated_letters(s):
if not s: return ""
ret = [s[0]]
for index, char in enumerate(s[1:], 1):
if s[index-1] != char:
ret.append(char)
return "".join(ret)
Then, remove_repeated_letters("aaaabbcddefff") gives 'abcdef'.
remove_repeated_letters("aaaabbcddefffaaa") gives 'abcdefa'.
Alternatively, use itertools.groupby, which groups consecutive equal elements together, and join the keys of that operation
import itertools
def remove_repeated_letters(s):
return "".join(key for key, group in itertools.groupby(s))

How to find check only first index in each split string?

I am trying to create define a function that:
Splits a string called text at every new line (ex text="1\n2\n\3)
Checks ONLY the first index in each of the individual split items to see if number is 0-9.
Return any index that has 0-9, it can be more than one line
ex: count_digit_leading_lines ("AAA\n1st") → 1 # 2nd line starts w/ digit 1
So far my code is looking like this but I can't figure out how to get it to only check the first index in each split string:
def count_digit_leading_lines(text):
for line in range(len(text.split('\n'))):
for index, line in enumerate(line):
if 0<=num<=9:
return index
It accepts the arguement text, it iterates over each individual line (new split strings,) I think it goes in to check only the first index but this is where I get lost...
The code should be as simple as :
text=text.strip() #strip all whitespace : for cases ending with '\n' or having two '\n' together
text=text.replace('\t','') #for cases with '\t' etc
s=text.split('\n') #Split each sentence (# '\n')
#s=[words.strip() for words in s] #can also use this instead of replace('\t')
for i,sentence in enumerate(s):
char=sentence[0] #get first char in each sentence
if char.isdigit(): #if 1st char is a digit (0-9)
return i
UPDATE:
Just noticed OP's comment on another answer stating you don't want to use enumerate in your code (though its good practice to use enumeration). So the for loop modified version without enumerate is :
for i in range(len(s)):
char=s[i][0] #get first char in each sentence
if char.isdigit(): #if 1st char is a digit (0-9)
return i
This should do it:
texts = ["1\n2\n\3", 'ABC\n123\n456\n555']
def _get_index_if_matching(text):
split_text = text.split('\n')
if split_text:
for line_index, line in enumerate(split_text):
try:
num = int(line[0])
if 0 < num < 9:
return line_index
except ValueError:
pass
for text in texts:
print(_get_index_if_matching(text))
It will return 0 and then 1
You could change out your return statement for a yield, making your function a generator. Then you could get the indexes one by one in a loop, or make them into a list. Here's a way you could do it:
def count_digit_leading_lines(text):
for index, line in enumerate(text.split('\n')):
try:
int(line[0])
yield index
except ValueError: pass
# Usage:
for index in count_digit_leading_lines(text):
print(index)
# Or to get a list
print(list(count_digit_leading_lines(text)))
Example:
In : list(count_digit_leading_lines('he\n1\nhto2\n9\ngaga'))
Out: [1, 3]

Find out word at specific index

I have a string with multiple words separated by underscores like this:
string = 'this_is_my_string'
And let's for example take string[n] which will return a letter.
Now for this index I want to get the whole word between the underscores.
So for string[12] I'd want to get back the word 'string' and for string[1] I'd get back 'this'
Very simple approach using string slicing is to:
slice the list in two parts based on position
split() each part based on _.
concatenate last item from part 1 and first item from part 2
Sample code:
>>> my_string = 'this_is_my_sample_string'
# ^ index 14
>>> pos = 14
>>> my_string[:pos].split('_')[-1] + my_string[pos:].split('_')[0]
'sample'
This shuld work:
string = 'this_is_my_string'
words = string.split('_')
idx = 0
indexes = {}
for word in words:
for i in range(len(word)):
idx += 1
indexes[idx] = word
print(indexes[1]) # this
print(indexes[12]) #string
The following code works. You can change the index and string variables and adapt to new strings. You can also define a new function with the code to generalize it.
string = 'this_is_my_string'
sp = string.split('_')
index = 12
total_len = 0
for word in sp:
total_len += (len(word) + 1) #The '+1' accounts for the underscore
if index < total_len:
result = word
break
print result
A little bit of regular expression magic does the job:
import re
def wordAtIndex(text, pos):
p = re.compile(r'(_|$)')
beg = 0
for m in p.finditer(text):
#(end, sym) = (m.start(), m.group())
#print (end, sym)
end = m.start()
if pos < end: # 'pos' is within current split piece
break
beg = end+1 # advance to next split piece
if pos == beg-1: # handle case where 'pos' is index of split character
return ""
else:
return text[beg:end]
text = 'this_is_my_string'
for i in range(0, len(text)+1):
print ("Text["+str(i)+"]: ", wordAtIndex(text, i))
It splits the input string at '_' characters or at end-of-string, and then iteratively compares the given position index with the actual split position.

How to replace characters in a string in python

How to replace characters in a string which we know the exact indexes in python?
Ex : name = "ABCDEFGH"
I need to change all odd index positions characters into '$' character.
name = "A$C$E$G$"
(Considered indexes bigin from 0 )
Also '$'.join(s[::2])
Just takes even letters, casts them to a list of chars and then interleaves $
''.join(['$' if i in idx else s[i] for i in range(len(s))])
works for any index array idx
You can use enumerate to loop over the string and get the indices in each iteration then based your logic you can keep the proper elements :
>>> ''.join([j if i%2==0 else '$' for i,j in enumerate(name)])
'A$C$E$G$'
name = "ABCDEFGH"
nameL = list(name)
for i in range(len(nameL)):
if i%2==1:
nameL[i] = '$'
name = ''.join(nameL)
print(name)
You can reference string elements by index and form a new string. Something like this should work:
startingstring = 'mylittlestring'
nstr = ''
for i in range(0,len(startingstring)):
if i % 2 == 0:
nstr += startingstring[i]
else:
nstr += '$'
Then do with nstr as you like.

python - ordinal value - list indices must be integers not str

what i want to do is take a string and for each character make the ordinal value 1 more from the value it has.
myinput=input("Message : ")
mylist =list(myinput) #convert to list in order to take each character
for character in mylist:
mylist[character]+=ord(mylist[character])+1
print(character)
The problem is with the "ord(mylist[character])+1"
Thank you!
Probably you are looking for the next:
>>> m = raw_input('Message:')
Message:asdf
>>> ''.join(chr(ord(c) + 1) for c in m)
'bteg'
Notes:
use raw_input when you need to get string input from a user;
ord convert character to integer, chr - vise versa;
... for c in m syntax is a generator expression. It is also used for list comprehension.
Three problems here. First, you're mixing up list indices and list elements. Second, you didn't convert back to a character (I'm assuming you want characters, not numbers). Third, you're adding to the existing value.
One way:
for i range(len(mylist)):
mylist[i] = chr(ord(mylist[i])+1)
Another way:
for i, character in enumerate(mylist):
mylist[i] = chr(ord(character)+1)
Instead of
for character in mylist:
mylist[character]+=ord(mylist[character])+1
(where character is a list index and therefore invalid), you probably want:
mylist = [ord(character) + 1 for character in mylist]
Or a Counter.
You can do like this
def ordsum(astring, tablesize):
sum = 0
for num in range(len(astring)):
sum = sum + ord(astring[num])
return sum
myinput = input() # use raw_input() in Python 2
myinput = map(lambda ch: chr(ord(ch) + 1), myinput)
# or list comp.
myinput = [chr(ord(ch) + 1) for ch in myinput]
You can iterate directly over a string, you do not have to make it a list first. If your end goal is to have a new string, you can do this:
myinput=input("Message : ")
result = []
for character in myinput:
result.append( chr( ord( character ) + 1 )
mynewstring = ' '.join(result)

Categories

Resources