I have a little "problem" that I would like to solve via programming a simple script. I don't have much programming experience and thought I'd ask here for help for what I should look for or know to do this.
Basically I want to take an email address such as placeholder.1234#fakemail.com and replace it into pl*************4#fakemail.com.
I need the script to take the letters after the first two, and before the last, and turn those letters into asterisks, and they have to match the amount of characters.
Just for clarification, I am not asking for someone to write this for me, I just need some guidance for how to go about this. I would also like to do this in Python, as I already have Python setup on my PC.
You can use Python string slicing:
email = "placeholder.1234#fakemail.com"
idx1 = 2
idx2 = email.index("#") - 1
print(email[:idx1] + "*" * (idx2 - idx1) + email[idx2:])
Output:
pl*************4#fakemail.com
Explanation:
Define the string that will contain the email address:
email = "placeholder.1234#fakemail.com"
Define the index of where the asterisks should begin, which is 2:
idx = 2
Define the index of where the asterisks should end, which is the index of where the # symbol is minus 1:
idx2 = email.index("#") - 1
Finally, using the indices defined, you can slice and concatenate the string defined accordingly:
print(email[:idx1] + "*" * (idx2 - idx1) + email[idx2:])
So this email will be a string.
Try use a combination of String indexing and (string replacement or string concatenation).
First, let's think about what data type we would store this in. It should be a String since it contains letters and characters.
We know we want to replace a portion of the String from the THIRD character to the character 2 before "#".
Let's think about this in terms of indexing now. The third character, or our start index for replacement is at index 2. To find the index of the character "#", we can use the: index() function as so:
end = email.index('#')
However, we want to start replacing from 2 before that index, so we can just subtract 2 from it.
Now we have the indexes (start and endpoint) of what we want to replace. We can now use a substring that goes from the starting to ending indexes and use the .replace() function to replace this substring in our email with a bunch of *'s.
To determine how many *'s we need, we can find the difference in indexes and add 1 to get the total number. So we could do something like:
stars = "*" * (end - start + 1)
email = email.replace(email[start:end + 1], stars)
Notice how I did start:end + 1 for the substring. This is because we want to include that ending index, and the substring function on its own will not include the ending index.
I hope this helped answer your question! Please let me know if you need any further clarification or details:)
Related
I want to know how you can replace one letter of a string without replacing the same letter. For example, let the variable:
action = play sports.
I could substitute "play" for "playing" by doing print(action.replace("play", "playing")
But what if you have to of the same letters?
For example, what if you want to replace the last half of "honeyhoney" into "honeysweet" (Replacing the last half of the string to sweet?
Sorry for the bad wording, I am new to coding and really unfamiliar with this. Thanks!
def replaceLast(str, old, new):
return str[::-1].replace(old[::-1],new[::-1], 1)[::-1]
print(replaceLast("honeyhoney", "honey", "sweet"))
output
honeysweet
so the idea is to reverse the string and the old and new substrings,
so the last substring becomes the first, do a replace and then reverse the returned string once again, and the number 1 is to replace only once and not both matches
Another solution
def replaceLast(str, old, new):
ind = str.rfind(old)
if ind == -1 : return str
return str[:ind] + new + str[ind + len(old):];
print(replaceLast("honeyhoney", "honey", "sweet"))
output
honeysweet
so here we get the string from the beginning to the index of the last substring then we add the new substring and the rest of the string from where the old substring ends and return them as the new string, String.rfind returns -1 in case of no match found and we need to check aginst that to make sure the output is correct even if there is nothing to replace.
I'm trying to find all instances of a specific substring(a!b2 as an example) and return them with the 4 characters that follow after the substring match. These 4 following characters are always dynamic and can be any letter/digit/symbol.
I've tried searching, but it seems like the similar questions that are asked are requesting help with certain characters that can easily split a substring, but since the characters I'm looking for are dynamic, I'm not sure how to write the regex.
When using regex, you can use "." to dynamically match any character. Use {number} to specify how many characters to match, and use parentheses as in (.{number}) to specify that the match should be captured for later use.
>>> import re
>>> s = "a!b2foobar a!b2bazqux a!b2spam and eggs"
>>> print(re.findall("a!b2(.{4})", s))
['foob', 'bazq', 'spam']
import re
print (re.search(r'a!b2(.{4})')).group(1))
.{4} matches any 4 characters except special characters.
group(0) is the complete match of the searched string. You can read about group id here.
If you're only looking for how to grab the following 4 characters using Regex, what you are probably looking to use is the curly brace indicator for quantity to match: '{}'.
They go into more detail in the post here, but essentially you would do [a-Z][0-9]{X,Y} or (.{X,Y}), where X to Y is the number of characters you're looking for (in your case, you would only need {4}).
A more Pythonic way to solve this problem would be to make use of string slicing, and the index function however.
Eg. given an input_string, when you find the substring at index i using index, then you could use input_string[i+len(sub_str):i+len(sub_str)+4] to grab those special characters.
As an example,
input_string = 'abcdefg'
sub_str = 'abcd'
found_index = input_string.index(sub_str)
start_index = found_index + len(sub_str)
symbol = input_string[start_index: start_index + 4]
Outputs (to show it works with <4 as well): efg
Index also allows you to give start and end indexes for the search, so you could also use it in a loop if you wanted to find it for every sub string, with the start of the search index being the previous found index + 1.
Alright, I'm working on a little project for school, a 6-frame translator. I won't go into too much detail, I'll just describe what I wanted to add.
The normal output would be something like:
TTCPTISPALGLAWS_DLGTLGFMSYSANTASGETLVSLYQLGLFEM_VVSYGRTKYYLICP_LFHLSVGFVPSD
The important part of this string are the M and the _ (the start and stop codons, biology stuff). What I wanted to do was highlight these like so:
TTCPTISPALGLAWS_DLGTLGF 'MSYSANTASGETLVSLYQLGLFEM_' VVSYGRTKYYLICP_LFHLSVGFVPSD
Now here is where (for me) it gets tricky, I got my output to look like this (adding a space and a ' to highlight the start and stop). But it only does this once, for the first start and stop it finds. If there are any other M....._ combinations it won't highlight them.
Here is my current code, attempting to make it highlight more than once:
def start_stop(translation):
index_2 = 0
while True:
if 'M' in translation[index_2::1]:
index_1 = translation[index_2::1].find('M')
index_2 = translation[index_1::1].find('_') + index_1
new_translation = translation[:index_1] + " '" + \
translation[index_1:index_2 + 1] + "' " +\
translation[index_2 + 1:]
else:
break
return new_translation
I really thought this would do it, guess not. So now I find myself being stuck.
If any of you are willing to try and help, here is a randomly generated string with more than one M....._ set:
'TTCPTISPALGLAWS_DLGTLGFMSYSANTASGETLVSLYQLGLFEM_VVSYGRTKYYLICP_LFHLSVGFVPSDGRRLTLYMPPARRLATKSRFLTPVISSG_DKPRHNPVARSQFLNPLVRPNYSISASKSGLRLVLSYTRLSLGINSLPIERLQYSVPAPAQITP_IPEHGNARNFLPEWPRLLISEPAPSVNVPCSVFVVDPEHPKAHSKPDGIANRLTFRWRLIG_VFFHNAL_VITHGYSRVDILLPVSRALHVHLSKSLLLRSAWFTLRNTRVTGKPQTSKT_FDPKATRVHAIDACAE_QQH_PDSGLRFPAPGSCSEAIRQLMI'
Thank you to anyone willing to help :)
Regular expressions are pretty handy here:
import re
sequence = "TTCP...."
highlighted = re.sub(r"(M\w*?_)", r" '\1' ", sequence)
# Output:
"TTCPTISPALGLAWS_DLGTLGF 'MSYSANTASGETLVSLYQLGLFEM_' VVSYGRTKYYLICP_LFHLSVGFVPSDGRRLTLY 'MPPARRLATKSRFLTPVISSG_' DKPRHNPVARSQFLNPLVRPNYSISASKSGLRLVLSYTRLSLGINSLPIERLQYSVPAPAQITP_IPEHGNARNFLPEWPRLLISEPAPSVNVPCSVFVVDPEHPKAHSKPDGIANRLTFRWRLIG_VFFHNAL_VITHGYSRVDILLPVSRALHVHLSKSLLLRSAWFTLRNTRVTGKPQTSKT_FDPKATRVHAIDACAE_QQH_PDSGLRFPAPGSCSEAIRQLMI"
Regex explanation:
We look for an M followed by any number of "word characters" \w* then an _, using the ? to make it a non-greedy match (otherwise it would just make one group from the first M to the last _).
The replacement is the matched group (\1 indicates "first group", there's only one), but surrounded by spaces and quotes.
You just require little slice of 'slice' module , you don't need any external module :
Python string have a method called 'index' just use it.
string_1='TTCPTISPALGLAWS_DLGTLGFMSYSANTASGETLVSLYQLGLFEM_VVSYGRTKYYLICP_LFHLSVGFVPSD'
before=string_1.index('M')
after=string_1[before:].index('_')
print('{} {} {}'.format(string_1[:before],string_1[before:before+after+1],string_1[before+after+1:]))
output:
TTCPTISPALGLAWS_DLGTLGF MSYSANTASGETLVSLYQLGLFEM_ VVSYGRTKYYLICP_LFHLSVGFVPSD
My intention was use the index method to search for either a colon (:) or a equal sign (=) in the string and print everything after that character but I realized it's not syntactically possible as it's written below with the OR statement. So is there another way to write this piece of code? (I wasn't able to come up with a simple way to write this without getting into loops and if statements)
l='Name = stack'
pos=l.index(':' or '=')
print (' '.join(l[pos+1:-1].split())) #this just gets rid of the whitespaces
Assuming your example as above, the long way (explanation of each piece below):
pos = max(l.find(':'), l.find('='), 0)
print(l[pos:].strip())
Here's a way to shorten it to one line, with an explanation of each part in the order it's evaluated in.
print(l[max(l.find(':'),l.find('='),0):].strip())
#--------------- Breakdown
# max -> highest of values; find returns -1 if it isn't there.
# using a 0 at the end means if ':'/'=' aren't in the string, print the whole thing.
# l.find(),l.find() -> check the two characters, using the higher due to max()
# l[max():] -> use that higher value until the end (implied with empty :])
# .strip() -> remove whitespace
import re
l='Name = stack'
print(re.split(':|=', l)[-1])
Regular expression split on either character, then take the last result.
You didn't mention if there was guaranteed to be one or the other separator and not both, always a separator, not more than one separator... this might not do what you want, depending.
You should limit the number of splits to one, using maxsplit in re.split():
import re
s1 = 'name1 = x1 and noise:noise=noise'
s2 = 'name2: x2 and noise:noise=noise'
print(re.split(':|=', s1, maxsplit=1)[-1].strip())
print(re.split(':|=', s2, maxsplit=1)[-1].strip())
Output:
x1 and noise:noise=noise
x2 and noise:noise=noise
I have a string:
str = "alskdfj asldfj 1234_important_what_i_need_123 sdlfja faslkdjfsdkf 234234_important_what_i_need_12312 alsdfj asdfj"
I want to extract each occurrence of the "%important_what_i_need%" bit from the string, including 10 or so characters before and after the search term.
How do I do this with python? Do I need to import re?
Starting with "aaafoobbb" and looking for "foo" and the surrounding two characters on either side, you could do:
>>> start_string = "aaafoobbb"
>>> search_string = "foo"
>>> index = start_string.index(search_string)
>>> s[(index - 2) : (index + len(search_string) + 2)]
Should be easy enough to adapt to your needs, although you'll need to add some extra checks to make sure your slice indices are within range (e.g. make sure that index - 2 is not less than 0). You definitely want to become more familiar with slicing and strings in Python.