Let's say I have a string like this:
string1 = 'bla/bla1/blabla/bla2/bla/bla/wowblawow1'
I need to take the text after the last '/' and delete everything else:
string2 = 'wowblawow1'
Is there any method I could use?
string1 = 'bla/bla1/blabla/bla2/bla/bla/wowblawow1'
string2 = string1.split(r'/')[-1] # Out[2]: 'wowblawow1'
see https://docs.python.org/2/library/stdtypes.html#str.split to see how it works. But as #Emilien suggested, if are looking for extracting basename, use os.path: https://docs.python.org/2/library/os.path.html
Or maybe you are even looking for this?
>>> import os
>>> os.path.basename("/var/log/syslog")
'syslog'
>>> os.path.dirname("/var/log/syslog")
'/var/log'
I generally use os.path.basename when dealing with forward slashes.
I know this may not be the most practical way, but for generally trying to locate the content after the last occurrence of something:
string1 = 'bla/bla1/blabla/bla2/bla/bla/wowblawow1'
index = (len(string1)-1) - string1[::-1].find('/')
string1 = string1[index+1:]
deatils:
string1[::-1] # reverse the string
string1[::-1].find(my_string_to_search_for) # gets the index of the first occurance of the argument in the string.
(len(string1)-1) # the maximum index value
(len(string1)-1) - string[::-1].find(my_string_to_search_for) # the index as taken from the front of the string
string1 = string1[index+1:] # gives the substring of everything after the index of the last occurance of your string
You could make the code more readable by doing something like:
def get_last_index_of(string,search_content):
return (len(string)-1) - string[::-1].find(search_content)
string1 = 'bla/bla1/blabla/bla2/bla/bla/wowblawow1'
string1 = string1[get_last_index_of('/')+1:]
Related
Let's say I have a string defined like this:
string1 = '23h4b245hjrandomstring345jk3n45jkotherrandomstring'
The goal is to grab the 11 characters (these for example '345jk3n45jk') after a part of the string (this part for example 'randomstring') using a specified search term and the specified number of characters to grab after that search term.
I tried doing something like this:
string2 = substring(string1,'randomstring', 11)
I appreciate any help you guys have to offer!
string2 = string1[string1.find("randomstring")+len("randomstring"):string1.find("randomstring")+len("randomstring")+11]
In one line, using split, and supposing that your randomstring is unique in your string, which seems to be the case as you worded out the question :
string1 = '23h4b245hjrandomstring345jk3n45jkotherrandomstring'
randomstring = 'randomstring'
nb_char_to_take = 11
# split using randomstring as splitter, take part of the string after, i.e the second part of the array, and then the 11 first character
result = string1.split(randomstring)[1][:nb_char_to_take]
You can use a simple regular expression like this
import re
s = "23h4b245hjrandomstring345jk3n45jkotherrandomstring"
result = re.findall("randomstring(.{11})", s)[0]
string1 = '23h4b245hjrandomstring345jk3n45jkotherrandomstring'
string2 = string1[10:22]
print(string2)
randomstring
You could use that. Its called string slicing, you basically count the position of the letters and then the first number before the colon is your starting point the second is your ending point when you enter those position numbers you should get whatever is in-between those position, the last is for a different function I highly suggest you search string slicing on YouTube as my explanation wouldn't really help you, and also search up * Find string method* those should hep you get the idea behind those functions. Sorry couldn't be of much help hope the videos help.
I have a list of strings and i would like to extract : "000000_5.612230" of :
A = '/calibration/test_min000000_5.612230.jpeg'
As the size of the strings could evolve, I try with monitoring the position of "n" of "min". I try to get the good index with :
print sorted(A, key=len).index('n')
But i got "11" which corresponds to the "n" of "calibration". I would like to know how to get the maximum index value of the string?
it is difficult to answer since you don't specify what part of the filename remains constant and what is subject to change. is it always a jpeg? is the number always the last part? is it always preceded with '_min' ?
in any case, i would suggest using a regex instead:
import re
A = '/calibration/test_min000000_5.612230.jpeg'
p = re.compile('.*min([_\d\.]*)\.jpeg')
value = p.search(A).group(1)
print value
output :
000000_5.612230
note that this snippet assumes that a match is always found, if the filename doesn't contain the pattern then p.search(...) will return None and an exception will be raised, you'll check for that case.
You can use re module and the regex to do that, for example:
import re
A = '/calibration/test_min000000_5.612230.jpeg'
text = re.findall('\d.*\d', A)
At now, text is a list. If you print it the output will be like this: ['000000_5.612230']
So if you want to extract it, just do this or use for:
import re
A = '/calibration/test_min000000_5.612230.jpeg'
text = re.findall('\d.*\d', A)
print text[0]
String slicing seems like a good solution for this
>>> A = '/calibration/test_min000000_5.612230.jpeg'
>>> start = A.index('min') + len('min')
>>> end = A.index('.jpeg')
>>> A[start:end]
'000000_5.612230'
Avoids having to import re
Try this (if extension is always '.jpeg'):
A.split('test_min')[1][:-5]
If your string is regular at the end, you can use negative indices to slice the string:
>>> a = '/calibration/test_min000000_5.612230.jpeg'
>>> a[-20:-5]
'000000_5.612230'
i was wondering if anyone has a simpler solution to extract a few letters in the middle of a string. i want to retrive the 3 letters (in this case, GMB) and all the entries follow the same patter. i'struggling o get a simpler way of doing this.
here is an example of what i've been using.
entry = "entries-alphabetical.jsp?raceid13=GMB$20140313A"
symbol = entry.strip('entries-alphabetical.jsp?raceid13=')
symbol = symbol[0:3]
print symbol
thanks
First of all the argument passed to str.strip is not prefix or suffix, it is just a combination of characters that you want to be stripped off from the string.
Since the string looks like an url, you can use urlparse.parse_qsl:
>>> import urlparse
>>> urlparse.parse_qsl(entry)
[('entries-alphabetical.jsp?raceid13', 'GMB$20140313A')]
>>> urlparse.parse_qsl(entry)[0][1][:3]
'GMB'
This is what regular expressions are for. http://docs.python.org/2/library/re.html
import re
val = re.search(r'(GMB.*)', entry)
print val.group(1)
I need to get the value after the last colon in this example 1234567
client:user:username:type:1234567
I don't need anything else from the string just the last id value.
To split on the first occurrence instead, see Splitting on first occurrence.
result = mystring.rpartition(':')[2]
If you string does not have any :, the result will contain the original string.
An alternative that is supposed to be a little bit slower is:
result = mystring.split(':')[-1]
foo = "client:user:username:type:1234567"
last = foo.split(':')[-1]
Use this:
"client:user:username:type:1234567".split(":")[-1]
You could also use pygrok.
from pygrok import Grok
text = "client:user:username:type:1234567"
pattern = """%{BASE10NUM:type}"""
grok = Grok(pattern)
print(grok.match(text))
returns
{'type': '1234567'}
Is it possible to replace a single character inside a string that occurs many times?
Input:
Sentence=("This is an Example. Thxs code is not what I'm having problems with.") #Example input
^
Sentence=("This is an Example. This code is not what I'm having problems with.") #Desired output
Replace the 'x' in "Thxs" with an i, without replacing the x in "Example".
You can do it by including some context:
s = s.replace("Thxs", "This")
Alternatively you can keep a list of words that you don't wish to replace:
whitelist = ['example', 'explanation']
def replace_except_whitelist(m):
s = m.group()
if s in whitelist: return s
else: return s.replace('x', 'i')
s = 'Thxs example'
result = re.sub("\w+", replace_except_whitelist, s)
print(result)
Output:
This example
Sure, but you essentially have to build up a new string out of the parts you want:
>>> s = "This is an Example. Thxs code is not what I'm having problems with."
>>> s[22]
'x'
>>> s[:22] + "i" + s[23:]
"This is an Example. This code is not what I'm having problems with."
For information about the notation used here, see good primer for python slice notation.
If you know whether you want to replace the first occurrence of x, or the second, or the third, or the last, you can combine str.find (or str.rfind if you wish to start from the end of the string) with slicing and str.replace, feeding the character you wish to replace to the first method, as many times as it is needed to get a position just before the character you want to replace (for the specific sentence you suggest, just one), then slice the string in two and replace only one occurrence in the second slice.
An example is worth a thousands words, or so they say. In the following, I assume you want to substitute the (n+1)th occurrence of the character.
>>> s = "This is an Example. Thxs code is not what I'm having problems with."
>>> n = 1
>>> pos = 0
>>> for i in range(n):
>>> pos = s.find('x', pos) + 1
...
>>> s[:pos] + s[pos:].replace('x', 'i', 1)
"This is an Example. This code is not what I'm having problems with."
Note that you need to add an offset to pos, otherwise you will replace the occurrence of x you have just found.