partition string in python and get value of last segment after colon - python

I need to get the value after the last colon in this example 1234567
client:user:username:type:1234567
I don't need anything else from the string just the last id value.
To split on the first occurrence instead, see Splitting on first occurrence.

result = mystring.rpartition(':')[2]
If you string does not have any :, the result will contain the original string.
An alternative that is supposed to be a little bit slower is:
result = mystring.split(':')[-1]

foo = "client:user:username:type:1234567"
last = foo.split(':')[-1]

Use this:
"client:user:username:type:1234567".split(":")[-1]

You could also use pygrok.
from pygrok import Grok
text = "client:user:username:type:1234567"
pattern = """%{BASE10NUM:type}"""
grok = Grok(pattern)
print(grok.match(text))
returns
{'type': '1234567'}

Related

remove gibberish prefix from a string

a = "aajfkdfvf_valid_name0"
b = "gdhdhsdsdeeeeex_valid_name1"
How do I remove the gibberish from my string before valid so that I have something like this -
valid_name0
valid_name1
If your strings always contains valid word, then you can try something like -
a = "aajfkdfvf_valid_name0"
b = "gdhdhsdsdeeeeex_valid_name1"
for s in (a, b):
print(s[s.rfind('valid'):])
So, even if the prefix contains _ or substring valid in it, the output will be correct. Though if your valid substring contains the word valid multiple times, then this will not work
We can try using re.sub here:
a = "aajfkdfvf_valid_name0"
b = "gdhdhsdsdeeeeex_valid_name1"
inp = [a, b]
output = [re.sub(r'^[^_]+_', '', i) for i in inp]
print(output) # ['valid_name0', 'valid_name1']
You can use a split join approach for this.
Try this:
a = "aajfkdfvf_valid_name0"
valid_a = '_'.join(a.split('_')[1:])
# 'valid_name0'
# can use maxsplit to split only once at the first _ and then take the remaining part of the string
another_valid_a = a.split('_',1)[1]
# valid_name0
Basically what this is doing is that it is splitting the original string at the _, then ignoring the first element and joining the remaining part again using _.
The other approaches seem a bit too over-engineered for this task, at least in my opinion.
If you already know that the gibberish comes before the first underscore _ character, you can just do a single str.split and discard the first split result:
a = "aajfkdfvf_valid_name0"
b = "gdhdhsdsdeeeeex_valid_name1"
def clean_string(s: str) -> str:
return s.split('_', 1)[1]
print(clean_string(a)) # valid_name0
print(clean_string(b)) # valid_name1
If you're sure that just a '_' is your need, a string split will help:
fixed_a = '_'.join(a.split('_')[1:])
The worst case is that this pattern is not the only one you're looking at. Then, check this:
You need to know exactly what your 'valid_name' looks like, you could make a REGEX to achieve your need.
Check for standards, patterns and all those.
I'm pretty sure if is there a pattern, a Regex can handle.
I recommend this site to do so.

Python String Replacement Not working As expected

Please explain what is going on with this:
'aaaaaaaaaaa'.replace('aaa','')
Output!:
'aa'
I expected only 3 'aaa' to be replaced in the original string. Please suggest an explanation or better approach.
'aaaaaaaaaaa'.replace('aaa','',1)
Apparently the function accepts the number of replacements as a third argument!
the official documentation of the "replace" method states
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
If you want to replace only the first occurrence of "aaa" write.
'aaaaaaaaaaa'.replace('aaa', '', 1)
The .replace() function will replace every instance of your first parameter with your second parameter. Since your original String contained 11 characters, 3 sets of "aaa" were replaced by 3 sets of "", leaving only "aa" behind.
If you only want to replace one set of "aa" we can use a different approach using indexing and substrings:
Using the .index() function we can find the first instance of "aaa". Now, we can simply remove this section from our String:
index = x.index('aa')
x = x[0: index] + x[index + 2:]
print(x)
I hope this helped! Please let me know if you need any further details or clarification :)

how to get second last and last value in a string after separator in python

In Python, how do you get the last and second last element in string ?
string "client_user_username_type_1234567"
expected output : "type_1234567"
Try this :
>>> s = "client_user_username_type_1234567"
>>> '_'.join(s.split('_')[-2:])
'type_1234567'
You can also use re.findall:
import re
s = "client_user_username_type_1234567"
result = re.findall('[a-zA-Z]+_\d+$', s)[0]
Output:
'type_1234567'
There's no set function that will do this for you, you have to use what Python gives you and for that I present:
split slice and join
"_".join("one_two_three".split("_")[-2:])
In steps:
Split the string by the common separator, "_"
s.split("_")
Slice the list so that you get the last two elements by using a negative index
s.split("_")[-2:]
Now you have a list composed of the last two elements, now you have to merge that list again so it's like the original string, with separator "_".
"_".join("one_two_three".split("_")[-2:])
That's pretty much it. Another way to investigate is through regex.

How to get sub string from a string in python using split or regex

I have a str in python like below. I want extract a substring from it.
table='abc_test_01'
number=table.split("_")[1]
I am getting test as a result.
What I want is everything after the first _.
The result I want is test_01 how can I achieve that.
Here is the code as already given by many of them
table='abc_test_01'
number=table.split("_",1)[1]
But the above one may fail in situations when the occurrence is not in the string, then you'll get IndexError: list index out of range
For eg.
table='abctest01'
number=table.split("_",1)[1]
The above one will raise IndexError, as the occurrence is not in the string
So the more accurate code for handling this is
table.split("_",1)[-1]
Therefore -1 will not get any harm because the number of occurrences is already set to one.
Hope it helps :)
To get the substring (all characters after the first occurrence of underscore):
number = table[table.index('_')+1:]
# Output: test_01
You could do it like:
import re
string = "abc_test_01"
rx = re.compile(r'[^_]*_(.+)')
match = rx.match(string).group(1)
print(match)
Or with normal string functions:
string = "abc_test_01"
match = '_'.join(string.split('_')[1:])
print(match)
Nobody mentions that the split() function can have an maxsplit argument:
str.split(sep=None, maxsplit=-1)
return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements).
So the solution is only:
table.split('_', 1)[1]
You can try this:
Edit: Thanks to #valtah's comment:
table = 'abc_test_01'
#final = "_".join(table.split("_")[1:])
final = table.split("_", 1)[1]
print final
Output:
'test_01'
Also the answer of #valtah in the comment is correct:
final = table.partition("_")[2]
print final
Will output the same result

how to get the same required string with better and shorter way

s = 'myName.Country.myHeight'
required = s.split('.')[0]+'.'+s.split('.')[1]
print required
myName.Country
How can I get the same 'required' string with better and shorter way?
Use str.rpartition like this
s = 'myName.Country.myHeight'
print s.rpartition(".")[0]
# myName.Country
rpartition returns a three element tuple,
1st element being the string before the separator
then the separator itself
and the the string after the separator
So, in our case,
s = 'myName.Country.myHeight'
print s.rpartition(".")
# ('myName.Country', '.', 'myHeight')
And we have picked only the first element.
Note: If you want to do it from the left, instead of doing it from the right, we have a sister function called str.partition.
You have a few options.
1
print s.rsplit('.',1)[0]
2
print s[:s.rfind('.')]
3
print s.rpartition('.')[0]
Well, that seems just fine to me... But here are a few other ways I can think of :
required = ".".join(s.split(".")[0:2]) // only one split
// using regular expressions
import re
required = re.sub(r"\.[^\.]$", "", s)
The regex only works if there are no dots in the last part you want to split off.

Categories

Resources