Delete a certain number of zeros from right of a string - python

I'm trying to delete a certain number of zeros from right. For example:
"10101000000"
I want to remove 4 zeros... And get:
"1010100"
I tried to do string.rstrip("0") or string.strip("0") but this removes all the of zeros from right. How can I do that?
The question is not a duplicate because I can't use imports.

You can use a regex
>>> import re
>>> mystr = "10101000000"
>>> numzeros = 4
>>> mystr = re.sub("0{{{}}}$".format(numzeros), "", mystr)
>>> mystr
'1010100'
This will leave the string as is if it doesn't end in four zeros
You could also check and then slice
if mystr.endswith("0" * numzeros):
mystr = mystr[:-numzeros]

For a known number of zeros you can use slicing:
s = "10101000000"
zeros = 4
if s.endswith("0" * zeros):
s = s[:-zeros]

rstrip deletes all characters from the end that are in passed set of characters. You can delete trailing zeros like this:
s = s[:-4] if s[-4:] == "0"*4 else s

Here's my solution:
number = "10101000000"
def my_rstrip(number, char, count=4):
for x in range(count):
if number.endswith(char):
number = number[0:-1]
else:
break
return number
print my_rstrip(number, '0', 4)

>>> s[:-4]+s[-4:].replace('0000','')

Don't forget to convert to str
import re
a = 10101000000
re.sub("0000$","", str(a))

You try to split off the last 4 characters from the string by this way:
string[:-4]

Related

specific characters printing with Python

given a string as shown below,
"[xyx],[abc].[cfd],[abc].[dgr],[abc]"
how to print it like shown below ?
1.[xyz]
2.[cfd]
3.[dgr]
The original string will always maintain the above-mentioned format.
I did not realize you had periods and commas... that adds a bit of trickery. You have to split on the periods too
I would use something like this...
list_to_parse = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
count = 0
for i in list_to_parse.split('.'):
for j in i.split(','):
string = str(count + 1) + "." + j
if string:
count += 1
print(string)
string = None
Another option is split on the left bracket, and then just re-add it with enumerate - then strip commas and periods - this method is also probably a tiny bit faster, as it's not a loop inside a loop
list_to_parse = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
for index, i in enumerate(list.split('[')):
if i:
print(str(index) + ".[" + i.rstrip(',.'))
also strip is really "what characters to remove" not a specific pattern. so you can add any characters you want removed from the right, and it will work through the list until it hits a character it can't remove. there is also lstrip() and strip()
string manipulation can always get tricky, so pay attention. as this will output a blank first object, so index zero isn't printed etc... always practice and learn your needs :D
You can use split() function:
a = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
desired_strings = [i.split(',')[0] for i in a.split('.')]
for i,string in enumerate(desired_strings):
print(f"{i+1}.{string}")
This is just a fun way to solve it:
lst = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
count = 1
var = 1
for char in range(0, len(lst), 6):
if var % 2:
print(f"{count}.{lst[char:char + 5]}")
count += 1
var += 1
output:
1.[xyx]
2.[cfd]
3.[dgr]
explanation : "[" appears in these indexes: 0, 6, 12, etc. var is for skipping the next pair. count is the counting variable.
Here we can squeeze the above code using list comprehension and slicing instead of those flag variables. It's now more Pythonic:
lst = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
lst = [lst[i:i+5] for i in range(0, len(lst), 6)][::2]
res = (f"{i}.{item}" for i, item in enumerate(lst, 1))
print("\n".join(res))
You can use RegEx:
import regex as re
pattern=r"(\[[a-zA-Z]*\])\,\[[a-zA-Z]*\]\.?"
results=re.findall(pattern, '[xyx],[abc].[cfd],[abc].[dgr],[abc]')
print(results)
Using re.findall:
import re
s = "[xyx],[abc].[cfd],[abc].[dgr],[abc]"
print('\n'.join(f'{i+1}.{x}' for i,x in
enumerate(re.findall(r'(\[[^]]+\])(?=,)', s))))
Output:
1.[xyx]
2.[cfd]
3.[dgr]

How can I invert a slice?

My code right now
sentence = "Sentence!"
print(*sentence[::3], sep="--")
Output: S--t--c
How am I able to invert the slice so that same input would result in -en-en-e!
I've tried doing -3 and different numbers in the ::3 but none work
Like this:
sentence = 'Sentence!'
import re
tokens = re.findall(r'.(..)', sentence)
print('', '-'.join(tokens), sep='-') # prints: -en-en-e!
Edit: Addressing the question in the comments:
This works, although how can I get this to start on the 3rd letter?
You could try this:
tokens = re.findall(r'(..).?', sentence[2:])
print(*tokens, sep='-')
This will output: nt-nc
Is this what you wanted?
What you're trying to achieve isn't possible using a slice, because the indices you want to keep (1, 2, 4, 5, 7, 8) are not an arithmetic progression.
Since the goal is to replace the first character of every three with a - symbol, the simplest solution I can think of is using a regex:
>>> import re
>>> re.sub(".(.{0,2})", r"-\1", "Sentence!")
'-en-en-e!'
>>> re.sub(".(.{0,2})", r"-\1", "Hello, world!")
'-el-o,-wo-ld-'
The {0,2} means the pattern will match even if the last group doesn't have three letters.
If you want to truly invert the range, then take the indices not in that range:
''.join(sentence[i] if i not in range(0, len(sentence), 3) else '-'
for i in range(len(sentence)))
Output
'-en-en-e!'
Personally, I prefer the regex solutions.
Another attempt:
sentence = ("Sentence!")
print(''.join(ch if i % 3 else '-' for i, ch in enumerate(sentence)))
Prints:
-en-en-e!
If sentence='Hello, world!':
-el-o,-wo-ld-
You can use slice assignment:
def invert(string, step, sep):
sentence = list(string)
sentence[::step] = len(sentence[::step]) * [sep]
return ''.join(sentence)
print(invert('Sentence!', 3, '*'))
# *en*en*e!
print(invert('Hallo World!', 4, '$'))
# $all$ Wo$ld!

Splitting a string before the nth occurrence of a character [duplicate]

Is there a Python-way to split a string after the nth occurrence of a given delimiter?
Given a string:
'20_231_myString_234'
It should be split into (with the delimiter being '_', after its second occurrence):
['20_231', 'myString_234']
Or is the only way to accomplish this to count, split and join?
>>> n = 2
>>> groups = text.split('_')
>>> '_'.join(groups[:n]), '_'.join(groups[n:])
('20_231', 'myString_234')
Seems like this is the most readable way, the alternative is regex)
Using re to get a regex of the form ^((?:[^_]*_){n-1}[^_]*)_(.*) where n is a variable:
n=2
s='20_231_myString_234'
m=re.match(r'^((?:[^_]*_){%d}[^_]*)_(.*)' % (n-1), s)
if m: print m.groups()
or have a nice function:
import re
def nthofchar(s, c, n):
regex=r'^((?:[^%c]*%c){%d}[^%c]*)%c(.*)' % (c,c,n-1,c,c)
l = ()
m = re.match(regex, s)
if m: l = m.groups()
return l
s='20_231_myString_234'
print nthofchar(s, '_', 2)
Or without regexes, using iterative find:
def nth_split(s, delim, n):
p, c = -1, 0
while c < n:
p = s.index(delim, p + 1)
c += 1
return s[:p], s[p + 1:]
s1, s2 = nth_split('20_231_myString_234', '_', 2)
print s1, ":", s2
I like this solution because it works without any actuall regex and can easiely be adapted to another "nth" or delimiter.
import re
string = "20_231_myString_234"
occur = 2 # on which occourence you want to split
indices = [x.start() for x in re.finditer("_", string)]
part1 = string[0:indices[occur-1]]
part2 = string[indices[occur-1]+1:]
print (part1, ' ', part2)
I thought I would contribute my two cents. The second parameter to split() allows you to limit the split after a certain number of strings:
def split_at(s, delim, n):
r = s.split(delim, n)[n]
return s[:-len(r)-len(delim)], r
On my machine, the two good answers by #perreal, iterative find and regular expressions, actually measure 1.4 and 1.6 times slower (respectively) than this method.
It's worth noting that it can become even quicker if you don't need the initial bit. Then the code becomes:
def remove_head_parts(s, delim, n):
return s.split(delim, n)[n]
Not so sure about the naming, I admit, but it does the job. Somewhat surprisingly, it is 2 times faster than iterative find and 3 times faster than regular expressions.
I put up my testing script online. You are welcome to review and comment.
>>>import re
>>>str= '20_231_myString_234'
>>> occerence = [m.start() for m in re.finditer('_',str)] # this will give you a list of '_' position
>>>occerence
[2, 6, 15]
>>>result = [str[:occerence[1]],str[occerence[1]+1:]] # [str[:6],str[7:]]
>>>result
['20_231', 'myString_234']
It depends what is your pattern for this split. Because if first two elements are always numbers for example, you may build regular expression and use re module. It is able to split your string as well.
I had a larger string to split ever nth character, ended up with the following code:
# Split every 6 spaces
n = 6
sep = ' '
n_split_groups = []
groups = err_str.split(sep)
while len(groups):
n_split_groups.append(sep.join(groups[:n]))
groups = groups[n:]
print n_split_groups
Thanks #perreal!
In function form of #AllBlackt's solution
def split_nth(s, sep, n):
n_split_groups = []
groups = s.split(sep)
while len(groups):
n_split_groups.append(sep.join(groups[:n]))
groups = groups[n:]
return n_split_groups
s = "aaaaa bbbbb ccccc ddddd eeeeeee ffffffff"
print (split_nth(s, " ", 2))
['aaaaa bbbbb', 'ccccc ddddd', 'eeeeeee ffffffff']
As #Yuval has noted in his answer, and #jamylak commented in his answer, the split and rsplit methods accept a second (optional) parameter maxsplit to avoid making splits beyond what is necessary. Thus, I find the better solution (both for readability and performance) is this:
s = '20_231_myString_234'
first_part = text.rsplit('_', 2)[0] # Gives '20_231'
second_part = text.split('_', 2)[2] # Gives 'myString_234'
This is not only simple, but also avoids performance hits of regex solutions and other solutions using join to undo unnecessary splits.

Extract a number from string in python

I want to extract a number form a string like this in Python:
string1 = 154787xs.txt
I want to get 154787 from there. I am using this:
searchPattern = re.compile('\d\d\d\d\d\d(?=xs)')
m = searchPattern.search(string1)
number = m.group()
but I do not get the correct value. Also the number of digits could change...
What am I doing wrong?
Simply you could use the below pattern,
searchPattern = re.compile(r'\d+(?=xs)')
Explanation:
\d+ matches one or more numbers.
(?=xs) Lookahead asserts that the characters which are following the numbers must be xs
Code:
>>> import re
>>> searchPattern = re.compile(r'\d+(?=xs)')
>>> m = searchPattern.search(string1)
>>> m
<_sre.SRE_Match object at 0x7f6047f66370>
>>> number = m.group()
>>> number
'154787'
What do you mean when you say you do not get the right value?
Your code does successfully match the string '154787'.
Perhaps you want number to be an int? In that case use:
number = int(m.group())
By the way, the regex could be written as
searchPattern = re.compile('(\d+)xs')
m = searchPattern.search(string1)
if m:
number = int(m.group(1))

String formatting by inserting ':'

I have a string '0000000000000201' in python
dpid_string = '0000000000000201'
Which is the best way to convert this to the following string
00:00:00:00:00:00:02:01
You'd partition the string into chunks of size 2, and join them with str.join():
':'.join([dpid_string[i:i + 2] for i in range(0, len(dpid_string), 2)])
Demo:
>>> dpid_string = '0000000000000201'
>>> ':'.join([dpid_string[i:i + 2] for i in range(0, len(dpid_string), 2)])
'00:00:00:00:00:00:02:01'
seq = '0000000000000201'
length = 2
":".join([seq[i:i+length] for i in range(0, len(seq), length)])
Although not very simple, you can do
dpid_string = '0000000000000201'
''.join([':' + char if not i % 2 else char for i, char in enumerate(dpid_string)])[1:]
To break it down from within the list comprehension:
[char for char in dpid_string] just loops over characters and returns them as a list.
We want it to return a string, so we join the full list using ''.join(list).
Now we want it to react on the location of the character, so we want to assess the index. Therefore we use i, value in enumerate(list)
If this index is even, add a colon before the char (modulus 2 is False).
Now this leaves us with a colon at index 0, we remove it by indexing [1:]
An alternative using re.sub:
import re
dpid_string = '0000000000000201'
subbed = re.sub('(..)(?!$)', r'\1:', dpid_string)
# 00:00:00:00:00:00:02:01
Read as take every 2 characters that aren't at the end of the string, and replace it with those two characters followed by :.

Categories

Resources