clear and comprehensible way to calculate the string [12:3] - python

I new on python.
I have this string "[12:3]" and i what to calculate the difference between these two numbers.
Ex: 12 - 3 = 9
Of course I can do something (not very clear) like this:
num1 = []
num2 = []
s = '[12:3]'
dot = 0;
#find the ':' sign
for i in range(len(s)):
if s[i] == ':' :
dot = i
#left side
for i in range(dot):
num1.append(s[i])
#right side
for i in range(len(s) - dot-1):
num2.append(s[i+dot+1])
return str(int("".join(num1))-int("".join(num2))+1)
But i'm sure the is a more clear and comprehensible way.
Thanks!

You could use regex to pick the numbers out of your string:
import re
s = '[12:3]'
numbers = [int(x) for x in re.findall(r'\d+',s)]
return numbers[0]-numbers[1]

Or, without re
numbers = [int(x) for x in s.strip('[]').split(':')]
print numbers[0] - numbers[1]
prints
9

You should use regular expressions.
>>> import re
>>> match = re.match(r'\[(\d+):(\d+)\]', '[12:3]')
>>> match.groups()
('12', '3')
>>> a = int(match.groups()[0])
>>> b = int(match.groups()[1])
>>> a - b
9
The regular expression there says "match starting at the beginning of the string, find [, then any number of digits \d+ (and store them), then a :, then any number of digits \d+ (and store them), and finally ]". We then extract the stored digits using .groups() and do arithmetic on them.

Related

Extract substring from a python string

I want to extract the string before the 9 digit number below:
tmp = place1_128017000_gw_cl_mask.tif
The output should be place1
I could do this:
tmp.split('_')[0] but I also want the solution to work for:
tmp = place1_place2_128017000_gw_cl_mask.tif where the result would be:
place1_place2
You can assume that the number will also be 9 digits long
Using regular expressions and the lookahead feature of regex, this is a simple solution:
tmp = "place1_place2_128017000_gw_cl_mask.tif"
m = re.search(r'.+(?=_\d{9}_)', tmp)
print(m.group())
Result:
place1_place2
Note that the \d{9} bit matches exactly 9 digits. And the bit of the regex that is in (?= ... ) is a lookahead, which means it is not part of the actual match, but it only matches if that follows the match.
Assuming we can phrase your problem as wanting the substring up to, but not including the underscore which is followed by all numbers, we can try:
tmp = "place1_place2_128017000_gw_cl_mask.tif"
m = re.search(r'^([^_]+(?:_[^_]+)*)_\d+_', tmp)
print(m.group(1)) # place1_place2
Use a regular expression:
import re
places = (
"place1_128017000_gw_cl_mask.tif",
"place1_place2_128017000_gw_cl_mask.tif",
)
pattern = re.compile("(place\d+(?:_place\d+)*)_\d{9}")
for p in places:
matched = pattern.match(p)
if matched:
print(matched.group(1))
prints:
place1
place1_place2
The regex works like this (adjust as needed, e.g., for less than 9 digits or a variable number of digits):
( starts a capture
place\d+ matches "places plus 1 to many digits"
(?: starts a group, but does not capture it (no need to capture)
_place\d+ matches more "places"
) closes the group
* means zero or many times the previous group
) closes the capture
\d{9} matches 9 digits
The result is in the first (and only) capture group.
Here's a possible solution without regex (unoptimized!):
def extract(s):
result = ''
for x in s.split('_'):
try: x = int(x)
except: pass
if isinstance(x, int) and len(str(x)) == 9:
return result[:-1]
else:
result += x + '_'
tmp = 'place1_128017000_gw_cl_mask.tif'
tmp2 = 'place1_place2_128017000_gw_cl_mask.tif'
print(extract(tmp)) # place1
print(extract(tmp2)) # place1_place2

String incrementation

I've just started to learn Python and I'm doing some exercises in codewars. The instructions are simple: If the string already ends with a number, the number should be incremented by 1.
If the string does not end with a number. the number 1 should be appended to the new string.
I wrote this:
if strng[-1].isdigit():
return strng.replace(strng[-1],str(int(strng[-1])+1))
else:
return strng + "1"
return(strng)
It works sometimes (for example 'foobar001 - foobar002', 'foobar' - 'foobar1'). But in other cases it adds 1 to each number at the end (for example 'foobar11' - 'foobar22'), I would like to achieve a code where the effect is to add only +1 to the ending number, for example when 'foobar99' then 'foobar100', so the number has to be considered as a whole. I would be grateful for advices for beginner :)!
First, you have to make some assumptions
Assuming that the numerical values are always at the end of string and the first character from the right that is not numeric would mark the end of the non-number string, i.e.
>>> input = "foobar123456"
>>> output = 123456 + 1
Second, we need to assume that number exists at the end of the string.
So if we encounter a string without a number, we need to decide if the python code should throw an error and not try to add 1.
>>> input = "foobar"
Or we decide that we automatically generate a 0 digit, which would require us to do something like
input = input if input[-1].isdigit() else input + "0"
Lets assume the latter decision for simplicity of the explanation.
Next we will try to read the numbers from the right until you get to a non-digit
Lets use reversed() to flip the string and then a for-loop to read the characters until we reach a non-number, i.e.
>>> s = "foobar123456"
>>> output = 123456
>>> for character in reversed(s):
... if not character.isdigit():
... break
... else:
... print(character)
...
6
5
4
3
2
1
Now, lets use a list to keep the digits characters
>>> digits_in_reverse = []
>>> for character in reversed(s):
... if not character.isdigit():
... break
... else:
... digits_in_reverse.append(character)
...
>>> digits_in_reverse
['6', '5', '4', '3', '2', '1']
Then we reverse it:
>>> ''.join(reversed(digits_in_reverse))
'123456'
And convert it into an integer:
>>> int(''.join(reversed(digits_in_reverse)))
123456
Now the +1 increment would be easy!
How do we find the string preceding the number?
# The input string.
s = "foobar123456"
s = s if s[-1].isdigit() else s + "0"
# Keep a list of the digits in reverse.
digits_in_reverse = []
# Iterate through each character from the right.
for character in reversed(s):
# If we meet a character that is not a digit, stop.
if not character.isdigit():
break
# Otherwise, keep collecting the digits.
else:
digits_in_reverse.append(character)
# Reverse, the reversed digits, then convert it into an integer.
number_str = "".join(reversed(digits_in_reverse))
number = int(number_str)
print(number)
# end of string preceeding number.
end = s.rindex(number_str)
print(s[:end])
# Increment +1
print(s[:end] + str(number + 1))
[output]:
123456
foobar
foobar123457
Bonus: Can you do it with a one-liner?
Not exactly one line, but close:
import itertools
s = "foobar123456"
s = s if s[-1].isdigit() else s + "0"
number_str = "".join(itertools.takewhile(lambda ch: ch.isdigit(), reversed(s)))[::-1]
end = s.rindex(number_str)
print(s[:end] + str(int(number_str) + 1))
Bonus: But how about regex?
Yeah, with regex it's pretty magical, you would still make the same assumption as how we started, and to make your regex as simple as possible you have to add another assumption that the alphabetic characters preceding the number can only be made up of a-z or A-Z.
Then you can do this:
import re
s = "foobar123456"
s = s if s[-1].isdigit() else s + "0"
alpha, numeric = re.match("([a-zA-z]+)(\d.+)", s).groups()
print(alpha + str(int(numeric) + 1))
But you have to understand the regex which might be a steep learning, see https://regex101.com/r/9iiaCW/1
One simple solution would be:
Have two empty variables head (=non-numeric prefix) and tail (numeric suffix). Iterate the string normally, from left to right. If the current character is a digit, add it to tail. Otherwise, join head and tail, add the current char to head and empty tail. Once complete, increment tail and return head + tail:
def foo(s):
head = tail = ''
for char in s:
if char.isdigit():
tail += char
else:
head += tail + char
tail = ''
tail = int(tail or '0')
return head + str(tail + 1)
Leading zeroes (x001 -> x002), if needed, left as an exercise ;)
In your string, you need to check if it is alpha numeric or not. if it is alpha numeric, then you need to check the last character, whether it is digit or not.
now if above condition satisfy then you need to get the index of first digit in the string which make a integer number in last of string.
once you got the index then, seperate the character and numeric part.
once done, convert numerical string part to interger and add 1. after this join both character and numeric part. that is your answer.
# your code goes here
string = 'randomstring2345'
index = len(string) - 1
if string.isalnum() and string[-1].isdigit():
while True:
if string[index].isdigit():
index-=1
else:
index+=1
break
if index<0:
break
char_part = string[:index]
int_part = string[index:]
integer = 0
if int_part:
integer = int(''.join(int_part))
modified_int = integer + 1
new_string = ''.join([char_part, str(modified_int)])
print(new_string)
output
randomstring2346
Regex can be a useful tool in python~ Here I make two groups, the first (.*?) is as few of anything as possible, while the second (\d*$) is as many digits at the end of the string as possible. For more in depth explanation see regexr.
import re
def increment(s):
word, digits = re.match('(.*?)(\d*$)', s).groups()
digits = str(int(digits) + 1).zfill(len(digits)) if digits else '1'
return word + digits
print(increment('foobar001'))
print(increment('foobar009'))
print(increment('foobar19'))
print(increment('foobar20'))
print(increment('foobar99'))
print(increment('foobar'))
print(increment('1a2c1'))
print(increment(''))
print(increment('01'))
Output:
foobar002
foobar010
foobar20
foobar21
foobar100
foobar1
1a2c2
1
02
Source
def solve(data):
result = None
if len(data) == 0 or not data[-1].isdigit():
result = data + str(1) #appending 1
else:
lin = 0
for index, ch in enumerate(data[::-1]):
if ch.isdigit():
lin = len(data) - index -1
else:
break
result = data[0 : lin] + str(int(data[lin:]) + 1) # incrementing result
return result
pass
print(solve("Hey123"))
print(solve("aaabbbzzz"))
output :
Hey124
aaabbbzzz1

FInding position of number in string

I would like to separate the letters from the numbers like this
inp= "AE123"
p= #position of where the number start in this case "2"
I've already tried to use str.find() but its has a limit of 3
Extracting the letters and the digits
If the goal is to extract both the letters and the digits, regular expressions can solve the problem directly without need for indices or slices:
>>> re.match(r'([A-Za-z]+)(\d+)', inp).groups()
('AE', '123')
Finding the position of the number
If needed, regular expressions can also locate the indices for the match.
>>> import re
>>> inp = "AE123"
>>> mo = re.search(r'\d+', inp)
>>> mo.span()
(2, 5)
>>> inp[2 : 5]
'123'
You can run a loop that checks for digits:
for p, c in enumerate(inp):
if c.isdigit():
break
print(p)
Find out more about str.isdigit
this should work
for i in range(len(inp)):
if inp[i].isdigit():
p = i
break
#Assuming all characters come before the first numeral as mentioned in the question
def findWhereNoStart(string):
start_index=-1
for char in string:
start_index+=1
if char.isdigit():
return string[start_index:]
return "NO NUMERALS IN THE GIVEN STRING"
#TEST
print(findWhereNoStart("ASDFG"))
print(findWhereNoStart("ASDFG13213"))
print(findWhereNoStart("ASDFG1"))
#OUTPUT
"""
NO NUMERALS IN THE GIVEN STRING
13213
1
"""

Extraction of Numbers from String

I am trying to extract numbers from a string. Without any fancy inports like regex and for or if statements.
Example
495 * 89
Output
495 89
Edit I have tried this:
num1 = int(''.join(filter(str.isdigit, num)))
It works, but doesn't space out the numbers
Actually, regex is a very simple and viable option here:
inp = "495 * 89"
nums = re.findall(r'\d+(?:\.\d+)?', inp)
print(nums) # ['495', '89']
Assuming you always expect integers and you want to avoid regex, you could use a string split approach with a list comprehension:
inp = "495 * 89"
parts = inp.split()
nums = [x for x in parts if x.isdigit()]
print(nums) # ['495', '89']
You can do this without much fancy stuff
s = "495 * 89"
#replace non-digits with spaces, goes into a list of characters
li = [c if c.isdigit() else " " for c in s ]
#join characters back into a string
s_digit_spaces = "".join(li)
#split will separate on space boundaries, multiple spaces count as one
nums = s_digit_spaces.split()
print(nums)
#one-liner:
print ("".join([c if c.isdigit() else " " for c in s ]).split())
output:
['495', '89']
['495', '89']
#and with non-digit number stuff
s = "495.1 * -89"
print ("".join([c if (c.isdigit() or c in ('-',".")) else " " for c in s ]).split())
output:
['495.1', '-89']
Finally, this works too:
print ("".join([c if c in "0123456789+-." else " " for c in s ]).split())
You're close.
You don't want to int() a single value when there are multiple numbers in the string. The filter function is being applied over characters, since strings are iterable that way
Instead, you need to first split the string into its individual tokens, then filter whole numerical strings, then cast each element
s = "123 * 54"
digits = list(map(int, filter(str.isdigit, s.split())))
Keep in mind, this only handles non-negative integers

Python: Search a string for a number, decrement that number and replace in the string

If I have a string such as:
string = 'Output1[10].mystruct.MyArray[4].mybool'
what I want to do is search the string for the number in the array, decrement by 1 and then replace the found number with my decremented number.
What I have tried:
import string
import re
string = 'Output1[10].mystruct.MyArray[4].mybool'
pattern = r'\[(\d+)\]'
num = re.findall(pattern, string)
So, I can get a list of the numbers, convert to integers but I don't know how to use re.sub to search the string to replace, it should be considered that there might be multiple arrays. If anyone is expert enough to do that, help much appreciated.
Cheers
I don't undestand a thing... If there is more than 1 array, do you want to decrease the number in all arrays? or just in 1 of them?
If you want to decrease in all arrays, you can do this:
import re
string = 'Output1[10].mystruct.MyArray[4].mybool'
pattern = r'\[(\d+)\]'
num = re.findall(pattern, string)
num = [int(elem) for elem in num]
num.sort()
for elem in num:
aux = elem - 1
string = string.replace(str(elem), str(aux))
If you want to decrease just the first array, you can do this
import string
import re
string = 'Output1[10].mystruct.MyArray[4].mybool'
pattern = r'\[(\d+)\]'
num = re.findall(pattern, string)
new_num = int(num[0]) - 1
string = string.replace(num[0], str(new_num), 1)
Thanks to #João Castilho for his answer, based on this I changed it slightly to work exactly how I want:
import string
import re
string = 'Output1[2].mystruct.MyArray[2].mybool'
pattern = r'\[(\d+)\]'
num = re.findall(pattern, string)
num = [int(elem) for elem in set(num)]
num.sort()
for elem in num:
aux = elem - 1
string = string.replace('[%d]'% elem, '[%d]'% aux)
print(string)
This will now replace any number between brackets with the decremented value in all of the conditions that the numbers may occur.
Cheers
ice.

Categories

Resources