I have a range of string such as: "024764108", "002231531", "005231329", they have exactly 9 digits. And I want to add - to each group of 3 digits. The result I want is as below:
"024-764-108", "002-231-531", "005-231-329".
How can I explain my think to python?
Here is a dynamic solution:
In [41]: df
Out[41]:
num
0 024764108
1 002231531
2 005231329
3 012345678901234
In [42]: df.num.str.extractall(r'(\d{3})').groupby(level=0)[0].apply('-'.join)
Out[42]:
0 024-764-108
1 002-231-531
2 005-231-329
3 012-345-678-901-234
Name: 0, dtype: object
If using python 3.6 you could consider 'f strings', f strings allow you to do some processing within the string.
f'{string[:3]}-{string[3:6]}-{string[6:]}'
Another option would be to split your string into three parts then do a join on the array.
split_string = [string[i: i + 3] for i in range(0, len(string), 3)]
formated_number = '-'.join(split_string)
The first line of this creates an array with sub strings of length 3, then it joins the elements of that array with a '-' character in between.
There is probably a better way to do this but you can use [] to split the string into sections of 3.
old_str = "024764108"
new_str = old_str[:3] + '-' + old_str[3:6] + '-' + old_str[6:]
Easy solution:
number = "024764108"
new_number = number[:3] + '-' + number[3:6]+ '-' + number[6:]
Consider this code, using string slicing: The segment of code that converts this str to your format is string[0:3] + "-" + string[3:6] + "-" + string[6:9]
Here is your updated method and some test cases. Also, it only accepts outputs which contain exactly 9 digits.
def format_digitstring(string:str):
if len(string) != 9:
return None
return string[0:3] + "-" + string[3:6] + "-" + string[6:9]
s1 = "024764108"
s2 = "002231531"
s3 = "005231329"
s4 = "00112341"
print(format_digitstring(s1))
print(format_digitstring(s2))
print(format_digitstring(s3))
print(format_digitstring(s4))
Output:
024-764-108
002-231-531
005-231-329
None
This also do:
import re
s='024764108'
print(('{}-'*2+'{}').format(*re.findall('(...)',s)))
or if you want to do it on all row, you can use panda's apply function.
Look ahead positive, \d{3} means three digits which followed with digit (?=\d), '-' is added after three digits ('\1-').
import re
number="024764108"
re.sub(r'(\d{3})(?=\d)',r'\1-',number)
Related
Ok so lets say I have two strings.
the first string is a series of 44 o's with 6 random v's
the second string is a some text and is 44 characters in length
random_string = "bzpsvawxqpvjmldhnmvdseftystvfjimcrwoftvchmqlvwugcm"
some_text = "LoremIpsumDolourSitAmettyConssecteturAdipisc"
I am looking for a way to split the some_text string based on the v split of random_string.
random_string = "oooovooooovooooooovoooooooovoooooooooovooooovooooo"
splitted_string = random_string.split('v')
print(splitted_string)
#['oooo', 'ooooo', 'ooooooo', 'oooooooo', 'oooooooooo', 'ooooo', 'ooooo']
but i would like to apply this split pattern to some_text to achieve
['Lore', 'mIpsu', 'mDolour', 'SitAmett', 'yConssecte', 'turAd', 'ipisc']
You can use string slices, based on lengths of the split random_string parts
text_split = []
current_i = 0
for random_substr in random_string.split('v'):
text_substr = some_text[current_i:current_i+len(random_substr)]
text_split.append(text_substr)
current_i += len(random_substr)
print(text_split)
I found a solution, I hope it is general enough for you:
import re
random_string = "oooovooooovooooooovoooooooovoooooooooovooooovooooo"
some_text = "LoremIpsumDolourSitAmettyConssecteturAdipisc"
b = [m.start() for m in re.finditer('v', random_string)]
b = [0] + b + [len(b)]
c = [ some_text[b[j]:b[j+1]] for j in range(len(b)-1)]
print(c)
Can anyone explain this code a little. I can't understand what n does here? We already have taken N = int(input()) as input then why n=len(bin(N))-2? I couldn't figure it out.
N = int(input())
n = len(bin(N))-2
for i in range(1,N+1):
print(str(i).rjust(n) + " " + format(i,'o').rjust(n) + " " + format(i,'X').rjust(n) + " " + format(i,'b').rjust(n))
n counts the number of bits in the number N. bin() produces the binary representation (zeros and ones), as as string with the 0b prefix:
>>> bin(42)
'0b101010'
so len(bin(n)) takes the length of that output string, minus 2 to account for the prefix.
See the bin() documentation:
Convert an integer number to a binary string prefixed with “0b”.
The length is used to set the width of the columns (via str.rjust(), which adds spaces to the front of a string to create an output n characters wide). Knowing how many characters the widest binary representation needs is helpful here.
However, the same information can be gotten directly from the number, with the int.bitlength() method:
>>> N = 42
>>> N.bit_length()
6
>>> len(bin(N)) - 2
6
The other columns are also oversized for the numbers. You could instead calculate max widths for each column, and use str.format() or an f-string to do the formatting:
from math import log10
N = int(input())
decwidth = int(log10(N) + 1)
binwidth = N.bit_length()
hexwidth = (binwidth - 1) // 4 + 1
octwidth = (binwidth - 1) // 3 + 1
for i in range(1, N + 1):
print(f'{i:>{decwidth}d} {i:>{octwidth}o} {i:>{hexwidth}X} {i:>{binwidth}b}')
For example, if I have:
"+----+----+---+---+--+"
is it possible to replace from second to fourth + to -?
If I have
"+----+----+---+---+--+"
and I want to have
"+-----------------+--+"
I have to replace from 2-nd to 4-th + to -. Is it possible to achieve this by regex? and how?
If you can assume the first character is always a +:
string = '+' + re.sub(r'\+', r'-', string[1:], count=3)
Lop off the first character of your string and sub() the first three + characters, then add the initial + back on.
If you can't assume the first + is the first character of the string, find it first:
prefix = string.index('+') + 1
string = string[:prefix] + re.sub(r'\+', r'-', string[prefix:], count=3)
I would rather iterate over the string, and then replace the pluses according to what I found.
secondIndex = 0
fourthIndex = 0
count = 0
for i, c in enumerate(string):
if c == '+':
count += 1
if count == 2 and secondIndex == 0:
secondIndex = i
elif count == 4 and fourthIndex == 0:
fourthIndex = i
string = string[:secondIndex] + '-'*(fourthIndex-secondIndex+1) + string[fourthIndex+1:]
Test:
+----+----+---+---+--+
+-----------------+--+
I split the string into an array of strings using the character to replace as the separator.
Then rejoin the array, in sections, using the required separators.
example_str="+----+----+---+---+--+"
swap_char="+"
repl_char='-'
ith_match=2
jth_match=4
list_of_strings = example_str.split(swap_char)
new_string = ( swap_char.join(list_of_strings[0:ith_match]) + repl_char +
repl_char.join(list_of_strings[ith_match:jth_match]) +
swap_char + swap_char.join(list_of_strings[jth_match:]) )
print (example_str)
print (new_string)
running it gives :
$ python ./python_example.py
+----+----+---+---+--+
+-------------+---+--+
with regex? Yes, that's possible.
^(\+-+){1}((?:\+[^+]+){3})
explanation:
^
(\+-+){1} # read + and some -'s until 2nd +
( # group 2 start
(?:\+[^+]+){3} # read +, followed by non-plus'es, in total 3 times
) # group 2 end
testing:
$ cat test.py
import re
pattern = r"^(\+-+){1}((?:\+[^+]+){3})"
tests = ["+----+----+---+---+--+"]
for test in tests:
m = re.search(pattern, test)
if m:
print (test[0:m.start(2)] +
"-" * (m.end(2) - m.start(2)) +
test[m.end(2):])
Adjusting is simple:
^(\+-+){1}((?:\+[^+]+){3})
^ ^
the '1' indicates that you're reading up to the 2nd '+'
the '3' indicates that you're reading up to the 4th '+'
these are the only 2 changes you need to make, the group number stays the same.
Run it:
$ python test.py
+-----------------+--+
This is pythonic.
import re
s = "+----+----+---+---+--+"
idx = [ i.start() for i in re.finditer('\+', s) ][1:-2]
''.join([ j if i not in idx else '-' for i,j in enumerate(s) ])
However, if your string is constant and want it simple
print (s)
print ('+' + re.sub('\+---', '----', s)[1:])
Output:
+----+----+---+---+--+
+-----------------+--+
Using only comprehension lists:
s1="+----+----+---+---+--+"
indexes = [i for i,x in enumerate(s1) if x=='+'][1:4]
s2 = ''.join([e if i not in indexes else '-' for i,e in enumerate(s1)])
print(s2)
+-----------------+--+
I saw you already found a solution but I do not like regex so much, so maybe this will help another! :-)
I have a int 123. I need to convert it to a string "100 + 20 + 3"
How can I achieve it using Python?
I am trying to divide the number first (with 100) and then multiple the quotient again with 100. This seems to be pretty inefficient. Is there another way which I can use?
a = 123
quot = 123//100
a1 = quot*100
I am repeating the above process for all the digits.
Another option would be to do it by the index of the digit:
def int_str(i):
digits = len(str(i))
result = []
for digit in range(digits):
result.append(str(i)[digit] + '0' * (digits - digit - 1))
print ' + '.join(result)
which gives:
>>> int_str(123)
100 + 20 + 3
This works by taking each digit and adding a number of zeroes equal to how many digits are after the current digit. (at index 0, and a length of 3, you have 3 - 0 - 1 remaining digits, so the first digit should have 2 zeroes after it.)
When the loop is done, I have a list ["100", "20", "3"] which I then use join to add the connecting " + "s.
(Ab)using list comprehension:
>>> num = 123
>>> ' + '.join([x + '0' * (len(str(num)) - i - 1) for i, x in enumerate(str(num))])
'100 + 20 + 3'
How it works:
iteration 0
Digit at index 0: '1'
+ ('0' * (num_digits - 1 - iter_count) = 2) = '100'
iteration 1
Digit at index 1: '2'
+ ('0' * 1) = '20'
iteration 2
Digit at index 2: '3'
+
('0' * 0) = '3'
Once you've created all the "numbers" and put them in the list, call join and combine them with the string predicate +.
Another way of achieving what you intended to do:
def pretty_print(a):
aa = str(a)
base = len(aa) - 1
for v in aa:
yield v + '0' * base
base -= 1
>>> ' + '.join(pretty_print(123))
'100 + 20 + 3'
Here's my approach:
numInput= 123
strNums= str(numInput)
numberList= []
for i in range(0,len(strNums)):
digit= (10**i)*int(strNums[-(i+1)])
numberList.append(str(digit))
final= "+".join(numberList)
print(final)
It's the mathematical approach for what you want.
In number system every digit can be denoted as the 10 to the power of the actual place plus number(counting from zero from right to left)
So we took a number and converted into a string. Then in a loop we decided the range of the iteration which is equal to the length of our number.
range: 0 to length of number
and we give that number of power to the 10, so we would get:
10^0, 10^1, 10^2...
Now we need this value to multiply with the digits right to left. So we used negative index. Then we appended the string value of the digit to an empty list because we need the result in a form as you said.
Hope it will be helpful to you.
So I've got a string that looks like "012 + 2 - 01 + 24" for example. I want to be able to quickly (less code) evaluate that expression...
I could use eval() on the string, but I don't want 012 to be represented in octal form (10), I want it to be represented as an int (12).
My solution for this works, but it is not elegant. I am sort of assuming that there is a really good pythonic way to do this.
My solution:
#expression is some string that looks like "012 + 2 - 01 + 24"
atomlist = []
for atom in expression.split():
if "+" not in atom and "-" not in atom:
atomlist.append(int(atom))
else:
atomlist.append(atom)
#print atomlist
evalstring = ""
for atom in atomlist:
evalstring+=str(atom)
#print evalstring
num = eval(evalstring)
Basically, I tear appart the string, and find numbers in it and turn them into ints, and then I rebuild the string with the ints (essentially removing leading 0's except where 0 is a number on its own).
How can this be done better?
I'd be tempted to use regular expressions to remove the leading zeroes:
>>> re.sub(r'\b0+(?!\b)', '', '012 + 2 + 0 - 01 + 204 - 0')
'12 + 2 + 0 - 1 + 204 - 0'
This removes zeroes at the start of every number, except when the number consists entirely of zeroes:
the first \b matches a word (token) boundary;
the 0+ matches one or more consecutive zeroes;
the (?!\b) (negative lookahead) inhibits matches where the sequence of zeroes is followed by a token boundary.
One advantage of this approach over split()-based alternatives is that it doesn't require spaces in order to work:
>>> re.sub(r'\b0+(?!\b)', '', '012+2+0-01+204-0')
'12+2+0-1+204-0'
You can do this in one line using lstrip() to strip off any leading zeros:
>>> eval("".join(token.lstrip('0') for token in s.split()))
37
I'd like to do it this way:
>>> s = '012 + 2 + 0 - 01 + 204 - 0'
>>> ' '.join(str(int(x)) if x.isdigit() else x for x in s.split())
'12 + 2 + 0 - 1 + 204 - 0'
Use float() if you want to handle them too :)
int does not assume that a leading zero indicates an octal number:
In [26]: int('012')
Out[26]: 12
Accordingly, you can safely evalute the expression with the following code
from operator import add, sub
from collections import deque
def mapper(item, opmap = {'+': add, '-': sub}):
try: return int(item)
except ValueError: pass
return opmap[item]
stack = deque()
# if item filters out empty strings between whitespace sequences
for item in (mapper(item) for item in "012 + 2 - 01 + 24".split(' ') if item):
if stack and callable(stack[-1]):
f = stack.pop()
stack.append(f(stack.pop(), item))
else: stack.append(item)
print stack.pop()
Not a one-liner, but it is safe, because you control all of the functions which can be executed.