I have >4000 numbers in a column that need to be manipulated..
They look like this:
040 413 560 89 or 0361 223240
How dow I put it into the folllowing format:
+49 (040) 41356089 or +49 (0361) 223240
They all need to have the same country dialling code +49 and then the respective area code put into brackets and some are already in the correct format.
We can split the string into groups:
>>> groups = '040 413 560 89'.split()
>>> groups
['040', '413', '560', '89']
We can slice the groups, and assign to variables, also join the later groups into one string:
>>> city, number = groups[0], ''.join(groups[1:])
>>> city, number
('040', '41356089')
We can format a new string:
>>> '+49 ({}) {}'.format(city, number)
'+49 (040) 41356089'
We can check if a number already starts with +:
>>> '+49 (040) 41356089'.startswith('+')
True
Do so like this:
ls_alreadycorrected = ['(',')','+49']
str_in = '040 413 560 89' #or apply to list
for flag in ls_alreadycorrected:
if flag not in str_in:
how_many_spaces = str_in.count(' ')
if how_many_spaces > 2:
str_in = str_in.replace(' ','')
str_out = '+049'+' ' + '(' + str_in[:3] + ') ' + str_in[-8:]
else:
str_in = str_in.replace(' ','')
str_out = '+049'+' ' + '(' + str_in[:4] + ') ' + str_in[-6:]
That's only given you have to types of phone numbers. For a list of numbers, put this on top instead of str_in
for number in list_of_numbers:
str_in = number
Cheers
You can do this.
phone = "456789"
cod = 123
final = str(cod) + phone
Result is "123456789"
Related
I have a range of string such as: "024764108", "002231531", "005231329", they have exactly 9 digits. And I want to add - to each group of 3 digits. The result I want is as below:
"024-764-108", "002-231-531", "005-231-329".
How can I explain my think to python?
Here is a dynamic solution:
In [41]: df
Out[41]:
num
0 024764108
1 002231531
2 005231329
3 012345678901234
In [42]: df.num.str.extractall(r'(\d{3})').groupby(level=0)[0].apply('-'.join)
Out[42]:
0 024-764-108
1 002-231-531
2 005-231-329
3 012-345-678-901-234
Name: 0, dtype: object
If using python 3.6 you could consider 'f strings', f strings allow you to do some processing within the string.
f'{string[:3]}-{string[3:6]}-{string[6:]}'
Another option would be to split your string into three parts then do a join on the array.
split_string = [string[i: i + 3] for i in range(0, len(string), 3)]
formated_number = '-'.join(split_string)
The first line of this creates an array with sub strings of length 3, then it joins the elements of that array with a '-' character in between.
There is probably a better way to do this but you can use [] to split the string into sections of 3.
old_str = "024764108"
new_str = old_str[:3] + '-' + old_str[3:6] + '-' + old_str[6:]
Easy solution:
number = "024764108"
new_number = number[:3] + '-' + number[3:6]+ '-' + number[6:]
Consider this code, using string slicing: The segment of code that converts this str to your format is string[0:3] + "-" + string[3:6] + "-" + string[6:9]
Here is your updated method and some test cases. Also, it only accepts outputs which contain exactly 9 digits.
def format_digitstring(string:str):
if len(string) != 9:
return None
return string[0:3] + "-" + string[3:6] + "-" + string[6:9]
s1 = "024764108"
s2 = "002231531"
s3 = "005231329"
s4 = "00112341"
print(format_digitstring(s1))
print(format_digitstring(s2))
print(format_digitstring(s3))
print(format_digitstring(s4))
Output:
024-764-108
002-231-531
005-231-329
None
This also do:
import re
s='024764108'
print(('{}-'*2+'{}').format(*re.findall('(...)',s)))
or if you want to do it on all row, you can use panda's apply function.
Look ahead positive, \d{3} means three digits which followed with digit (?=\d), '-' is added after three digits ('\1-').
import re
number="024764108"
re.sub(r'(\d{3})(?=\d)',r'\1-',number)
I want to create the following function
Left_padded(n, width)
That returns, for example:
Left_padded(6, 4):
' 6' #number 6 into a 4 digits space
Left_padded(54, 5)
' 54' #number 54 into a 5 digits space
You can use rjust:
>>> def Left_padded(n, width):
... return str(n).rjust(width)
>>> Left_padded(54, 5)
' 54'
Assuming you want to put the number next to another string, you can also use % formatting to achieve the same result:
>>> w1 = "your number is:"
>>> num = 20
>>> line = '%s%10s' % (w1, num)
>>> print(line)
'your number is: 20'
I can reverse a string using the [::- 1] syntax. Take note of the example below:
text_in = 'I am 25 years old'
rev_text = text_in[::-1]
print rev_text
Output:
dlo sraey 52 ma I
How can I reverse only the letters while keeping the numbers in order?
The desired result for the example is 'dlo sraey 25 ma I'.
Here's an approach with re:
>>> import re
>>> text_in = 'I am 25 years old'
>>> ''.join(s if s.isdigit() else s[::-1] for s in reversed(re.split('(\d+)', text_in)))
'dlo sraey 25 ma I'
>>>
>>> text_in = 'Iam25yearsold'
>>> ''.join(s if s.isdigit() else s[::-1] for s in reversed(re.split('(\d+)', text_in)))
'dlosraey25maI'
Using split() and join() along with str.isdigit() to identify numbers :
>>> s = 'I am 25 years old'
>>> s1 = s.split()
>>> ' '.join([ ele if ele.isdigit() else ele[::-1] for ele in s1[::-1] ])
=> 'dlo sraey 25 ma I'
NOTE : This only works with numbers that are space separated. For others, check out timegeb's answer using regex.
Here is a step by step approach:
text_in = 'I am 25 years old'
text_seq = list(text_in) # make a list of characters
text_nums = [c for c in text_seq if c.isdigit()] # extract the numbers
num_ndx = 0
revers = []
for idx, c in enumerate(text_seq[::-1]): # for each char in the reversed text
if c.isdigit(): # if it is a number
c = text_nums[num_ndx] # replace it by the number not reversed
num_ndx += 1
revers.append(c) # if not a number, preserve the reversed order
print(''.join(revers)) # output the final string
Output :
dlo sraey 25 ma I
You can do it in pythonic way straight forward like below..
def rev_except_digit(text_in):
rlist = text_in[::-1].split() #Reverse the whole string and split into list
for i in range(len(rlist)): # Again reverse only numbers
if rlist[i].isdigit():
rlist[i] = rlist[i][::-1]
return ' '.join(rlist)
Test:
Original: I am 25 years 345 old 290
Reverse: 290 dlo 345 sraey 25 ma I
you can find official python doc here split() and other string methods, slicing[::-1]
text = "I am 25 years old"
new_text = ''
text_rev = text[::-1]
for i in text_rev.split():
if not i.isdigit():
new_text += i + " ";
else:
new_text += i[::-1] + " ";
print(new_text)
My prof wants me to create a function that return the sum of numbers in a string but without using any lists or list methods.
The function should look like this when operating:
>>> sum_numbers('34 3 542 11')
590
Usually a function like this would be easy to create when using lists and list methods. But trying to do so without using them is a nightmare.
I tried the following code but they don't work:
>>> def sum_numbers(s):
for i in range(len(s)):
int(i)
total = s[i] + s[i]
return total
>>> sum_numbers('1 2 3')
'11'
Instead of getting 1, 2, and 3 all converted into integers and added together, I instead get the string '11'. In other words, the numbers in the string still have not been converted to integers.
I also tried using a map() function but I just got the same results:
>>> def sum_numbers(s):
for i in range(len(s)):
map(int, s[i])
total = s[i] + s[i]
return total
>>> sum_numbers('1 2 3')
'11'
Totally silly of course, but for fun:
s = '34 3 542 11'
n = ""; total = 0
for c in s:
if c == " ":
total = total + int(n)
n = ""
else:
n = n + c
# add the last number
total = total + int(n)
print(total)
> 590
This assumes all characters (apart from whitespaces) are figures.
You've definitely put some effort in here, but one part of your approach definitely won't work as-is: you're iterating over the characters in the string, but you keep trying to treat each character as its own number. I've written a (very commented) method that accomplishes what you want without using any lists or list methods:
def sum_numbers(s):
"""
Convert a string of numbers into a sum of those numbers.
:param s: A string of numbers, e.g. '1 -2 3.3 4e10'.
:return: The floating-point sum of the numbers in the string.
"""
def convert_s_to_val(s):
"""
Convert a string into a number. Will handle anything that
Python could convert to a float.
:param s: A number as a string, e.g. '123' or '8.3e-18'.
:return: The float value of the string.
"""
if s:
return float(s)
else:
return 0
# These will serve as placeholders.
sum = 0
current = ''
# Iterate over the string character by character.
for c in s:
# If the character is a space, we convert the current `current`
# into its numeric representation.
if c.isspace():
sum += convert_s_to_val(current)
current = ''
# For anything else, we accumulate into `current`.
else:
current = current + c
# Add `current`'s last value to the sum and return.
sum += convert_s_to_val(current)
return sum
Personally, I would use this one-liner, but it uses str.split():
def sum_numbers(s):
return sum(map(float, s.split()))
No lists were used (nor harmed) in the production of this answer:
def sum_string(string):
total = 0
if len(string):
j = string.find(" ") % len(string) + 1
total += int(string[:j]) + sum_string(string[j:])
return total
If the string is noisier than the OP indicates, then this should be more robust:
import re
def sum_string(string):
pattern = re.compile(r"[-+]?\d+")
total = 0
match = pattern.search(string)
while match:
total += int(match.group())
match = pattern.search(string, match.end())
return total
EXAMPLES
>>> sum_string('34 3 542 11')
590
>>> sum_string(' 34 4 ')
38
>>> sum_string('lksdjfa34adslkfja4adklfja')
38
>>> # and I threw in signs for fun
...
>>> sum_string('34 -2 45 -8 13')
82
>>>
If you want to be able to handle floats and negative numbers:
def sum_numbers(s):
sm = i = 0
while i < len(s):
t = ""
while i < len(s) and not s[i].isspace():
t += s[i]
i += 1
if t:
sm += float(t)
else:
i += 1
return sm
Which will work for all cases:
In [9]: sum_numbers('34 3 542 11')
Out[9]: 590.0
In [10]: sum_numbers('1.93 -1 23.12 11')
Out[10]: 35.05
In [11]: sum_numbers('')
Out[11]: 0
In [12]: sum_numbers('123456')
Out[12]: 123456.0
Or a variation taking slices:
def sum_numbers(s):
prev = sm = i = 0
while i < len(s):
while i < len(s) and not s[i].isspace():
i += 1
if i > prev:
sm += float(s[prev:i])
prev = i
i += 1
return sm
You could also use itertools.groupby which uses no lists, using a set of allowed chars to group by:
from itertools import groupby
def sum_numbers(s):
allowed = set("0123456789-.")
return sum(float("".join(v)) for k,v in groupby(s, key=allowed.__contains__) if k)
which gives you the same output:
In [14]: sum_numbers('34 3 542 11')
Out[14]: 590.0
In [15]: sum_numbers('1.93 -1 23.12 11')
Out[15]: 35.05
In [16]: sum_numbers('')
Out[16]: 0
In [17]: sum_numbers('123456')
Out[17]: 123456.0
Which if you only have to consider positive ints could just use str.isdigit as the key:
def sum_numbers(s):
return sum(int("".join(v)) for k,v in groupby(s, key=str.isdigit) if k)
Try this:
def sum_numbers(s):
sum = 0
#This string will represent each number
number_str = ''
for i in s:
if i == ' ':
#if it is a whitespace it means
#that we have a number so we incease the sum
sum += int(number_str)
number_str = ''
continue
number_str += i
else:
#add the last number
sum += int(number_str)
return sum
You could write a generator:
def nums(s):
idx=0
while idx<len(s):
ns=''
while idx<len(s) and s[idx].isdigit():
ns+=s[idx]
idx+=1
yield int(ns)
while idx<len(s) and not s[idx].isdigit():
idx+=1
>>> list(nums('34 3 542 11'))
[34, 3, 542, 11]
Then just sum that:
>>> sum(nums('34 3 542 11'))
590
or, you could use re.finditer with a regular expression and a generator construction:
>>> sum(int(m.group(1)) for m in re.finditer(r'(\d+)', '34 3 542 11'))
590
No lists used...
def sum_numbers(s):
total=0
gt=0 #grand total
l=len(s)
for i in range(l):
if(s[i]!=' '):#find each number
total = int(s[i])+total*10
if(s[i]==' ' or i==l-1):#adding to the grand total and also add the last number
gt+=total
total=0
return gt
print(sum_numbers('1 2 3'))
Here each substring is converted to number and added to grant total
If we omit the fact eval is evil, we can solve that problem with it.
def sum_numbers(s):
s = s.replace(' ', '+')
return eval(s)
Yes, that simple. But i won't put that thing in production.
And sure we need to test that:
from hypothesis import given
import hypothesis.strategies as st
#given(list_num=st.lists(st.integers(), min_size=1))
def test_that_thing(list_num):
assert sum_numbers(' '.join(str(i) for i in list_num)) == sum(list_num)
test_that_thing()
And it would raise nothing.
So I've got a string that looks like "012 + 2 - 01 + 24" for example. I want to be able to quickly (less code) evaluate that expression...
I could use eval() on the string, but I don't want 012 to be represented in octal form (10), I want it to be represented as an int (12).
My solution for this works, but it is not elegant. I am sort of assuming that there is a really good pythonic way to do this.
My solution:
#expression is some string that looks like "012 + 2 - 01 + 24"
atomlist = []
for atom in expression.split():
if "+" not in atom and "-" not in atom:
atomlist.append(int(atom))
else:
atomlist.append(atom)
#print atomlist
evalstring = ""
for atom in atomlist:
evalstring+=str(atom)
#print evalstring
num = eval(evalstring)
Basically, I tear appart the string, and find numbers in it and turn them into ints, and then I rebuild the string with the ints (essentially removing leading 0's except where 0 is a number on its own).
How can this be done better?
I'd be tempted to use regular expressions to remove the leading zeroes:
>>> re.sub(r'\b0+(?!\b)', '', '012 + 2 + 0 - 01 + 204 - 0')
'12 + 2 + 0 - 1 + 204 - 0'
This removes zeroes at the start of every number, except when the number consists entirely of zeroes:
the first \b matches a word (token) boundary;
the 0+ matches one or more consecutive zeroes;
the (?!\b) (negative lookahead) inhibits matches where the sequence of zeroes is followed by a token boundary.
One advantage of this approach over split()-based alternatives is that it doesn't require spaces in order to work:
>>> re.sub(r'\b0+(?!\b)', '', '012+2+0-01+204-0')
'12+2+0-1+204-0'
You can do this in one line using lstrip() to strip off any leading zeros:
>>> eval("".join(token.lstrip('0') for token in s.split()))
37
I'd like to do it this way:
>>> s = '012 + 2 + 0 - 01 + 204 - 0'
>>> ' '.join(str(int(x)) if x.isdigit() else x for x in s.split())
'12 + 2 + 0 - 1 + 204 - 0'
Use float() if you want to handle them too :)
int does not assume that a leading zero indicates an octal number:
In [26]: int('012')
Out[26]: 12
Accordingly, you can safely evalute the expression with the following code
from operator import add, sub
from collections import deque
def mapper(item, opmap = {'+': add, '-': sub}):
try: return int(item)
except ValueError: pass
return opmap[item]
stack = deque()
# if item filters out empty strings between whitespace sequences
for item in (mapper(item) for item in "012 + 2 - 01 + 24".split(' ') if item):
if stack and callable(stack[-1]):
f = stack.pop()
stack.append(f(stack.pop(), item))
else: stack.append(item)
print stack.pop()
Not a one-liner, but it is safe, because you control all of the functions which can be executed.