Add a number to the beginning of a string in particular locations - python

I have this string:
abc,12345,abc,abc,abc,abc,12345,98765443,xyz,zyx,123
What can I use to add a 0 to the beginning of each number in this string? So how can I turn that string into something like:
abc,012345,abc,abc,abc,abc,012345,098765443,xyz,zyx,0123
I've tried playing around with Regex but I'm unsure how I can use that effectively to yield the result I want. I need it to match with a string of numbers rather than a positive integer, but with only numbers in the string, so not something like:
1234abc567 into 01234abc567 as it has letters in it. Each value is always separated by a comma.

Use re.sub,
re.sub(r'(^|,)(\d)', r'\g<1>0\2', s)
or
re.sub(r'(^|,)(?=\d)', r'\g<1>0', s)
or
re.sub(r'\b(\d)', r'0\1', s)

Try following
re.sub(r'(?<=\b)(\d+)(?=\b)', r'\g<1>0', str)

If the numbers are always seperated by commas in your string, you can use basic list methods to achieve the result you want.
Let's say your string is called x
y=x.split(',')
x=''
for i in y:
if i.isdigit():
i='0'+i
x=x+i+','
What this piece of code does is the following:
Splits your string into pieces depending on where you have commas and returns a list of the pieces.
Checks if the pieces are actually numbers, and if they are a 0 is added using string concatenation.
Finally your string is rebuilt by concatenating the pieces along with the commas.

Related

Remove characters after matching two conditions

I have the Python code below and I would like the output to be a string: "P-1888" discarding all numbers after the 2nd "-" and removing the leading 0's after the 1st "-".
So far all I have been able to do in the following code is to remove the trailing 0's:
import re
docket_no = "P-01888-000"
doc_no_rgx1 = re.compile(r"^([^\-]+)\-(0+(.+))\-0[\d]+$")
massaged_dn1 = doc_no_rgx1.sub(r"\1-\2", docket_no)
print(massaged_dn1)
You can use the split() method to split the string on the "-" character and then use the join() method to join the first and second elements of the resulting list with a "-" character. Additionally, you can use the lstrip() method to remove the leading 0's after the 1st "-". Try this.
docket_no = "P-01888-000"
docket_no_list = docket_no.split("-")
docket_no_list[1] = docket_no_list[1].lstrip("0")
massaged_dn1 = "-".join(docket_no_list[:2])
print(massaged_dn1)
First way is to use capturing groups. You have already defined three of them using brackets. In your example the first capturing group will get "P", and the third capturing group will get numbers without leading zeros. You can get captured data by using re.match:
match = doc_no_rgx1.match(docket_no)
print(f'{match.group(1)}-{match.group(3)}') # Outputs 'P-1888'
Second way is to not use regex for such a simple task. You could split your string and reassemble it like this:
parts = docket_no.split('-')
print(f'{parts[0]}-{parts[1].lstrip("0")}')
It seems like a sledgehammer/nut situation but of you do want to use re then you could use:
doc_no_rgx1 = ''.join(re.findall('([A-Z]-)0+(\d+)-', docket_no)[0])
I don't think I'd use a regular expression for this purpose. Your usecase can be handled by standard string manipulation so using a regular expression would be overkill. Instead, consider doing this:
docket_nos = "P-01888-000".split('-')[:-1]
docket_nos[1] = docket_nos[1].lstrip('0')
docket_no = '-'.join(docket_nos)
print(docket_no) # P-1888
This might seem a little bit verbose but it does exactly what you're looking for. The first line splits docket_no by '-' characters, producing substrings P, 01888 and 000; and then discards the last substring. The second line strips leading zeros from the second substring. And the third line joins all these back together using '-' characters, producing your desired result of P-1888.
Functionally this is no different than other answers suggesting that you split on '-' and lstrip the zero(s), but personally I find my code more readable when I use multiple assignment to clarify intent vs. using indexes:
def convert_docket_no(docket_no):
letter, number, *_ = docket_no.split('-')
return f'{letter}-{number.lstrip("0")}'
_ is used here for a "throwaway" variable, and the * makes it accept all elements of the split list past the first two.

Regular expression to retrieve string parts within parentheses separated by commas

I have a String from which I want to take the values within the parenthesis. Then, get the values that are separated from a comma.
Example: x(142,1,23ERWA31)
I would like to get:
142
1
23ERWA31
Is it possible to get everything with one regex?
I have found a method to do so, but it is ugly.
This is how I did it in python:
import re
string = "x(142,1,23ERWA31)"
firstResult = re.search("\((.*?)\)", string)
secondResult = re.search("(?<=\()(.*?)(?=\))", firstResult.group(0))
finalResult = [x.strip() for x in secondResult.group(0).split(',')]
for i in finalResult:
print(i)
142
1
23ERWA31
This works for your example string:
import re
string = "x(142,1,23ERWA31)"
l = re.findall (r'([^(,)]+)(?!.*\()', string)
print (l)
Result: a plain list
['142', '1', '23ERWA31']
The expression matches a sequence of characters not in (,,,) and – to prevent the first x being picked up – may not be followed by a ( anywhere further in the string. This makes it also work if your preamble x consists of more than a single character.
findall rather than search makes sure all items are found, and as a bonus it returns a plain list of the results.
You can make this a lot simpler. You are running your first Regex but then not taking the result. You want .group(1) (inside the brackets), not .group(0) (the whole match). Once you have that you can just split it on ,:
import re
string = "x(142,1,23ERWA31)"
firstResult = re.search("\((.*?)\)", string)
for e in firstResult.group(1).split(','):
print(e)
A little wonky looking, and also assuming there's always going to be a grouping of 3 values in the parenthesis - but try this regex
\((.*?),(.*?),(.*?)\)
To extract all the group matches to a single object - your code would then look like
import re
string = "x(142,1,23ERWA31)"
firstResult = re.search("\((.*?),(.*?),(.*?)\)", string).groups()
You can then call the firstResult object like a list
>> print(firstResult[2])
23ERWA31

How do I start reading from a certain character in a string?

I have a list of strings that look something like this:
"['id', 'thing: 1\nother: 2\n']"
"['notid', 'thing: 1\nother: 2\n']"
I would now like to read the value of 'other' out of each of them.
I did this by counting the number at a certain position but since the position of such varies I wondererd if I could read from a certain character like a comma and say: read x_position character from comma. How would I do that?
Assuming that "other: " is always present in your strings, you can use it as a separator and split by it:
s = 'thing: 1\nother: 2'
_,number = s.split('other: ')
number
#'2'
(Use int(number) to convert the number-like string to an actual number.) If you are not sure if "other: " is present, enclose the above code in try-except statement.

Find Certain String Indices

I have this string and I need to get a specific number out of it.
E.G. encrypted = "10134585588147, 3847183463814, 18517461398"
How would I pull out only the second integer out of the string?
You are looking for the "split" method. Turn a string into a list by specifying a smaller part of the string on which to split.
>>> encrypted = '10134585588147, 3847183463814, 18517461398'
>>> encrypted_list = encrypted.split(', ')
>>> encrypted_list
['10134585588147', '3847183463814', '18517461398']
>>> encrypted_list[1]
'3847183463814'
>>> encrypted_list[-1]
'18517461398'
Then you can just access the indices as normal. Note that lists can be indexed forwards or backwards. By providing a negative index, we count from the right rather than the left, selecting the last index (without any idea how big the list is). Note this will produce IndexError if the list is empty, though. If you use Jon's method (below), there will always be at least one index in the list unless the string you start with is itself empty.
Edited to add:
What Jon is pointing out in the comment is that if you are not sure if the string will be well-formatted (e.g., always separated by exactly one comma followed by exactly one space), then you can replace all the commas with spaces (encrypt.replace(',', ' ')), then call split without arguments, which will split on any number of whitespace characters. As usual, you can chain these together:
encrypted.replace(',', ' ').split()

How to attach a string to an integer in Python 3

I've got a two letter word that I'd like to attach to a double digit number. The word is an integer and the number is a string.
Say the name of the number is "number" and the name of the word is "word".
How would you make it print both of them together without spaces. When I try it right now it still has a space between them regardless of what I try.
Thanks !
'{}{}'.format(word, number)
For example,
In [19]: word='XY'
In [20]: number=123
In [21]: print('{}{}'.format(word, number))
XY123
The print function has a sep parameter that controls spacing between items:
print(number, word, sep="")
If you need a string, rather than printing, than unutbu's answer with string formatting is better, but this may get you to your desired results with fewer steps.
In python 3 the preferred way to construct strings is by using format
To print out a word and a number joined together you would use:
print("{}{}".format(word, number))

Categories

Resources