String searching - Python - python

I need to be able to check if a string contains a number of letters and integers, and if possible how to split a string into three separate parts based off user input, any ideas?
any help is useful thanks

I think the best way to do this is to use regex. You can filter out the numbers only
import re
str1=aas30dsa20
str2=re.sub("\D", "", str1)
'3020'
You can then search for the result in the original string
start=str1.index(str2)
end=len(str1)
Finally filter it with
originalStr[start:end]

Related

Replace string with quotes, brackets, braces, and slashes in python

I have a string where I am trying to replace ["{\" with [{" and all \" with ".
I am struggling to find the right syntax in order to do this, does anyone have a solid understanding of how to do this?
I am working with JSON, and I am inserting a string into the JSON properties. This caused it to put a single quotes around my inserted data from my variable, and I need those single quotes gone. I tried to do json.dumps() on the data and do a string replace, but it does not work.
Any help is appreciated. Thank you.
You can use the replace method.
See documentation and examples here
I would recommend maybe posting more of your code below so we can suggest a better answer. Just based on the information you have provided, I would say that what you are looking for are escape characters. I may be able to help more once you provide us with more info!
Use the target/replacement strings as arguments to replace().
The general format is mystring = mystring.replace("old_text", "new_text")
Since your target strings have backslashes, you also probably want to use raw strings to prevent them from being interpreted as special characters.
mystring = "something"
mystring = mystring.replace(r'["{\"', '[{"')
mystring = mystring.replace(r'\"', '"')
if its two characters you want to replace then you have to first check for first character and then the second(which should be present just after the first one and so on) and shift(shorten the whole array by 3 elements in first case whenever the condition is satisfied and in the second case delete \ from the array.
You can also find particular substring by using inbuilt function and replace it by using replace() function to insert the string you want in its place

Extract Number before a Character in a String Using Python

I'm trying to extract the number before character "M" in a series of strings. The strings may look like:
"107S33M15H"
"33M100S"
"12M100H33M"
so basically there would be a sets of numbers separated by different characters, and "M" may show up more than once. For the example here, I would like my code to return:
33
33
12,33 #doesn't matter what deliminator to use here
One way I could think of is to split the string by "M", and find items that are pure numbers, but I suspect there are better ways to do it. Thanks a lot for the help.
You may use a simple (\d+)M regex (1+ digit(s) followed with M where the digits are captured into a capture group) with re.findall.
See IDEONE demo:
import re
s = "107S33M15H\n33M100S\n12M100H33M"
print(re.findall(r"(\d+)M", s))
And here is a regex demo
You can use rpartition to achieve that job.
s = '107S33M15H'
prefix = s.rpartition('M')[0]

How to parse a string into two different strings based on first instance of an integer? (Python)

I'm trying to take a string like "PR405j" and separate it into two strings. In this instance, the two strings would be "PR" and "405j." There are a variety of strings I have to do this to. Exmaples:
"ACR498" would be "ACR" and "498", "FR707e" would be "FR" and "707e", "TY699l" would be "TY" and "699l" and so on and so forth.
The problem I'm having is separating the first part from the second part. The amount of characters on either side differs, and the second string (the one with the numbers) may or may not have alphabetic characters in there as well. The only commonality between all of these strings is that you can divide them based on the first instance of an integer.
I thought a for loop that goes through every character in the original string and builds two separate strings inside would work, but I could only think to base the separation on integers and alphabetic characters, which would make something like "PR405j" turn into "PRj" and "405".
I also thought the split string method would help, but there's no one character all these strings have in common.
Finally, I can't split the strings based on the numbers of alphabetic characters in the beginning of the string (say 2 for "PR405j") because there is variation between strings.
If anybody could help me with this, I'd greatly appreciate it. Thank you!
You can use regular expressions to do simple string matching such as this. The expression '(\D+)(.+)' is saying 'Extract one or more non-digits as the first group, then extract one or more other characters as the second.'
import re
inputs = ['PR405j']
for input in inputs:
match = re.match('(\D+)(.+)', input)
start = match.group(1)
end = match.group(2)
print input, start, end
EDIT: I misunderstood the question, thought you wanted 3 groups, not two. Zack Bloom's answer is more correct, but I'll leave this here as a reference in case someone has a similar question.
You can use re.split:
>>> re.split(r'(\d+)', 'PR405j')
['PR', '405', 'j']
The trick here is using a capturing group (with parentheses) as the regular expression to split by; this will cause the output to contain the portions that caused the split as well as the portions to either side of it. If you have a string with multiple groups of digits separated by non-digits, this will fully split the string:
>>> re.split(r'(\d+)', 'PR405j123abc')
['PR', '405', 'j', '123', 'abc']
re.split, like the rest of the answers. But you have to munge it to deal with the grouping:
import re
re.split(r'([a-zA-Z]+)', 'PR405j', 1)[1:]

How to find if a string contains all the certain characters?

For example:
Characters to match: 'czk'
string1: 'zack' Matches
string2: 'zak' Does not match
I tried (c)+(k)+(z) and [ckz] which are obviously wrong. I feel this is a simple task, but i am unable to find an answer
The most natural way would probably to use sets rather than regex, like so
set('czk').issubset(s)
Code is very often simpler and easier to maintain without using regex much.
Basically you have to sort the string first so you get "ackz" and then you can use a regex like /.*c.*k.*z.*/ to match against.

Pad an integer using a regular expression

I'm using regular expressions with a python framework to pad a specific number in a version number:
10.2.11
I want to transform the second element to be padded with a zero, so it looks like this:
10.02.11
My regular expression looks like this:
^(\d{2}\.)(\d{1})([\.].*)
If I just regurgitate back the matching groups, I use this string:
\1\2\3
When I use my favorite regular expression test harness (http://kodos.sourceforge.net/), I can't get it to pad the second group. I tried \1\20\3, but that interprets the second reference as 20, and not 2.
Because of the library I'm using this with, I need it to be a one liner. The library takes a regular expression string, and then a string for what should be used to replace it with.
I'm assuming I just need to escape the matching groups string, but I can't figure it out. Thanks in advance for any help.
How about a completely different approach?
nums = version_string.split('.')
print ".".join("%02d" % int(n) for n in nums)
What about removing the . from the regex?
^(\d{2})\.(\d{1})[\.](.*)
replace with:
\1.0\2.\3
Try this:
(^\d(?=\.)|(?<=\.)\d(?=\.)|(?<=\.)\d$)
And replace the match by 0\1. This will make any number at least two digits long.
Does your library support named groups? That might solve your problem.

Categories

Resources