Finding string between 2 specific character

Finding string between 2 specific character - python

I have a list of directories:
C:\level1\level2\level3\level4\level5\level6\level7
I need a way to extract "level4" and "level5".
Aka. I need to extract the string between the 4th and 5th backlash and the string between the 5th and 6th backslash.
How would I go about doing this?

Try splitting by backslash:
s = "C:\level1\level2\level3\level4\level5\level6\level7"
l = s.split('\\')
print l[4], l[5]

Use str.split
>>> s = r'C:\level1\level2\level3\level4\level5\level6\level7'
>>> words = s.split('\\') # Escape backslash or use rawstring - r'\'
>>> words[4]
'level4'
>>> words[5]
'level5'
If you want to join those two words, then use str.join
>>> ' '.join((words[4],words[5]))
'level4 level5'
Also if you want a list of levels,
>>> ' '.join(words[i] for i in [4,5,6])
'level4 level5 level6'

You can use str.split to split a string according to a specific delimiter.
path = r'C:\level1\level2\level3\level4\level5\level6\level7'
dirs = path.split('\\')
print dirs[4], dirs[5]
# will print:
# level4 level5

Related

Rstrip not removing correct backslashes or giving position

So,
I have a string that looks like \uisfhb\dfjn
This will vary in length. Im struggling to get my head around rsplit and the fact that backslash is an escape character. i only want "dfjn"
i currently have
more = "\\\\uisfhb\dfjn"
more = more.replace(r'"\\\\', r"\\")
sharename = more.rsplit(r'\\', 2)
print(sharename)
and im getting back
['', 'uisfhb\dfjn']

If you want to partition a string on a literal backslash, you need to escape the backslash with another backslash in the separator.
>>> more.split('\\')
['', '', 'uisfhb', 'dfjn']
>>> more.rsplit('\\', 1)
['\\\\uisfhb', 'dfjn']
>>> more.rpartition('\\')
('\\\\uisfhb', '\\', 'dfjn')
Once the string has been split, the last element can be accessed using the index -1:
>>> sharename = more.rsplit('\\', 1)[-1]
>>> sharename
'dfjn'
or using sequence-unpacking syntax (the * operator)
>>> *_, sharename = more.rpartition('\\')
>>> sharename
'dfjn'

I think this is an issue with raw strings. Try this:
more = "\\\\uisfhb\dfjn"
more = more.replace("\\\\", "\\")
sharename = more.split("\\")[2] # using split and not rsplit
print(sharename)

If sharename is the last node in the tree, this will get it:
>>>more = "\\\\uisfhb\dfjn"
>>>sharename = more.split('\\')[-1]
>>>sharename
'dfjn'

Python: list strip overkill

I just want to remove the '.SI' in the list but it will overkill by remove any that contain S or I in the list.
ab = ['abc.SI','SIV.SI','ggS.SI']
[x.strip('.SI') for x in ab]
>> ['abc','V','gg']
output which I want is
>> ['abc','SIV','ggS']
any elegant way to do it? prefer not to use for loop as my list is long

Why strip ? you can use .replace():
[x.replace('.SI', '') for x in ab]
Output:
['abc', 'SIV', 'ggS']
(this will remove .SI anywhere, have a look at other answers if you want to remove it only at the end)
The reason strip() doesn't work is explained in the docs:
The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped
So it will strip any character in the string that you pass as an argument.

If you want to remove the substring only from the end, the correct way to achieve this will be:
>>> ab = ['abc.SI','SIV.SI','ggS.SI']
>>> sub_string = '.SI'
# checks the presence of substring at the end
# v
>>> [s[:-len(sub_string)] if s.endswith(sub_string) else s for s in ab]
['abc', 'SIV', 'ggS']
Because str.replace() (as mentioned in TrakJohnson's answer) removes the substring even if it is within the middle of string. For example:
>>> 'ab.SIrt'.replace('.SI', '')
'abrt'

use this [x[:-3] for x in ab].

Use split instead of strip and get the first element:
[x.split('.SI')[0] for x in ab]

Pythonic way to replace every second comma of string with space

I have a string which looks like this:
coords = "86.2646484375,23.039297747769726,87.34130859375,22.59372606392931,88.13232421875,24.066528197726857"
What I want is to bring it to this format:
coords = "86.2646484375,23.039297747769726 87.34130859375,22.59372606392931 88.13232421875,24.066528197726857"
So in every second number to replace the comma with a space. Is there a simple, pythonic way to do this.
Right now I am trying to do it with using the split function to create a list and then loop through the list. But it seems rather not straightforward.

First let's import the regular expression module and define your coords variable:
>>> import re
>>> coords = "86.2646484375,23.039297747769726,87.34130859375,22.59372606392931,88.13232421875,24.066528197726857"
Now, let's replace every second comma with a space:
>>> re.sub('(,[^,]*),', r'\1 ', coords)
'86.2646484375,23.039297747769726 87.34130859375,22.59372606392931 88.13232421875,24.066528197726857'
The regular expression (,[^,]*), looks for pairs of commas. The replacement text, r'\1 ' keeps the first comma but replaces the second with a space.

This sort of works:
>>> s = coords.split(',')
>>> s
['86.2646484375', '23.039297747769726', '87.34130859375', '22.59372606392931', '88.13232421875', '24.066528197726857']
>>> [','.join(i) for i in zip(s[::2], s[1::2])]
['86.2646484375,23.039297747769726', '87.34130859375,22.59372606392931', '88.13232421875,24.066528197726857']

The pythonic way is to split the string and join it again, with the alternating delimiters:
from itertools import chain, cycle, izip
coords = ''.join(chain.from_iterable(izip(coords.split(','), cycle(', '))))

Using parentheses as delimiter in re or str.split() python

I am trying to split a string such as: add(ten)sub(one) into add(ten) sub(one).
I can't figure out how to match the close parentheses. I have used re.sub(r'\\)', '\\) ') and every variation of escaping the parentheses,I can think of. It is hard to tell in this font but I am trying to add a space between these commands so I can split it into a list later.

There's no need to escape ) in the replacement string, ) has a special a special meaning only in the regex pattern so it needs to be escaped there in order to match it in the string, but in normal string it can be used as is.
>>> strs = "add(ten)sub(one)"
>>> re.sub(r'\)(?=\S)',r') ', strs)
'add(ten) sub(one)'
As #StevenRumbalski pointed out in comments the above operation can be simply done using str.replace and str.rstrip:
>>> strs.replace(')',') ').strip()
'add(ten) sub(one)'

d = ')'
my_str = 'add(ten)sub(one)'
result = [t+d for t in my_str.split(d) if len(t) > 0]
result = ['add(ten)','sub(one)']

Create a list of all substrings
import re
a = 'add(ten)sub(one)'
print [ b for b in re.findall('(.+?\(.+?\))', a) ]
Output:
['add(ten)', 'sub(one)']

How to delete some characters from a string by matching certain character in python

i am trying to delete certain portion of a string if a match found in the string as below
string = 'Newyork, NY'
I want to delete all the characters after the comma from the string including comma, if comma is present in the string
Can anyone let me now how to do this .

Use .split():
string = string.split(',', 1)[0]
We split the string on the comma once, to save python the work of splitting on more commas.
Alternatively, you can use .partition():
string = string.partition(',')[0]
Demo:
>>> 'Newyork, NY'.split(',', 1)[0]
'Newyork'
>>> 'Newyork, NY'.partition(',')[0]
'Newyork'
.partition() is the faster method:
>>> import timeit
>>> timeit.timeit("'one, two'.split(',', 1)[0]")
0.52929401397705078
>>> timeit.timeit("'one, two'.partition(',')[0]")
0.26499605178833008

You can split the string with the delimiter ",":
string.split(",")[0]
Example:
'Newyork, NY'.split(",") # ['Newyork', ' NY']
'Newyork, NY'.split(",")[0] # 'Newyork'

Try this :
s = "this, is"
m = s.index(',')
l = s[:m]

A fwe options:
string[:string.index(",")]
This will raise a ValueError if , cannot be found in the string. Here, we find the position of the character with .index then use slicing.
string.split(",")[0]
The split function will give you a list of the substrings that were separated by ,, and you just take the first element of the list. This will work even if , is not present in the string (as there'd be nothing to split in that case, we'd have string.split(...) == [string])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding string between 2 specific character - python

I have a list of directories: C:\level1\level2\level3\level4\level5\level6\level7 I need a way to extract "level4" and "level5". Aka. I need to extract the string between the 4th and 5th backlash and the string between the 5th and 6th backslash. How would I go about doing this?

Try splitting by backslash: s = "C:\level1\level2\level3\level4\level5\level6\level7" l = s.split('\\') print l[4], l[5]

You can use str.split to split a string according to a specific delimiter. path = r'C:\level1\level2\level3\level4\level5\level6\level7' dirs = path.split('\\') print dirs[4], dirs[5] # will print: # level4 level5

Related

Rstrip not removing correct backslashes or giving position

Python: list strip overkill

Pythonic way to replace every second comma of string with space

Using parentheses as delimiter in re or str.split() python

How to delete some characters from a string by matching certain character in python

Categories

Resources