Rstrip not removing correct backslashes or giving position - python

So,
I have a string that looks like \uisfhb\dfjn
This will vary in length. Im struggling to get my head around rsplit and the fact that backslash is an escape character. i only want "dfjn"
i currently have
more = "\\\\uisfhb\dfjn"
more = more.replace(r'"\\\\', r"\\")
sharename = more.rsplit(r'\\', 2)
print(sharename)
and im getting back
['', 'uisfhb\dfjn']

If you want to partition a string on a literal backslash, you need to escape the backslash with another backslash in the separator.
>>> more.split('\\')
['', '', 'uisfhb', 'dfjn']
>>> more.rsplit('\\', 1)
['\\\\uisfhb', 'dfjn']
>>> more.rpartition('\\')
('\\\\uisfhb', '\\', 'dfjn')
Once the string has been split, the last element can be accessed using the index -1:
>>> sharename = more.rsplit('\\', 1)[-1]
>>> sharename
'dfjn'
or using sequence-unpacking syntax (the * operator)
>>> *_, sharename = more.rpartition('\\')
>>> sharename
'dfjn'

I think this is an issue with raw strings. Try this:
more = "\\\\uisfhb\dfjn"
more = more.replace("\\\\", "\\")
sharename = more.split("\\")[2] # using split and not rsplit
print(sharename)

If sharename is the last node in the tree, this will get it:
>>>more = "\\\\uisfhb\dfjn"
>>>sharename = more.split('\\')[-1]
>>>sharename
'dfjn'

Related

Strip removing more characters than expected

Can anyone explain what's going on here:
s = 'REFPROP-MIX:METHANOL&WATER'
s.lstrip('REFPROP-MIX') # this returns ':METHANOL&WATER' as expected
s.lstrip('REFPROP-MIX:') # returns 'THANOL&WATER'
What happened to that 'ME'? Is a colon a special character for lstrip? This is particularly confusing because this works as expected:
s = 'abc-def:ghi'
s.lstrip('abc-def') # returns ':ghi'
s.lstrip('abd-def:') # returns 'ghi'
str.lstrip removes all the characters in its argument from the string, starting at the left. Since all the characters in the left prefix "REFPROP-MIX:ME" are in the argument "REFPROP-MIX:", all those characters are removed. Likewise:
>>> s = 'abcadef'
>>> s.lstrip('abc')
'def'
>>> s.lstrip('cba')
'def'
>>> s.lstrip('bacabacabacabaca')
'def'
str.lstrip does not remove whole strings (of length greater than 1) from the left. If you want to do that, use a regular expression with an anchor ^ at the beginning:
>>> import re
>>> s = 'REFPROP-MIX:METHANOL&WATER'
>>> re.sub(r'^REFPROP-MIX:', '', s)
'METHANOL&WATER'
The method mentioned by #PadraicCunningham is a good workaround for the particular problem as stated.
Just split by the separating character and select the last value:
s = 'REFPROP-MIX:METHANOL&WATER'
res = s.split(':', 1)[-1] # 'METHANOL&WATER'

Finding string between 2 specific character

I have a list of directories:
C:\level1\level2\level3\level4\level5\level6\level7
I need a way to extract "level4" and "level5".
Aka. I need to extract the string between the 4th and 5th backlash and the string between the 5th and 6th backslash.
How would I go about doing this?
Try splitting by backslash:
s = "C:\level1\level2\level3\level4\level5\level6\level7"
l = s.split('\\')
print l[4], l[5]
Use str.split
>>> s = r'C:\level1\level2\level3\level4\level5\level6\level7'
>>> words = s.split('\\') # Escape backslash or use rawstring - r'\'
>>> words[4]
'level4'
>>> words[5]
'level5'
If you want to join those two words, then use str.join
>>> ' '.join((words[4],words[5]))
'level4 level5'
Also if you want a list of levels,
>>> ' '.join(words[i] for i in [4,5,6])
'level4 level5 level6'
You can use str.split to split a string according to a specific delimiter.
path = r'C:\level1\level2\level3\level4\level5\level6\level7'
dirs = path.split('\\')
print dirs[4], dirs[5]
# will print:
# level4 level5

Python, can't replace generator object

I need to change replace a string's punctuation marks with space.
The problem is that I need to do it in one line.
for example: there's a string: 'H,+-=/e^##%ll-!!..o'
the result should be : 'H-----e----ll-----o'
where '-' symbolizes ' ' (space)
when I do
replace((c for c in string.punctuation),' ')
I get the error:
TypeError: Can't convert 'generator' object to str implicitly
I tried to put it in a list, in a set even in a dict.
but this error keeps on coming back.
how can I surpass this?
str.replace() doesn't take a list or generator, it'd only take a string, and even then won't do what you want. The method replaces one whole sequence of characters with another, so even x.replace(string.puntuation, '-') would only replace whole occurrences of the string.punctuation string in x with one dash.
Use string.maketrans() and str.translate() instead:
import string
translationmap = string.maketrans(string.punctuation, '-' * len(string.punctuation))
x = x.translate(translationmap)
Demo:
>>> import string
>>> x = 'H,+-=/e^##%ll-!!..o'
>>> import string
>>> translationmap = string.maketrans(string.punctuation, '-' * len(string.punctuation))
>>> x.translate(translationmap)
'H-----e----ll-----o'
str.translate() is hands-down the fastest method to map characters to other characters, or delete characters from a string.
On Python 3, str.translate() (or in Python 2, unicode.translate()) takes a mapping instead:
translationmap = {ord(c): '-' for c in string.punctuation}
x.translate(translationmap)
Try following
import string
''.join(map(lambda x : '-' if x in string.punctuation else x,
'H,+-=/e^##%ll-!!..o'))
You could also use re.sub for this:
>>> from re import sub
>>> sub("\W", "-", "H,+-=/e^##%ll-!!..o")
'H-----e----ll-----o'
>>>
\W captures all non-word characters.
Note that the above code will keep underscores. If you don't want them, replace \W with [\W_].

Using parentheses as delimiter in re or str.split() python

I am trying to split a string such as: add(ten)sub(one) into add(ten) sub(one).
I can't figure out how to match the close parentheses. I have used re.sub(r'\\)', '\\) ') and every variation of escaping the parentheses,I can think of. It is hard to tell in this font but I am trying to add a space between these commands so I can split it into a list later.
There's no need to escape ) in the replacement string, ) has a special a special meaning only in the regex pattern so it needs to be escaped there in order to match it in the string, but in normal string it can be used as is.
>>> strs = "add(ten)sub(one)"
>>> re.sub(r'\)(?=\S)',r') ', strs)
'add(ten) sub(one)'
As #StevenRumbalski pointed out in comments the above operation can be simply done using str.replace and str.rstrip:
>>> strs.replace(')',') ').strip()
'add(ten) sub(one)'
d = ')'
my_str = 'add(ten)sub(one)'
result = [t+d for t in my_str.split(d) if len(t) > 0]
result = ['add(ten)','sub(one)']
Create a list of all substrings
import re
a = 'add(ten)sub(one)'
print [ b for b in re.findall('(.+?\(.+?\))', a) ]
Output:
['add(ten)', 'sub(one)']

How to delete some characters from a string by matching certain character in python

i am trying to delete certain portion of a string if a match found in the string as below
string = 'Newyork, NY'
I want to delete all the characters after the comma from the string including comma, if comma is present in the string
Can anyone let me now how to do this .
Use .split():
string = string.split(',', 1)[0]
We split the string on the comma once, to save python the work of splitting on more commas.
Alternatively, you can use .partition():
string = string.partition(',')[0]
Demo:
>>> 'Newyork, NY'.split(',', 1)[0]
'Newyork'
>>> 'Newyork, NY'.partition(',')[0]
'Newyork'
.partition() is the faster method:
>>> import timeit
>>> timeit.timeit("'one, two'.split(',', 1)[0]")
0.52929401397705078
>>> timeit.timeit("'one, two'.partition(',')[0]")
0.26499605178833008
You can split the string with the delimiter ",":
string.split(",")[0]
Example:
'Newyork, NY'.split(",") # ['Newyork', ' NY']
'Newyork, NY'.split(",")[0] # 'Newyork'
Try this :
s = "this, is"
m = s.index(',')
l = s[:m]
A fwe options:
string[:string.index(",")]
This will raise a ValueError if , cannot be found in the string. Here, we find the position of the character with .index then use slicing.
string.split(",")[0]
The split function will give you a list of the substrings that were separated by ,, and you just take the first element of the list. This will work even if , is not present in the string (as there'd be nothing to split in that case, we'd have string.split(...) == [string])

Categories

Resources