Remove from python string - python

When you run something through popen in Python, the results come in from the buffer with the CR-LF decimal value of a carriage return (13) at the end of each line. How do you remove this from a Python string?

You can simply do
s = s.replace('\r\n', '\n')
to replace all occurrences of CRNL with just NL, which seems to be what you want.

buffer = "<text from your subprocess here>\r\n"
no_cr = buffer.replace("\r\n", "\n")

If they are at the end of the string(s), I would suggest to use:
buffer = "<text from your subprocess here>\r\n"
no_cr = buffer.rstrip("\r\n")
You can also use rstrip() without parameters which will remove whitespace as well.

replace('\r\n','\n') should work, but sometimes it just does not. How strange. Instead you can use this:
lines = buffer.split('\r')
cleanbuffer = ''
for line in lines: cleanbuffer = cleanbuffer + line

Actually, you can simply do the following:
s = s.strip()
This will remove any extraneous whitespace, including CR and LFs, leading or trailing the string.
s = s.rstrip()
Does the same, but only trailing the string.
That is:
s = ' Now is the time for all good... \t\n\r "
s = s.strip()
s now contains 'Now is the time for all good...'
s = s.rstrip()
s now contains ' Now is the time for all good...'
See http://docs.python.org/library/stdtypes.html for more.

You can do s = s.replace('\r', '') too.

Related

How to derive a string for the newline characters in a platform-independent way and use it in a regular expression pattern?

I have a question about how to represent the newline characters as a string in Python. I thought I could use the built-in function repr to achieve this. So I try to verify the feasibility of this method by running the following code:
import os
lineBreakAsStr = repr(os.linesep)
print(f'lineBreakAsStr = {lineBreakAsStr}') # line 4
print(lineBreakAsStr == '\\r\\n') # line 5
I expect the result of line 5 should be ' True ' if the function repr can convert the value of os.linesep to a string successfully. But in my Windows 7 PC, the output of line 4 is ' lineBreakAsStr = '\r\n' ' and the output of line 5 is ' False '.
Can anyone explain to me why?
And how should I get the string which stands for newline characters from the value of os.linesep and put it in a regular expression pattern instead of using a fixed string like ' \\r\\n '?
Below is a code snippet to demonstrate what I want to do. ( I prefer to use the code in line 13 to the code in line 14. But the code in 13 does not work. It has to be modified in some way to find the substring I want to find. ):
import os, re
def f(pattern, data):
p = re.compile(pattern)
m = p.search(data)
if m is not None:
print(m.group())
else:
print('Not match.')
dataSniffedInConsole = ('procd: - init -\\\\r\\\\nPlease press Enter '
'to activate this console.\\\\r\\\\n')
lineBreakAsStr = repr(os.linesep) # line 13
# lineBreakAsStr = '\\\\\\\\r\\\\\\\\n' # line 14
pattern = rf'Please press Enter to activate this console.{lineBreakAsStr}'
f(pattern, dataSniffedInConsole)
Using repr will put quotes around the string. The quotes are probably causing your issue.
>>> newline = repr(os.linesep)
>>> print(newline)
'\\r\\n'
>>> newline == "'\\r\\n'"
True
A quick fix to your problem is to remove the quotes:
>>> newline = repr(os.linesep).strip("'")
>>> print(newline)
\\r\\n
>>> newline == "'\\r\\n'"
False
>>> newline == "\\r\\n"
True
I recommend you find a way to read the raw data from the console rather than a representation of it. Using the raw data will be much easier to process.

Remove string tail from first occurrence of a symbol

I need to remove comment (if it exists) from a string. Comments start with #.
Line may have multiple #.
E.g., "separator" line: ################################
Is there a better (one-liner) way to do it than this:
ipound = line.find('#')
if ipound >= 0:
line = line[: ipound].rstrip()
(rstrip is optional to remove white space before comment)
PS: cannot avoid if like this:
>>> line = "test"
>>> line = line[:line.find('#')]
>>> line
'tes'
This should be remove the comment and return the line.
def remove_comment(line):
return line.split('#')[0].rstrip()
line.index("#") returns the first occurrence of "#"
You can then use string slicing to get the stuff before it: line = line[:line.index("#")]If there is no instance of "#", this will cause an error, so instead do line = line[:line.index("#")] if "#" in line else line
If you want to remove the end of the string, I would advise you to use the re library. You can do a lot of complicated stuff with one line of code.
Removing what is after # if there is a # is equivalent to keeping all of what is before # and everything if there is no #.
import re
string='hi#there'
new_string=re.findall('.*#?',string)[0]
>>>new_string='hi'
string2='hithere'
new_string_2=re.findall('.*#?',string2)[0]
>>>new_string='hithere'
You did not specify if you were going to have many # in your string. If yes, it will only consider the last #.
string3='hi#there#how are you ?'
new_string_3=re.findall('.*#',strign3)[0]
>>>new_string_3='hi#there#

Stripping Hex code from a plain text file in Python [duplicate]

I have a string. How do I remove all text after a certain character? (In this case ...)
The text after will ... change so I that's why I want to remove all characters after a certain one.
Split on your separator at most once, and take the first piece:
sep = '...'
stripped = text.split(sep, 1)[0]
You didn't say what should happen if the separator isn't present. Both this and Alex's solution will return the entire string in that case.
Assuming your separator is '...', but it can be any string.
text = 'some string... this part will be removed.'
head, sep, tail = text.partition('...')
>>> print head
some string
If the separator is not found, head will contain all of the original string.
The partition function was added in Python 2.5.
S.partition(sep) -> (head, sep, tail)
Searches for the separator sep in S, and returns the part before it,
the separator itself, and the part after it. If the separator is not
found, returns S and two empty strings.
If you want to remove everything after the last occurrence of separator in a string I find this works well:
<separator>.join(string_to_split.split(<separator>)[:-1])
For example, if string_to_split is a path like root/location/child/too_far.exe and you only want the folder path, you can split by "/".join(string_to_split.split("/")[:-1]) and you'll get
root/location/child
Without a regular expression (which I assume is what you want):
def remafterellipsis(text):
where_ellipsis = text.find('...')
if where_ellipsis == -1:
return text
return text[:where_ellipsis + 3]
or, with a regular expression:
import re
def remwithre(text, there=re.compile(re.escape('...')+'.*')):
return there.sub('', text)
import re
test = "This is a test...we should not be able to see this"
res = re.sub(r'\.\.\..*',"",test)
print(res)
Output: "This is a test"
The method find will return the character position in a string. Then, if you want remove every thing from the character, do this:
mystring = "123⋯567"
mystring[ 0 : mystring.index("⋯")]
>> '123'
If you want to keep the character, add 1 to the character position.
From a file:
import re
sep = '...'
with open("requirements.txt") as file_in:
lines = []
for line in file_in:
res = line.split(sep, 1)[0]
print(res)
This is in python 3.7 working to me
In my case I need to remove after dot in my string variable fees
fees = 45.05
split_string = fees.split(".", 1)
substring = split_string[0]
print(substring)
Yet another way to remove all characters after the last occurrence of a character in a string (assume that you want to remove all characters after the final '/').
path = 'I/only/want/the/containing/directory/not/the/file.txt'
while path[-1] != '/':
path = path[:-1]
another easy way using re will be
import re, clr
text = 'some string... this part will be removed.'
text= re.search(r'(\A.*)\.\.\..+',url,re.DOTALL|re.IGNORECASE).group(1)
// text = some string

Python how to remove equal sign "=" in strings?

Python how to remove = in strings?
a = 'bbb=ccc'
a.rstrip('=')
# returns 'bbb=ccc'
a.rstrip('\=')
# alse returns 'bbb=ccc'
how to match = ?
You can replace it with an empty string:
a.replace("=", "")
For reference:
https://docs.python.org/3/library/stdtypes.html#str.replace
You can use the replace method (easiest):
a = 'bbb=ccc'
a.replace('=', '')
or the translate method (probably faster on large amounts of data):
a = 'bbb=ccc'
a.translate(None, '=')
or the re.sub method (most powerful, i.e. can do much more):
import re
re.sub('=', '', 'aaa=bbb')
strip removes characters from the beginning and from the end of the string!
From the documentation:
str.strip([chars])
Return a copy of the string with leading and trailing characters removed.
Since you "=" is neither at the beggining nor at the end of your string, you can't use strip for your purpose. You need to use replace.
a.replace("=", "")

String function to strip the last comma

Input
str = 'test1,test2,test3,'
Ouput
str = 'test1,test2,test3'
Requirement to strip the last occurence of ','
Just use rstrip().
result = your_string.rstrip(',')
str = 'test1,test2,test3,'
str[:-1] # 'test1,test2,test3'
The question is very old but tries to give the better answer
str = 'test1,test2,test3,'
It will check the last character, if the last character is a comma it will remove otherwise will return the original string.
result = str[:-1] if str[-1]==',' else str
Though it is little bit over work for something like that. I think this statement will help you.
str = 'test1,test2,test3,'
result = ','.join([s for s in str.split(',') if s]) # 'test1,test2,test3'
If you have to remove the last comma (also as the last character?) you can do this via the function removesuffix()
Here is an example:
>>>'hello,'.removesuffix(',')
'hello'
Actually we have to consider the worst case also.
The worst case is,
str= 'test1,test2,test3, ,,,, '
for above code, please use following code,
result = ','.join([s.strip() for s in str.split(',') if s.strip()!=''])
It will work/remove the prefix 'comma' also. For example,
str= ' , ,test1,test2,test3, ,,,, '

Categories

Resources