Python - Removing \n when using default split()? - python

I'm working on strings where I'm taking input from the command line. For example, with this input:
format driveName "datahere"
when I go string.split(), it comes out as:
>>> input.split()
['format, 'driveName', '"datahere"']
which is what I want.
However, when I specify it to be string.split(" ", 2), I get:
>>> input.split(' ', 2)
['format\n, 'driveName\n', '"datahere"']
Does anyone know why and how I can resolve this? I thought it could be because I'm creating it on Windows and running on Unix, but the same problem occurs when I use nano in unix.
The third argument (data) could contain newlines, so I'm cautious not to use a sweeping newline remover.

Default separator in split() is all whitespace which includes newlines \n and spaces.
Here is what the docs on split say:
str.split([sep[, maxsplit]])
If sep is not specified or is None, a different splitting algorithm is
applied: runs of consecutive whitespace are regarded as a single
separator, and the result will contain no empty strings at the start
or end if the string has leading or trailing whitespace.
When you define a new sep it only uses that separator to split the strings.

Use None to get the default whitespace splitting behaviour with a limit:
input.split(None, 2)
This leaves the whitespace at the end of input() untouched.
Or you could strip the values afterwards; this removes whitespace from the start and end, not the middle, of each resulting string, just like input.split() would:
[v.strip() for v in input.split(' ', 2)]

The default str.split targets a number of "whitespace characters", including also tabs and others. If you do str.split(' '), you tell it to split only on ' ' (a space). You can get the default behavior by specifying None, as in str.split(None, 2).
There may be a better way of doing this, depending on what your actual use-case is (your example does not replicate the problem...). As your example output implies newlines as separators, you should consider splitting on them explicitly.
inp = """
format
driveName
datahere
datathere
"""
inp.strip().split('\n', 2)
# ['format', 'driveName', 'datahere\ndatathere']
This allows you to have spaces (and tabs etc) in the first and second item as well.

Related

Turn string into a list and remove carriage returns (Python)

I have a string like this:
['过\r\n啤酒\r\n小心\r\n照顾\r\n锻炼\r\n过去\r\n忘记\r\n哭\r\n包\r\n个子\r\n瘦\r\n选择\r\n奶奶\r\n突然\r\n节目\r\n']
How do I remove all of the "\r\n", and then turn the string into a list like so:
[过, 啤酒, 小心, 照顾, 过去, etc...]
str.split removes all whitespace; this includes \r and \n:
A = ['过\r\n啤酒\r\n小心\r\n照顾\r\n锻炼\r\n过去\r\n忘记\r\n哭\r\n包\r\n个子\r\n瘦\r\n选择\r\n奶奶\r\n突然\r\n节目\r\n']
res = A[0].split()
print(res)
['过', '啤酒', '小心', '照顾', '锻炼', '过去', '忘记', '哭', '包', '个子', '瘦', '选择', '奶奶', '突然', '节目']
As described in the str.split docs:
If sep is not specified or is None, a different splitting
algorithm is applied: runs of consecutive whitespace are regarded as a
single separator, and the result will contain no empty strings at the
start or end if the string has leading or trailing whitespace.
To limit the split to \r\n you can use .splitlines():
>>> li=['过\r\n啤酒\r\n小心\r\n照顾\r\n锻炼\r\n过去\r\n忘记\r\n哭\r\n包\r\n个子\r\n瘦\r\n选择\r\n奶奶\r\n突然\r\n节目\r\n']
>>> li[0].splitlines()
['过', '啤酒', '小心', '照顾', '锻炼', '过去', '忘记', '哭', '包', '个子', '瘦', '选择', '奶奶', '突然', '节目']
Try this:
s = "['过\r\n啤酒\r\n小心\r\n照顾\r\n锻炼\r\n过去\r\n忘记\r\n哭\r\n包\r\n个子\r\n瘦\r\n选择\r\n奶奶\r\n突然\r\n节目\r\n']"
s = s.replace('\r\n', ',').replace("'", '')
print(s)
Output:
[过,啤酒,小心,照顾,锻炼,过去,忘记,哭,包,个子,瘦,选择,奶奶,突然,节目,]
This first replace replaces the \r\n and the second one replaces the single quote from the string as you expected as the output.

How to remove set of characters when a string comprise of "\" and Special characters in python

a = "\Virtual Disks\DG2_ASM04\ACTIVE"
From the above string I would like to get the part "DG2_ASM04" alone. I cannot split or strip as it has the special characters "\", "\D" and "\A" in it.
Have tried the below and can't get the desired output.
a.lstrip("\Virtual Disks\\").rstrip("\ACTIVE")
the output I have got is: 'G2_ASM04' instead of "DG2_ASM04"
Simply use slicing and escape backslash(\)
>>> a.split("\\")[-2]
'DG2_ASM04'
In your case D is also removing because it is occurring more than one time in given string (thus striping D as well). If you tweak your string then you will realize what is happening
>>> a = "\Virtual Disks\XG2_ASM04\ACTIVE"
>>> a.lstrip('\\Virtual Disks\\').rstrip("\\ACTIVE")
'XG2_ASM04'

How does raw_input().strip().split() in Python work in this code?

Hopefully, the community might explain this better to me. Below is the objective, I am trying to make sense of this code given the objective.
Objective: Initialize your list and read in the value of followed by lines of commands where each command will be of the types listed above. Iterate through each command in order and perform the corresponding operation on your list.
Sample input:
12
insert 0 5
insert 1 10
etc.
Sample output:
[5, 10]
etc.
The first line contains an integer, n, denoting the number of commands.
Each line of the subsequent lines contains one of the commands described above.
Code:
n = int(raw_input().strip())
List = []
for number in range(n):
args = raw_input().strip().split(" ")
if args[0] == "append":
List.append(int(args[1]))
elif args[0] == "insert":
List.insert(int(args[1]), int(args[2]))
So this is my interpretation of the variable "args." You take the raw input from the user, then remove the white spaces from the raw input. Once that is removed, the split function put the string into a list.
If my raw input was "insert 0 5," wouldn't strip() turn it into "insert05" ?
In python you use a split(delimiter) method onto a string in order to get a list based in the delimiter that you specified (by default is the space character) and the strip() method removes the white spaces at the end and beginning of a string
So step by step the operations are:
raw_input() #' insert 0 5 '
raw_input().strip() #'insert 0 5'
raw_input().strip().split() #['insert', '0', '5']
you can use split(';') by example if you want to convert strings delimited by semicolons 'insert;0;5'
Let's take an example, you take raw input
string=' I am a coder '
While you take input in form of a string, at first, strip() consumes input i.e. string.strip() makes it
string='I am a coder'
since spaces at the front and end are removed.
Now, split() is used to split the stripped string into a list
i.e.
string=['I', 'am', 'a', 'coder']
Nope, that would be .remove(" "), .strip() just gets rid of white space at the beginning and end of the string.

Python Programming: Need to add spaces between character in a split string

I have to print out every third letter of a text with spaces between and none at the end. I can do everything but the spaces between each of the letters.
Here is what I have.
line = input('Message? ')
print(line[0]+line[3::3].strip())
To join things with spaces, use join(). Consider the following:
>>> line = '0123456789'
>>> ' '.join(line[::3])
'0 3 6 9'
Since you're using Python 3, you can use * to unpack the line and send each element as an argument to print(), which uses a space as the default separator between arguments. You don't need to separate line[0] from the rest - you can include it in the slice ([0::3]), or it'll be used by default ([::3]). Also, there's no need to use strip(), as the newline character that you send with Enter is not included in the string when you use input().
print(*input('Message? ')[::3])

Parsing a string in python without knowing is in the string

I'm trying to parse a random string between a set of quotation marks.
The data is always of the form:
klheafoiehaiofa"randomtextnamehere.ext"fiojeaiof;jidoa;jfioda
I know what .ext is, and that what I want is randomtextnamehere.ext, and that it is always separated by " ".
Currently I can only deal with certain cases, but I want to be able to handle any case, and if I could just start grabbing at the first ", and stop at the second ", that would be great. Since there's a possibility of there being more than one set of " per line.
Thanks!
You can use the str.split method:
Docstring:
S.split([sep [,maxsplit]]) -> list of strings
Return a list of the words in the string S, using sep as the
delimiter string. If maxsplit is given, at most maxsplit
splits are done. If sep is not specified or is None, any
whitespace string is a separator and empty strings are removed
from the result.
In [1]: s = 'klheafoiehaiofa"randomtextnamehere.ext"fiojeaiof;jidoa;jfioda'
In [2]: s.split('"', 2)[1]
Out[2]: 'randomtextnamehere.ext'

Categories

Resources