How to remove certain characters from lists (Python 2.7)?

How to remove certain characters from lists (Python 2.7)? - python

I've got a list where each element is:
['a ',' b ',' c ',' d\n ']
I want to manipulate it so that each element just becomes:
['a','b','c','d']
I don't think the spaces matter, but for some reason I can't seem to remove the \n from the end of the 4th element. I've tried converting to string and removing it using:
str.split('\n')
No error is returned, but it doesn't do anything to the list, it still has the \n at the end.
I've also tried:
d.replace('\n','')
But this just returns an error.
This is clearly a simple problem but I'm a complete beginner to Python so any help would be appreciated, thank you.
Edit:
It seems I have a list of arrays (I think) so am I right in thinking that list[0], list[1] etc are their own arrays? Does that mean I can use a for loop for i in list to strip \n from each one?

>>> my_array = ['a ',' b ',' c ',' d\n ']
>>> my_array = [c.strip() for c in my_array]
>>> my_array
['a', 'b', 'c', 'd']
If you have a list of arrays then you can do something in the lines of:
>>> list_of_arrays = [['a', 'b', 'c', 'd'], ['a ', ' b ', ' c ', ' d\n ']]
>>> new_list = [[c.strip() for c in array] for array in list_of_arrays]
>>> new_list
[['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd']]

Try this -
arr = ['a ',' b ',' c ',' d\n ']
arr = [s.strip() for s in arr]

A very simple answer is join your list, strip the nextline charcter and split to get a new list:
Newlist = ''.join(myList).strip().split()
Your Newlist is now:
['a', 'b', 'c', 'd']

Related

String concatenation from a list of string, using a praticle in front and one at the end for each element

I have an array of strings:
data = ['a', 'b', 'c', 'd']
I want to obtain:
s.a, s.b, s.c, s.d
I tried:
"s., ".join(fields)
Doesn't work because I need s. in front and , at the end

You should perform a mapping of the individual elements from 'x' to 's.x', we can do that by:
map('s.{}'.format, data)
then we can join these together by a comma:
', '.join(map('s.{}'.format, data))
this yields:
>>> ', '.join(map('s.{}'.format, data))
's.a, s.b, s.c, s.d'
Or like #JonClements says, since python-3.6, we can use literal string interpolation [PEP-498]:
', '.join(f's.{d}' for d in data)

You were very close. Use a list comprehension for the operation you want to perform on every string, and then join the list of strings together:
data = ['a', 'b', 'c', 'd']
', '.join(['s.'+x for x in data])
# 's.a, s.b, s.c, s.d'

how to turn a string of letters embedded in squared brackets into embedded lists

I'm trying to find a simple way to convert a string like this:
a = '[[a b] [c d]]'
into the corresponding nested list structure, where the letters are turned into strings:
a = [['a', 'b'], ['c', 'd']]
I tried to use
import ast
l = ast.literal_eval('[[a b] [c d]]')
l = [i.strip() for i in l]
as found here
but it doesn't work because the characters a,b,c,d are not within quotes.
in particular I'm looking for something that turns:
'[[X v] -s]'
into:
[['X', 'v'], '-s']

You can use regex to find all items between brackets then split the result :
>>> [i.split() for i in re.findall(r'\[([^\[\]]+)\]',a)]
[['a', 'b'], ['c', 'd']]
The regex r'\[([^\[\]]+)\]' will match anything between square brackets except square brackets,which in this case would be 'a b' and 'c d' then you can simply use a list comprehension to split the character.
Note that this regex just works for the cases like this, which all the characters are between brackets,and for another cases you can write the corresponding regex, also not that the regex tick won't works in all cases .
>>> a = '[[a b] [c d] [e g]]'
>>> [i.split() for i in re.findall(r'\[([^\[\]]+)\]',a)]
[['a', 'b'], ['c', 'd'], ['e', 'g']]

Use isalpha method of string to wrap all characters into brackets:
a = '[[a b] [c d]]'
a = ''.join(map(lambda x: '"{}"'.format(x) if x.isalpha() else x, a))
Now a is:
'[["a" "b"] ["c" "d"]]'
And you can use json.loads (as #a_guest offered):
json.loads(a.replace(' ', ','))

>>> import json
>>> a = '[[a b] [c d]]'
>>> a = ''.join(map(lambda x: '"{}"'.format(x) if x.isalpha() else x, a))
>>> a
'[["a" "b"] ["c" "d"]]'
>>> json.loads(a.replace(' ', ','))
[[u'a', u'b'], [u'c', u'd']]
This will work with any degree of nested lists following the above pattern, e.g.
>>> a = '[[[a b] [c d]] [[e f] [g h]]]'
>>> ...
>>> json.loads(a.replace(' ', ','))
[[[u'a', u'b'], [u'c', u'd']], [[u'e', u'f'], [u'g', u'h']]]
For the specific example of '[[X v] -s]':
>>> import json
>>> a = '[[X v] -s]'
>>> a = ''.join(map(lambda x: '"{}"'.format(x) if x.isalpha() or x=='-' else x, a))
>>> json.loads(a.replace('[ [', '[[').replace('] ]', ']]').replace(' ', ',').replace('][', '],[').replace('""',''))
[[u'X', u'v'], u'-s']

How to split a string into characters in python

I have a string 'ABCDEFG'
I want to be able to list each character sequentially followed by the next one.
Example
A B
B C
C D
D E
E F
F G
G
Can you tell me an efficient way of doing this? Thanks

In Python, a string is already seen as an enumerable list of characters, so you don't need to split it; it's already "split". You just need to build your list of substrings.
It's not clear what form you want the result in. If you just want substrings, this works:
s = 'ABCDEFG'
[s[i:i+2] for i in range(len(s))]
#=> ['AB', 'BC', 'CD', 'DE', 'EF', 'FG', 'G']
If you want the pairs to themselves be lists instead of strings, just call list on each one:
[list([s[i:i+2]) for i in range(len(s))]
#=> [['A', 'B'], ['B', 'C'], ['C', 'D'], ['D', 'E'], ['E', 'F'], ['F', 'G'], ['G']]
And if you want strings after all, but with something like a space between the letters, join them back together after the list call:
[' '.join(list(s[i:i+2])) for i in range(len(s))]
#=> ['A B', 'B C', 'C D', 'D E', 'E F', 'F G', 'G']

You need to keep the last character, so use izip_longest from itertools
>>> import itertools
>>> s = 'ABCDEFG'
>>> for c, cnext in itertools.izip_longest(s, s[1:], fillvalue=''):
... print c, cnext
...
A B
B C
C D
D E
E F
F G
G

def doit(input):
for i in xrange(len(input)):
print input[i] + (input[i + 1] if i != len(input) - 1 else '')
doit("ABCDEFG")
Which yields:
>>> doit("ABCDEFG")
AB
BC
CD
DE
EF
FG
G

There's an itertools pairwise recipe for exactly this use case:
import itertools
def pairwise(myStr):
a,b = itertools.tee(myStr)
next(b,None)
for s1,s2 in zip(a,b):
print(s1,s2)
Output:
In [121]: pairwise('ABCDEFG')
A B
B C
C D
D E
E F
F G

Your problem is that you have a list of strings, not a string:
with open('ref.txt') as f:
f1 = f.read().splitlines()
f.read() returns a string. You call splitlines() on it, getting a list of strings (one per line). If your input is actually 'ABCDEFG', this will of course be a list of one string, ['ABCDEFG'].
l = list(f1)
Since f1 is already a list, this just makes l a duplicate copy of that list.
print l, f1, len(l)
And this just prints the list of lines, and the copy of the list of lines, and the number of lines.
So, first, what happens if you drop the splitlines()? Then f1 will be the string 'ABCDEFG', instead of a list with that one string. That's a good start. And you can drop the l part entirely, because f1 is already an iterable of its characters; list(f1) will just be a different iterable of the same characters.
So, now you want to print each letter with the next letter. One way to do that is by zipping 'ABCDEFG' and 'BCDEFG '. But how do you get that 'BCDEFG '? Simple; it's just f1[1:] + ' '.
So:
with open('ref.txt') as f:
f1 = f.read()
for left, right in zip(f1, f1[1:] + ' '):
print left, right
Of course for something this simple, there are many other ways to do the same thing. You can iterate over range(len(f1)) and get 2-element slices, or you can use itertools.zip_longest, or you can write a general-purpose "overlapping adjacent groups of size N from any iterable" function out of itertools.tee and zip, etc.

As you want space between the characters you can use zip function and list comprehension :
>>> s="ABCDEFG"
>>> l=[' '.join(i) for i in zip(s,s[1:])]
['A B', 'B C', 'C D', 'D E', 'E F', 'F G']
>>> for i in l:
... print i
...
A B
B C
C D
D E
E F
F G
if you dont want space just use list comprehension :
>>> [s[i:i+2] for i in range(len(s))]
['AB', 'BC', 'CD', 'DE', 'EF', 'FG', 'G']

Split string based on a regular expression

I have the output of a command in tabular form. I'm parsing this output from a result file and storing it in a string. Each element in one row is separated by one or more whitespace characters, thus I'm using regular expressions to match 1 or more spaces and split it. However, a space is being inserted between every element:
>>> str1="a b c d" # spaces are irregular
>>> str1
'a b c d'
>>> str2=re.split("( )+", str1)
>>> str2
['a', ' ', 'b', ' ', 'c', ' ', 'd'] # 1 space element between!!!
Is there a better way to do this?
After each split str2 is appended to a list.

By using (,), you are capturing the group, if you simply remove them you will not have this problem.
>>> str1 = "a b c d"
>>> re.split(" +", str1)
['a', 'b', 'c', 'd']
However there is no need for regex, str.split without any delimiter specified will split this by whitespace for you. This would be the best way in this case.
>>> str1.split()
['a', 'b', 'c', 'd']
If you really wanted regex you can use this ('\s' represents whitespace and it's clearer):
>>> re.split("\s+", str1)
['a', 'b', 'c', 'd']
or you can find all non-whitespace characters
>>> re.findall(r'\S+',str1)
['a', 'b', 'c', 'd']

The str.split method will automatically remove all white space between items:
>>> str1 = "a b c d"
>>> str1.split()
['a', 'b', 'c', 'd']
Docs are here: http://docs.python.org/library/stdtypes.html#str.split

When you use re.split and the split pattern contains capturing groups, the groups are retained in the output. If you don't want this, use a non-capturing group instead.

Its very simple actually. Try this:
str1="a b c d"
splitStr1 = str1.split()
print splitStr1

How to convert comma-delimited string to list in Python?

Given a string that is a sequence of several values separated by a commma:
mStr = 'A,B,C,D,E'
How do I convert the string to a list?
mList = ['A', 'B', 'C', 'D', 'E']

You can use the str.split method.
>>> my_string = 'A,B,C,D,E'
>>> my_list = my_string.split(",")
>>> print my_list
['A', 'B', 'C', 'D', 'E']
If you want to convert it to a tuple, just
>>> print tuple(my_list)
('A', 'B', 'C', 'D', 'E')
If you are looking to append to a list, try this:
>>> my_list.append('F')
>>> print my_list
['A', 'B', 'C', 'D', 'E', 'F']

In the case of integers that are included at the string, if you want to avoid casting them to int individually you can do:
mList = [int(e) if e.isdigit() else e for e in mStr.split(',')]
It is called list comprehension, and it is based on set builder notation.
ex:
>>> mStr = "1,A,B,3,4"
>>> mList = [int(e) if e.isdigit() else e for e in mStr.split(',')]
>>> mList
>>> [1,'A','B',3,4]

Consider the following in order to handle the case of an empty string:
>>> my_string = 'A,B,C,D,E'
>>> my_string.split(",") if my_string else []
['A', 'B', 'C', 'D', 'E']
>>> my_string = ""
>>> my_string.split(",") if my_string else []
[]

>>> some_string='A,B,C,D,E'
>>> new_tuple= tuple(some_string.split(','))
>>> new_tuple
('A', 'B', 'C', 'D', 'E')

You can split that string on , and directly get a list:
mStr = 'A,B,C,D,E'
list1 = mStr.split(',')
print(list1)
Output:
['A', 'B', 'C', 'D', 'E']
You can also convert it to an n-tuple:
print(tuple(list1))
Output:
('A', 'B', 'C', 'D', 'E')

You can use this function to convert comma-delimited single character strings to list-
def stringtolist(x):
mylist=[]
for i in range(0,len(x),2):
mylist.append(x[i])
return mylist

#splits string according to delimeters
'''
Let's make a function that can split a string
into list according the given delimeters.
example data: cat;dog:greff,snake/
example delimeters: ,;- /|:
'''
def string_to_splitted_array(data,delimeters):
#result list
res = []
# we will add chars into sub_str until
# reach a delimeter
sub_str = ''
for c in data: #iterate over data char by char
# if we reached a delimeter, we store the result
if c in delimeters:
# avoid empty strings
if len(sub_str)>0:
# looks like a valid string.
res.append(sub_str)
# reset sub_str to start over
sub_str = ''
else:
# c is not a deilmeter. then it is
# part of the string.
sub_str += c
# there may not be delimeter at end of data.
# if sub_str is not empty, we should att it to list.
if len(sub_str)>0:
res.append(sub_str)
# result is in res
return res
# test the function.
delimeters = ',;- /|:'
# read the csv data from console.
csv_string = input('csv string:')
#lets check if working.
splitted_array = string_to_splitted_array(csv_string,delimeters)
print(splitted_array)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to remove certain characters from lists (Python 2.7)? - python

Try this - arr = ['a ',' b ',' c ',' d\n '] arr = [s.strip() for s in arr]

A very simple answer is join your list, strip the nextline charcter and split to get a new list: Newlist = ''.join(myList).strip().split() Your Newlist is now: ['a', 'b', 'c', 'd']

Related

String concatenation from a list of string, using a praticle in front and one at the end for each element

how to turn a string of letters embedded in squared brackets into embedded lists

How to split a string into characters in python

Split string based on a regular expression

How to convert comma-delimited string to list in Python?

Categories

Resources