Splitting Strings within an Array

Splitting Strings within an Array - python

I am writing a program in python that reads in a text file and executes any python commands within it. The commands may be out of order, but each command has a letter ID such as {% (c) print x %}
I've been able to sort all the commands with in the document into an array, in the correct order. My question is, how to i remove the (c), so i can run exec(statement) on the string?
Here is the full example array
[' (a) import random ', ' (b) x = random.randint(1,6) ', ' (c) print x ', ' (d) print 2*x ']
Also, I am very new to python, my first assignment with it.

You can remove the index part, by using substring:
for cmd in arr:
exec(cmd[5:])

Take everything right to the parenthesis and exec:
for cmd in arr:
exec(cmd.split(") ")[-1])

Stripping the command-id prefixes is a good job for a regular expression:
>>> import re
>>> commands = [' (a) import random ', ' (b) x = random.randint(1,6) ', ' (c) print x ', ' (d) print 2*x ']
>>> [re.search(r'.*?\)\s*(.*)', command).group(1) for command in commands]
['import random ', 'x = random.randint(1,6) ', 'print x ', 'print 2*x ']
The meaning of regex components are:
.*?\) means "Get the shortest group of any characters that ends with a closing-parentheses."
\s* means "Zero or more space characters."
(.*) means "Collect all the remaining characters into group(1)."
How this explanation makes it all clear :-)

Since the pattern looks simple and consistent, you could use regex.
This also allows for both (a) and (abc123) as valid IDs.
import re
lines = [
' (a) import random ',
' (b) x = random.randint(1,6) ',
' (c) print x ',
' (d) print 2*x '
]
for line in lines:
print(re.sub(r"^[ \t]+(\(\w+\))", "", line))
Which would output:
import random
x = random.randint(1,6)
print x
print 2*x
If you really only want to match a single letter, then replace \w+ with [a-zA-Z].

You may use a simple regex to omit the first alpha character in braces as:
import re
lst = [' (a) import random ', ' (b) x = random.randint(1,6) ', ' (c) print x ', ' (d) print 2*x ']
for ele in lst:
print re.sub("^ \([a-z]\)", "", ele)

Related

Merge 2 elements together if the elements contains

I have a very messy data I am noticing patterns where ever there is '\n' end of the element, it needs to be merged with single element before that.
sample list:
ls = ['hello','world \n','my name','is john \n','How are you?','I am \n doing well']
ls
return/tryouts:
print([s for s in ls if "\n" in s[-1]])
>>> ['world \n', 'is john \n'] # gave elements that ends with \n
How do I get it elements that ends with '\n' merge with 1 before element? Looking for a output like this one:
['hello world \n', 'my name is john \n', 'How are you?','I am \n doing well']

If you are reducing a list, maybe, one readable approach is to use reduce function.
functools.reduce(func, iter, [initial_value]) cumulatively performs an operation on all the iterable’s elements and, therefore, can’t be applied to infinite iterables.
First of all, you need a kind of struck to accumulate results, I use a tuple with two elements: buffer with concatenated strings until I found "\n" and the list of results. See initial struct (1).
ls = ['hello','world \n','my name','is john \n','How are you?','I am \n doing well']
def combine(x,y):
if y.endswith('\n'):
return ( "", x[1]+[x[0]+" "+y] ) #<-- buffer to list
else:
return ( x[0]+" "+y, x[1] ) #<-- on buffer
t=reduce( combine, ls, ("",[]) ) #<-- see initial struct (1)
t[1]+[t[0]] if t[0] else t[1] #<-- add buffer if not empty
Result:
['hello world \n', 'my name is john \n', 'How are you? ', 'I am \n doing well ']
(1) Explained initial struct: you use a tuple to store buffer string until \n and a list of already cooked strings:
("",[])
Means:
("__ buffer string not yet added to list __", [ __result list ___ ] )

I wrote this out so it is simple to understand instead of trying to make it more complex as a list comprehension.
This will work for any number of words until you hit a \n character and clean up the remainder of your input as well.
ls_out = [] # your outgoing ls
out = '' # keeps your words to use
for i in range(0, len(ls)):
if '\n' in ls[i]: # check for the ending word, if so, add it to output and reset
out += ls[i]
ls_out.append(out)
out = ''
else: # otherwise add to your current word list
out += ls[i]
if out: # check for remaining words in out if total ls doesn't end with \n
ls_out.append(out)
You may need to add spaces when you string concatenate but I am guessing that it is just with your example. If you do, make this edit:
out += ' ' + ls[i]
Edit:
If you want to only grab the one before and not multiple before, you could do this:
ls_out = []
for i in range(0, len(ls)):
if ls[i].endswith('\n'): # check ending only
if not ls[i-1].endswith('\n'): # check previous string
out = ls[i-1] + ' ' + ls[i] # concatenate together
else:
out = ls[i] # this one does, previous didn't
elif ls[i+1].endswith('\n'): # next one will grab this so skip
continue
else:
out = ls[i] # next one won't so add this one in
ls_out.append(out)

You can solve it using the regex expression using the 're' module.
import re
ls = ['hello','world \n','my name','is john \n','How are you?','I am \n doing well']
new_ls = []
for i in range(len(ls)):
concat_word = '' # reset the concat word to ''
if re.search(r"\n$", str(ls[i])): # matching the \n at the end of the word
try:
concat_word = str(ls[i-1]) + ' ' + str(ls[i]) # appending to the previous word
except:
concat_word = str(ls[i]) # in case if the first word in the list has \n
new_ls.append(concat_word)
elif re.search(r'\n',str(ls[i])): # matching the \n anywhere in the word
concat_word = str(ls[i])
new_ls.extend([str(ls[i-1]), concat_word]) # keeps the word before the "anywhere" match separate
print(new_ls)
This returns the output
['hello world \n', 'my name is john \n', 'How are you?', 'I am \n doing well']

Assuming the first element doesn't end with \n and all words are longer than 2 characters:
res = []
for el in ls:
if el[-2:] == "\n":
res[-1] = res[-1] + el
else:
res.append(el)

Try this:
lst=[]
for i in range(len(ls)):
if "\n" in ls[i][-1]:
lst.append((ls[i-1] + ' ' + ls[i]))
lst.remove(ls[i-1])
else:
lst.append(ls[i])
lst
Result:
['hello world \n', 'my name is john \n', 'How are you?', 'I am \n doing well']

how to remove spaces in a list that has a specific character

how do you get a list to fix the spaces in the list m.
m = ['m, a \n', 'l, n \n', 'c, l\n']
for i in m:
if (' ') in i:
i.strip(' ')
I got:
'm, a \n'
'l, n \n'
'c, l\n'
and I want it to return:
['m, a\n', 'l, n\n', 'c, l\n']

The strip() method will strip all the characters from the end of the string. In your case, strip starts at the end of your string, encounters a '\n' character, and exits.
It seems a little unclear what you are trying to do, but I will assume that you are looking to clear out any white space between the last non-whitespace character of your string and the newline. Correct me if I'm wrong.
There are many ways to do this, and this may not be the best, but here is what I came up with:
m = ['This, is a string. \n', 'another string! \n', 'final example\n ']
m = map(lambda(x): x.rstrip() + '\n' if x[-1] == '\n' else x.rstrip(' '), m)
print(m)
['This, is a string.\n', 'another string!\n', 'final example\n']
Here I use the built in map function iterate over each list element and remove all white space from the end (rstrip() instead of strip() which does both the start and end) of the string, and add in a new line if there was one present in the original string.

Your code wouldn't be useful in a script; you are just seeing the REPL displaying the result of the expression i.strip(' '). In a script, that value would just be ignored.
To create a list, use a list comprehension:
result = [i.strip(' ') for i in m if ' ' in i]
Note, however, strip only removes the requested character from either end; in your data, the space precedes the newline. You'll need to do something like removing the newline as well, then put it back:
result = ["%s\n" % i.strip() for i in m if ' ' in i]

You can use regex:
import re
m = ['m, a \n', 'l, n \n', 'c, l\n']
final_m = [re.sub('(?<=[a-zA-Z])\s+(?=\n)', '', i) for i in m]
Output:
['m, a\n', 'l, n\n', 'c, l\n']

Quick and dirty:
m = [x.replace(' \n', '\n') for x in m]
If you know that only one space goes before the '\n'

How to reduce whitespace in Python? [duplicate]

This question already has answers here:
Is there a simple way to remove multiple spaces in a string?
(27 answers)
Closed 6 years ago.
How do I reduce whitespace in Python from
test = ' Good ' to single whitespace test = ' Good '
I have tried define this function but when I try to test = reducing_white(test) it doesn't work at all, does it have to do with the function return or something?
counter = []
def reducing_white(txt):
counter = txt.count(' ')
while counter > 2:
txt = txt.replace(' ','',1)
counter = txt.count(' ')
return txt

Here is how I solved it:
def reduce_ws(txt):
ntxt = txt.strip()
return ' '+ ntxt + ' '
j = ' Hello World '
print(reduce_ws(j))
OUTPUT:
' Hello World '

You need to use regular expressions:
import re
re.sub(r'\s+', ' ', test)
>>>> ' Good '
test = ' Good Sh ow '
re.sub(r'\s+', ' ', test)
>>>> ' Good Sh ow '
r'\s+' matches all multiple whitespace characters, and replaces the entire sequence with a ' ' i.e. a single whitespace character.
This solution is fairly powerful and will work on any combination of multiple spaces.

Removing whitespace from the end of string while keeping space in the middle of each letter

The goal of this code is to take a bunch of letters and print the first letter and every third letter after that for the user. What's the easiest way to remove the whitespace at the end of the output here while keeping all the spaces in the middle?
msg = input('Message? ')
for i in range(0, len(msg), 3):
print(msg[i], end = ' ')

str_object.rstrip() will return a copy of str_object without trailing whitespace. Just do
msg = input('Message? ').rstrip()
For what it's worth, you can replace your loop by string slicing:
print(*msg[::3], sep=' ')

n = ' hello '
n.rstrip()
' hello'
n.lstrip()
'hello '
n.strip()
'hello'

What about?
msg = input('Message? ')
output = ' '.join(msg[::3]).rstrip()
print(output)

You can use at least 2 methods:
1) Slicing method:
print(" ".join(msg[0::3]))
2) List comprehension (more readable/powerful):
print(" ".join([letter for i,letter in enumerate(msg) if i%3==0])

How do I trim whitespace from a string?

How do I remove leading and trailing whitespace from a string in Python?
" Hello world " --> "Hello world"
" Hello world" --> "Hello world"
"Hello world " --> "Hello world"
"Hello world" --> "Hello world"

To remove all whitespace surrounding a string, use .strip(). Examples:
>>> ' Hello '.strip()
'Hello'
>>> ' Hello'.strip()
'Hello'
>>> 'Bob has a cat'.strip()
'Bob has a cat'
>>> ' Hello '.strip() # ALL consecutive spaces at both ends removed
'Hello'
Note that str.strip() removes all whitespace characters, including tabs and newlines. To remove only spaces, specify the specific character to remove as an argument to strip:
>>> " Hello\n ".strip(" ")
'Hello\n'
To remove only one space at most:
def strip_one_space(s):
if s.endswith(" "): s = s[:-1]
if s.startswith(" "): s = s[1:]
return s
>>> strip_one_space(" Hello ")
' Hello'

As pointed out in answers above
my_string.strip()
will remove all the leading and trailing whitespace characters such as \n, \r, \t, \f, space .
For more flexibility use the following
Removes only leading whitespace chars: my_string.lstrip()
Removes only trailing whitespace chars: my_string.rstrip()
Removes specific whitespace chars: my_string.strip('\n') or my_string.lstrip('\n\r') or my_string.rstrip('\n\t') and so on.
More details are available in the docs.

strip is not limited to whitespace characters either:
# remove all leading/trailing commas, periods and hyphens
title = title.strip(',.-')

This will remove all leading and trailing whitespace in myString:
myString.strip()

You want strip():
myphrases = [" Hello ", " Hello", "Hello ", "Bob has a cat"]
for phrase in myphrases:
print(phrase.strip())

This can also be done with a regular expression
import re
input = " Hello "
output = re.sub(r'^\s+|\s+$', '', input)
# output = 'Hello'

Well seeing this thread as a beginner got my head spinning. Hence came up with a simple shortcut.
Though str.strip() works to remove leading & trailing spaces it does nothing for spaces between characters.
words=input("Enter the word to test")
# If I have a user enter discontinous threads it becomes a problem
# input = " he llo, ho w are y ou "
n=words.strip()
print(n)
# output "he llo, ho w are y ou" - only leading & trailing spaces are removed
Instead use str.replace() to make more sense plus less error & more to the point.
The following code can generalize the use of str.replace()
def whitespace(words):
r=words.replace(' ','') # removes all whitespace
n=r.replace(',','|') # other uses of replace
return n
def run():
words=input("Enter the word to test") # take user input
m=whitespace(words) #encase the def in run() to imporve usability on various functions
o=m.count('f') # for testing
return m,o
print(run())
output- ('hello|howareyou', 0)
Can be helpful while inheriting the same in diff. functions.

In order to remove "Whitespace" which causes plenty of indentation errors when running your finished code or programs in Pyhton. Just do the following;obviously if Python keeps telling that the error(s) is indentation in line 1,2,3,4,5, etc..., just fix that line back and forth.
However, if you still get problems about the program that are related to typing mistakes, operators, etc, make sure you read why error Python is yelling at you:
The first thing to check is that you have your
indentation right. If you do, then check to see if you have
mixed tabs with spaces in your code.
Remember: the code
will look fine (to you), but the interpreter refuses to run it. If
you suspect this, a quick fix is to bring your code into an
IDLE edit window, then choose Edit..."Select All from the
menu system, before choosing Format..."Untabify Region.
If you’ve mixed tabs with spaces, this will convert all your
tabs to spaces in one go (and fix any indentation issues).

I could not find a solution to what I was looking for so I created some custom functions. You can try them out.
def cleansed(s: str):
""":param s: String to be cleansed"""
assert s is not (None or "")
# return trimmed(s.replace('"', '').replace("'", ""))
return trimmed(s)
def trimmed(s: str):
""":param s: String to be cleansed"""
assert s is not (None or "")
ss = trim_start_and_end(s).replace(' ', ' ')
while ' ' in ss:
ss = ss.replace(' ', ' ')
return ss
def trim_start_and_end(s: str):
""":param s: String to be cleansed"""
assert s is not (None or "")
return trim_start(trim_end(s))
def trim_start(s: str):
""":param s: String to be cleansed"""
assert s is not (None or "")
chars = []
for c in s:
if c is not ' ' or len(chars) > 0:
chars.append(c)
return "".join(chars).lower()
def trim_end(s: str):
""":param s: String to be cleansed"""
assert s is not (None or "")
chars = []
for c in reversed(s):
if c is not ' ' or len(chars) > 0:
chars.append(c)
return "".join(reversed(chars)).lower()
s1 = ' b Beer '
s2 = 'Beer b '
s3 = ' Beer b '
s4 = ' bread butter Beer b '
cdd = trim_start(s1)
cddd = trim_end(s2)
clean1 = cleansed(s3)
clean2 = cleansed(s4)
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s1, len(s1), cdd, len(cdd)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s2, len(s2), cddd, len(cddd)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s3, len(s3), clean1, len(clean1)))
print("\nStr: {0} Len: {1} Cleansed: {2} Len: {3}".format(s4, len(s4), clean2, len(clean2)))

If you want to trim specified number of spaces from left and right, you could do this:
def remove_outer_spaces(text, num_of_leading, num_of_trailing):
text = list(text)
for i in range(num_of_leading):
if text[i] == " ":
text[i] = ""
else:
break
for i in range(1, num_of_trailing+1):
if text[-i] == " ":
text[-i] = ""
else:
break
return ''.join(text)
txt1 = " MY name is "
print(remove_outer_spaces(txt1, 1, 1)) # result is: " MY name is "
print(remove_outer_spaces(txt1, 2, 3)) # result is: " MY name is "
print(remove_outer_spaces(txt1, 6, 8)) # result is: "MY name is"

How do I remove leading and trailing whitespace from a string in Python?
So below solution will remove leading and trailing whitespaces as well as intermediate whitespaces too. Like if you need to get a clear string values without multiple whitespaces.
>>> str_1 = ' Hello World'
>>> print(' '.join(str_1.split()))
Hello World
>>>
>>>
>>> str_2 = ' Hello World'
>>> print(' '.join(str_2.split()))
Hello World
>>>
>>>
>>> str_3 = 'Hello World '
>>> print(' '.join(str_3.split()))
Hello World
>>>
>>>
>>> str_4 = 'Hello World '
>>> print(' '.join(str_4.split()))
Hello World
>>>
>>>
>>> str_5 = ' Hello World '
>>> print(' '.join(str_5.split()))
Hello World
>>>
>>>
>>> str_6 = ' Hello World '
>>> print(' '.join(str_6.split()))
Hello World
>>>
>>>
>>> str_7 = 'Hello World'
>>> print(' '.join(str_7.split()))
Hello World
As you can see this will remove all the multiple whitespace in the string(output is Hello World for all). Location doesn't matter. But if you really need leading and trailing whitespaces, then strip() would be find.

One way is to use the .strip() method (removing all surrounding whitespaces)
str = " Hello World "
str = str.strip()
**result: str = "Hello World"**
Note that .strip() returns a copy of the string and doesn't change the underline object (since strings are immutable).
Should you wish to remove all whitespace (not only trimming the edges):
str = ' abcd efgh ijk '
str = str.replace(' ', '')
**result: str = 'abcdefghijk'

I wanted to remove the too-much spaces in a string (also in between the string, not only in the beginning or end). I made this, because I don't know how to do it otherwise:
string = "Name : David Account: 1234 Another thing: something "
ready = False
while ready == False:
pos = string.find(" ")
if pos != -1:
string = string.replace(" "," ")
else:
ready = True
print(string)
This replaces double spaces in one space until you have no double spaces any more

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Splitting Strings within an Array - python

You can remove the index part, by using substring: for cmd in arr: exec(cmd[5:])

Take everything right to the parenthesis and exec: for cmd in arr: exec(cmd.split(") ")[-1])

You may use a simple regex to omit the first alpha character in braces as: import re lst = [' (a) import random ', ' (b) x = random.randint(1,6) ', ' (c) print x ', ' (d) print 2*x '] for ele in lst: print re.sub("^ \([a-z]\)", "", ele)

Related

Merge 2 elements together if the elements contains

how to remove spaces in a list that has a specific character

How to reduce whitespace in Python? [duplicate]

Removing whitespace from the end of string while keeping space in the middle of each letter

How do I trim whitespace from a string?

Categories

Resources