How to remove parenthesis from elements in a list (Python) - python

I'm trying to remove some parenthesis from numbers in my list. Example, I have the following list
[' 103.92246(11)\n'],
[' 104.92394(11)\n'],
[' 105.92797(21)#\n'],
[' 106.93031(43)#\n'],
[' 107.93484(32)#\n'],
[' 108.93763(54)#\n'],
[' 109.94244(54)#\n'],
[' 110.94565(54)#\n'],
[' 111.95083(75)#\n'],
[' 112.95470(86)#\n'],
[' 82.94874(54)#\n'],
[' 83.94009(43)#\n'],
[' 84.93655(30)#\n'],
[' 85.93070(47)\n'],
[' 86.92733(24)\n'],
...]
for example, for the first element in my list I have 103.92246(11), were I want () stripped from it to give 103.92246. Some elements also have # which I want removed too, basically all I want is the float number. How would I go about doing this?
I've tried the below code, but that doesn't seem to be working for me.
tolist = []
for num in mylist:
a = re.sub('()', '', num)
tolist.append(a)

You can use str.translate, passing whatever chars you want to remove:
l =[[' 103.92246(11)\n'],
[' 104.92394(11)\n'],
[' 105.92797(21)#\n'],
[' 106.93031(43)#\n'],
[' 107.93484(32)#\n'],
[' 108.93763(54)#\n'],
[' 109.94244(54)#\n'],
[' 110.94565(54)#\n'],
[' 111.95083(75)#\n'],
[' 112.95470(86)#\n'],
[' 82.94874(54)#\n'],
[' 83.94009(43)#\n'],
[' 84.93655(30)#\n'],
[' 85.93070(47)\n'],
[' 86.92733(24)\n']]
for sub in l:
sub[:] = [s.translate(None, "()#") for s in sub]
Output:
[[' 103.9224611\n'], [' 104.9239411\n'], [' 105.9279721\n'],
[' 106.9303143\n'], [' 107.9348432\n'], [' 108.9376354\n'],
[' 109.9424454\n'], [' 110.9456554\n'], [' 111.9508375\n'],
[' 112.9547086\n'], [' 82.9487454\n'], [' 83.9400943\n'],
[' 84.9365530\n'], [' 85.9307047\n'], [' 86.9273324\n']]
If you want them cast to floats:
sub[:] = map(float,(s.translate(None, "()#") for s in sub))
which will give you:
[[103.9224611], [104.9239411], [105.9279721], [106.9303143],
[107.9348432], [108.9376354], [109.9424454], [110.9456554],
[111.9508375], [112.9547086], [82.9487454], [83.9400943], [84.936553],
[85.9307047], [86.9273324]]
If you want to remove the nums in the parens, split on the first (:
for sub in l:
sub[:] = map(float,(s.rsplit("(",1)[0] for s in sub))
print(l)
Output:
[[103.92246], [104.92394], [105.92797], [106.93031], [107.93484],
[108.93763], [109.94244], [110.94565], [111.95083], [112.9547],
[82.94874], [83.94009], [84.93655], [85.9307], [86.92733]]
Or using str.rfind:
for sub in l:
sub[:] = map(float,(s[:s.rfind("(")] for s in sub))
output as above.

you can do this:
result = []
for num in mylist:
a = num[0].index('(') #find the position of (
result.append(num[0][:a])
a oneliner version
[x[0][:x[0].index('(')] for x in mylist]

A little change in your regex:
tolist = []
for num in mylist:
a = re.sub(r'\(.*\)', '',num)
tolist.append(a)

import re
my_list = [[' 103.92246(11)\n'],
[' 104.92394(11)\n'],
[' 105.92797(21)#\n'],
[' 106.93031(43)#\n'],
[' 107.93484(32)#\n'],
[' 108.93763(54)#\n'],
[' 109.94244(54)#\n'],
[' 110.94565(54)#\n'],
[' 111.95083(75)#\n'],
[' 112.95470(86)#\n'],
[' 82.94874(54)#\n'],
[' 83.94009(43)#\n'],
[' 84.93655(30)#\n'],
[' 85.93070(47)\n']]
result = [re.sub(r'([0-9\.])\(.*?\n', r'\1', x[0]) for x in my_list]

Related

Extract all element of a list that matched a string

I have a keyword list and an input list of lists. My task is to find those lists that contain the keyword (even partially). I am able to extract the lists that contain the keyword using the following code:
t_list = [['Subtotal: ', '1,292.80 '], ['VAT ', ' 64.64 '], ['RECEIPT TOTAL ', 'AED1,357.44 '],
['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '],
['NOT_SELECTED, upto500 ', 'amount 160.58 ', '', '3.03 '],
['Learn', 'Bectricity total ', '', '', '63.61 ']]
keyword = ['total ', 'amount ']
for lists in t_list:
for string_list in table:
string_list[:] = [item for item in string_list if item != '']
for element in string_list:
element = element.lower()
if any(s in element for s in keyword):
print(string_list)
The output is:
[['Subtotal: ', '1,292.80 '], ['RECEIPT TOTAL ', 'AED1,357.44 '], ['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '], ['NOT_SELECTED, upto500 ', 'amount 160.58 ', '3.03 '],
['Learn', 'Bectricity total ', '63.61 ']]
Required output is to have only the string that matched with the keyword and the number in the list.
Required output:
[['Subtotal: ', '1,292.80 '], ['RECEIPT TOTAL ', 'AED1,357.44 '], ['Sub total ', '60.58 '], ['amount 160.58 ', '3.03 '],['Bectricity total ', '63.61 ']]
If I can have the output as a dictionary with the string matched to the keyword as key and the number a value, it would be perfect.
Thanks a ton in advance!
Here is the answer from our chat, slightly modified with some comments as some explanation for the code. Feel free to ask me to clarify or change anything.
import re
t_list = [
['Subtotal: ', '1,292.80 '],
['VAT ', ' 64.64 '],
['RECEIPT TOTAL ', 'AED1,357.44 '],
['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '],
['NOT_SELECTED, upto500 ', 'amount 160.58 ', '', '3.03 '],
['Learn', 'Bectricity total ', '', '', '63.61 ']
]
keywords = ['total ', 'amount ']
output = {}
for sub_list in t_list:
# Becomes the string that matched the keyword if one is found
matched = None
for item in sub_list:
for keyword in keywords:
if keyword in item.lower():
matched = item
# If a match was found, then we start looking at the list again
# looking for the numbers
if matched:
for item in sub_list:
# split the string so for example 'amount 160.58 ' becomes ['amount', '160.58']
# This allows us to more easily extract just the number
split_items = item.split()
for split_item in split_items:
# Simple use of regex to match any '.' with digits either side
re_search = re.search(r'[0-9][.][0-9]', split_item)
if re_search:
# Try block because we are making a list. If the list exists,
# then just append a value, otherwise create the list with the item
# in it
try:
output[matched.strip()].append(split_item)
except KeyError:
output[matched.strip()] = [split_item]
print(output)
You mentioned wanting to match a string such as 'AED 63.61'. My solution is using .split() to separate strings and make it easier to grab just the number. For example, for a string like 'amount 160.58' it becomes much easier to just grab the 160.58. I'm not sure how to go about matching a string like the one you want to keep but not matching the one I just mentioned (unless, of course, it is just 'AED' in which case we could just add some more logic to match anything with 'aed').

Mutating two dimensional list

I'm Trying to make a more advanced Tic Tac Toe program with an 'infinite' amount of lines/rows.
But when I try to mutate the list, it changes the whole column instead of just one spot.
size = 4
board = size * [size*[' ']]
board[0][1] = 'x'
#output:
#[[' ', 'x', ' ', ' '],
# [' ', 'x', ' ', ' '],
# [' ', 'x', ' ', ' '],
# [' ', 'x', ' ', ' ']]
How can I fix that?
It occurs because the inner list each row is made of is the same object that gets repeated.
You can change it to
board = [
[' ']*size
for _ in range(size)
]
Or use a double list comprehension
size = 4
board = [
[' ' for _ in range(size)]
for _ in range(size)
]
board[0][1] = 'x'
print(board)
which both produce
[[' ', 'x', ' ', ' '], [' ', ' ', ' ', ' '], [' ', ' ', ' ', ' '], [' ', ' ', ' ', ' ']]
Kudos to #Pynchia for beating me to the answer. Here is my version of code. I think the problem you were having was a result of your method of creating a list of lists.
size = 4
# simple way to create a list of lists
board = [size * [' '] for i in range(4)]
board[0][1] = 'x'
print(board)
Output will be as expected.

Loading text data in python and create a matrix

say I have a text file with a format similarly to this
Q: hello what is your name?
A: Hi my name is John Smith
and I want to create a matrix such that it is a 2xn in this case
[['hello','what','is',your','name','?', ' '],['hi','my','name','is','John','Smith']]
note that the first row has an empty entry because it has 6 strings while second row has 7 strings
You can use re.split:
import re
file_data = open('filename.txt').read()
results = filter(None, re.split('A:\s|Q:\s', file_data))
new_results = [re.findall('\w+|\W', i) for i in results]
Output:
[['hello', ' ', 'what', ' ', 'is', ' ', 'your', ' ', 'name', '?', ' '], ['Hi', ' ', 'my', ' ', 'name', ' ', 'is', ' ', 'John', ' ', 'Smith']]
just split the strings, using the split function:
with open('txt.txt') as my_file:
lines = my_file.readlines()
#lines[0] = "Q: hello what is your name?"
#lines[1] = "A: Hi my name is John Smith"
then just use
output = [lines[0].split,lines[1].split]

Searching for an item in a nested list and then returning the index of an item

So I was trying to write a function called find_treasure which takes a 2D list as a parameter. The purpose of the function is to search through the 2D list given and to return the index of where the 'x' is located.
def find_treasure(my_list):
str1 = 'x'
if str1 in [j for i in (my_list) for j in i]:
index = (j for i in my_list for j in i).index(str1)
return(index)
treasure_map = [[' ', ' ', ' '], [' ', 'x', ' '], [' ', ' ', ' ']]
print(find_treasure(treasure_map))
However, I can't seem to get the function to return the index, I tried using the enumerate function too but either I was using it wrongly.
Using enumerate
def find_treasure(my_list):
str1 = 'x'
for i,n in enumerate(my_list):
for j, m in enumerate(n):
if m == str1:
return (i, j)
treasure_map = [[' ', ' ', ' '], [' ', 'x', ' '], [' ', ' ', ' ']]
print(find_treasure(treasure_map))
Output:
(1, 1)
Using index function.
def find_treasure(my_list):
str1 = 'x'
for i,n in enumerate(my_list):
try:
return (i, n.index(str1))
except ValueError:
pass
treasure_map = [[' ', ' ', ' '], [' ', 'x', ' '], [' ', ' ', ' ']]
print(find_treasure(treasure_map))
Output
(1, 1)

Python: extract items in sublist into string

I got a list like this:
[['zai4'], [' '], ['tui1'], ['jin4'], [' '], ['shi2'], ['pin3'], [' '], ['an1'], ['quan2'], [' '], ['xin4'], ['xi1'], [' ']]
how could I convert it into this kind:
zai4 tui1 jin4 shi2 pin3 an1 quan2 xin4 xi1
Thank you.
I think the unique part of my question is how to extract content from
sub-list into string.
Try this approach:
data=[['zai4'], [' '], ['tui1'], ['jin4'], [' '], ['shi2'], ['pin3'], [' '], ['an1'], ['quan2'], [' '], ['xin4'], ['xi1'], [' ']]
print([i[0] for i in data if i[0].isalnum()])
output:
['zai4', 'tui1', 'jin4', 'shi2', 'pin3', 'an1', 'quan2', 'xin4', 'xi1']
if you want without list then:
print(" ".join([i[0] for i in data if i[0].isalnum()]))
output:
zai4 tui1 jin4 shi2 pin3 an1 quan2 xin4 xi1
Using list comprehension and str.join
a = [['zai4'], [' '], ['tui1'], ['jin4'], [' '], ['shi2'], ['pin3'], [' '], ['an1'], ['quan2'], [' '], ['xin4'], ['xi1'], [' ']]
print(" ".join(i[0] for i in a).replace(" ", ""))
Output:
zai4 tui1 jin4 shi2 pin3 an1 quan2 xin4 xi1
You can use unpacking:
s = [['zai4'], [' '], ['tui1'], ['jin4'], [' '], ['shi2'], ['pin3'], [' '], ['an1'], ['quan2'], [' '], ['xin4'], ['xi1'], [' ']]
new_s = ' '.join(a for [a] in s if a != ' ')
Output:
'zai4 tui1 jin4 shi2 pin3 an1 quan2 xin4 xi1'

Categories

Resources