Python: extract items in sublist into string - python

I got a list like this:
[['zai4'], [' '], ['tui1'], ['jin4'], [' '], ['shi2'], ['pin3'], [' '], ['an1'], ['quan2'], [' '], ['xin4'], ['xi1'], [' ']]
how could I convert it into this kind:
zai4 tui1 jin4 shi2 pin3 an1 quan2 xin4 xi1
Thank you.
I think the unique part of my question is how to extract content from
sub-list into string.

Try this approach:
data=[['zai4'], [' '], ['tui1'], ['jin4'], [' '], ['shi2'], ['pin3'], [' '], ['an1'], ['quan2'], [' '], ['xin4'], ['xi1'], [' ']]
print([i[0] for i in data if i[0].isalnum()])
output:
['zai4', 'tui1', 'jin4', 'shi2', 'pin3', 'an1', 'quan2', 'xin4', 'xi1']
if you want without list then:
print(" ".join([i[0] for i in data if i[0].isalnum()]))
output:
zai4 tui1 jin4 shi2 pin3 an1 quan2 xin4 xi1

Using list comprehension and str.join
a = [['zai4'], [' '], ['tui1'], ['jin4'], [' '], ['shi2'], ['pin3'], [' '], ['an1'], ['quan2'], [' '], ['xin4'], ['xi1'], [' ']]
print(" ".join(i[0] for i in a).replace(" ", ""))
Output:
zai4 tui1 jin4 shi2 pin3 an1 quan2 xin4 xi1

You can use unpacking:
s = [['zai4'], [' '], ['tui1'], ['jin4'], [' '], ['shi2'], ['pin3'], [' '], ['an1'], ['quan2'], [' '], ['xin4'], ['xi1'], [' ']]
new_s = ' '.join(a for [a] in s if a != ' ')
Output:
'zai4 tui1 jin4 shi2 pin3 an1 quan2 xin4 xi1'

Related

How to write a dictionary into csv file, with keys having values of different length?

I have a dictionary which has multiple values for each key. The catch is that the length of values for few keys are different.
{'TimeStamp': [' 5571300\r\n', ' 6074300\r\n', ' 6581300\r\n', ' 7084300\r\n'],
'PRESS': [' 1020.44\r\n', ' 1020.42\r\n', ' 1020.48\r\n', ' 1020.46\r\n'],
'GYR_X': [' -70', ' 70', ' 0', ' 0'],
'GYR_Y': [' 630', ' 630', ' 630', ' 630'],
'GYR_Z': [' 210\r\n', ' 210\r\n', ' 210\r\n', ' 210\r\n'],
'MAG_X': [' -456', ' -450', ' -471', ' -457'],
'MAG_Y': [' -300', ' -307', ' -294', ' -295'],
'MAG_Z': [' 84\r\n', ' 84\r\n', ' 75\r\n', ' 75\r\n'],
'TEMP': [' +24.30\r\n', ' +24.30\r\n'],
'HUM': [' 71.78\r\n', ' 71.73\r\n'],
....}
How to populate this into a csv file under their respective keys?
I tried using pandas dataframe, but it requires the values to be of equal length.
ValueError: All arrays must be of the same length

Extract all element of a list that matched a string

I have a keyword list and an input list of lists. My task is to find those lists that contain the keyword (even partially). I am able to extract the lists that contain the keyword using the following code:
t_list = [['Subtotal: ', '1,292.80 '], ['VAT ', ' 64.64 '], ['RECEIPT TOTAL ', 'AED1,357.44 '],
['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '],
['NOT_SELECTED, upto500 ', 'amount 160.58 ', '', '3.03 '],
['Learn', 'Bectricity total ', '', '', '63.61 ']]
keyword = ['total ', 'amount ']
for lists in t_list:
for string_list in table:
string_list[:] = [item for item in string_list if item != '']
for element in string_list:
element = element.lower()
if any(s in element for s in keyword):
print(string_list)
The output is:
[['Subtotal: ', '1,292.80 '], ['RECEIPT TOTAL ', 'AED1,357.44 '], ['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '], ['NOT_SELECTED, upto500 ', 'amount 160.58 ', '3.03 '],
['Learn', 'Bectricity total ', '63.61 ']]
Required output is to have only the string that matched with the keyword and the number in the list.
Required output:
[['Subtotal: ', '1,292.80 '], ['RECEIPT TOTAL ', 'AED1,357.44 '], ['Sub total ', '60.58 '], ['amount 160.58 ', '3.03 '],['Bectricity total ', '63.61 ']]
If I can have the output as a dictionary with the string matched to the keyword as key and the number a value, it would be perfect.
Thanks a ton in advance!
Here is the answer from our chat, slightly modified with some comments as some explanation for the code. Feel free to ask me to clarify or change anything.
import re
t_list = [
['Subtotal: ', '1,292.80 '],
['VAT ', ' 64.64 '],
['RECEIPT TOTAL ', 'AED1,357.44 '],
['NOT_SELECTED, upto2,000 ', 'Sub total ', '60.58 '],
['NOT_SELECTED, upto500 ', 'amount 160.58 ', '', '3.03 '],
['Learn', 'Bectricity total ', '', '', '63.61 ']
]
keywords = ['total ', 'amount ']
output = {}
for sub_list in t_list:
# Becomes the string that matched the keyword if one is found
matched = None
for item in sub_list:
for keyword in keywords:
if keyword in item.lower():
matched = item
# If a match was found, then we start looking at the list again
# looking for the numbers
if matched:
for item in sub_list:
# split the string so for example 'amount 160.58 ' becomes ['amount', '160.58']
# This allows us to more easily extract just the number
split_items = item.split()
for split_item in split_items:
# Simple use of regex to match any '.' with digits either side
re_search = re.search(r'[0-9][.][0-9]', split_item)
if re_search:
# Try block because we are making a list. If the list exists,
# then just append a value, otherwise create the list with the item
# in it
try:
output[matched.strip()].append(split_item)
except KeyError:
output[matched.strip()] = [split_item]
print(output)
You mentioned wanting to match a string such as 'AED 63.61'. My solution is using .split() to separate strings and make it easier to grab just the number. For example, for a string like 'amount 160.58' it becomes much easier to just grab the 160.58. I'm not sure how to go about matching a string like the one you want to keep but not matching the one I just mentioned (unless, of course, it is just 'AED' in which case we could just add some more logic to match anything with 'aed').

Mutating two dimensional list

I'm Trying to make a more advanced Tic Tac Toe program with an 'infinite' amount of lines/rows.
But when I try to mutate the list, it changes the whole column instead of just one spot.
size = 4
board = size * [size*[' ']]
board[0][1] = 'x'
#output:
#[[' ', 'x', ' ', ' '],
# [' ', 'x', ' ', ' '],
# [' ', 'x', ' ', ' '],
# [' ', 'x', ' ', ' ']]
How can I fix that?
It occurs because the inner list each row is made of is the same object that gets repeated.
You can change it to
board = [
[' ']*size
for _ in range(size)
]
Or use a double list comprehension
size = 4
board = [
[' ' for _ in range(size)]
for _ in range(size)
]
board[0][1] = 'x'
print(board)
which both produce
[[' ', 'x', ' ', ' '], [' ', ' ', ' ', ' '], [' ', ' ', ' ', ' '], [' ', ' ', ' ', ' ']]
Kudos to #Pynchia for beating me to the answer. Here is my version of code. I think the problem you were having was a result of your method of creating a list of lists.
size = 4
# simple way to create a list of lists
board = [size * [' '] for i in range(4)]
board[0][1] = 'x'
print(board)
Output will be as expected.

How to remove parenthesis from elements in a list (Python)

I'm trying to remove some parenthesis from numbers in my list. Example, I have the following list
[' 103.92246(11)\n'],
[' 104.92394(11)\n'],
[' 105.92797(21)#\n'],
[' 106.93031(43)#\n'],
[' 107.93484(32)#\n'],
[' 108.93763(54)#\n'],
[' 109.94244(54)#\n'],
[' 110.94565(54)#\n'],
[' 111.95083(75)#\n'],
[' 112.95470(86)#\n'],
[' 82.94874(54)#\n'],
[' 83.94009(43)#\n'],
[' 84.93655(30)#\n'],
[' 85.93070(47)\n'],
[' 86.92733(24)\n'],
...]
for example, for the first element in my list I have 103.92246(11), were I want () stripped from it to give 103.92246. Some elements also have # which I want removed too, basically all I want is the float number. How would I go about doing this?
I've tried the below code, but that doesn't seem to be working for me.
tolist = []
for num in mylist:
a = re.sub('()', '', num)
tolist.append(a)
You can use str.translate, passing whatever chars you want to remove:
l =[[' 103.92246(11)\n'],
[' 104.92394(11)\n'],
[' 105.92797(21)#\n'],
[' 106.93031(43)#\n'],
[' 107.93484(32)#\n'],
[' 108.93763(54)#\n'],
[' 109.94244(54)#\n'],
[' 110.94565(54)#\n'],
[' 111.95083(75)#\n'],
[' 112.95470(86)#\n'],
[' 82.94874(54)#\n'],
[' 83.94009(43)#\n'],
[' 84.93655(30)#\n'],
[' 85.93070(47)\n'],
[' 86.92733(24)\n']]
for sub in l:
sub[:] = [s.translate(None, "()#") for s in sub]
Output:
[[' 103.9224611\n'], [' 104.9239411\n'], [' 105.9279721\n'],
[' 106.9303143\n'], [' 107.9348432\n'], [' 108.9376354\n'],
[' 109.9424454\n'], [' 110.9456554\n'], [' 111.9508375\n'],
[' 112.9547086\n'], [' 82.9487454\n'], [' 83.9400943\n'],
[' 84.9365530\n'], [' 85.9307047\n'], [' 86.9273324\n']]
If you want them cast to floats:
sub[:] = map(float,(s.translate(None, "()#") for s in sub))
which will give you:
[[103.9224611], [104.9239411], [105.9279721], [106.9303143],
[107.9348432], [108.9376354], [109.9424454], [110.9456554],
[111.9508375], [112.9547086], [82.9487454], [83.9400943], [84.936553],
[85.9307047], [86.9273324]]
If you want to remove the nums in the parens, split on the first (:
for sub in l:
sub[:] = map(float,(s.rsplit("(",1)[0] for s in sub))
print(l)
Output:
[[103.92246], [104.92394], [105.92797], [106.93031], [107.93484],
[108.93763], [109.94244], [110.94565], [111.95083], [112.9547],
[82.94874], [83.94009], [84.93655], [85.9307], [86.92733]]
Or using str.rfind:
for sub in l:
sub[:] = map(float,(s[:s.rfind("(")] for s in sub))
output as above.
you can do this:
result = []
for num in mylist:
a = num[0].index('(') #find the position of (
result.append(num[0][:a])
a oneliner version
[x[0][:x[0].index('(')] for x in mylist]
A little change in your regex:
tolist = []
for num in mylist:
a = re.sub(r'\(.*\)', '',num)
tolist.append(a)
import re
my_list = [[' 103.92246(11)\n'],
[' 104.92394(11)\n'],
[' 105.92797(21)#\n'],
[' 106.93031(43)#\n'],
[' 107.93484(32)#\n'],
[' 108.93763(54)#\n'],
[' 109.94244(54)#\n'],
[' 110.94565(54)#\n'],
[' 111.95083(75)#\n'],
[' 112.95470(86)#\n'],
[' 82.94874(54)#\n'],
[' 83.94009(43)#\n'],
[' 84.93655(30)#\n'],
[' 85.93070(47)\n']]
result = [re.sub(r'([0-9\.])\(.*?\n', r'\1', x[0]) for x in my_list]

appending 2 lists at the same time

I have a function that draws rectangles:
def drawTbl(l, w):
ln1 = ' '
ln2 = '-'
ln3 = '|'
x = range(l)
print '+', ln2*w, '+'
for i in range(len(x)):
print ln3, ln1*w, ln3
print '+', ln2*w, '+'
It works fine, but I'm attempting to kind of graph this (this is like a pong clone) so that I can place a ball 'O' at the center and use X and Y for collision detection. When I use this function:
def tblData(l, w):
table=[]
for x in range(l):
table.append([])
for y in range(w):
table.append([])
It does seem to append the blank lists, but when I try to use table[x][y], all I receive is an error.
When I return table from tblData, I do get a list of empty lists,
but say (l, w) is (12, 56), so I'm trying to place ball 'O' at the center of the grid (6, 28), simply typing table[6][28] returns an error, so I don't know how I would append 'O' to table[6,28]
So my question is, how can I effectively access list[x][y]?
Instead of creating empty lists you will need to initialize the values in the inner lists to some reasonable value, like a space.
For example:
def tblData(l, w):
table=[]
for x in range(l):
table.append([' '] * w)
return table
Or more concisely:
def tblData(l, w):
return [[' '] * w for x in range(l)]
Note that [' '] * 3 creates the list [' ', ' ', ' '], so [' '] * w is equivalent to
[' ' for x in range(w)].
For example:
>>> import pprint
>>> table = [[' '] * 4 for x in range(5)]
>>> pprint.pprint(table)
[[' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' ']]
>>> table[3][1] = 'O'
>>> pprint.pprint(table)
[[' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' '],
[' ', 'O', ' ', ' '],
[' ', ' ', ' ', ' ']]

Categories

Resources