Splitting a list in python - python

I'm writing a parser in Python. I've converted an input string into a list of tokens, such as:
['(', '2', '.', 'x', '.', '(', '3', '-', '1', ')', '+', '4', ')', '/', '3', '.', 'x', '^', '2']
I want to be able to split the list into multiple lists, like the str.split('+') function. But there doesn't seem to be a way to do my_list.split('+'). Any ideas?
Thanks!

You can write your own split function for lists quite easily by using yield:
def split_list(l, sep):
current = []
for x in l:
if x == sep:
yield current
current = []
else:
current.append(x)
yield current
An alternative way is to use list.index and catch the exception:
def split_list(l, sep):
i = 0
try:
while True:
j = l.index(sep, i)
yield l[i:j]
i = j + 1
except ValueError:
yield l[i:]
Either way you can call it like this:
l = ['(', '2', '.', 'x', '.', '(', '3', '-', '1', ')', '+', '4', ')',
'/', '3', '.', 'x', '^', '2']
for r in split_list(l, '+'):
print r
Result:
['(', '2', '.', 'x', '.', '(', '3', '-', '1', ')']
['4', ')', '/', '3', '.', 'x', '^', '2']
For parsing in Python you might also want to look at something like pyparsing.

quick hack, you can first use the .join() method to join create a string out of your list, split it at '+', re-split (this creates a matrix), then use the list() method to further split each element in the matrix to individual tokens
a = ['(', '2', '.', 'x', '.', '(', '3', '-', '1', ')', '+', '4', ')', '/', '3', '.', 'x', '^', '2']
b = ''.join(a).split('+')
c = []
for el in b:
c.append(list(el))
print(c)
result:
[['(', '2', '.', 'x', '.', '(', '3', '-', '1', ')'], ['4', ')', '/', '3', '.', 'x', '^', '2']]

Related

How to reverse multiple lists?

scores=open('scores.csv','r')
for score in scores.readlines():
score = score.strip()
rev=[]
for s in reversed(score[0:]):
rev.append(s)
print(rev)
This is my code, what I am going to do is the print reversed list from scores.csv
If I print scores at the beginning, the result is:
['0.74,0.63,0.58,0.89\n', '0.91,0.89,0.78,0.99\n', '0.43,0.35,0.34,0.45\n', '0.56,0.61,0.66,0.58\n', '0.50,0.49,0.76,0.72\n', '0.88,0.75,0.61,0.78\n']
It looks normal, and if I print score after I remove all \n in the list, the result is:
0.74,0.63,0.58,0.89
0.91,0.89,0.78,0.99
0.43,0.35,0.34,0.45
0.56,0.61,0.66,0.58
0.50,0.49,0.76,0.72
0.88,0.75,0.61,0.78
it still looks ok, but if I print at the end of the code, it shows:
['9', '8', '.', '0', ',', '8', '5', '.', '0', ',', '3', '6', '.', '0', ',', '4', '7', '.', '0']
['9', '9', '.', '0', ',', '8', '7', '.', '0', ',', '9', '8', '.', '0', ',', '1', '9', '.', '0']
['5', '4', '.', '0', ',', '4', '3', '.', '0', ',', '5', '3', '.', '0', ',', '3', '4', '.', '0']
['8', '5', '.', '0', ',', '6', '6', '.', '0', ',', '1', '6', '.', '0', ',', '6', '5', '.', '0']
['2', '7', '.', '0', ',', '6', '7', '.', '0', ',', '9', '4', '.', '0', ',', '0', '5', '.', '0']
['8', '7', '.', '0', ',', '1', '6', '.', '0', ',', '5', '7', '.', '0', ',', '8', '8', '.', '0']
looks like python converts my result from decimal to integer, but when I am trying to use float(s) to convert it back, it gives me an error. I would like to know what's wrong with my code?
In your approach, score is a string, so it's doing exactly what you tell it to: reverse the entire line character by character. You can do two things:
Use the csv module to read your CSV file (recommended), to get a list of float values, then reverse that.
Split your line on commas, then reverse that list, and finally stitch it back together. An easy way to reverse a list in Python is mylist[::-1].
For number 2, it would be something like:
score = score.strip()
temp = score.split(',')
temp_reversed = temp[::-1]
score_reversed = ','.join(temp_reversed)
always use csv module to read csv files. This module parses the data, splits according to commas, etc...
Your attempt is just reversing the line char by char. I'd rewrite it completely using csv module, which yields the tokens already split by comma (default):
import csv
with open('scores.csv','r') as scores:
cr = csv.reader(scores)
rev = []
for row in cr:
rev.append(list(reversed(row))
that doesn't convert data to float, that said, I'd replace the loop by a comprehension + float conversion
rev = [[float(x) for x in reversed(row)] for row in cr]

how to use min in nested dict?

if I have:
a = {
(1,1): {'prev': '.', 'cur': '.', 'possible': ['2', '7', '8', '9']},
(2,2): {'prev': '.', 'cur': '.', 'possible': ['1', '3', '8']},
(3,3): {'prev': '.', 'cur': '.', 'possible': ['2', '7', '8', '9', '8']}
}
And I want to get the key that has shortest length of 'possible'.
I wrote:
b = min(a, key=lambda x: len(a[x]['possible']))
It actually works.
Is there another way I can write? I was trying to see if I can use get() in dict methods.
Thanks!
I mean, you could go:
b = min(a, key=lambda x: len(a.get(x).get('possible')))
But your solution is good itself.

Outputting from a list

Just trying to make my code more efficient!
ip = ['1.1.1.1', '2.2.2.2', '3.3.3.3']
err = []
for address in ip:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
result = sock.connect_ex((address, 9999))
if result != 0:
err.extend(address)
print(err)
this is the output I receive:
['1', '.', '1', '.', '1', '.', '1', '2', '.', '2', '.', '2', '.', '2', '3', '.', '3', '.', '3', '.', '3']
if I run typecast to be either a float or int, there are errors thrown. I just need each ip address inserted into a list so I can print them out looking like:
1.1.1.1
Use err.append to add strings rather than extend, which iterates the string to characters

re.split on multiple characters (and maintaining the characters) produces a list containing also empty strings

I need to split a mathematical expression based on the delimiters. The delimiters are (, ), +, -, *, /, ^ and space. I came up with the following regular expression
"([\\s\\(\\)\\-\\+\\*/\\^])"
which also keeps the delimiters in the resulting list (which is what I want), but it also produces empty strings "" elements, which I don't want. I hardly ever use regular expression (unfortunately), so I am not sure if it is possible to avoid this.
Here's an example of the problem:
>>> import re
>>> e = "((12*x^3+4 * 3)*3)"
>>> re.split("([\\s\\(\\)\\-\\+\\*/\\^])", e)
['', '(', '', '(', '12', '*', 'x', '^', '3', '+', '4',
' ', '', ' ', '', ' ', '', '*', '', ' ', '3', ')', '', '*', '3', ')', '']
Is there a way to not produce those empty strings, maybe by modifying my regular expression? Of course I can remove them using for example filter, but the idea would be not to produce them at all.
Edit
I would also need to not include spaces. If you can help also in that matter, it would be great.
You could add \w+, remove the \s and do a findall:
import re
e = "((12*x^3+44 * 3)*3)"
print re.findall("(\w+|[()\-+*/^])", e)
Output:
['(', '(', '12', '*', 'x', '^', '3', '+', '44', '*', '3', ')', '*', '3', ')']
Depending on what you want you can change the regex:
e = "((12a*x^3+44 * 3)*3)"
print re.findall("(\d+|[a-z()\-+*/^])", e)
print re.findall("(\w+|[()\-+*/^])", e)
The first considers 12a to be two strings the latter one:
['(', '(', '12', 'a', '*', 'x', '^', '3', '+', '44', '*', '3', ')', '*', '3', ')']
['(', '(', '12a', '*', 'x', '^', '3', '+', '44', '*', '3', ')', '*', '3', ')']
Just strip/filter them out in a comprehension.
result = [item for item in re.split("([\\s\\(\\)\\-\\+\\*/\\^])", e) if item.strip()]

Pythonic way to separate operators and operands in an expression

I am trying to separate the operators (including parentheses) and the operands in an expression. For example given an expression
expr = "(32+54)*342-(4*(3-9))"
I am trying to get
['(', '32', '+', '54', ')', '*', '342', '-', '(', '4', '*', '(', '3', '-', '9', ')', ')']
Here is the code that I wrote. Is there a better way of doing it in python.
l = list(expr)
n = ''
expr = []
try:
for c in l:
if c in string.digits:
n += c
else:
if n != '':
expr.append(n)
n = ''
expr.append(c)
finally:
if n != '':
expr.append(n)
We can do this with re.split():
>>> import re
>>> expr = "(32+54)*342-(4*(3-9))"
>>> re.split("([-()+*/])", expr)
['', '(', '32', '+', '54', ')', '', '*', '342', '-', '', '(', '4', '*', '', '(', '3', '-', '9', ')', '', ')', '']
This does insert some empty strings, but these can probably be handled or stripped out trivially enough. E.g with a list comprehension:
>>> [part for part in re.split("([-()+*/])", expr) if part]
['(', '32', '+', '54', ')', '*', '342', '-', '(', '4', '*', '(', '3', '-', '9', ')', ')']
If you are only trying to tokenize the stream, your approach is fine, but somewhat old-fashioned. You can use a regular expression, to split the tokens more easily.
However, if you also want to do something with the tokens (such as evaluate them) then I suggest you look at a parsing module that can handle recursion (regular expressions cannot handle recursion), such as pyparsing.
Python: Batteries Included.
>>> [x[1] for x in tokenize.generate_tokens(StringIO.StringIO('(32+54)*342-(4*(3-9))').readline)]
['(', '32', '+', '54', ')', '*', '342', '-', '(', '4', '*', '(', '3', '-', '9', ')', ')', '']
>>> if True:
exp=[]
expr = "(32+54)*342-(4*(3-9))"
flag=False
for i in expr:
if i.isdigit() and flag:
exp.append(str(exp.pop(len(exp)-1))+i)
elif i.isdigit():
flag=True
exp.append(i)
else:
flag=False
exp.append(i)
print(exp)
['(', '32', '+', '54', ')', '*', '342', '-', '(', '4', '*', '(', '3', '-', '9', ')', ')']
>>>

Categories

Resources