Select level above in iteratively created nested list with Python - python

I have trawled through so many questions on lists and I can't find what I'm looking for.
I have a string with braces wrapping some values and some other nested pairs of braces containing values. I don't know the deepest level the structure is nested, but it might look something like this:
{121{12}12{211}2}
I want to iterate over this string and transfer it to a nested list in a way similar to the following pseudocode:
for i in thestring
if there is a leftbrace { start a new list inside the current list
elseif there is a rightbrace } close the current list and select the list on the level above
else add i to currently selected list
I have no idea how to go up a list level and close the current sublist

def listify(s):
i = iter(s)
l = []
for c in i:
if c == '{':
l.append(listify(i))
elif c == '}':
return l
else:
l.append(c)
return l[0]
The trick is to use a recursive function so that the stack takes care of keeping track which list you're in. We use iter so that we can pass the string to recursive calls without the letters we have already processed.
>>>listify(s)
['1', '2', '1', ['1', '2'], '1', '2', ['2', '1', '1'], '2']

Here is one way using regex and ast.literal_eval to create a set of tuples:
In [78]: s = "{121{12}12{211}2}"
In [79]: new = re.sub(r'(})(?!$)|(?<!^)({)',lambda x : {'{': ',(', '}': ',),'}[x.group(0)], s)
In [80]: ast.literal_eval(new)
Out[80]: {121, (211,), 2, 12, (12,)}

Related

Find index of an element in list using wildcards

I have a list like this:
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
I want to identify the index of an element in the list a based on first three sub-elements of the element. For example, the index of the element which contains ['4','5','6'] as its first three sub-elements is 1.
I have tried to do this using list comprehension follows:
ind = [i for i in range(0,len(a)) if (a[i][0] == '4' and a[i][1] == '5' and a[i][2] == '6')]
But this is computationally expensive as I have to implement this code several times in a for loop. So, I am looking for a less computationally expensive method. SO far I have tried 'list.index' as follows:
ind = a.index(['4','5','6','*','*'])
where '*' is used as wildcard string. But is does not work as it outputs:
['4', '5', '6', '.*', '.*'] is not in list
I think there is something wrong with the way I am using wildcard. Can you please tell me what is it? Or is there another fast way to identify the element of a list based on its sub-elements?
Solution 1: True wildcard
You can simply use a.index(...) if you use a true wildcard:
class Wildcard:
def __eq__(self, anything):
return True
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
wc = Wildcard()
print(a.index(['4', '5', '6', wc, wc]))
Outputs 1. Try it online!
This might be fast because 1) it does the searching in C code and 2) it does minimal work, as it for example doesn't create a list slice for every row and might often rule out a row simply by looking at just the first value.
Solution 2: operator.indexOf
Or using operator.indexOf, which finds the index in C rather than in Python like an enumerate solution would. Here are two versions of that, I suspect the one mapping a ready-to-go slice is faster:
from operator import indexOf, itemgetter
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
print(indexOf((r[:3] for r in a), ['4', '5', '6']))
print(indexOf(map(itemgetter(slice(3)), a), ['4', '5', '6']))
Try it online!
Well you could transpose and slice and transpose back and finally index, like this:
>>> list(zip(*list(zip(*a))[:3])).index(('4', '5', '6'))
1
>>>
But what's wrong with?
>>> [x[:3] for x in a].index(['4', '5', '6'])
1
>>>
It may be more efficient to implement it as a generator. In this way, if e.g. you know that you can get at most one match you can stop once it is found:
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
pattern = ['4','5','6']
def find_index(data, pattern):
for n, elt in enumerate(a):
if elt[:3] == pattern:
yield n
indices = find_index(a, pattern)
next(indices)
It gives:
1
Here's a fairly simple and straightforward way of accomplishing this:
def findListMatch(lists, match):
# Loop over lists with indices
for k, sublist in enumerate(lists):
# Check if the beginning of the list matches the desired start pattern
if sublist[0:len(match)] == match:
# If it's a match, return the index
return k
# If none of the lists match, return a placeholder value of "None"
return None
a = [['1','2','3','a','b'],
['4','5','6','c','d'],
['7','8','9','e','f']]
matchIndex = findListMatch(a, ['4', '5', '6'])
# Result:
# matchIndex = 1
#U12-Forward's answer works but unnecessarily builds a temporary list of the same size as the input list before applying the index method, which can be quite an overhead if the list is long. A more efficient approach would be to use enumerate to generate the indices while comparing the first 3 items to the desired list:
next(i for i, l in enumerate(a) if l[:3] == ['4', '5', '6'])

Parsing,splicing and structuring nested strings to list in Python

I can't imagine I'm going to get much help from this due to my inability to explain it. But for instance I have a string like so:
s = "[1,[2,2,[3,4]],5]"
and I need to convert it into a nested list item as such
lst = ["1",["2","2",["3","4"]],"5"]
that if I were to go lst[1][2][0] it would return '3'.
The way I have tried to do it was by creating a substring for every number within '[' and end of string characters and then slowly nest it back up
def ParseList(strlist):
if '[' in strlist:
print strlist
return ParseList(GetBetweenChar(strlist,'[',None))
else:
return strlist
however it returns:(which although maybe a good start? I dont know where to continue)
[1,[2,2,[3,4]],5]
1,[2,2,[3,4]],5
2,2,[3,4]],
3,4]]
which I would think I would append that to a list item but I dont know how to..
You can use ast.literal_eval to safely convert the string to a nested list of integers. Then define a nested map function to convert to all elements to strings, whilst maintaining the nesting structure.
from ast import literal_eval
s = "[1,[2,2,[3,4]],5]"
ls = literal_eval(s)
# yes I know there is something else called nmap
def nmap(fn, iterable):
res = []
for i in iterable:
if isinstance(i, list): # could be tuple or something else?
res.append(nmap(fn, i))
else:
res.append(fn(i))
return res
result = nmap(str, ls)
print(result)
print(result[1][2][0])
result:
['1', ['2', '2', ['3', '4']], '5']
3
You can use eval(). Just be careful to make sure the string is safe because eval will convert a string to valid python code.
>>> eval("[1,[2,2,[3,4]],5]")[1][2][0]
3
Some more info: What does Python's eval() do?
If you didn't require every piece to be a string, but you could let numbers be numbers, then you can use the json library:
>>> s = "[1,[2,2,[3,4]],5]"
>>> import json
>>> json.loads(s)
[1, [2, 2, [3, 4]], 5]
Notice that if your original list contains numbers or booleans, they will stay as numbers or booleans. This is probably what you want, BUT if you really need everything to be strings, then you can recurse through the nested arrays and apply str to everything (look for "How to do flatmap in Python") or request further help in the comment section below.
You could proceed by first adding the quotes around the digits, then eval the list:
s = "[1,[2,2,[3,4]],5]"
res = ''
for c in s:
if c.isdigit():
res += '"' + c + '"'
else:
res += c
s = eval(res)
s
output:
['1', ['2', '2', ['3', '4']], '5']
This will work for single digit numbers; a little bit more work would be needed for multiple digits, or floats
Eval is not safe for user input.
You can do something like for python (2.6+):
>>> import ast
>>> s = "[1,[2,2,[3,4]],5]"
>>> lst = ast.literal_eval(s)
>>> str(lst[1][2][0])
'3'

Iterating through and selecting nth element in multiple lists of lists - python 2

I have multiple lists of lists. I need to get the 2nd element of each inner list and make multiple new lists composed of these 2nd elements.
Sample data:
for item in list:
x = fpp.table
print x
[['hello', 'mum'],['goodbye', 'dad']]
[['3', '6', '9'], ['2', '4', '6']]
So with this data I want to turn it into the two following lists:
['mum','dad']
['6','4']
The accepted answer is correct. However, the most Pythonic (IMO) way to do that is (BTW, you should avoid to name your variable list since this is a type in Python) to use a list comprehension:
[elt[1] for elt in my_list]
If you want to get the second element of each list only when the list has at least two elements (otherwise, the previous code would crash), you can add a condition to the list comprehension:
[elt[1] for elt in my_list if len(elt) >= 2]
Lets try making a function like this:
def secondElement(list):
secondL = []
for item in list:
secondL.append(item[1])
print (secondL)
This should do the job for getting the 2nd element of the every integrated sub-list from the main list. Hope this is what you were looking for!

copy item in specific location in list elements to new list in Python [duplicate]

This question already has answers here:
Accessing a value in a tuple that is in a list
(6 answers)
Closed 5 years ago.
Well-experienced in C++, but new-ish to Python: I'd like to pull out the 2nd character in each element of the following list named input to a new list named output.
input = ['hail','2198','1721','1925']
output = ['a', '1', '7', '9']
Am I missing a simple operator that does this? Thanks
Welcome to programming in Python :) .
The syntax for getting a character out of the string s is s[i] where i starts with 0 and goes up to n-1 where n is the length of the string.
In Python it is possible to create a list of elements using a syntax that explains itself when reading it. item[1] means in this context the second character/element in the string got from input because Python considers in this context a string to be a list of characters.
The right keywords to search the Internet for details will be "Python list comprehension" and "Python list slice".
output = [item[1] for item in input_] (see note in the other answer about 'input')
Compared to C++ Python will make coding a pleasure. You have just to write what you mean it has to be that way and it probably will be that way in Python - that is how I came from C++ to Python myself.
This is for the character after '2'.
input_ = ['hail','2198','1721','1925']
result_list = []
for element in input_:
character = '2' # in this case
index_of_character = element.find(character)
if index_of_character != -1 and index_of_character != len(element) -1:
# -1 if character is not found in the string no need to append element after that
# character should not be the last in the string otherwise index out of bound will occur
result_list.append(element[index_of_character + 1])
print (result_list)
PS: This method only gives character only after first occurrence of two if there are multiple '2' in the string. You have to tweak this method
That's a list comprehension:
>>> input_ = ['hail','2198','1721','1925']
>>> [s[1] for s in input_]
['a', '1', '7', '9']
Note that input is the name of a built-in function in Python, so you should avoid to use that name for a local variable.
You can solve it in one line using list-comprehension. You can select the second index element using input[i][1] at some index i.
>>>input = ['hail','2198','1721','1925']
>>>[x[1] for x in input]
['a', '1', '7', '9']
[x[1] for x in input] will create a list of elements where each element will be x[1].

Python for loop isn't iterating '0' from a list

Python for loop isn't iterating '0' from a list!
I tried to make a code to separate an input into numbers and letters(or operators):
g='10+10+20x'
t=[]
for each_g in g:
t.append(each_g)
lol=[]
a=[]
for each_t in t:
if each_t.isdigit():
lol.append(each_t)
x = t.index(each_t)
t.pop(x)
else:
lol = ''.join(lol)
a.append(lol)
a.append(each_t)
lol=[]
print(a)
The desired output would be:
['10', '+', '10', '+', '20', 'x']
but it prints
['1', '+', '1', '+', '2', 'x']
instead.
Is there any problems whit the code or a better solution to make it work as expected?
Thanks.
Don't modify a sequence that you're iterating over. Each pop is shifting a character down before you can process it.
In this case since you're not using t when you're done, there's no need for the pop at all - it's redundant.
Here's an alternative approach (as your code already has been thoroughly discussed by others):
In [38]: import re
In [39]: g='10+10+20x'
In [40]: re.findall('(\d+|[a-zA-Z]+|\+)',g)
Out[40]: ['10', '+', '10', '+', '20', 'x']
Editing a list (or any other iterable) while you're iterating over it is a horrible idea.
Here's how iterating over a list works:
when you say for each_t in t, what's actually happening is that a sequence of numbers is generated in turn; first 0, then 1 and so forth, all the way up to len(t)-1. The iterator then gets the element at that index and assigns it to each_t.
When you pop from that list, you are eliminating an index. Therefore, what used to be at index 3 is now at index 2 (if you popped something at an index less than 3). Then what happens is that the iterator accesses the element at the next number. But since the elements have essentially shifted down an index, when the iterator asks for the element at index i, it means "give me the element that used to be at index i". Instead, what it gets is "the element currently at index i, which is indeed the element that used to be at index i+1".
This is exactly why you skip over elements when you delete from a list as you're iterating over it

Categories

Resources