I'm parsing JSON objects and found this sample line of code which I kind of understand but would appreciate a more detailed explanation of:
for record in [x for x in records.split("\n") if x.strip() != '']:
I know it is spliting records to get individual records by the new line character however I was wondering why it looks so complicated? is it a case that we can't have something like this:
for record in records.split("\n") if x.strip() != '']:
So what do the brackets do []? and why do we have x twice in x for x in records.split....
Thanks
The "brackets" in your example constructs a new list from an old one, this is called list comprehension.
The basic idea with [f(x) for x in xs if condition] is:
def list_comprehension(xs):
result = []
for x in xs:
if condition:
result.append(f(x))
return result
The f(x) can be any expression, containing x or not.
That's a list comprehension, a neat way of creating lists with certain conditions on the fly.
You can make it a short form of this:
a = []
for record in records.split("\n"):
if record.strip() != '':
a.append(record)
for record in a:
# do something
The square brackets ( [] ) usually signal a list in Python.
Related
This question already has answers here:
What does "list comprehension" and similar mean? How does it work and how can I use it?
(5 answers)
Closed 6 months ago.
I am still relatively new to coding. I've been doing this for less than a year and most of the basics I completely understand now. However, every now and then I come across a type of for loop that I can't get my head around.
It usually goes like this:
x for x in list if x in otherList
I completley understand for loops and if statements. But that particular line of code always confuses me. Would anyone be able to provide a detailed explanation of what actually is happening there, please?
It's called a list comprehension if it's in brackets []:
This:
new_list = [x for x in my_list if x in other_list]
Is equivalent to this:
new_list = []
for x in my_list:
if x in other_list:
new_list.append(x)
If it's in parentheses () it's called a generator:
This:
new_list = (x for x in my_list if x in other_list)
Is sort of equivalent to this:
def foo():
for x in my_list:
if x in other_list:
yield x
new_list = foo()
You might want to read this question and answer to understand more about generators and yielding functions.
This is used within a list comprehension and the if statement acts as a filter.
You may begin with a list of all numbers from 0-9:
mynums = range(10)
But then you might want only the even numbers in a new list:
myevennums=[]
for i in mynums:
if mynums%2 ==0:
myevennums.append(i)
That works but so many keystrokes ðŸ˜
So python allows list comprehensions:
myevennums = [i for i in mynums if i%2==0]
The condition could be anything, including membership in another list:
evens_to_20 = list(range(0,21,2))
Then you could create a list with the elements of your list that are also in the other list:
myevennums = [i for i in mynums if i in evens_to_20]
In general, when you see a statement like that you can always expand it as:
Y_comprehension = [i for i in X if condition(i)]
# the same as
Y_loop=[]
for i in X:
if condition(i):
Y_loop.append(i)
assert Y_comprehension == Y_loop
If the condition really is very simple, the list-comprehension is usually the better choice. If it starts to get complicated, you probably want the loop or an intermediate function def condition(item): ...
I've been trying to turn the output of my list comprehension into a variable. Quite silly, but no matter what I try, I seem to end up with an empty list (or NoneType variable).
I'm guessing it has something to do with the generator that it uses, but I'm not sure how to get around it as I need the generator to retrieve the desired results from my JSON file. (And I'm too much of a list comprehension and generator newbie to see how).
This is the working piece of code (originally posted as an answer to these questions (here and here)).
I'd like the output of the print() part to be written to a list.
def item_generator(json_Response_GT, identifier):
if isinstance(json_Response_GT, dict):
for k, v in json_Response_GT.items():
if k == identifier:
yield v
else:
yield from item_generator(v, identifier)
elif isinstance(json_Response_GT, list):
for item in json_Response_GT:
yield from item_generator(item, identifier)
res = item_generator(json_Response_GT, "identifier")
print([x for x in res])
Any help would be greatly appreciated!
A generator keeps its state, so after you iterate through it once (in order to print), another iteration will start at the end and yield nothing.
print([x for x in res]) # res is used up here
a = [x for x in res] # nothing left in res
Instead, do this:
a = [x for x in res] # or a = list(res)
# now res is used up, but a is a fixed list - it can be read and accessed as many times as you want without changing its state
print(a)
res = [x for x in item_generator(json_Response_GT, "identifier")] should do the trick.
I come across a list comprehension that is not quite the same as usual. So I am confused about the list-compresion execution order.
import re
folders = ['train_frames001', 'train_masks002',
'val_frames003','val_masks004', 'test_frames005', 'test_masks006']
folders.sort(key=lambda var:[int(x) if x.isdigit() else x
for x in re.findall(r'[^0-9]|[0-9]+', var)])
print(folders)
#Whether the list compresion part means
#for x in re.findall(r'[^0-9]|[0-9]+', var):
# if x.isdigit():
# int(x)
# else:
# x
I did't find related samples and docs.
I think you are confused between order of if-else.
a = [1,2,3,4,5,6,7,8]
If you want simply square of each number
b = [i**2 for i in a]
# [1,4,9,16,25,36,49,64]
If you want even numbers (if statement in list-comprehension)
c = [i for i in a if i%2==0]
# [2,4,6,8]
If you want to square only even numbers(if-else statement ternary operator)
c = [i**2 if i%2==0 else i for i in a]
# [1,4,3,16,5,36,7,64]
I run the code, and get ['test_frames005', 'test_masks006', 'train_frames001', 'train_masks002', 'val_frames003', 'val_masks004'], I think the result is right.
If you want get result like ['train_frames001', 'train_masks002', 'val_frames003', 'val_masks004', 'test_frames005', 'test_masks006'], which sorted by the end number. Maybe you should change your code like below.
import re
folders = ['train_frames001', 'train_masks002',
'val_frames003', 'val_masks004', 'test_frames005', 'test_masks006']
folders.sort(key=lambda var: [int(x)
for x in re.findall(r'[^0-9]|[0-9]+', var) if x.isdigit()])
print(folders)
I have a list containing:
NewL = [(1.1,[01,02]),(1.2,[03,04]),(1.3,[05,06])]
and i used enumerate to obtain the list as above where the square brackets containing [01,02],[03,04] and [05,06] are generally obtained from another list. I'll show it just in case:
L = [[01,02],[03,04],[05,06]]
and initially the output list is just:
OutputList = [1.1,1.2,1.3]
i used enumerate on both of this list to get what i have as the first list i've written above.
The problem i'm facing now is, let's say i want to only output the value for [05,06] which is 1.3 from the NewL. How would i do that? I was thinking of something like:
for val in NewL:
if NewL[1] == [05,06]:
print NewL[0]
but it's totally wrong as cases might change where it's not necessary always be [05,06] as it can be obtaining value for [03,04] and [01,02] too. I'm pretty new using enumerate so I'll appreciate any help for this.
The for loop should like this:
for val in NewL:
if val[1] == [5,6]:
print val[0]
It will print 1.3
I'm not sure I understand the question, so I will extrapolate what you need:
Given your 2 intial lists:
L = [[01,02],[03,04],[05,06]]
OutputList = [1.1,1.2,1.3]
you can generate your transformed list using:
NewL = list(zip(OutputList, L))
then, given 1 item from L, if you want to retrieve the value from OutputList:
val = [x for x, y in NewL if y == [05,06]][0]
But it would be a lot easier to just do:
val = OutputList[L.index([05,06])]
Note that both those expressions will raise an IndexError if the searched item is not found
I'm new to python and came across this segment of code. Can someone help me with the syntax here? Maybe provide some comments on each line to how it's working? xs is a list that contains dates.
data = {}
for title, d in tmpdata.items():
data[title] = [x in d and d[x][statid] or 0 for x in xs]
data[title][-1] = maxs[statid]
If I had to guess, I'd say the most perplexing line to someone new to Python must be:
data[title] = [x in d and d[x][statid] or 0 for x in xs]
There is a lot going on here, and some of it uses a style that, although safe in this instance, is no longer recommended. Here is a more verbose form:
data[title] = []
for x in xs:
if x in d and d[x][statid]:
data[title].append(d[x][statid])
else:
data[title].append(0)
The construct condition and value-if-condition-is-true or value-if-condition-is-false is an old-style form of the C ternary form condition ? value-if-condition-is-true : value-if-condition-is-false. The given Python expression hides a latent bug that can crop up if the value-if-condition-is-true is evaluated by Python as a "false-y" value - 0, [], () are all values that would be considered as false if used in a conditional expression, so you might have a problem if your value-if-condition-is-true turned out to be one of these. As it happens in this case, if d[x][statid] is 0, then we would assume a False result and go on and add a 0 to the list, which would be the right thing to do anyway. If we could just edit the verbose form, the simplest would be to remove the and d[x][statid] as in:
data[title] = []
for x in xs:
if x in d:
data[title].append(d[x][statid])
else:
data[title].append(0)
Or use the new Python ternary form (which gives some people a rash, but I have grown accustomed to it - the ternary form, not the rash), which is written as:
value-if-condition-is-true if condition else value-if-condition-is-false
Or substituting into our verbose form:
data[title] = []
for x in xs:
data[title].append(d[x][statid] if x in d else 0)
So lastly, the list comprehension part. Whenever you have this kind of loop:
listvar = []
for some-iteration-condition:
listvar.append(some-iteration-dependent-value)
You can rewrite it as:
listvar = [some-iteration-dependent-value for some-iteration-condition]
and this form is called a list comprehension. It creates a list by following the iteration condition and evaluating the value for each iteration.
So you can now see how the original statement would be written. Because of the possible latent bug inherent in the old-style condition and true-value or false-value, the ternary form or an explicit if-then-else is the preferred style now. The code should be written today as:
data[title] = [d[x][statid] if x in d else 0 for x in xs]
An explanation of the code is:
Initialize data to the empty dictionary
Loop through the key-value pairs in the dictionary tmpdata, calling the key title and the value d
a. Add a new key-value pair to the data dictionary whose key is title and whose value is a list of the following: for each x in some (global?) list xs, the value x itself if d[x][statid] is truthy otherwise 0.
b. Overwrite the last cell of this new value with maxs[statid]
There are some interesting pythonic structures here - list comprehensions and the and/or form of the conditional expression.
# Data initialization
data = {}
# for over all the element of the dictionary tmpdata
# title will get the index and d the data of the current element
for title, d in tmpdata.items():
#data[title]= a list containing each x contained in list xs
# the x value or 0 depening on the condition "d[x][statid]"
data[title] = [x in d and d[x][statid] or 0 for x in xs]
# Assign the value maxs[statid] to the last cell ( I think but not too sure)
data[title][-1] = maxs[statid]