List comprehension if/else and for iteration order - python

I have a CSV file that contains book chapters either as single chapters or chapter ranges delimited by commas, e.g. 1,2,4-6,12.
Given this input '1,2,4-6,12', I want a list ['1','2','4','5','6','12'] as output
Something along the lines of:
chps=[str(x) for x in chp_range(entry)) if '-' in entry else entry for entry in chapters.split(',') ]
which doesn't work.
Function chp_range('4-6') returns a range(4,6) object.
I've tried a lot of variations, but still haven't been able to get the order of conditionals and iteration right. How can I get this code to work?

If it has to be a one-liner, this should work:
>>> [str(x) for c in chapters.split(",") for x in range(int(c.split("-")[0]), int(c.split("-")[-1])+1)]
['1', '2', '4', '5', '6', '12']
You can't conditionally nest your comprehensions, so your chp_range function is of little value when used in a comprehension.

Don't try to use list comprehension for the sake of, only if it's actually easier / more readable:
lst = []
for x in s.split(','):
if '-' in x:
start, end = x.split('-')
lst.extend([str(i) for i in range(int(start), int(end)+1)])
else:
lst.append(x)

What you were trying to do:
chps = [str(x)
for entry in chapters.split(',')
for x in (chp_range(entry) if '-' in entry else [entry])]
Try it online!

from itertools import chain
def chp_range(entry):
x, y = map(int, entry.split('-'))
return map(str, range(x, y+1))
chps = [
chp_range(entry) if '-' in entry else entry for entry in chapters.split(',')]
list(chain(*chps))

YOu can use extend for range as follows:
string = '1,2,4-6,12'
string = string.split(',')
chapters = []
for i in string:
if '-' in i:
a,b = i.split('-')
chapters.extend(range(int(a),int(b)+1))
else:
chapters.append(int(i))
print(chapters)

Not a one-liner but if you are interested in a recursive solution. Please check it up.
a = '1,2,4-6,12'
inp = [num if '-' not in num else [str(i) for i in range(int(num[0]),int(num[-1])+1)] for num in a.split(',') ]
ans = []
def flatten(inp,ans):
while inp:
tmp = inp.pop()
if type(tmp) == str:
ans.append(tmp)
else:
flatten(tmp,ans)
return
flatten(inp,ans)
ans[::-1]

Related

Python: Find all integers inside string and store in list as an integer

Looking for an efficient way to search for all integers in a string and append them to a list. E.g. '(12, 15)' should become [12, 15]. Integers that are greater than 9, should remain joined and not separated when appended to the list.
If there is a way to use built-in functions, lambda or list comprehension, could you share those specifically? Thanks.
What I have so far seems too bloated.
user_input = '(3, 10)' # or '3 10'
def sti(n):
s = ''
l = []
for index, item in enumerate(n):
if item.isdigit():
s += item
if not item.isdigit():
l.append(s)
s = ''
l.append(s)
a = list(filter(None, l)) # remove spaces
a = list(map(lambda x: int(x), a)) # convert to int
return a
print(sti(user_input))
Use regular expressions:
import re
print(list(map(int, re.findall(r'\d+', user_input))))
If
new_string = "lol69on420for666"
then you can do something like,
for letter in new_string:
if letter == "0" or \
letter == "1" or \
...
letter == "9":
append the letter to some list
or
if "6" in new_string:
append "6" to some list
Assuming there are no negative numbers, you could use itertools.groupby together with str.isdecimal:
>>> from operator import itemgetter
>>> from itertools import groupby
>>> list(map(int, map(''.join, map(itemgetter(1), filter(itemgetter(0), groupby('(3, 10)', str.isdecimal))))))
[3, 10]
Pretty much better without importing any package. v is the string
new_list = [int(item) for item in v if item.isdigit()]

While loop within for loop for list of lists

I'm trying to create a big list that will contain lists of strings. I iterate over the input list of strings and create a temporary list.
Input:
['Mike','Angela','Bill','\n','Robert','Pam','\n',...]
My desired output:
[['Mike','Angela','Bill'],['Robert','Pam']...]
What i get:
[['Mike','Angela','Bill'],['Angela','Bill'],['Bill']...]
Code:
for i in range(0,len(temp)):
temporary = []
while(temp[i] != '\n' and i<len(temp)-1):
temporary.append(temp[i])
i+=1
bigList.append(temporary)
Use itertools.groupby
from itertools import groupby
names = ['Mike','Angela','Bill','\n','Robert','Pam']
[list(g) for k,g in groupby(names, lambda x:x=='\n') if not k]
#[['Mike', 'Angela', 'Bill'], ['Robert', 'Pam']]
Fixing your code, I'd recommend iterating over each element directly, appending to a nested list -
r = [[]]
for i in temp:
if i.strip():
r[-1].append(i)
else:
r.append([])
Note that if temp ends with a newline, r will have a trailing empty [] list. You can get rid of that though:
if not r[-1]:
del r[-1]
Another option would be using itertools.groupby, which the other answerer has already mentioned. Although, your method is more performant.
Your for loop was scanning over the temp array just fine, but the while loop on the inside was advancing that index. And then your while loop would reduce the index. This caused the repitition.
temp = ['mike','angela','bill','\n','robert','pam','\n','liz','anya','\n']
# !make sure to include this '\n' at the end of temp!
bigList = []
temporary = []
for i in range(0,len(temp)):
if(temp[i] != '\n'):
temporary.append(temp[i])
print(temporary)
else:
print(temporary)
bigList.append(temporary)
temporary = []
You could try:
a_list = ['Mike','Angela','Bill','\n','Robert','Pam','\n']
result = []
start = 0
end = 0
for indx, name in enumerate(a_list):
if name == '\n':
end = indx
sublist = a_list[start:end]
if sublist:
result.append(sublist)
start = indx + 1
>>> result
[['Mike', 'Angela', 'Bill'], ['Robert', 'Pam']]

how to add new value in front of each elements in a list?

I have a question about adding new value to existing elements in a list.For example if I have
myList = ["0","12","221","3344"]
I set a condition where if the length of each element in the list is smaller than 4, the program will add value "0" to the front of the each elements.Perhaps it should look like this
newList = ["0000","0012","0221","3344"]
For now I already done some example code like below
x = ["0","1"]
if len(x) < 4:
x.insert(0,"0")
print(x)
The output is like this
["0","0","1"]
I've tried to add/change some line like below
x = ["0","1"]
for i in x:
if len(i) < 4:
i.insert(0,"0")
print(x)
but I got Error saying
'str' object has no attribute 'insert'.
Did I missed something here or there are other way to do this?
Thank you for your answer.
For the particular case of adding 0s at the start of strings, you could use zfill:
>>> myList = ["0","12","221","3344"]
>>> [x.zfill(4) for x in myList]
['0000', '0012', '0221', '3344']
You could use the in-built map function to apply a lambda function to each item in the list as follows:
myList = ['0', '12', '221', '3344']
answer = map(lambda x: '0'*(4-len(x))+x if len(x) < 4 else x, myList)
print(answer)
Output
['0000', '0012', '0221', '3344']
With a list comprehension:
>>> myList = ["0","12","221","3344","11111111111"]
>>> ['0'*(4 - len(x)) + x for x in myList]
['0000', '0012', '0221', '3344', '11111111111']
Note that '0'*y is the empty string if y is smaller than zero, I added the last value to myList to show this. That's why you don't need an if/else in the comprehension.
edit: str.rjust is another option:
>>> [x.rjust(4, '0') for x in myList]
['0000', '0012', '0221', '3344', '11111111111']
Here's a solution:
def paditem(item, length):
return item + ('0' * (length - len(item))) if len(item) < length else item
def padlist(somelist, length):
return map(lambda x: paditem(x, length), somelist)
# Test Code:
myList = ["0","12","221","3344"]
results = padlist(myList, 4)
for result in results:
print result

Filtering out sublist from list based on contents of entire sublist?

So here is what I have:
lst = [["111","101","000"],["1001","1100","1111"],["00","11","00"]]
And I want to filter out the sublists that contain only strings of "0"*len(string) and "1"*len(string). The result should look like this:
[["111","101","000"],["1001","1100","1111"]]
Break up the task into smaller parts. Then combine to get the solution:
# check that a string is all 0 or all 1
def check_string(s):
size = len(s)
return s in ('0'*size, '1'*size)
# check that a list contains only strings that satisfy check_string
def check_list(l):
return all(check_string(s) for s in l)
lst = [["111","101","000"],["1001","1100","1111"],["00","11","00"]]
result = [l for l in lst if not check_list(l)]
Then we have
>>> print(result)
[['111', '101', '000'], ['1001', '1100', '1111']]
Here's one way to do it with regular expressions:
import re
[[y for y in x if not (re.match('1+$', y) or re.match('0+$', y))] for x in lst]
And here is a better clever way inspired by the answer here:
[[y for y in x if not (y == len(y) * y[0])] for x in lst]
With generator expressions:
lst = list([x for x in lst if not all([y == y[0]*len(y) for y in x])])
Note: This is better than #Tum's answer because it takes the list as a whole (e.g., ["111","101","000"]) rather than individually accepting or rejecting each value (e.g., accepting "101" but rejecting "111" and "000", leaving ["101"]
You can do so using the filter function as follows:
import re
orig_list = [["111","101","000"], ["1001","1100","1111"], ["01","10"]]
def checker(item):
for idx in item:
if re.search(r'^1*$', idx) or re.search(r'^0*$', idx):
return True
return False
new_list = list(filter(checker, orig_list))
print(new_list)
Output:
[['111', '101', '000'], ['1001', '1100', '1111']]
One more solution:
[lst[j] for j in set([k for k, i in enumerate(lst) for m in i if m[0]*len(m) != m])]
In this case think about m[0]: if you have empty string what does it mean in your case? You can exclude it also.

What is the most pythonic way to filter list from number?

i have this :
['SPRD', '60', 'p25']
I want to generate that :
['SPRD', 'p']
What is the most pythonic way to do that?
Thanks.
digits_stripped = (s.translate(None, '0123456789') for s in input_list)
without_blanks = [s for s in digits_stripped if s]
(Note that if you happen to be using a Python older than 2.6, you'll need to use string.maketrans('', '') instead of None as the first argument to translate().)
In [25]: l = ['SPRD', '60', 'p25']
In [26]: filter(None,(s.translate(None,'1234567890') for s in l))
Out[26]: ['SPRD', 'p']
In [37]: l = []
In [38]: p = "[a-zA-Z]*"
In [39]: p1 = re.compile(p)
In [40]: for i in ['SPRD', '60', 'p25']:
....: if p1.match(i):
....: l.append(p1.match(i).group())
....:
In [41]: [x for x in l if x]
Out[41]: ['SPRD', 'p']
import re
filter(None, [re.sub("\d+", "", f) for f in input_list])
I believe this is quite pythonic, although it does require regex and is not as efficient as other answers. First, all digits are removed from words, then any emptystrings are removed from the list.
Use a generation expression, regular expressions, a list comprehensions:
import re
[s for s in (re.sub("[0-9]*", "", s) for s in l) if s]
If your lists only contain strings and you only want to filter out numbers, this, however clunky, may work too:
def filter_text(x):
try:
int(x)
return ''
except ValueError:
return x
l = ['SPRD', '60', 'p25']
newlist = []
for x in l:
newlist.append(''.join([filter_text(y) for y in x]))
newlist = [x for x in newlist if x != '']
According to a previous answer, casting should be faster and "prettier" than regex or string operations.
The best way to do this then should look something like this:
def is_number(s):
try:
float(s)
return True
except ValueError:
return False
#with list comprehension
def filter_number(seq):
return [item for item in seq if not is_number(item)]
#with a generator
def filter_number_generator(seq):
for item in seq:
if not is_number(item): yeld item
If your goal is to have a copy of the list in memory, you should use the list comprehension method. Now if you want to iterate over each non-number item in the list efficiently, the generator allows you to do something like:
for non_number in filter_number_generator(l)

Categories

Resources