I have to take a large list of words in the form:
['this\n', 'is\n', 'a\n', 'list\n', 'of\n', 'words\n']
and then using the strip function, turn it into:
['this', 'is', 'a', 'list', 'of', 'words']
I thought that what I had written would work, but I keep getting an error saying:
"'list' object has no attribute 'strip'"
Here is the code that I tried:
strip_list = []
for lengths in range(1,20):
strip_list.append(0) #longest word in the text file is 20 characters long
for a in lines:
strip_list.append(lines[a].strip())
You can either use a list comprehension
my_list = ['this\n', 'is\n', 'a\n', 'list\n', 'of\n', 'words\n']
stripped = [s.strip() for s in my_list]
or alternatively use map():
stripped = list(map(str.strip, my_list))
In Python 2, map() directly returned a list, so you didn't need the call to list. In Python 3, the list comprehension is more concise and generally considered more idiomatic.
list comprehension?
[x.strip() for x in lst]
You can use lists comprehensions:
strip_list = [item.strip() for item in lines]
Or the map function:
# with a lambda
strip_list = map(lambda it: it.strip(), lines)
# without a lambda
strip_list = map(str.strip, lines)
This can be done using list comprehensions as defined in PEP 202
[w.strip() for w in ['this\n', 'is\n', 'a\n', 'list\n', 'of\n', 'words\n']]
All other answers, and mainly about list comprehension, are great. But just to explain your error:
strip_list = []
for lengths in range(1,20):
strip_list.append(0) #longest word in the text file is 20 characters long
for a in lines:
strip_list.append(lines[a].strip())
a is a member of your list, not an index. What you could write is this:
[...]
for a in lines:
strip_list.append(a.strip())
Another important comment: you can create an empty list this way:
strip_list = [0] * 20
But this is not so useful, as .append appends stuff to your list. In your case, it's not useful to create a list with defaut values, as you'll build it item per item when appending stripped strings.
So your code should be like:
strip_list = []
for a in lines:
strip_list.append(a.strip())
But, for sure, the best one is this one, as this is exactly the same thing:
stripped = [line.strip() for line in lines]
In case you have something more complicated than just a .strip, put this in a function, and do the same. That's the most readable way to work with lists.
If you need to remove just trailing whitespace, you could use str.rstrip(), which should be slightly more efficient than str.strip():
>>> lst = ['this\n', 'is\n', 'a\n', 'list\n', 'of\n', 'words\n']
>>> [x.rstrip() for x in lst]
['this', 'is', 'a', 'list', 'of', 'words']
>>> list(map(str.rstrip, lst))
['this', 'is', 'a', 'list', 'of', 'words']
my_list = ['this\n', 'is\n', 'a\n', 'list\n', 'of\n', 'words\n']
print([l.strip() for l in my_list])
Output:
['this', 'is', 'a', 'list', 'of', 'words']
Related
I have a big text file like this (without the blank space in between words but every word in each line):
this
is
my
text
and
it
should
be
awesome
.
And I have also a list like this:
index_list = [[1,2,3,4,5],[6,7,8][9,10]]
Now I want to replace every element of each list with the corresponding index line of my text file, so the expected answer would be:
new_list = [[this, is, my, text, and],[it, should, be],[awesome, .]
I tried a nasty workaround with two for loops with a range function that was way too complicated (so I thought). Then I tried it with linecache.getline, but that also has some issues:
import linecache
new_list = []
for l in index_list:
for j in l:
new_list.append(linecache.getline('text_list', j))
This does produce only one big list, which I don't want. Also, after every word I get a bad \n which I do not get when I open the file with b = open('text_list', 'r').read.splitlines() but I don't know how to implement this in my replace function (or create, rather) so I don't get [['this\n' ,'is\n' , etc...
You are very close. Just use a temp list and the append that to the main list. Also you can use str.strip to remove newline char.
Ex:
import linecache
new_list = []
index_list = [[1,2,3,4,5],[6,7,8],[9,10]]
for l in index_list:
temp = [] #Temp List
for j in l:
temp.append(linecache.getline('text_list', j).strip())
new_list.append(temp) #Append to main list.
You could use iter to do this as long as you text_list has exactly as many elements as sum(map(len, index_list))
text_list = ['this', 'is', 'my', 'text', 'and', 'it', 'should', 'be', 'awesome', '.']
index_list = [[1,2,3,4,5],[6,7,8],[9,10]]
text_list_iter = iter(text_list)
texts = [[next(text_list_iter) for _ in index] for index in index_list]
Output
[['this', 'is', 'my', 'text', 'and'], ['it', 'should', 'be'], ['awesome', '.']]
But I am not sure if this is what you wanted to do. Maybe I am assuming some sort of ordering of index_list. The other answer I can think of is this list comprehension
texts_ = [[text_list[i-1] for i in l] for l in index_list]
Output
[['this', 'is', 'my', 'text', 'and'], ['it', 'should', 'be'], ['awesome', '.']]
I have a list that consists of both words and digits. Lets say:
list1 = ['1','100', 'Stack', 'over','flow']
From this list I would like to filter all the digits and keep the words. I have imported re and found the re code for it, namely:
[^0-9]
However, I am not sure how to implement this so that I get a list like below.
result = ['Stack', 'over', 'flow']
No need to regex, use isdigit() :
list1 = ['1','100', 'Stack', 'over','flow']
print([i for i in list1 if not i.isdigit()])
returns :
['Stack', 'over', 'flow']
use list-comprehension and string method isdigit:
[elem for elem in list1 if not elem.isdigit()]
You can do this quite nicely with list comprehension:
list1 = ['1','100', 'Stack', 'over','flow']
list2 = [i for i in list1 if not i.isdigit()]
If, for whatever reason, you did want to use regex to do this (maybe you have more complex filtering criteria), you could do it using something like this:
import re
list1 = ['1','100', 'Stack', 'over','flow']
list2 = [i for i in list1 if re.fullmatch('[^0-9]+', i)]
Using filter + lambda:
list(filter(lambda x: not x.isdigit(), list1))
# ['Stack', 'over', 'flow']
Like other answers suggested, you don't really need Regexes, but they can be more flexible if your requirements change in the future. For example.
from re import match
list1 = ['1','100', 'Stack', 'over','flow']
result = list(filter(lambda el: match(r'^[^0-9]*$', el), list1))
^: start of the string
[...]: character group
^: negates the character group
0-9: digits 0-9 (you could use \d as well)
*: zero or more times
$: end of the string
If you want all elements that don't start with a number, use ^[^0-9].* where . is any character.
I don't know exact pattern of your list element but this code should work for given example
import re
pattern = re.compile("([A-Za-z])")
list1 = ['1','100', 'Stack', 'over','flow']
result = []
for x in list1:
check = pattern.match(x)
if check is not None:
result.append(x)
print (result)
#python 3
olist = list(filter(lambda s: s.isalpha() , list1))
<br>print(olist) # ['Stack', 'over', 'flow']
#python2
olist = filter(lambda s:s.isalpha(), list1)
<br>print olist # ['Stack', 'over', 'flow']
I am trying to remove specific characters from items in a list, using another list as a reference. Currently I have:
forbiddenList = ["a", "i"]
tempList = ["this", "is", "a", "test"]
sentenceList = [s.replace(items.forbiddenList, '') for s in tempList]
print(sentenceList)
which I hoped would print:
["ths", "s", "test"]
of course, the forbidden list is quite small and I could replace each individually, but I would like to know how to do this "properly" for when I have an extensive list of "forbidden" items.
You could use a nested list comprehension.
>>> [''.join(j for j in i if j not in forbiddenList) for i in tempList]
['ths', 's', '', 'test']
It seems like you also want to remove elements if they become empty (as in, all of their characters were in forbiddenList)? If so, you can wrap the whole thing in even another list comp (at the expense of readability)
>>> [s for s in [''.join(j for j in i if j not in forbiddenList) for i in tempList] if s]
['ths', 's', 'test']
>>> templist = ['this', 'is', 'a', 'test']
>>> forbiddenlist = ['a', 'i']
>>> trans = str.maketrans('', '', ''.join(forbiddenlist))
>>> [w for w in (w.translate(trans) for w in templist) if w]
['ths', 's', 'test']
This is a Python 3 solution using str.translate and str.maketrans. It should be fast.
You can also do this in Python 2, but the interface for str.translate is slightly different:
>>> templist = ['this', 'is', 'a', 'test']
>>> forbiddenlist = ['a', 'i']
>>> [w for w in (w.translate(None, ''.join(forbiddenlist))
... for w in templist) if w]
['ths', 's', 'test']
I'm new to programming in Python (and programming in general) and we were asked to develop a function to encrypt a string by rearranging the text. We were given this as a test:
encrypt('THE PRICE OF FREEDOM IS ETERNAL VIGILENCE', 5)
'SI MODEERF FO ECIRP EHT ECNELIGIV LANRETE'
We have to make sure it works for any string of any length though. I got as far as this before getting stuck:
##Define encrypt
def encrypt(text, encrypt_value):
##Split string into list
text_list = text.split()
##group text_list according to encrypt_value
split_list = [text_list[index:index+encrypt_value] for index in xrange\
(0, len(text_list), encrypt_value)]
If I printed the result now, this would give me:
encrypt("I got a jar of dirt and you don't HA", 3)
[['I', 'got', 'a'], ['jar', 'of', 'dirt'], ['and', 'you', "don't"], ['HA']]
So I need to combine each of the lists in the list into a string (which I think is ' '.join(text)?), reverse it with [::-1], before joining the whole thing together into one string. But how in the world do I do that?
To combine your elements, you can try to using reduce:
l = [['I', 'got', 'a'], ['jar', 'of', 'dirt'], ['and', 'you', "don't"], ['HA']]
str = reduce(lambda prev,cur: prev+' '+reduce(lambda subprev,word: subprev+' '+word,cur, ''), l, '')
It will result in:
" I got a jar of dirt and you don't HA"
If you want to remove extra spaces:
str.replace(' ',' ').strip()
This reduce use can be easily modified to reverse each sublist right before combining their elements:
str = reduce(lambda prev,cur: prev+' '+reduce(lambda subprev,word: subprev+' '+word,cur[::-1], ''), l, '')
Or to reverse the combined substrings just before joining all together:
str = reduce(lambda prev,cur: prev+' '+reduce(lambda subprev,word: subprev+' '+word,cur, '')[::-1], l, '')
You can do what you're looking for fairly simply with a few nested list comprehensions.
For example, you already have
split_list = [['I', 'got', 'a'], ['jar', 'of', 'dirt'], ['and', 'you', "don't"], ['HA']]
What you want now is to reverse each triplet of words with a list comprehension, e.g. like so:
reversed_sublists = [sublist[::-1] for sublist in split_list]
// [['a', 'got', 'I'], ['dirt', 'of', 'jar'], ["don't", 'you', 'and'], ['HA']]
Then reverse each string in each sublist
reversed_strings = [[substr[::-1] for substr in sublist] for sublist in split_list]
// [['a', 'tog', 'I'], ['trid', 'fo', 'raj'], ["t'nod", 'uoy', 'dna'], ['AH']]
And then join them all up, as you said, with ' '.join(), e.g.
' '.join([' '.join(sublist) for sublist in reversed_strings])
// "a tog I trid fo raj t'nod uoy dna AH"
But nothing says you can't just do all those things at the same time with some nesting:
' '.join([' '.join([substring[::-1] for substring in sublist[::-1]]) for sublist in split_list])
// "a tog I trid fo raj t'nod uoy dna AH"
I personally prefer the aesthetic of this (and the fact you don't need to go back to strip spaces), but I'm not sure whether it performs better than Pablo's solution.
b = [['I', 'got', 'a'], ['jar', 'of', 'dirt'], ['and', 'you', "don't"], ['HA']]
print "".join([j[::-1]+' ' for i in b for j in reversed(i)])
a tog I trid fo raj t'nod uoy dna AH
Is this what you wanted...
Is there any reason you are trying to do it in one list comprehension?
It's probably easier to conceptualize (and implement) by breaking it down into parts:
def encrypt(text, encrypt_value):
reversed_words = [w[::-1] for w in text.split()]
rearranged_words = reversed_words[encrypt_value:] + reversed_words[:encrypt_value]
return ' '.join(rearranged_words[::-1])
Example output:
In [6]: encrypt('THE PRICE OF FREEDOM IS ETERNAL VIGILENCE', 5)
Out[6]: 'SI MODEERF FO ECIRP EHT ECNELIGIV LANRETE'
I have a list of curse words I want to match against other list to remove matches. I normally use list.remove('entry') on an individual basis, but looping through a list of entries against another list - then removing them has me stumped. Any ideas?
Using filter:
>>> words = ['there', 'was', 'a', 'ffff', 'time', 'ssss']
>>> curses = set(['ffff', 'ssss'])
>>> filter(lambda x: x not in curses, words)
['there', 'was', 'a', 'time']
>>>
It could also be done with list comprehension:
>>> [x for x in words if x not in curses]
Use sets.
a=set(["cat","dog","budgie"])
b=set(["horse","budgie","donkey"])
a-b
->set(['dog', 'cat'])