This should be an easy one but I have simply not come to a solution.
This is the exercise:
Start with 4 words “comfortable”, “round”, “support”, “machinery”, return a list of all possible 2 word combinations.
Example: ["comfortable round", "comfortable support", "comfortable machinery", ...]
I have started coding a loop that would go through every element, starting with the element at index[0] :
words = ["comfortable, ", 'round, ', 'support, ', 'machinery, ']
index_zero= words[0]
for i in words:
words = index_zero + i
words_one = index_one + i
print(words)
>>> Output=
comfortable, comfortable,
comfortable, round,
comfortable, support,
comfortable, machinery
The issue is when I want to start iterating from the 2nd element ('round'). I have tried operating the indexes (index[0] + 1) but of course, it won't return anything as the elements are strings.
I know a conversion from string to indexes needs to take place, but I'm not sure how.
I have also tried defining a function, but it will return None
word_list = ["comfortable, ", 'round, ', 'support, ', 'machinery, ']
index_change = word_list[0]+ 1
def word_variations(set_of_words):
for i in set_of_words:
set_of_words = set_of_words[0] + i
set_of_words = word_variations(word_list)
print(set_of_words)
I think this would do what you're looking for:
def word_variations(word_list):
combinations = []
for first_word in word_list:
for second_word in word_list:
if first_word != second_word:
combinations.append(f'{first_word}, {second_word}')
return combinations
word_list = ["comfortable", "round", "support", "machinery"]
print(word_variations(word_list))
Explanation:
You need to include a return statement at the end of the function to return a value. In my example function word_variations(), I first define an empty list called combinations. This will store each combination we compute. Then I iterate through all the words in the input word_list, create another inner loop to iterate through all words again, and if the first_word does not equal the second_word append the combination to my combinations list. Once all loops are complete, return the finished list from the function.
If I slightly change the code to print each of the results on a new line:
def word_variations(word_list):
combinations = []
for first_word in word_list:
for second_word in word_list:
if first_word != second_word:
combinations.append(f'{first_word}, {second_word}')
return combinations
word_list = ["comfortable", "round", "support", "machinery"]
for combo in word_variations(word_list):
print(combo)
the output is:
comfortable, round
comfortable, support
comfortable, machinery
round, comfortable
round, support
round, machinery
support, comfortable
support, round
support, machinery
machinery, comfortable
machinery, round
machinery, support
If you want to work with indexes in a Python loop like that, you should use either enumerate or iterate over the length of the list. The following examples will start the loop at the second element.
Example getting both index and the word at once with enumerate:
for i, word in enumerate(set_of_words[1:]):
Example using only indexes:
for i in range(1, len(set_of_words)):
Note: set_of_words[1:] above is a slice that returns the list starting at the second element.
You can also use itertools.permutations() like this
from itertools import permutations
lst = ['comfortable', 'round', 'support', 'machinery']
for i in list(permutations(lst, 2)):
print(i)
Related
This is my code, but it doesn't work. It should read text from the console, split it into words and distribute them into 3 lists and use separators between them.
words = list(map(str, input().split(" ")))
lowercase_words = []
uppercase_words = []
mixedcase_words = []
def split_symbols(list):
from operator import methodcaller
list = words
map(methodcaller(str,"split"," ",",",":",";",".","!","( )","","'","\\","/","[ ]","space"))
return list
for word in words:
if words[word] == word.lower():
words[word] = lowercase_words
elif words[word] == word.upper():
words[word] = uppercase_words
else:
words[word] = mixedcase_words
print(f"Lower case: {split_symbols(lowercase_words)}")
print(f"Upper case: {split_symbols(uppercase_words)}")
print(f"Mixed case: {split_symbols(mixedcase_words)}")
There are several issues in your code.
1) words is a list and word is string. And you are trying to access the list with the index as string which will throw an error. You must use integer for indexing a list. In this case, you don't even need indexes.
2) To check lower or upper case you can just do, word == word.lower() or word == word.upper(). Or another approach would be to use islower() or isupper() function which return a boolean.
3) You are trying to assign an empty list to that element of list. What you want is to append the word to that particular list. You want something like lowercase_words.append(word). Same for uppercase and mixedcase
So, to fix this two issues you can write the code like this -
for word in words:
if word == word.lower(): # same as word.islower()
lowercase_words.append(word)
elif word == word.upper(): # same as word.isupper()
uppercase_words.append(word)
else:
mixedcase_words.append(word)
My advice would be to refrain from naming variable things like list. Also, in split_words() you are assigning list to words. I think you meant it other way around.
Now I am not sure about the "use separators between them" part of the question. But the line map(methodcaller(str,"split"," ",",",":",";",".","!","( )","","'","\\","/","[ ]","space")) is definitely wrong. map() takes a function and an iterable. In your code the iterable part is absent and I think this where the input param list fits in. So, it may be something like -
map(methodcaller("split"," "), list)
But then again I am not sure what are you trying to achieve with that many seperator
I'm sure I am missing something obvious here, but I have been staring at this code for a while and cannot find the root of the problem.
I want to search through many strings, find all the occurrences of certain keywords, and for each of these hits, to retrieve (and save) the two words immediately preceding and following the keywords.
So far the code I have find those words, but when there is more than one occurrence of the keyword in a string, the code returns two different lists. How can I aggregate those lists at the observation/string level (so that I can match it back to string i)?
Here is a mock example of a sample and desired results. Keyword is "not":
review_list=['I like this book.', 'I do not like this novel, no, I do not.']
results= [[], ['I do not like this I do not']]
Current results (using code below) would be:
results = [[], ['I do not like this'], ['I do not']]
Here is the code (simplified version):
for i in review_list:
if (" not " or " neither ") in i:
z = i.split(' ')
for x in [x for (x, y) in enumerate(z) if find_not in y]:
neg_1=[(' '.join(z[max(x-numwords,0):x+numwords+1]))]
neg1.append(neg_1)
elif (" not " or " neither ") not in i:
neg_1=[]
neg1.append(neg_1)
Again, I am certain this is basic, but as a new Python user, any help will be greatly appreciated. Thanks!
To extract only words (removing punctuation) e.g from a string such as
'I do not like this novel, no, I do not.'
I recommend regular expressions:
import re
words = re.findall(r'\w+', somestring)
To find all indices at which one word equals not:
indices = [i for i, w in enumerate(words) if w=='not']
To get the two previous and to following words as well, I recommend a set to remove duplications:
allindx = set()
for i in indices:
for j in range(max(0, i-2), min(i+3, len(words))):
allindx.add(j)
and finally to get all the words in question into a space-joined string:
result = ' '.join(words[i] for i in sorted(allindx))
Now of course we can put all these tidbits together into a function...:
import re
def twoeachside(somestring, keyword):
words = re.findall(r'\w+', somestring)
indices = [i for i, w in enumerate(words) if w=='not']
allindx = set()
for i in indices:
for j in range(max(0, i-2), min(i+3, len(words)):
allindx.add(j)
result = ' '.join(words(i) for i in sorted(allindx))
return result
Of course, this function works on a single sentence. To make a list of results from a list of sentences:
review_list = ['I like this book.', 'I do not like this novel, no, I do not.']
results = [twoeachside(s, 'not') for s in review_list]
assert results == [[], ['I do not like this I do not']]
the last assert of course just being a check that the code works as you desire:-)
EDIT: actually judging from the example you somewhat absurdly require the results' items to be lists with a single string item if non-empty but empty lists if the string in them would be empty. This absolutely weird spec can of course also be met...:
results = [twoeachside(s, 'not') for s in review_list]
results = [[s] if s else [] for s in results]
it just makes no sense whatsoever, but hey!, it's your spec!-)
I have the following list:
Words = ['This','is','a','list','and','NM,']
Note: Words[5] >>> NM, (with a comma(,))
New_List = []
for word in Words:
if word[:2] =="NM":
Words.insert((Words.index("NM")),input("Input a " + ac_to_word("NM") + ": "))
Words.remove("NM")
Whenever I try to run this I get:
Words.insert((Words.index("NM")),input("Input a " + ac_to_word("NM") + ": "))
ValueError: 'NM' is not in list
Yet "NM" is the in index 5. What's going on here? I am asking for word[:2] not the whole word.
I tried figuring out the problem,but no one was around to look at my code, and give me feedback, so I though maybe some people out there might be able to help. If you see a mistake, please show me where. Any help is appreciated!
Several problems:
You're trying to access a string 'NM' in the list that has no such item.
You're modifying the list as you iterate over it. Don't do this! It will have unexpected consequences.
An easier way here would probably be to iterate over the list indices instead of the items:
Words = ['This','is','a','list','and','NM,']
for i in xrange(len(Words)):
if Words[i].startswith('NM'):
Words[i] = input("Input a " + ac_to_word("NM") + ": ")
Notice that I'm simply replacing the NM... items with the result of input(). This is more efficient than inserting and removing elements.
The error is coming from here:
Words.index("NM")
'NM' is not in your list of strings.
Doing insert and remove operations on a sequence while you iterate over it is a bad, bad idea. It is a surefire way to skip an item, or to double-operate on an item. Also, you should not be doing linear searches with index since a) it is slow and b) what happens if you have duplicates?
Just use enumerate:
for i,word in enumerate(words):
if word[:2] == 'NM':
words[i] = input('replace NM with something: ')
def sucontain(A):
C = A.split()
def magic(x):
B = [C[i]==C[i+1] for i in range(len(C)-1)]
return any(B)
N = [x for x in C if magic(x)]
return N
Phrase = "So flee fleeting candy can and bandage"
print (sucontain(Phrase))
The goal of this function is to create a list of the words that are inside of each successive word. For example the function would take the string ""So flee fleeting candy can and bandage" as input and return ['flee', 'and'] because flee is inside fleeting (the next word) and 'and' is inside 'bandage'. If no cases like these are found, an empty list [] should be returned. My code right now is returning [] instead of ['flee', 'and']. Can someone point out what I'm doing wrong? thank you
Just pair the consecutive words, then it becomes an easy list comprehension…
>>> s = "So flee fleeting candy can and bandage"
>>> words = s.split()
>>> [i for i, k in zip(words, words[1:]) if i in k]
['flee', 'and']
There is definitely something wrong with your magic function. It accepts x as an argument but doesn't use it anywhere.
Here is an alternate version that doesn't use an additional function:
def sucontain(A):
C = A.split()
return [w for i, w in enumerate(C[:-1]) if w in C[i+1]]
The enumerate() function allows us to loop over the indices and the values together, which makes it very straight forward to perform the test. C[i+1] is the next value and w is the current value so w in C[i+1] checks to see if the current value is contained in the next value. We use C[:-1] to make sure that we stop one before the last item, otherwise C[i+1] would result in an IndexError.
Looking ahead can be problematic. Instead of testing whether the current word is in the next one, check to see whether the previous word is in the current one. This almost always makes things simpler.
Also, use descriptive variable names instead of C and A and x and B and N and magic.
def succotash(text): # okay, so that isn't very descriptive
lastword = " " # space won't ever be in a word
results = []
for currentword in text.split():
if lastword in currentword:
results.append(currentword)
lastword = currentword
return results
print succotash("So flee fleeting candy can and bandage")
i recently wrote a method to cycle through /usr/share/dict/words and return a list of palindromes using my ispalindrome(x) method
here's some of the code...what's wrong with it? it just stalls for 10 minutes and then returns a list of all the words in the file
def reverse(a):
return a[::-1]
def ispalindrome(a):
b = reverse(a)
if b.lower() == a.lower():
return True
else:
return False
wl = open('/usr/share/dict/words', 'r')
wordlist = wl.readlines()
wl.close()
for x in wordlist:
if not ispalindrome(x):
wordlist.remove(x)
print wordlist
wordlist = wl.readlines()
When you do this, there is a new line character at the end, so your list is like:
['eye\n','bye\n', 'cyc\n']
the elements of which are obviously not a palindrome.
You need this:
['eye','bye', 'cyc']
So strip the newline character and it should be fine.
To do this in one line:
wordlist = [line.strip() for line in open('/usr/share/dict/words')]
EDIT: Iterating over a list and modifying it is causing problems. Use a list comprehension,as pointed out by Matthew.
Others have already pointed out better solutions. I want to show you why the list is not empty after running your code. Since your ispalindrome() function will never return True because of the "newlines problem" mentioned in the other answers, your code will call wordlist.remove(x) for every single item. So why is the list not empty at the end?
Because you're modifying the list as you're iterating over it. Consider the following:
>>> l = [1,2,3,4,5,6]
>>> for i in l:
... l.remove(i)
...
>>> l
[2, 4, 6]
When you remove the 1, the rest of the elements travels one step upwards, so now l[0] is 2. The iteration counter has advanced, though, and will look at l[1] in the next iteration and therefore remove 3 and so on.
So your code removes half of the entries. Moral: Never modify a list while you're iterating over it (unless you know exactly what you're doing :)).
I think there are two problems.
Firstly, what is the point in reading all of the words into a list? Why not process each word in turn and print it if it's a palindrome.
Secondly, watch out for whitespace. You have newlines at the end of each of your words!
Since you're not identifying any palindromes (due to the whitespace), you're going to attempt to remove every item from the list. While you're iterating over it!
This solution runs in well under a second and identifies lots of palindromes:
for word in open('/usr/share/dict/words', 'r'):
word = word.strip()
if ispalindrome(word):
print word
Edit:
Perhaps more 'pythonic' is to use generator expressions:
def ispalindrome(a):
return a[::-1].lower() == a.lower()
words = (word.strip() for word in open('/usr/share/dict/words', 'r'))
palindromes = (word for word in words if ispalindrome(word))
print '\n'.join(palindromes)
It doesn't return all the words. It returns half. This is because you're modifying the list while iterating over it, which is a mistake. A simpler, and more effective solution, is to use a list comprehension. You can modify sukhbir's to do the whole thing:
[word for word in (word.strip() for word in wl.readlines()) if ispalindrome(word)]
You can also break this up:
stripped = (word.strip() for word in wl.readlines())
wordlist = [word for word in stripped if ispalindrome(word)]
You're including the newline at the end of each word in /usr/share/dict/words. That means you never find any palindromes. You'll speed things up if you just log the palindromes as you find them, instead of deleting non-palindromes from the list, too.