This question already has answers here:
How can I check if two strings are anagrams of each other?
(27 answers)
Closed 11 months ago.
I want to write a function that finds anagrams. Anagrams are words that are written with the same characters. For example, "abba" and "baba".
I have gotten as far as writing a function that can recognize if a certain string has the same letters as another string. However, I can't account for the number of repeated letters in the string.
How should I do this?
This is the code I have written so far:
def anagrams(word, words):
list1 = []
for i in words:
if set(word) == set(i):
list1.append(i)
return list1
The inputs look something like this:
('abba', ['aabb', 'abcd', 'bbaa', 'dada'])
I want to find an anagram for the first string, within the list.
You kind of point to the solution in your question. Your current problem is that using a set, you ignore the count of times each individual letter is contained in the input and target strings. So to fix this, start using these counts. For instance you can have a mapping between letter and the number of its occurrences in each of the two strings and then compare those mappings.
For the purposes of learning I would like to encourage you to use dict to solve the problem. Still after you know how to do that, there is a built-in container in collections, called Counter that can do the same for you.
Related
This question already has answers here:
How do I append one string to another in Python?
(12 answers)
How to concatenate (join) items in a list to a single string
(11 answers)
Which is the preferred way to concatenate a string in Python? [duplicate]
(12 answers)
Closed last month.
Im new to Python, like around an hour and a half into it new.. ive crawled my website using cewl to get a bespoke wordlist for password audits, i also want to combine randomly 3 of these words together.
IE Cewl wordlist ;
word1
word2
word3
word4
using a python script i want to further create another wordlist randomly joining 3 words together IE
word4word2word1
word1word3word4
word3word4word2
so far all ive come up with is;
import random
print(random.choice(open("test.txt").read().split()))
print (random.choice(open("test.txt").read().split()))
print(random.choice(open("test.txt").read().split()))
Whilst this is clearly wrong, it will give me 3 random words from my list i just want to join them without delimiter, any help for a complete novice would be massively appreciated
First thing to do is only read the words once and using a context manager so the file gets closed properly.
with open("test.txt") as f:
lines = f.readlines()
Then use random.sample to pick three words.
words = random.sample(lines, 3)
Of course, you probably want to strip newlines and other extraneous whitespace for each word.
words = random.sample([x.strip() for x in lines], 3)
Now you just need to join those together.
Using your code/style:
import random
wordlist = open("test.txt").read().split()
randomword = ''.join([random.choice(wordlist), random.choice(wordlist), random.choice(wordlist)])
print(randomword)
join is a method of the string type and it will join the elements of a list using the string as a delimiter. In this case we use an empty string '' and join a list made up of random choices from your test.txt file.
I was wondering how would you get the number of duplicate letters in a string and store the information into a hashtable? As in this order: {(DUPLICATE LETTER HERE) : (NUMBER OF DUPLICATES FOR THAT LETTER HERE)}
I have been having a lot of trouble with getting an answer and am looking for code for this specific function.
Thanks!
I'm having trouble in a school project because I don't know how to join elements of a list in segments. Here's an example: Let's say I have the following list:
list = ["T","h","i","s","I","s","A","L","i","s","t",]
How could I join this list so that the program outputs the following?:
Output: ["This","Is","A","List"]
Assuming list is your input, and without giving you the answer outright since it's a school project you should do yourself, here are some hints.
You'll want to check if a character is uppercase to know when the start of a word is. With python, you can use isupper() (ex: 'C'.isupper() would return True).
Python strings are iterable.
You can add a character to the end of a string using += (ex: myWord += 'a')
You can add a string to a list using append (ex: myList.append(myWord))
Remember this is a learning experience and there's no real value to being given the answer outright, if that's what you were hoping for. Best of luck and welcome to StackOverflow.
You can use regex for this
import re
list = ["T","h","i","s","I","s","A","L","i","s","t",]
sep=[s for s in re.split("([A-Z][^A-Z]*)", ''.join(list)) if s]
print(sep)
This question already has answers here:
Modifying a list while iterating when programming with python [duplicate]
(5 answers)
Closed 2 years ago.
I am trying to write a code that would loop through elements in a list of strings and combine the elements that start with a lower case letter with a previous element. For example, given this list:
test_list = ['Example','This is a sample','sentence','created to illustrate','the problem.','End of example']
I would like to end up with the following list:
test_list = ['Example','This is a sample sentence created to illustrate the problem.','End of example']
Here is the code I have tried (which doesn't work):
for i in range(len(test_list)):
if test_list[i].islower():
test_list[i-1:i] = [' '.join(test_list[i-1:i])]
I think there might be a problem with me trying to use this join recursively. Could someone recommend a way to solve this? As background, the reason I need this is because I have many PDF documents of varying sizes converted to text which I split into paragraphs to extract specific items using re.split('\n\s*\n',document) on each doc. It works for most docs but, for whatever reason, some of them have '\n\n' literally after every other word or just in random places that do not correspond to end of paragraph, so I am trying to combine these to achieve a more reasonable list of paragraphs. On the other hand, if anyone has a better idea of how to split raw extracted text into paragraphs, that would be awesome, too. Thanks in advance for the help!
you could use:
output = [test_list[0]]
for a, b in zip(test_list, test_list[1:]):
if b[0].islower():
output[-1] = f'{output[-1]} {b}'
else:
output.append(b)
output
output:
['Example',
'This is a sample sentence created to illustrate the problem.',
'End of example']
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to check if my list has an item from another list(dictionary)?
This is actually homework for a mark.
The user of program must write sentence down. Than program checks the words and prints the wrong ones (if wrong words appear more than once program must print them only once). Wrong words must be printed in the order they appear in the sentence.
Here is how I did it. But there is one problem. The wrong words do not apper in the same order they apper in the sentence beacause of built-in function sorted. Is there any other method to delete duplicates in list?
And dictionary is imported from dictionary.txt!!
sentence=input("Sentence:")
dictionary=open("dictionary.txt", encoding="latin2").read().lower().split()
import re
words=re.findall("\w+",sentence.lower())
words=sorted(set(words))
sez=[]
for i in words:
if i not in dictionary:
sez.append(i)
print(sez)
words = filter(lambda index, item: words.index(item) == index, enumerate(words))
It'll filter out every duplicate and will maintain the order.
As Thomas pointed out, this is a rather heavy approach. if you need to process a larger number of words, you could use this for loop:
dups = set()
filtered_list = []
for word in words:
if not word in dups:
filtered_list.append(word)
dups.add(word)
To delete duplicates in a list, add them to a dictionary. A dictionary only has 1 KEY:VALUE pair.
You can use OrderedSet recipe.
#edit: BTW if the dictionary is big then it's better to convert dictionary list into a set -- checking existence of an element in a set takes constant time instead of O(n) in the case of list.
You should check this answer:
https://stackoverflow.com/a/7961425/1225541
If you use his method and stop sorting the words array (remove the words=sorted(set(words)) line) it should do what you expect.