say I have two lists
list_1 = [ 'Tar', 'Arc', 'Elbow', 'State', 'Cider', 'Dusty', 'Night', 'Inch', 'Brag', 'Cat', 'Bored', 'Save', 'Angel','bla', 'Stressed', 'Dormitory', 'School master','Awesoame', 'Conversation', 'Listen', 'Astronomer', 'The eyes', 'A gentleman', 'Funeral', 'The Morse Code', 'Eleven plus two', 'Slot machines', 'Fourth of July', 'Jim Morrison', 'Damon Albarn', 'George Bush', 'Clint Eastwood', 'Ronald Reagan', 'Elvis', 'Madonna Louise Ciccone', 'Bart', 'Paris', 'San Diego', 'Denver', 'Las Vegas', 'Statue of Liberty']
and
list_B = ['Cried', 'He bugs Gore', 'They see', 'Lives', 'Joyful Fourth', 'The classroom', 'Diagnose', 'Silent', 'Taste', 'Car', 'Act', 'Nerved', 'Thing', 'A darn long era', 'Brat', 'Twelve plus one', 'Elegant man', 'Below', 'Robed', 'Study', 'Voices rant on', 'Chin', 'Here come dots', 'Real fun', 'Pairs', 'Desserts', 'Moon starer', 'Dan Abnormal', 'Old West action', 'Built to stay free', 'One cool dance musician', 'Dirty room', 'Grab', 'Salvages', 'Cash lost in me', "Mr. Mojo Risin'", 'Glean', 'Rat', 'Vase']
What I am looking for is to find the anagram pairs of list_A in list_B. Create a list of tuples of the anagrams.
For one list I can do the following and generate the list of tuples, however, for two lists I need some assistance. Thanks in advance for the help!
What I have tried for one list,
from collections import defaultdict
anagrams = defaultdict(list)
for w in list_A:
anagrams[tuple(sorted(w))].append(w)
You can use a nested for loop, outer for the first list, inner for the second (also, use str.lower to make it case-insensitive):
anagram_pairs = [] # (w_1 from list_A, w_2 from list_B)
for w_1 in list_A:
for w_2 in list_B:
if sorted(w_1.lower()) == sorted(w_2.lower()):
anagram_pairs.append((w_1, w_2))
print(anagram_pairs)
Output:
[('Tar', 'Rat'), ('Arc', 'Car'), ('Elbow', 'Below'), ('State', 'Taste'), ('Cider', 'Cried'), ('Dusty', 'Study'), ('Night', 'Thing'), ('Inch', 'Chin'), ('Brag', 'Grab'), ('Cat', 'Act'), ('Bored', 'Robed'), ('Save', 'Vase'), ('Angel', 'Glean'), ('Stressed', 'Desserts'), ('School master', 'The classroom'), ('Listen', 'Silent'), ('The eyes', 'They see'), ('A gentleman', 'Elegant man'), ('The Morse Code', 'Here come dots'), ('Eleven plus two', 'Twelve plus one'), ('Damon Albarn', 'Dan Abnormal'), ('Elvis', 'Lives'), ('Bart', 'Brat'), ('Paris', 'Pairs'), ('Denver', 'Nerved')]
You are quite close with your current attempt. All you need to do is repeat the same process on list_B:
from collections import defaultdict
anagrams = defaultdict(list)
list_A = [ 'Tar', 'Arc', 'Elbow', 'State', 'Cider', 'Dusty', 'Night', 'Inch', 'Brag', 'Cat', 'Bored', 'Save', 'Angel','bla', 'Stressed', 'Dormitory', 'School master','Awesoame', 'Conversation', 'Listen', 'Astronomer', 'The eyes', 'A gentleman', 'Funeral', 'The Morse Code', 'Eleven plus two', 'Slot machines', 'Fourth of July', 'Jim Morrison', 'Damon Albarn', 'George Bush', 'Clint Eastwood', 'Ronald Reagan', 'Elvis', 'Madonna Louise Ciccone', 'Bart', 'Paris', 'San Diego', 'Denver', 'Las Vegas', 'Statue of Liberty']
list_B = ['Cried', 'He bugs Gore', 'They see', 'Lives', 'Joyful Fourth', 'The classroom', 'Diagnose', 'Silent', 'Taste', 'Car', 'Act', 'Nerved', 'Thing', 'A darn long era', 'Brat', 'Twelve plus one', 'Elegant man', 'Below', 'Robed', 'Study', 'Voices rant on', 'Chin', 'Here come dots', 'Real fun', 'Pairs', 'Desserts', 'Moon starer', 'Dan Abnormal', 'Old West action', 'Built to stay free', 'One cool dance musician', 'Dirty room', 'Grab', 'Salvages', 'Cash lost in me', "Mr. Mojo Risin'", 'Glean', 'Rat', 'Vase']
for w in list_A:
anagrams[tuple(sorted(w))].append(w)
for w in list_B:
anagrams[tuple(sorted(w))].append(w)
result = [b for b in anagrams.values() if len(b) > 1]
Output:
[['Cider', 'Cried'], ['The eyes', 'They see'], ['Damon Albarn', 'Dan Abnormal'], ['Bart', 'Brat'], ['Paris', 'Pairs']]
Another solution using dictionary:
out = {}
for word in list_A:
out.setdefault(tuple(sorted(word.lower())), []).append(word)
for word in list_B:
word_s = tuple(sorted(word.lower()))
if word_s in out:
out[word_s].append(word)
print(list(tuple(v) for v in out.values() if len(v) > 1))
Prints:
[
("Tar", "Rat"),
("Arc", "Car"),
("Elbow", "Below"),
("State", "Taste"),
("Cider", "Cried"),
("Dusty", "Study"),
("Night", "Thing"),
("Inch", "Chin"),
("Brag", "Grab"),
("Cat", "Act"),
("Bored", "Robed"),
("Save", "Vase"),
("Angel", "Glean"),
("Stressed", "Desserts"),
("School master", "The classroom"),
("Listen", "Silent"),
("The eyes", "They see"),
("A gentleman", "Elegant man"),
("The Morse Code", "Here come dots"),
("Eleven plus two", "Twelve plus one"),
("Damon Albarn", "Dan Abnormal"),
("Elvis", "Lives"),
("Bart", "Brat"),
("Paris", "Pairs"),
("Denver", "Nerved"),
]
i have a project to get semantic relationships between two words , i want to get word to word relationships like hypernyms,hyponyms, Synonyms, holonyms, ...
i try wordnet nltk but most of relationships is none,
here is sample code:
from nltk.corpus import wordnet as wn
from wordhoard import synonyms
Word1 = 'red'
Word2 = 'color'
LSTWord1 =[]
for syn in wn.synsets(Word1):
for lemma in syn.part_meronyms():
LSTWord1.append(lemma)
for s in LSTWord1:
if Word2 in s.name() :
print(Word1 +' is meronyms of ' + Word2)
break
LSTWord2 =[]
for syn in wn.synsets(Word2):
for lemma in syn.part_meronyms ():
LSTWord2.append(lemma)
for s in LSTWord2:
if Word1 in s.name() :
print( Word2 +' is meronyms of ' + Word1)
break
here an example of words:
scheduled ,geometry
games,river
campaign,sea
adventure,place
session,road
long,town
campaign,road
session,railway
difficulty of session,place of interest
campaign,town
leader,historic place
have,town
player,town
skills,church
campaign,cultural interest
character name,town
player,monument
player,province
games,beach
expertise level,gas station
character,municipality
world,electrict line
social interaction,municipality
world,electric line
percentage,municipality
character,hospital
inhabitants,mine
active character,municipality
campaign,altitude
died,municipality
many time,mountain
adventurer,altitude
campaign,peak
gain,place of interest
new capabilities,cultural interest
player,cultural interest
achievement,national park
campaign,good
first action,railway station
player,province
may wordnet is limit or may there is no relation between words, my question is there any alternatives to wordnet to handle semantic relationships between words, or is there any better way to get semantic relation between words?
Thanks
As I previously stated, I'm the author of the Python package wordhoard that you used in your question. Based on your question, I decided to add some additional modules to the package. These modules focus on:
homophones
hypernyms
hyponyms
I could not find an easy way to add the meronyms, but I'm still looking at the best way to do that.
The homophones modules will query a hand-built list of 60,000+ most frequently used English words for known homophone. I plan to expand this list in the future.
from wordhoard import Homophones
words = ['scheduled' ,'geometry', 'games', 'river', 'campaign', 'sea', 'adventure','place','session', 'road', 'long', 'town', 'campaign', 'road', 'session', 'railway']
for word in words:
homophone = Homophones(word)
results = homophone.find_homophones()
print(results)
# output
no homophones for scheduled
no homophones for geometry
no homophones for games
no homophones for river
no homophones for campaign
['sea is a homophone of see', 'sea is a homophone of cee']
no homophones for adventure
['place is a homophone of plaice']
['session is a homophone of cession']
['road is a homophone of rowed', 'road is a homophone of rode']
truncated...
The hypernyms module queries various online repositories.
from wordhoard import Hypernyms
words = ['scheduled' ,'geometry', 'games', 'river', 'campaign','sea', 'adventure',
'place','session','road', 'long','town', 'campaign','road', 'session', 'railway']
for word in words:
hypernym = Hypernyms(word)
results = hypernym.find_hypernyms()
print(results)
# output
['no hypernyms found']
['arrangement', 'branch of knowledge', 'branch of math', 'branch of mathematics', 'branch of maths', 'configuration', 'figure', 'form', 'math', 'mathematics', 'maths', 'pure mathematics', 'science', 'shape', 'study', 'study of numbers', 'study of quantities', 'study of shapes', 'system', 'type of geometry']
['lake', 'recreation']
['branch', 'dance', 'fresh water', 'geological feature', 'landform', 'natural ecosystem', 'natural environment', 'nature', 'physical feature', 'recreation', 'spring', 'stream', 'transportation', 'watercourse']
['action', 'actively seek election', 'activity', 'advertise', 'advertisement', 'battle', 'canvass', 'crusade', 'discuss', 'expedition', 'military operation', 'operation', 'political conflict', 'politics', 'promote', 'push', 'race', 'run', 'seek votes', 'wage war']
truncated...
The hyponyms module queries repositories.
from wordhoard import Hyponyms
words = ['scheduled' ,'geometry', 'games', 'river', 'campaign','sea', 'adventure',
'place','session','road', 'long','town', 'campaign','road', 'session', 'railway']
for word in words:
hyponym = Hyponyms(word)
results = hyponym.find_hyponyms()
print(results)
# output
['no hyponyms found']
['absolute geometry', 'affine geometry', 'algebraic geometry', 'analytic geometry', 'combinatorial geometry', 'descriptive geometry', 'differential geometry', 'elliptic geometry', 'euclidean geometry', 'finite geometry', 'geometry of numbers', 'hyperbolic geometry', 'non-euclidean geometry', 'perspective', 'projective geometry', 'pythagorean theorem', 'riemannian geometry', 'spherical geometry', 'taxicab geometry', 'tropical geometry']
['jack in the box', 'postseason']
['affluent', 'arkansas river', 'arno river', 'avon', 'big sioux river', 'bighorn river', 'brazos river', 'caloosahatchee river', 'cam river', 'canadian river', 'cape fear river', 'changjiang', 'chari river', 'charles river', 'chattahoochee river', 'cimarron river', 'colorado river', 'orange', 'red', 'tunguska']
['ad blitz', 'ad campaign', 'advertising campaign', 'agitating', 'anti-war movement', 'campaigning', 'candidacy', 'candidature', 'charm campaign', 'come with', 'electioneering', 'feminism', 'feminist movement', 'fund-raising campaign', 'fund-raising drive', 'fund-raising effort', 'military campaign', 'military expedition', 'political campaign', 'senate campaign']
truncated...
Please let me know if you have any issues when you used these new modules.
Looks like you are looking for arbitrary semantic relationships between a pair of given words and in a large vocabulary. Probably simple cosine similarity of the word embedding can help here. You can start with GloVe.
I have a problem about implementing recommendation system by using Euclidean Distance.
What I want to do is to list some close games with respect to search criteria by game title and genre.
Here is my project link : Link
After calling function, it throws an error shown below.
How can I fix it?
Here is the error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-31-bda255afa9da> in <module>
1 search_game=input('Please enter The name of the wiigame :')
2 number=int(input('Please enter the number of recommendations you want: '))
----> 3 recommendationSystembyEuclideanDistance(wii_df,search_game,number)
<ipython-input-27-c8bfb3378f18> in recommendationSystembyEuclideanDistance(data, game, number)
17 count=0
18 for i in df.values:
---> 19 p.append([distance.euclidean(x,i),count])
20 count+=1
21 p.sort()
~\Anaconda3\lib\site-packages\scipy\spatial\distance.py in euclidean(u, v, w)
618
619 """
--> 620 return minkowski(u, v, p=2, w=w)
621
622
~\Anaconda3\lib\site-packages\scipy\spatial\distance.py in minkowski(u, v, p, w)
510 if p < 1:
511 raise ValueError("p must be at least 1")
--> 512 u_v = u - v
513 if w is not None:
514 w = _validate_weights(w)
TypeError: unsupported operand type(s) for -: 'str' and 'str'
Here is my games titles
array(['007 Legends', '1001 Spikes', '140', '153 Hand Video Poker',
'3Souls', '6-Hand Video Poker', '6180 the moon', '8Bit Hero',
'99Moves', '99Seconds', 'Absolutely Unstoppable MineRun', 'Abyss',
'Ace: Alien Cleanup Elite', 'Ace of Seafood',
'Act It Out! A Game of Charades',
'Adventure Party: Cats and Caverns',
"Adventure Time: Explore the Dungeon Because I Don't Know!",
'Adventure Time: Finn & Jake Investigations', 'Adventures of Pip',
'Aenigma Os', 'Affordable Space Adventures', 'Alice in Wonderland',
'Alphadia Genesis', 'The Amazing Spider-Man: Ultimate Edition',
'The Amazing Spider-Man 2', 'Angry Birds Star Wars',
'Angry Birds Trilogy', 'Angry Bunnies: Colossal Carrot Crusade',
'Angry Video Game Nerd Adventures',
'Animal Crossing: Amiibo Festival', 'Animal Gods', 'Annihilation',
'Another World: 20th Anniversary Edition', 'Aperion Cyberstorm',
'Aqua Moto Racing Utopia', 'Aqua TV', 'Arc Style: Baseball! SP',
'Archery by Thornbury Software', 'Ark Rush', 'Armikrog', 'Armillo',
'Armored Acorns: Action Squirrel Squad', 'Arrow Time U',
'Art of Balance', 'Ascent of Kings', 'Asdivine Hearts',
"Assassin's Creed III", "Assassin's Creed IV: Black Flag",
'Asteroid Quarry', 'Astral Breakers',
'Ava and Avior Save the Earth', 'Avoider', 'Axiom Verge',
'Azure Snake', 'B3 Game Expo for Bees', 'Back to Bed',
'Badland: Game of the Year Edition', 'Baila Latino',
'Ballpoint Universe: Infinite',
'Barbie and her Sisters: Puppy Rescue', 'Barbie: Dreamhouse Party',
'Batman: Arkham City – Armored Edition', 'Batman: Arkham Origins',
'Batman: Arkham Origins Blackgate – Deluxe Edition', 'Bayonetta',
'Bayonetta 2', 'Beatbuddy: Tale of the Guardians',
"The Beggar's Ride", 'Ben 10: Omniverse', 'Ben 10: Omniverse 2',
"Bigley's Revenge", 'The Binding of Isaac: Rebirth',
'Bird Mania Party', 'Bit Dungeon Plus',
'Bit.Trip Presents... Runner2: Future Legend of Rhythm Alien',
'Blackjack 21', 'Blasting Agent: Ultimate Edition', 'Blek', 'Bloc',
'Blockara', 'Block Zombies!', 'Blocky Bot', 'Blok Drop U',
'Blok Drop X: Twisted Fusion', 'Blue-Collar Astronaut',
'Bombing Bastards', 'The Book of Unwritten Tales 2', 'Booty Diver',
'Box Up', 'Brave Tank Hero', 'Breakout Defense',
'Breakout Defense 2', 'Breezeblox', 'BrickBlast U!',
'Brick Breaker', 'Brick Race', 'The Bridge',
'Bridge Constructor Playground', 'Brunswick Pro Bowling',
'Bubble Gum Popper', 'Buddy & Me: Dream Edition', 'Buta Medal',
"Cabela's Big Game Hunter: Pro Hunts",
"Cabela's Dangerous Hunts 2013",
'Cake Ninja 3: The Legend Continues', 'Call of Duty: Black Ops II',
'Call of Duty: Ghosts', 'Call of Nightmare', 'Candy Hoarder',
'Canvaleon', 'Captain Toad: Treasure Tracker',
'Cars 3: Driven to Win', 'CastleStorm', 'The Cave', 'Chariot',
'ChariSou Ultra DX: Sekai Tour', 'Chasing Aurora', 'Chasing Dead',
"Chests o' Booty", 'Child of Light', 'Chimpuzzle Pro',
'Chompy Chomp Chomp Party',
'Christmas Adventure of Rocket Penguin', 'Chroma Blast',
'Chronicles of Teddy: Harmony of Exidus', 'Chubbins',
'Citadale: Gate of Souls', 'Citadale: The Legends Trilogy',
'Citizens of Earth', 'Cloudberry Kingdom', 'Coaster Crazy Deluxe',
'Cocoto Magic Circus 2', 'Collateral Thinking', 'Color Bombs',
'Color Cubes', 'Color Symphony 2', 'Color Zen', 'Color Zen Kids',
'Coqui The Game', 'Cosmophony', 'Costume Quest 2',
'Crab Cakes Rescue', 'The Croods: Prehistoric Party!',
'Crush Insects', 'Crystorld', 'Cube Blitz',
'Cube Life: Island Survival', 'Cube Life: Pixel Action Heroes',
'Cubemen 2', 'Cubeshift', 'Cubit The Hardcore Platformer Robot HD',
'Cup Critters', 'Cutie Clash', 'Cutie Pets Go Fishing',
'Cutie Pets Jump Rope', 'Cutie Pets Pick Berries',
'Cycle of Eternity: Space Anomaly',
'D.M.L.C. Death Match Love Comedy', 'Daikon Set',
'Dare Up Adrenaline', 'Darksiders II',
'Darksiders: Warmastered Edition', 'Darts Up',
'A Day at the Carnival', 'The Deer God', 'Defend Your Crypt',
'Defense Dome', 'Demonic Karma Summoner',
"Deus Ex: Human Revolution – Director's Cut", "Devil's Third",
'Dinox', 'Discovery', 'Disney Epic Mickey 2: The Power of Two',
'Disney Infinity', 'Disney Infinity 3.0',
'Disney Infinity: Marvel Super Heroes', 'Disney Planes',
'Disney Planes: Fire & Rescue', "Disney's DuckTales: Remastered",
'Dodge Club Party', 'DokiDoki Tegami Relay', 'Dolphin Up',
"Don't Crash", "Don't Starve: Giant Edition",
"Don't Touch Anything Red", 'Donkey Kong Country: Tropical Freeze',
'Dot Arcade', 'Double Breakout', 'Double Breakout II', 'Dr. Luigi',
"Dracula's Legacy", 'Dragon Fantasy: The Black Tome of Ice',
'Dragon Fantasy: The Volumes of Westeria',
'Dragon Quest X: 5000-nen no Harukanaru Kokyou e Online',
'Dragon Quest X: Inishie no Ryuu no Denshou Online',
'Dragon Quest X: Mezameshi Itsutsu no Shuzoku Online',
'Dragon Quest X: Nemureru Yūsha to Michibiki no Meiyū Online',
'Dragon Skills', 'Draw 2 Survive', 'Draw a Stickman: Epic 2',
"A Drawing's Journey", 'Dream Pinball 3D II', 'Dreamals',
'Dreamals: Dream Quest', 'Dreii', 'Drop It: Block Paradise!',
'Dual Core', 'Dungeons & Dragons: Chronicles of Mystara',
'Dungeon Hearts DX', 'Dying is Dangerous',
'Earthlock: Festival of Magic', 'Eba & Egg: A Hatch Trip',
'Ectoplaza', 'Edge', 'Educational Pack of Kids Games',
'Electronic Super Joy', 'Electronic Super Joy: Groove City',
'Elliot Quest', 'El Silla: Arcade Edition',
'Emojikara: A Clever Emoji Match Game', 'Endless Golf',
'Epic Dumpster Bear', 'Escape from Flare Industries',
'ESPN Sports Connection', 'Evofish', "Exile's End", 'Explody Bomb',
'Extreme Exorcism', 'F1 Race Stars: Powered Up Edition',
'Factotum', 'Fake Colors', 'The Fall', 'Falling Skies: The Game',
'Family Party: 30 Great Games Obstacle Arcade', 'Family Tennis SP',
'Fast & Furious: Showdown', 'Fast Racing Neo', 'Fat City',
'Fat Dragons', 'Fatal Frame: Maiden of Black Water', 'FIFA 13',
'Fifteen', 'Finding Teddy II', "Fire: Ungh's Quest",
"Fist of the North Star: Ken's Rage 2", 'Fit Music for Wii U',
'Flapp & Zegeta', 'Flight of Light',
"Flowerworks HD: Follie's Adventure", 'Forced', 'Forest Escape',
'Forma.8', 'Frag doch mal...die Maus!',
'Frankenstein: Master of Death', 'Frederic: Resurrection of Music',
'Free Balling', 'Freedom Planet', 'FreezeME', 'Frenchy Bird',
'Fujiko F. Fujio Characters: Daishuugou! SF Dotabata Party!',
'FullBlast', 'Funk of Titans', "Funky Barn: It's Farming",
'Funky Physics', 'Futuridium EP Deluxe', 'Gaiabreaker',
'Galaxy Blaster', 'Game & Wario', 'Game Party Champions',
'Games for Toddlers', 'Gear Gauntlet', 'The Gem Collector',
'Gemology', 'Geom', 'GetClose: A game for Rivals',
'Ghost Blade HD', 'Giana Sisters: Twisted Dreams',
'Giana Sisters: Twisted Dreams – Owltimate Edition',
'Ginsei Shogi: Kyoutendotou Fuuraijin', 'The Girl and the Robot',
'Girls Like Robots',
'Gotouchi Tetsudou: Gotouchi Kyara to Nihon Zenkoku no Tabi',
'GravBlocks+', 'Gravity+', 'Gravity Badgers', 'The Great Race',
'Grumpy Reaper', "Guac' a Mole",
'Guacamelee!: Super Turbo Championship Edition',
'Guitar Hero Live', 'Gunman Clive HD Collection',
'Hello Kitty Kruisers', 'Heptrix', 'High Strangeness', 'Hive Jump',
'Hold Your Fire: A Game About Responsibility', 'Horror Stories',
'Hot Rod Racer', "Hot Wheels World's Best Driver",
'How to Survive', 'How to Train Your Dragon 2',
'Human Resource Machine', 'Humanitarian Helicopter',
"Hunter's Trophy 2 Europa", 'HurryUp! Bird Hunter',
'Hyrule Warriors', 'I C Redd', "I've Got to Run!",
'Ice Cream Surfer', 'Infinity Runner', 'Injustice: Gods Among Us',
'Insect Planet TD', 'Inside My Radio', 'Internal Invasion',
'Invanoid', 'IQ Test', 'Island Flight Simulator', 'Ittle Dew',
'Jackpot 777', 'Jeopardy!', 'Jett Tailfin', 'Jewel Quest',
'Jikan Satansa', 'Job the Leprechaun', "Joe's Diner",
'Jolt Family Robot Racer', 'Jones on Fire',
'Jotun: Valhalla Edition', 'Journey of a Special Average Balloon',
'Just Dance 4', 'Just Dance 2014', 'Just Dance 2015',
'Just Dance 2016', 'Just Dance 2017', 'Just Dance 2018',
'Just Dance 2019', 'Just Dance: Disney Party 2',
'Just Dance Kids 2014', 'Just Dance Wii U',
'Kamen Rider: Battride War II', 'Kamen Rider: SummonRide',
'Kemono Dash', 'Kick & Fennick', 'KickBeat: Special Edition',
'Kirby and the Rainbow Curse', 'Koi DX', 'Knytt Underground',
'Kung Fu Fight!', 'Kung Fu Panda: Showdown of Legendary Legends',
'Kung Fu Rabbit', 'Land It Rocket', 'Laser Blaster',
'Last Soldier', 'Legend of Kay Anniversary',
'The Legend of Zelda: Breath of the Wild',
'The Legend of Zelda: Twilight Princess HD',
'The Legend of Zelda: The Wind Waker HD',
'Lego Batman 2: DC Super Heroes', 'Lego Batman 3: Beyond Gotham',
'Lego City Undercover', 'Lego Dimensions', 'Lego Jurassic World',
'Lego Marvel Super Heroes', "Lego Marvel's Avengers",
'The Lego Movie Videogame', 'Lego Star Wars: The Force Awakens',
'Lego The Hobbit', 'The Letter', 'Letter Quest Remastered',
"Level 22, Gary's Misadventures", 'Life of Pixel',
'Little Inferno', "Lone Survivor: The Director's Cut",
'Lost Reavers', 'Lovely Planet', 'Lucadian Chronicles',
'Lucentek Beyond', 'Lucentek: Activate',
'Luv Me Buddies Wonderland', 'Madden NFL 13', 'Mahjong',
'Mahjong Deluxe 3', 'Manabi Getto!',
'Mario & Sonic at the Rio 2016 Olympic Games',
'Mario & Sonic at the Sochi 2014 Olympic Winter Games',
'Mario Kart 8', 'Mario Party 10', 'Mario Tennis: Ultra Smash',
'Mario vs. Donkey Kong: Tipping Stars',
'Marvel Avengers: Battle for Earth',
'Mass Effect 3: Special Edition', 'Masked Forces', 'Master Reboot',
'Maze', 'Maze Break', 'Mega Maze', 'Meine Ersten Mitsing-Lieder',
'Meme Run', 'Michiko Jump!', 'Midnight', 'Midnight 2',
'Midtown Crazy Race', 'Mighty No. 9',
'Mighty Switch Force!: Hyper Drive Edition',
'Mighty Switch Force! 2', 'Miko Mole', 'MikroGame: Rotator',
'Minecraft: Story Mode - The Complete Adventure',
'Minecraft: Wii U Edition',
'Mini Mario & Friends: Amiibo Challenge',
'Mini-Games Madness Volume #1: Hello World!',
'Minna de Uchū Tour: ChariSou DX 2',
'The Misshitsukara no Dasshutsu: Subete no Hajimari 16 no Nazo',
'The Misshitsukara no Dasshutsu 2: Kesareta 19 no Kioku',
'Molly Maggot', 'Momonga Pinball Adventures',
'Mon Premier Karaoké', 'Monkey Pirates', 'Monster High: 13 Wishes',
'Monster High: New Ghoul in School', 'Monster Hunter 3 Ultimate',
'Monster Hunter: Frontier G', 'Monster Hunter: Frontier G Genuine',
'Monster Hunter: Frontier G5', 'Monster Hunter: Frontier G6',
'Monster Hunter: Frontier G7', 'Monster Hunter: Frontier G8',
'Monster Hunter: Frontier G9', 'Monster Hunter: Frontier G10',
'Monster Hunter: Frontier Z', 'Mop: Operation Cleanup',
'Mortar Melon', 'Mountain Peak Battle Mess',
'Mr. Pumpkin Adventure', 'Mutant Alien Moles of the Dead',
'Mutant Mudds Deluxe', 'Mutant Mudds Super Challenge',
'My Arctic Farm', 'My Exotic Farm', 'My Farm', 'My First Songs',
'My Jurassic Farm', 'My Style Studio: Hair Salon',
'The Mysterious Cities of Gold: Secret Paths', 'Nano Assault Neo',
'NBA 2K13', 'Near Earth Objects', 'Need for Speed: Most Wanted U',
'Neon Battle', 'NES Remix', 'NES Remix 2', 'Never Alone',
'New Super Luigi U', 'New Super Mario Bros. U', 'Nihilumbra',
"Ninja Gaiden 3: Razor's Edge", 'Ninja Pizza Girl',
'Ninja Strike: Dangerous Dash',
'Nintendo Game Seminar 2013 Jukousei Sakuhin', 'Nintendo Land',
'Noitu Love: Devolution', 'Nova-111', 'Now I know my ABCs',
'Octocopter: Super Sub Squid Escape', 'Octodad: Dadliest Catch',
"Oddworld: Abe's Oddysee – New 'n' Tasty!",
"Ohayou! Beginner's Japanese", 'OlliOlli', 'Olympia Rising',
'One Piece: Unlimited World Red', 'Orbit', 'Othello',
'Outside The Realm', 'Overworld Defender Remix',
'Pac-Man and the Ghostly Adventures',
'Pac-Man and the Ghostly Adventures 2', 'Panda Love', 'Paparazzi',
'Paper Mario: Color Splash', 'Paper Monsters Recut',
'Paranautical Activity',
"The Peanuts Movie: Snoopy's Grand Adventure", 'Peg Solitaire',
'Penguins of Madagascar', 'Pentapuzzle', "Percy's Predicament",
'Perpetual Blast', 'The Perplexing Orb', 'Petite Zombies',
'Phineas and Ferb: Quest for Cool Stuff', 'Piano Teacher',
'Pic-a-Pix Color', 'PictoParty',
'Pier Solar and the Great Architects', 'Pikmin 3', 'Pinball',
'The Pinball Arcade', 'Pinball Breakout', 'Ping 1.5+',
'Pirate Pop Plus', 'Pixel Slime U', 'PixelJunk Monsters',
'PixlCross', 'Placards', 'Plantera', 'Plenty of Fishies',
'Pokémon Rumble U', 'Poker Dice Solitaire Future',
'Pokkén Tournament', 'Poncho',
'Preston Sterling and the Legend of Excalibur', 'Prism Pets',
'Psibo', 'Psyscrolr', 'Puddle', 'Pumped BMX', 'Pure Chess',
'Pushmo World', 'Puyo Puyo Tetris', 'Puzzle Monkeys',
'Quadcopter Pilot Challenge', "Q.U.B.E.: Director's Cut",
'Queens Garden', 'Quest of Dungeons', 'The Quiet Collection',
'Rabbids Land', 'Race the Sun', 'Radiantflux: Hyperfractal',
'Rainbow Snake', 'Rakoo & Friends', 'Rapala Pro Bass Fishing',
'Rayman Legends', 'Rayman Legends Challenges', 'Red Riding Hood',
'Regina & Mac', 'Replay: VHS is Not Dead', 'Reptilian Rebellion',
'Resident Evil: Revelations', 'Retro Road Rumble', 'Revenant Saga',
'Rise of the Guardians: The Video Game',
'The Rivers of Alice: Extended Version',
"Rock 'N Racing Grand Prix", "Rock 'N Racing Off Road",
"Rock 'N Racing Off Road DX", 'Rock Zombie',
'Rodea the Sky Soldier', 'Romance of the Three Kingdoms 12',
'Rorrim', 'Roving Rogue', 'RTO', 'RTO 2', "Rubik's Cube",
'Run Run and Die', 'Runbow', 'Rush',
"Rynn's Adventure: Trouble in the Enchanted Forest",
'Ryū ga Gotoku 1 & 2: HD Edition', 'Sanatory Hallways',
'Santa Factory', 'Schlag den Star: Das Spiel',
'Scoop! Around the World in 80 Spaces',
'Scram Kitty and His Buddy on Rails', 'Scribble',
'Scribblenauts Unlimited',
'Scribblenauts Unmasked: A DC Comics Adventure',
'Secret Files: Tunguska', 'Severed', 'Shadow Archer',
'Shadow Archery', 'Shadow Puppeteer', 'Shakedown: Hawaii',
"Shantae and the Pirate's Curse", 'Shantae: Half-Genie Hero',
"Shantae: Risky's Revenge - Director's Cut", 'Shapes of Gray',
'Shiftlings', 'Shiny the Firefly', 'SHMUP Collection',
'Shooting Range by Thornbury Software', 'Shoot the Ball',
'Shooty Space', 'Shovel Knight', 'Shut the Box', 'Shütshimi',
'Shuttle Rush', 'Sing Party', 'Sinister Assistant',
'Six Sides of the World', 'Skeasy', 'Sketch Wars', 'Skorb',
"Skunky B's Super Slots Saga #1", 'Sky Force Anniversary',
'Skylanders: Giants', 'Skylanders: Imaginators',
"Skylanders: Spyro's Adventure", 'Skylanders: SuperChargers',
'Skylanders: Swap Force', 'Skylanders: Trap Team',
'Slender: The Arrival', "Slots: Pharaoh's Riches",
'Smart Adventures Mission Math: Sabotage at the Space Station',
'The Smurfs 2', 'Snake Den', 'Sniper Elite V2', 'Snowball',
'Solitaire', 'Solitaire Dungeon Escape',
'Sonic & All-Stars Racing Transformed',
'Sonic Boom: Rise of Lyric', 'Sonic Lost World', 'Soon Shine',
'Soul Axiom', 'Space Hulk', 'Space Hunted',
'Space Hunted: The Lost Levels', 'Space Intervention',
'SpaceRoads', "Spellcaster's Assistant", 'Sphere Slice',
'SphereZor', 'Spheroids', 'Spikey Walls',
"Spin the Bottle: Bumpie's Party", 'Splashy Duck', 'Splatoon',
"SpongeBob SquarePants: Plankton's Robotic Revenge", 'Sportsball',
'Spot the Differences! Party', 'Spy Chameleon', 'Squids Odyssey',
'Star Fox Guard', 'Star Fox Zero', 'Star Ghost', 'Star Sky',
'Star Sky 2', 'Star Splash: Shattered Star', 'Star Wars Pinball',
'Starwhal', 'Stealth Inc. 2: A Game of Clones', 'SteamWorld Dig',
'SteamWorld Heist', 'Steel Lords', 'Steel Rivals',
'Stick It to the Man!', 'Stone Shire', 'The Stonecutter',
'Sudoku and Permudoku', 'Sudoku Party', 'Super Destronaut',
'Super Destronaut 2: Go Duck Yourself', 'Super Hero Math',
'Super Mario 3D World', 'Super Mario Maker', 'Super Meat Boy',
'Super Robo Mouse', 'Super Smash Bros. for Wii U',
'Super Toy Cars', 'Super Ultra Star Shooter',
"Surfin' Sam: Attack of the Aqualites",
'Suspension Railroad Simulator', 'Swap Blocks', 'Swap Fire',
'The Swapper', 'Sweetest Thing', 'The Swindle',
'Swords & Soldiers', 'Swords & Soldiers II', 'Tabletop Gallery',
'Tachyon Project', 'Tadpole Treble',
'Taiko no Tatsujin: Atsumete ☆ Tomodachi Daisakusen!',
'Taiko no Tatsujin: Tokumori!', 'Taiko no Tatsujin: Wii U Version',
'Tallowmere', 'Tank SP', 'Tank! Tank! Tank!', 'Tap Tap Arcade',
'Tap Tap Arcade 2', 'Tekken Tag Tournament 2: Wii U Edition',
'Temple of Yog', 'Tengami', 'Terraria', 'Teslagrad', 'Teslapunk',
'Test Your Mind', 'Tested with Robots!', 'Tetrobot and Co.',
'Tetraminos', 'The First Skunk Bundle', 'Thomas Was Alone',
'Tilelicious: Delicious Tiles', 'Tiny Galaxy!', 'Tiny Thief',
'Titans Tower', 'TNT Racers: Nitro Machines Edition',
'Toby: The Secret Mine', 'Togabito no Senritsu', 'Toki Tori',
'Toki Tori 2+', 'Tomeling in Trouble', 'Tokyo Mirage Sessions ♯FE',
"Tom Clancy's Splinter Cell: Blacklist", 'Toon Tanks', 'Toon War',
'TorqueL', 'Toss n Go', 'Totem Topple', 'Toto Temple Deluxe',
'Touch Battle Tank SP', 'Touch Selections',
'Transformers Prime: The Game',
'Transformers: Rise of the Dark Spark', 'Trine: Enchanted Edition',
"Trine 2: Director's Cut", 'Tri-Strip', 'Triple Breakout',
'Tumblestone', 'Turbo: Super Stunt Squad', 'Turtle Tale',
'Twin Robots', 'Twisted Fusion', 'Typoman', 'U Host', 'Ultratron',
'Unalive', 'Underground', 'Unepic', 'Use Your Words', 'uWordsmith',
'Vaccine', 'Vector Assault', 'Vektor Wars',
'The Voice: I Want You', 'Volcanic Field 2', 'Volgarr the Viking',
'VRog', 'The Walking Dead: Survival Instinct', 'Wall Ball',
'Warriors Orochi 3 Hyper', 'Watch Dogs', 'Wheel of Fortune',
'Whispering Willows', 'Wicked Monsters Blast! HD Plus',
'Wii Fit U', 'Wii Karaoke U', 'Wii Party U', 'Wii Sports Club',
'Wind-up Knight 2', 'Wings of Magloryx', 'WinKings', 'Wipeout 3',
'Wipeout: Create & Crash', 'Woah Dave!', 'The Wonderful 101',
"Wooden Sen'SeY", 'Word Party', 'Word Logic by Powgi',
'Word Puzzles by Powgi', 'Word Search by Powgi',
'WordsUp! Academy', 'A World of Keflings', 'Xavier',
'Xenoblade Chronicles X', 'Xeodrifter', 'XType Plus', 'Y.A.S.G',
'Yakuman Hō-ō', 'Year Walk',
'Yo-kai Watch Dance: Just Dance Special Version',
"Yoshi's Woolly World", 'Your Shape: Fitness Evolved 2013',
"ZaciSa's Last Stand", 'Zen Pinball 2', 'Ziggurat', 'Zombeer',
'Zombie Brigade: No Brain No Gain', 'Zombie Defense', 'ZombiU',
'Zumba Fitness: World Party'], dtype=object)
Here is finding word function
def find_word(word,words):
t=[]
count=0
if word[-1]==' ':
word=word[:-1]
for i in words:
if word.lower() in i.lower():
t.append([len(word)/len(i),count])
else:
t.append([0,count])
count+=1
t.sort(reverse=True)
return words[t[0][1]]
Here is my recommendation system function.
def recommendationSystembyEuclideanDistance(data,game,number):
df=pd.DataFrame()
data.drop_duplicates(inplace=True)
games=data['Title'].values
best=find_word(game,games)
print('The game closest to your search is :',best)
genre=data[data['Title']==best]['Genre(s)'].values[0]
df=data[data['Genre(s)']==genre]
x=df[df['Title']==best].drop(columns=['Genre(s)','Title']).values
if len(x)>1:
x=x[1]
games_names=df['Title'].values
df.drop(columns=['Genre(s)','Title'],inplace=True)
df=df.fillna(df.mean())
p=[]
count=0
for i in df.values:
p.append([distance.euclidean(x,i),count])
count+=1
p.sort()
for i in range(1,number+1):
print(games_names[p[i][1]])
Here is the main part
search_game=input('Please enter The name of the wiigame :')
number=int(input('Please enter the number of recommendations you want: '))
recommendationSystembyEuclideanDistance(wii_df,search_game,number)
The issue is that you are using euclidean distance for comparing strings. Consider using Levenshtein distance, or something similar, which is designed for strings. NLTK has a function called edit distance that can do this or you can implement it on your own.
If I have a data-frame of 2000 and in which let say brand have 142 unique values and i want to count frequency of every unique value form 1 to 142.values should change dynamically.
brand=clothes_z.brand_name
brand.describe(include="all")
unique_brand=brand.unique()
brand.describe(include="all"),unique_brand
Output:
(count 2613
unique 142
top Mango
freq 54
Name: brand_name, dtype: object,
array(['Jack & Jones', 'TOM TAILOR DENIM', 'YOURTURN', 'Tommy Jeans',
'Alessandro Zavetti', 'adidas Originals', 'Volcom', 'Pier One',
'Superdry', 'G-Star', 'SIKSILK', 'Tommy Hilfiger', 'Karl Kani',
'Alpha Industries', 'Farah', 'Nike Sportswear',
'Calvin Klein Jeans', 'Champion', 'Hollister Co.', 'PULL&BEAR',
'Nike Performance', 'Even&Odd', 'Stradivarius', 'Mango',
'Champion Reverse Weave', 'Massimo Dutti', 'Selected Femme Petite',
'NAF NAF', 'YAS', 'New Look', 'Missguided', 'Miss Selfridge',
'Topshop', 'Miss Selfridge Petite', 'Guess', 'Esprit Collection',
'Vero Moda', 'ONLY Petite', 'Selected Femme', 'ONLY', 'Dr.Denim',
'Bershka', 'Vero Moda Petite', 'PULL & BEAR', 'New Look Petite',
'JDY', 'Even & Odd', 'Vila', 'Lacoste', 'PS Paul Smith',
'Redefined Rebel', 'Selected Homme', 'BOSS', 'Brave Soul', 'Mind',
'Scotch & Soda', 'Only & Sons', 'The North Face',
'Polo Ralph Lauren', 'Gym King', 'Selected Woman', 'Rich & Royal',
'Rooms', 'Glamorous', 'Club L London', 'Zalando Essentials',
'edc by Esprit', 'OYSHO', 'Oasis', 'Gina Tricot',
'Glamorous Petite', 'Cortefiel', 'Missguided Petite',
'Missguided Tall', 'River Island', 'INDICODE JEANS',
'Kings Will Dream', 'Topman', 'Esprit', 'Diesel', 'Key Largo',
'Mennace', 'Lee', "Levi's®", 'adidas Performance', 'jordan',
'Jack & Jones PREMIUM', 'They', 'Springfield', 'Benetton', 'Fila',
'Replay', 'Original Penguin', 'Kronstadt', 'Vans', 'Jordan',
'Apart', 'New look', 'River island', 'Freequent', 'Mads Nørgaard',
'4th & Reckless', 'Morgan', 'Honey punch', 'Anna Field Petite',
'Noisy may', 'Pepe Jeans', 'Mavi', 'mint & berry', 'KIOMI', 'mbyM',
'Escada Sport', 'Lost Ink', 'More & More', 'Coffee', 'GANT',
'TWINTIP', 'MAMALICIOUS', 'Noisy May', 'Pieces', 'Rest',
'Anna Field', 'Pinko', 'Forever New', 'ICHI', 'Seafolly', 'Object',
'Freya', 'Wrangler', 'Cream', 'LTB', 'G-star', 'Dorothy Perkins',
'Carhartt WIP', 'Betty & Co', 'GAP', 'ONLY Tall', 'Next', 'HUGO',
'Violet by Mango', 'WEEKEND MaxMara', 'French Connection'],
dtype=object))
As it is showing only frequency of Mango "54" because it is top frequency and I want every value frequency like what is the frequency of Jack & Jones, TOM TAILOR DENIM and YOURTURN and so on... and values should change dynamically.
You could simply do,
clothes_z.brand_name.value_counts()
This would list down the unique values and would give you the frequency of every element in that Pandas Series.
from collections import Counter
ll = [...your list of brands...]
c = Counter(ll)
# you can do whatever you want with your counted values
df = pd.DataFrame.from_dict(c, orient='index', columns=['counted'])