Basically, I have this long list of words (provided below) that I want to organize using Python. The problem is that the list of words don't already have commas + they are separated by line breaks and there are like 200 of them. Backspacing twice and adding a comma to each word seems a bit tedious and I'm sure there's some way to automate this in Python. However, I'm a beginner and can't really think of a method.
If possible, I'm looking for someone to point me in the right direction to solving this, because I really want to figure it out myself (for the most part, lol).
I want it to look like so:
[Adventurous, Aggressive, Agreeable, Alert, Alive, Amused]
(and so on)
This is how the list of words comes out when I copy/paste it:
adorable
adventurous
aggressive
agreeable
alert
alive
amused
angry
annoyed
annoying
anxious
arrogant
ashamed
attractive
average
awful
bad
beautiful
better
bewildered
black
bloody
blue
blue-eyed
blushing
bored
brainy
brave
breakable
bright
busy
calm
careful
cautious
charming
cheerful
clean
clear
clever
cloudy
clumsy
colorful
combative
comfortable
concerned
condemned
confused
cooperative
courageous
crazy
creepy
crowded
cruel
curious
cute
dangerous
dark
dead
defeated
defiant
delightful
depressed
determined
different
difficult
disgusted
distinct
disturbed
dizzy
doubtful
drab
dull
You can use tkinter module to get the copied text from clipboard, then split the text on new line character \n finally filter any item that is just an empty string.
import tkinter as tk
root = tk.Tk()
text = root.clipboard_get()
list(filter(lambda x: x != '', text.split('\n')))
OUTPUT:
['adventurous', 'aggressive', 'agreeable', 'alert', 'alive', 'amused', 'angry', 'annoyed', 'annoying', 'anxious', 'arrogant', 'ashamed', 'attractive', 'average', 'awful', 'bad', 'beautiful', 'better', 'bewildered', 'black', 'bloody', 'blue', 'blue-eyed', 'blushing', 'bored', 'brainy', 'brave', 'breakable', 'bright', 'busy', 'calm', 'careful', 'cautious', 'charming', 'cheerful', 'clean', 'clear', 'clever', 'cloudy', 'clumsy', 'colorful', 'combative', 'comfortable', 'concerned', 'condemned', 'confused', 'cooperative', 'courageous', 'crazy', 'creepy', 'crowded', 'cruel', 'curious', 'cute', 'dangerous', 'dark', 'dead', 'defeated', 'defiant', 'delightful', 'depressed', 'determined', 'different', 'difficult', 'disgusted', 'distinct', 'disturbed', 'dizzy', 'doubtful', 'drab', 'dull']
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I have a file I'd like to parse to json. First item looks as follows:
{'name': 'Ravikant P.', 'username': '#exp9993', 'url': 'https://www.freelancer.com/u/exp9993', 'title': 'ASP.NET/ Graphic Design/ Web Design/WordPress/PHP', 'city': 'Indore, India', 'stars_num_reviews': ['#exp9993', '5.0', '(178 reviews)'], 'rate': '$15 USD / hour', 'reputation': ['99%', 'Jobs Completed', '87%', 'On Budget', '94%', 'On Time', '19%', 'Repeat Hire Rate'], 'review_ratings': ['5.0', '5.0', '5.0', '5.0', '5.0'], 'review_clean_project_values': ['€50.00 EUR', '$600.00 USD', '$1,300.00 AUD', '$30.00 USD', '$50.00 USD'], 'review_project_titles': ['Wordpress - Woocommerce Small Function', 'PayDash Frontend', 'Improve current website', 'Project for Ravikant P.', 'Build a page 1:1 copy from Figma design file'], "review_project_descriptions": ['Wordpress - Woocommerce Small Function', 'Very fast and efficient! Made my request in less than 2-3 hours and exactly what i asked for.\nI recommend and will come back for sure! :)', 'PayDash Frontend', 'Delivered a high value project very quickly, always helping me fix any bugs there are in the code. Very great developer and team to work with!', 'Improve current website', 'Amazing job, and nothing was too much trouble. Highly recommend!!', 'Project for Ravikant P.', 'I requested the project from him, he completed it within 30 minutes. Super quick delivery, I will definitely be using him in the future for all of my frontend web development projects.', 'Build a page 1:1 copy from Figma design file', "He did the work very quickly and was super precise about all the little details. He's the perfect person I was looking for, someone who was able to create a page for me while paying attention to all… Read more"], 'experience_title': 'N/A', 'education_title': 'bachelor of engineering', 'education_time': '2011 - 2015', 'description': 'We can make anything you want, if you can describe exactly what you want, will get the exact service until full satisfaction. My interest lies in designing new material & do believe we are creative & can handle complications, I am always eager to learn new things & by listening carefully & asking the right questions can get to the core of the conversation quickly. We are a Team of 9+ experienced professionals who work closely with the clients, understand their requirements, offer suggestions, and implement ideas into reality. We always think beyond the boundaries and provide user-friendly as well as high quality IT services to our customers at a very reasonable price. Our team is always dedicative to innovate from high-end E-Commerce website development to the simplest logo design needs… Read more', 'profile_skills': {'Website Design': '138', 'HTML': '129', 'Graphic Design': '115', 'PHP': '99', 'WordPress': '62', 'User Experience Design': '53', 'CSS': '26', 'User Interface / IA': '25', 'Photoshop': '17', 'C# Programming': '13', 'ASP.NET': '12', '.NET': '11', 'eCommerce': '9', 'Microsoft SQL Server': '8', 'Logo Design': '8', 'PSD to HTML': '6', 'WooCommerce': '6'}}
In order to be valid json, all single quotes should be converted to double quotes. However, replacing the single quotes inside any [] truncates the json. I've done some extensive searching and have not seen a way to use regex to match a text NOT contained in quotes and then change that text. Any ideas on how I would be able to parse this to json? I've included items 2 and 3 below. Any help would be so greatly appreciated! I've been stuck on this for a few days and I'm not sure what else to try...
{'name': 'Artur', 'username': '#Appswebandroid', 'url': 'https://www.freelancer.com/u/Appswebandroid', 'title': '♛Google Certified Digital Marketing - Grow Sales♛', 'city': 'Mafra, Portugal', 'stars_num_reviews': ['#Appswebandroid', '5.0', '(71 reviews)'], 'rate': '$15 USD / hour', 'reputation': ['100%', 'Jobs Completed', '94%', 'On Budget', '97%', 'On Time', '17%', 'Repeat Hire Rate'], 'review_ratings': ['5.0', '5.0', '5.0', '5.0', '5.0'], 'review_clean_project_values': ['$10.00 USD', '$10.00 USD', '$200.00 USD', '$100.00 USD', '$196.00 USD'], 'review_project_titles': ['woodpress seo expert', 'semana adwords', 'Project for Artur', 'experto en adwords de preferencia en español', 'Google ADS expert for make a campaign today!'], "review_project_descriptions'": ['woodpress seo expert', 'Thank you', 'semana adwords', "Artur helped me setup my campaign, but he was also taking care of it during the month that we were working. I couldn't have done it myself. He helped me channel all my money to the right… Read more", 'Project for Artur', 'Great Freelancer, very pro- and reactive. Only recommend. Thank you, Artur!', 'experto en adwords de preferencia en español', 'Artur is a master of adwords and he is really helpfull. Would work with him again. Thanks :)', 'Google ADS expert for make a campaign today!', 'Good work, is an expert in google ads! Fully recommend!'], 'experience_title': 'Computer Technician - Network Installation and Management', 'experience_time': 'Feb 2018 - Jun 2020 (2 years, 4 months)', 'education_title': 'Work optimization Business', 'education_time': 'N/A', 'description': 'I have a degree in Computer Technician Installation and Network Management. I am available to help you publicize your business to get more customers and more sales and solve web problems. I am ready to exceed my limits to satisfy every customer. Google Certified Digital Marketing Expert Increase your organic traffic and the average position of your keywords with SEO method. Qualified traffic in Google Adwords, Facebook, Instagram, Linkedin campaigns. Increase your sales for your company, in ecommerce or physical companies We place your site on the first page of search engines. I am a multi-site webmaster, an SEO specialist with a passion for search engines, taking your site or business to the top of Google search, following best practices. ✅My skills: *✔️ Google Merchant Center*✔️Facebook Ads*✔️Google Ads*✔️Instagram Ads✔️Pinterest Ads*✔️SEO*✔️WordPress*✔️Website Optimization*✔️eCommerce*✔️ Facebook Marketing**✔️Social Media… Read more', 'profile_skills': {'Internet Marketing': '35', 'Google Adwords': '27', 'Facebook Marketing': '23', 'Marketing': '22', 'Advertising': '20', 'SEO': '14', 'Social Media Marketing': '12', 'Android': '10', 'PHP': '8', 'Mobile App Development': '7', 'Google Adsense': '6', 'Website Design': '6', 'Linux': '5', 'Graphic Design': '5', 'eCommerce': '5', 'Google Analytics': '4', 'Prestashop': '4'}}
{'name': 'Usman N.', 'username': '#futivetechnet', 'url': 'https://www.freelancer.com/u/futivetechnet', 'title': '3D/2D Design/Animation-SMM-Unity Game Development', 'city': 'Lahore, Pakistan', 'stars_num_reviews': ['#futivetechnet', '4.9', '(139 reviews)'], 'rate': '$25 USD / hour', 'reputation': ['96%', 'Jobs Completed', '96%', 'On Budget', '93%', 'On Time', '12%', 'Repeat Hire Rate'], 'review_ratings': ['5.0', '5.0', '5.0', '5.0', '5.0'], 'review_clean_project_values': ['$12,400.00 USD', '€200.00 EUR', '$150.00 USD', '€4,700.00 EUR', '•'], 'review_project_titles': ['Online Easter crepes hunt game (for kids 5-10)', 'Project for Usman N.', '12 Social media posts for Digital marketing and app development company', 'Mobile game similar to Flappy Bird (side-scroller)', 'Make character animation for 2D game'], "review_project_descriptions'": ['Online Easter crepes hunt game (for kids 5-10)', 'It was a pleasure working with Usman. They made an amazing job!', 'Project for Usman N.', "Great quality of work! It's my second project with Usman, and both were successful. Looking forward to hire him again.", '12 Social media posts for Digital marketing and app development company', 'Usman delivered the posts as per the requirements and he did multiple revisions as per our requirements', 'Mobile game similar to Flappy Bird (side-scroller)', 'Great communication, great project management and great work!', 'Make character animation for 2D game', 'Great work! Usman understands all requirements with minimum clarifications. Result is very good.'], 'experience_title': '3D/2D Design/Animation-SMM-Unity Game Development', 'experience_time': 'Mar 2010 - Present', 'education_title': 'N/A', 'education_time': 'N/A', 'description': 'We are an all in one IT services provider in this region. A team of 70+ skilled & certified professional developers, designers, video editors, 3D modellers & animators, project managers and quality assurance individuals. Our team is dedicated in providing the work to our clients, meeting the highest quality standards, client specifications & timely delivery of services. Mobile | PC Games Development: - Experienced in Unity3D, iOS Swift, Cocos2D, Buildbox, including team for UI/UX designs - Experienced in Action, Simulation, 2D platformers, multiplayer & arcade games 3D | 2D Works, CGI Graphics: - 3D Modelling, Rigging, Rendering & Animation - Expertise in Autodesk Maya, 3DS Max, Blender Studio, DAZ Studio, Adobe Flash & After Effects… Read more', 'profile_skills': {'Animation': '64', '3D Animation': '64', 'Mobile App Development': '53', 'Game Development': '52', 'Graphic Design': '51', 'Game Design': '47', 'Unity 3D': '26', 'Video Services': '26', 'After Effects': '26', '3D Modelling': '24', 'Android': '20', '3D Rendering': '20', 'Photoshop': '18', '3D Design': '15', 'PHP': '14', 'Video Editing': '13', 'Website Design': '13'}}
As #CharlesDuffy says, you can use ast.literal_eval().
You can read the content directly from your file:
# file.txt contains three lines, one for each of the OP's strings
with open('file.txt') as f:
dlist = [ast.literal_eval(s) for s in f.read().splitlines()]
>>> [len(d) for d in dlist]
[17, 18, 18]
Note that, in the string that you provided, there is a r'\n':
"""...and exactly what i asked for.\nI recommend..."""
So if you try to paste the string surrounded by """ into an interpreter, then that makes ast.literal_eval() choke on the input (as all of a sudden there is a line break in the middle of the string).
If you replace it, then all is well.
dct = ast.literal_eval(s.replace('\n', r'\n'))
>>> len(dct)
17
But this is unnecessary when you read from the file.
I'm trying to tokenize the below text with stopwords('is', 'the', 'was') as delimiters
The expected output is this:
['Walter',
'feeling anxious',
'He',
'diagnosed today,'
'He probably',
'best person I know']
This is the code which I trying to make the above output
import nltk
stopwords = ['is', 'the', 'was']
sents = nltk.sent_tokenize("Walter was feeling anxious. He was diagnosed today. He probably is the best person I know.")
sents_rm_stopwords = []
for sent in sents:
sents_rm_stopwords.append(' '.join(w for w in nltk.word_tokenize(sent) if w not in stopwords))
My code output is this:
['Walter feeling anxious .',
'He diagnosed today .',
'He probably best person I know .']
How can I make the expected output?
So the problem considers both stopwords and line delimiters. Assuming that we can define a line by the symbol ., you can introduce that to multiple splits by using re.split().
import re
s = "Walter was feeling anxious. He was diagnosed today. He probably is the best person I know."
result = re.split(" was | is | the |\. |\.", s)
results
>>
['Walter',
'feeling anxious',
'He',
'diagnosed today',
'He probably',
'the best person I know',
'']
Because we are using both single . and . with a whitespace after, the split results will return an additional ''. Assuming that this structure of sentences are consistent, you can slice the results to get your expected results.
result[:-1]
>>
['Walter',
'feeling anxious',
'He',
'diagnosed today',
'He probably',
'the best person I know']
Hello I have a list as follows:
['2925729', 'Patrick did not shake our hands nor ask our names. He greeted us promptly and politely, but it seemed routine.'].
My goal is a result as follows:
['2925729','Patrick did not shake our hands nor ask our names'], ['2925729', 'He greeted us promptly and politely, but it seemed routine.']
Any pointers would be very much appreciated.
>>> t = ['2925729', 'Patrick did not shake our hands nor ask our names. He greeted us promptly and politely, but it seemed routine.']
>>> [ [t[0], a + '.'] for a in t[1].rstrip('.').split('.')]
[['2925729', 'Patrick did not shake our hands nor ask our names.'], ['2925729', ' He greeted us promptly and politely, but it seemed routine.']]
If you have a large dataset and want to conserve memory, you may want to create a generator instead of a list:
g = ( [t[0], a + '.'] for a in t[1].rstrip('.').split('.') )
for key, sentence in g:
# do processing
Generators do not create lists all at once. They create each element as you access it. This is only helpful if you don't need the whole list at once.
ADDENDUM: You asked about making dictionaries if you have multiple keys:
>>> data = ['1', 'I think. I am.'], ['2', 'I came. I saw. I conquered.']
>>> dict([ [t[0], t[1].rstrip('.').split('.')] for t in data ])
{'1': ['I think', ' I am'], '2': ['I came', ' I saw', ' I conquered']}
Noob here trying to learn python by doing a project as I don't learn well from books.
I am using a huge lump of code to perform what seems to me to be a small operation -
I want to extract 4 variables from the following string
'Miami 0, New England 28'
(variables being home_team, away_team, home_score, away_score)
My program is running pretty slow and I think it might be this bit of code. I guess I am looking for the quickest/most efficient way of doing this.
Would regex be quicker? Thanks
It seems like your text could be split twice. First on , and next on whitespace:
info1,info2 = s.split(',')
home,home_score = info1.rsplit(None,1)
away,away_score = info2.rsplit(None,1)
e.g.:
>>> s = 'Miami 0, New England 28'
>>> info1,info2 = s.split(',')
>>> home,home_score = info1.rsplit(None,1)
>>> away,away_score = info2.rsplit(None,1)
>>> print [home,home_score,away,away_score]
['Miami', '0', ' New England', '28']
You could do this with regex without too much difficulty -- but you pay for it in terms of readability.
In case you do want a regex:
import re
s='Miami 0, New England 28'
l=re.findall(r'^([^\d]+)\s(\d+)\s*,\s*([^\d]+)\s(\d+)',s)
hm_team,away_team,hm_score,away_score=l[0]
print l
Prints [('Miami', '0', 'New England', '28')] and assigns those values to the variables.
import re
reg = re.compile('\s*(\D+?)\s*(\d+)'
'[,;:.#=#\s]*'
'(\D+?)\s*(\d+)'
'\s*')
for s in ('Miami 0, New England 28',
'Miami0,New England28 ',
' Miami 0 . New England28',
'Miami 0 ; New England 28',
'Miami0#New England28 ',
' Miami 0 # New England28'):
print reg.search(s).groups()
result
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
('Miami', '0', 'New England', '28')
'\D' means 'no digit'