I want to build a simple app that will generate random words and their associated defintion from the urban dictionary api. I was thinking I could somehow scrape the website or find a database or .csv file with most of the urban dictionary words and then inject that into the api {word}.
I found their unofficial/official API online here: http://api.urbandictionary.com/v0
And more information about it here: https://pub.dev/documentation/urbandictionary/latest/urbandictionary/OfficialUrbanDictionaryClient-class.html
And here: https://pub.dev/documentation/urbandictionary/latest/urbandictionary/UrbanDictionary-class.html
Inside the second pub.dev link there appears to be a built-in function that generates a random list of words from the site. So obviously rather than having to find a database/web scrape the words this would be a much better way to create this app. Problem is I dont know how to call that function in my code.
New to APIs and here my code so far:
import requests
word = "all good in the hood"
response = requests.get(f"http://api.urbandictionary.com/v0/define?term={word}")
print(response.text)
This gives a long JSON/Dictionary in VSCODE. I think I'd be able to expand on this idea if it's possible to access that random function and just get a random word from the list.
Any help is appreciated.
Thanks
Scraping all the words in the Urban Dictionary would take a very long time. You can get a random word from the Urban Dictionary by calling https://api.urbandictionary.com/v0/random
Here's a function that gets a random word from the Urban Dictionary
def randomword():
response = requests.get("https://api.urbandictionary.com/v0/random")
return response.text
In order to convert the response to JSON, you have to import JSON and do json.loads(response.text). Once converted to JSON, it is basically a dictionary. Here's a code that gets the definition, word, and author of the first definition
data = json.loads(randomword()) #gets random and converts to JSON
firstdef = data["list"][0] #gets first definition
author = firstdef["author"] #author of definition
definition = firstdef["definition"] #definition of word
word = firstdef["word"] #word
Referring to the above comment, I share the method you need.
import requests
word = "all good in the hood"
response = requests.get(f"https://api.urbandictionary.com/v0/random")
# get all list item
for obj in response.json()['list']:
print(obj)
# get index 0 of list
print(response.json()['list'][0])
# get index 0 - word of list
print(response.json()['list'][0]['word'])
The text is in json format, so just use the json module to convert to a dictionary. I also had it just give the definition with the most thumbs
import json
import requests
word = "all good in the hood"
response = requests.get(f"http://api.urbandictionary.com/v0/define?term={word}")
dictionary = json.loads(response.text)['list']
most_thumbs = -1
best_definition = ""
for definition in dictionary:
if definition['thumbs_up']>most_thumbs:
most_thumbs = definition['thumbs_up']
best_definition = definition['definition']
print(f"{word}: {best_definition}")
Related
I have a list of peptide sequence, I want to map it to the correct protein names from any Open Database like Uniprot, i.e., peptides belonging to the proteins. Can someone guide how to find the protein names and map them, thanks in advance.
I'd say your best bet is to use the requests module and hook into the API that Uniprot has on their website. The API for peptide sequence searching is here, and the docs for it link from the same page.
With this, you should be able to form a dict that contains your search parameters and send a request to the API that will return the results you are looking for. The requests module allows you to retrieve the results as json format, which you can very easily parse back into lists/dicts, etc for use in whatever way you wish.
Edit: I have code!
Just for fun, I tried the first part: looking up the proteins using the peptides. This works! You can see how easy the requests module makes this sort of thing :)
There is another API for retrieving the database entries once you have the list of "accessions" from this first step. All of the API end points and docs can be accessed here. I think you want this one.
import requests
from time import sleep
url = 'https://research.bioinformatics.udel.edu/peptidematchws/asyncrest'
#peps can be a comma separated list for multiple peptide sequences
data={'peps':'MKTLLLTLVVVTIVCLDLGYT','lEQi':'off','spOnly':'off'}
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
response = requests.post(url,params=data,headers=headers)
if response.status_code == 202:
print(f"Search accepted. Results at {response.headers['Location']}")
search_job = requests.get(response.headers['Location'])
while search_job.status_code == 303:
sleep(30)
search_job = requests.get(response.headers['Location'])
if search_job.status_code == 200:
results = search_job.text.split(',')
print('Results found:')
print(results)
else:
print('No matches found')
else:
print('Error Search not accepted')
print(response.status_code, response.reason)
I am requesting a wikipedia page that returns all the text from that website like so:
def my_function(addr):
response = requests.get(addr)
print(response.text)
my_function("https://en.wikipedia.org/wiki/Web_scraping")
Right now what im trying to do is basically delete unwanted parts, basically all text before the id with the class 'See_also'. Is there a right and easy way to do so? I could not just delete a certain amount of lines since this code is meant to work for different wiki sites.
You can use REGEX (huraay).
import requests
import re
def my_function(addr):
response = requests.get(addr)
print(re.findall("See_also[\\s\\S]*", response.text))
my_function("https://en.wikipedia.org/wiki/Web_scraping")
So I'm new to Python and am working on a simple program that will read a text file of protein names (PDB IDs) and create a URL to search a database (the PDB) for that protein and some associated data.
Unfortunately, as a newbie, I forgot to save my script, so I can't recall what I did to make my code work!
Below is my code so far:
import urllib
import urllib.parse
import urllib.request
import os
os.chdir("C:\\PythonProjects\\Samudrala Lab Projects")
protein_file = open("protein_list.txt","r")
protein_list = protein_file.read()
for item in protein_list:
item = item[0:4]
query_string =urlencode('customReportColumns','averageBFactor','resolution','experimentalTechnique','service=wsfile','format=csv')
**final_URL = url + '?pdbid={}{}'.format(url, item, query_string)**
print(final_URL)
The line of code I'm stuck on is starred.
The object "final_url" within the loop is missing some modification to indicate that I'd like the URL to search for the item as a pdbid. Can anyone give me a hint as to how I can tell the URL to plug in each item on the list as a PDBID?
I'm getting a type error indicating that it's not a valid non-string sequence or mapping object. Original post was edited to add this info.
Please let me know if this is an unclear question, or if you need any additional info.
Thanks!
How about something like this?
final_URL = "{}?pdbids={}{}".format(url, item, query_string)
I am practicing my programming skills (in Python) and I realized that I don't know what to do when I need to find a value that is unknown but introduced by a key word. I am taking the information for this off a website where in the page source it says, '"size":"10","stockKeepingUnitId":"(random number)"'
How can I figure out what that number is.
This is what I have so far --
def stock():
global session
endpoint = '(website)'
reponse = session.get(endpoint)
soup = bs(response.text, "html.parser")
sizes = soup.find('"size":"10","stockKeepingUnitId":')
Off the top of my head there are two ways to do this. Say you have the string mystr = 'some text...content:"67588978"'. The first way is just to search for "content:" in the string and use string slicing to take everything after it:
num = mystr[mystr.index('content:"') + len('content:"'):-1]
Alternatively, as probably a better solution, you could use regular expressions
import re
nums = re.findall(r'.*?content:\"(\d+)\"')
As you haven't provided an example of the dataset you're trying to analyze, there could also be a number of other solutions. If you're trying to parse a JSON or YAML file, there are simple libraries to turn them into python dicts (json is part of the standard library, and PyYaml handles YAML files easily).
So I'm trying to learn Python here, and would appreciate any help you guys could give me. I've written a bit of code that asks one of my favorite websites for some information, and the api call returns an answer in a dictionary. In this dictionary is a list. In that list is a dictionary. This seems crazy to me, but hell, I'm a newbie.
I'm trying to assign the answers to variables, but always get various error messages depending on how I write my {},[], or (). Regardless, I can't get it to work. How do I read this return? Thanks in advance.
{
"answer":
[{"widgets":16,
"widgets_available":16,
"widgets_missing":7,
"widget_flatprice":"156",
"widget_averages":15,
"widget_cost":125,
"widget_profit":"31",
"widget":"90.59"}],
"result":true
}
Edited because I put in the wrong sample code.
You need to show your code, but the de-facto way of doing this is by using the requests module, like this:
import requests
url = 'http://www.example.com/api/v1/something'
r = requests.get(url)
data = r.json() # converts the returned json into a Python dictionary
for item in data['answer']:
print(item['widgets'])
Assuming that you are not using the requests library (see Burhan's answer), you would use the json module like so:
data = '{"answer":
[{"widgets":16,
"widgets_available":16,
"widgets_missing":7,
"widget_flatprice":"156",
"widget_averages":15,
"widget_cost":125,
"widget_profit":"31",
"widget":"90.59"}],
"result":true}'
import json
data = json.loads(data)
# Now you can use it as you wish
data['answer'] # and so on...
First I will mention that to access a dictionary value you need to use ["key"] and not {}. see here an Python dictionary syntax.
Here is a step by step walkthrough on how to build and access a similar data structure:
First create the main dictionary:
t1 = {"a":0, "b":1}
you can access each element by:
t1["a"] # it'll return a 0
Now lets add the internal list:
t1["a"] = ["x",7,3.14]
and access it using:
t1["a"][2] # it'll return 3.14
Now creating the internal dictionary:
t1["a"][2] = {'w1':7,'w2':8,'w3':9}
And access:
t1["a"][2]['w3'] # it'll return 9
Hope it helped you.