python wikipedia package changing input

python wikipedia package changing input - python

I'm running a script to get pages related to a word using python (pip3 install wikipedia). I enter a word to search, let's say the word is "cat". I send that to the code below, but the wikipedia code changes it to "hat" and returns pages related to "hat". It does this with any word I search for (ie: "bear" becomes "beard". "dog" becomes "do", etc...)
wikipedia_page_name = "cat"
print("Original: ", wikipedia_page_name)
myString = wikipedia.page(wikipedia_page_name)
print("Returned: ", myString)
Here is what I get back:
Original: cat
Returned: <WikipediaPage 'Hat'>
My steps to use this were to install wikipedia "pip3 install wikipedia" and then import it "import wikipedia". That's it! I've tried uninstalling and then reinstalling, but I get the same results.
Any help is appreciated!

If you want to work with the page <WikipediaPage 'Cat'>, please try to set auto_suggest to False as suggest can be pretty bad at finding the right page:
import wikipedia
wikipedia_page_name = "cat"
print("Original: ", wikipedia_page_name)
myString = wikipedia.page(wikipedia_page_name, pageid=None, auto_suggest=False)
print("Returned: ", myString)
Output:
Original: cat
Returned: <WikipediaPage 'Cat'>
If you want to find titles, use search instead:
import wikipedia
wikipedia_page_name = "cat"
searches = wikipedia.search(wikipedia_page_name)
print(searches)
Output:
['Cat', 'Cat (disambiguation)', 'Keyboard Cat', 'Calico cat', 'Pussy Cat Pussy Cat', 'Felidae', "Schrödinger's cat", 'Tabby cat', 'Bengal cat', 'Sphynx cat']
You can use both together to make sure you get the right page from a String, as such:
import wikipedia
wikipedia_page_name = "cat"
searches = wikipedia.search(wikipedia_page_name)
if searches:
my_page = wikipedia.page(searches[0], pageid=None, auto_suggest=False)
print(my_page)
else:
print("No page found for the String", wikipedia_page_name)
Output:
<WikipediaPage 'Cat'>

Related

How to parse answers with deepl api?

I want to use deepl translate api for my university project, but I can't parse it. I want to use it wit PHP or with Python, because the argument I'll pass to a python script so it's indifferent to me which will be the end. I tried in php like this:
$original = $_GET['searchterm'];
$deeplTranslateURL='https://api-free.deepl.com/v2/translate?auth_key=MYKEY&text='.urlencode($original).'&target_lang=EN';
if (get_headers($deeplTranslateURL)[0]=='HTTP/1.1 200 OK') {
$translated = str_replace(' ', '', json_decode(file_get_contents($deeplTranslateURL))["translations"][0]["text"]);
}else{
echo("translate error");
}
$output = passthru("python search.py $original $translated");
and I tried also in search.py based this answer:
#!/usr/bin/env python
import sys
import requests
r = requests.post(url='https://api.deepl.com/v2/translate',
data = {
'target_lang' : 'EN',
'auth_key' : 'MYKEY',
'text': str(sys.argv)[1]
})
print 'Argument:', sys.argv[1]
print 'Argument List:', str(sys.argv)
print 'translated to: ', str(r.json()["translations"][0]["text"])
But neither got me any answer, how can I do correctly? Also I know I can do it somehow in cURL but I didn't used that lib ever.

DeepL now has a python library that makes translation with python much easier, and eliminates the need to use requests and parse a response.
Get started as such:
import deepl
translator = deepl.Translator(auth_key)
result = translator.translate_text(text_you_want_to_translate, target_lang="EN-US")
print(result)
Looking at your question, it looks like search.py might have a couple problems, namely that sys splits up every individual word into a single item in a list, so you're only passing a single word to DeepL. This is a problem because DeepL is a contextual translator: it builds a translation based on the words in a sentence - it doesn't simply act as a dictionary for individual words. If you want to translate single words, DeepL API probably isn't what you want to go with.
However, if you are actually trying to pass a sentence to DeepL, I have built out this new search.py that should work for you:
import sys
import deepl
auth_key="your_auth_key"
translator = deepl.Translator(auth_key)
"""
" ".join(sys.argv[1:]) converts all list items after item [0]
into a string separated by spaces
"""
result = translator.translate_text(" ".join(sys.argv[1:]), target_lang = "EN-US")
print('Argument:', sys.argv[1])
print('Argument List:', str(sys.argv))
print("String to translate: ", " ".join(sys.argv[1:]))
print("Translated String:", result)
I ran the program by entering this:
search.py Der Künstler wurde mit einem Preis ausgezeichnet.
and received this output:
Argument: Der
Argument List: ['search.py', 'Der', 'Künstler', 'wurde', 'mit', 'einem',
'Preis', 'ausgezeichnet.']
String to translate: Der Künstler wurde mit einem Preis ausgezeichnet.
Translated String: The artist was awarded a prize.
I hope this helps, and that it's not too far past the end of your University Project!

Find if a string contains substring python

I'm trying to figure out how to find a substring with regex in python inside an input.
What I mean is that I'm getting an input string from the user, and I have JSON file I load, inside every block in my JSON file I have 'alert_regex', and I want to check it the string inside my input contains my regex.
this is what I have tried so far:
import json
from pprint import pprint
import re
# Load json file
json_data=open('alerts.json')
jdata = json.load(json_data)
json_data.close()
# Input for users
input = 'Check Liveness in dsadakjnflkds.server'
# Search in json function
def searchInJson(input, jdata):
for i in jdata:
# checks if the input is similiar for the alert name in the json
print(i["alert_regex"])
regexCheck = re.search(i["alert_regex"], input)
if(regexCheck):
# saves and prints the confluence's related link
alert = i["alert_confluence"]
print(alert)
return print('Alert successfully found in `alerts.json`.')
print('Alert was not found!')
searchInJson(input,jdata)
what I want my regex to check is only if the string contains 'Check flink liveness'
There are 2 optional problems:
1. maybe my regex is not correct inside i["alert_regex"] (I've tried to same one with javascript and it worked)
2. my code is not correct.
An example of my JSON file:
[
{
"id": 0,
"alert_regex": "check (.*) Liveness (.*)",
"alert_confluence": "link goes here"
}
]

You have two problems. All of your code can be reduced down to:
import re
re.search("check (.*) Liveness (.*)", 'Check Liveness in dsadakjnflkds.server')
This will not match because:
You need to set the "case insensitivity" on the search because check will not match with Check otherwise.
check (.*) Liveness ends up with two spaces between check and Liveness if (.) matches the empty string.
You need:
re.search("check (.*)Liveness (.*)", 'Check Liveness in dsadakjnflkds.server', flags=re.I)

How to make os.system read all the statements

import wikipedia
import os
while True:
input = raw_input("Ques: ")
#To get output in a particular language ,
#This prints the results on spanish
#wikipedia.set_lang("es")
wiki = wikipedia.summary(input, sentences = 2).encode('utf-8').strip()
os.system("say " + wiki)
print wiki
on the output console, it asks for
Ques: when I type Cristiano
It says "Cristiano is a Portuguese footballer"
But when I type anything other than Cristiano (Say Chelsea FC), it says
sh: -c: line 0: unexpected EOF while looking for matching `''
sh: -c: line 1: syntax error: unexpected end of file
OR
sh: -c: line 0: syntax error near unexpected token `('

The returning value of wikipedia.summary() may contain characters that the shell interprets with special meaning. You can escape such characters with shlex.quote():
import wikipedia
import os
import shlex
while True:
input = raw_input("Ques: ")
#To get output in a particular language ,
#This prints the results on spanish
#wikipedia.set_lang("es")
wiki = wikipedia.summary(input, sentences = 2).encode('utf-8').strip()
os.system("say " + shlex.quote(wiki))
print wiki

I haven't work with wikipedia third-party before. But when I try your code and I found that I just need to delete .encode('utf-8'). And it's work for me.
wiki = wikipedia.summary(i, sentences=2).strip()
import wikipedia
import os
while True:
i = input("Ques: ")
#To get output in a particular language ,
#This prints the results on spanish
#wikipedia.set_lang("es")
wiki = wikipedia.summary(i, sentences=2).strip()
os.system("say "+ wiki)
print(wiki)
Result: Chelsea Football Club is a professional football club in London, England, that competes in the Premier League. Founded in 1905, the club's home ground since then has been Stamford Bridge.Chelsea won the First Division title in 1955, ....
Or you can you use another third-party like pyttsx3: pip install pyttsx3.
And the code will be like this:
import wikipedia
import pyttsx3
engine = pyttsx3.init()
while True:
i = input("Ques: ")
wiki = wikipedia.summary(i, sentences=2).strip()
# os.system("say "+ wiki)
print(wiki)
engine.say(wiki)
engine.runAndWait()`
I hope it can help.

Can I find subject from Spacy Dependency tree using NLTK in python?

I want to find the subject from a sentence using Spacy. The code below is working fine and giving a dependency tree.
import spacy
from nltk import Tree
en_nlp = spacy.load('en')
doc = en_nlp("The quick brown fox jumps over the lazy dog.")
def to_nltk_tree(node):
if node.n_lefts + node.n_rights > 0:
return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])
else:
return node.orth_
[to_nltk_tree(sent.root).pretty_print() for sent in doc.sents]
From this dependency tree code, Can I find the subject of this sentence?

I'm not sure whether you want to write code using the nltk parse tree (see How to identify the subject of a sentence? ). But, spacy also generates this with the 'nsubj' label of the word.dep_ property.
import spacy
from nltk import Tree
en_nlp = spacy.load('en')
doc = en_nlp("The quick brown fox jumps over the lazy dog.")
sentence = next(doc.sents)
for word in sentence:
... print "%s:%s" % (word,word.dep_)
...
The:det
quick:amod
brown:amod
fox:nsubj
jumps:ROOT
over:prep
the:det
lazy:amod
dog:pobj
Reminder that there could more complicated situations where there is more than one.
>>> doc2 = en_nlp(u'When we study hard, we usually do well.')
>>> sentence2 = next(doc2.sents)
>>> for word in sentence2:
... print "%s:%s" %(word,word.dep_)
...
When:advmod
we:nsubj
study:advcl
hard:advmod
,:punct
we:nsubj
usually:advmod
do:ROOT
well:advmod
.:punct

Same with leavesof3, I prefer to use spaCy for this kind of purpose. It has better visualization, i.e.
the subject will be the word or phrase (if you use noun chunking) with the dependency property "nsubj" or "normal subject"
You can access displaCy (spaCy visualization) demo here

Try this:
import spacy
import en_core_web_sm
nlp = spacy.load('en_core_web_sm')
sent = "I need to be able to log into the Equitable siteI tried my username and password from the AXA Equitable site which worked fine yesterday but it won't allow me to log in and when I try to change my password it says my answer is incorrect for the secret question I just need to be able to log into the Equitable site"
nlp_doc=nlp(sent)
subject = [tok for tok in nlp_doc if (tok.dep_ == "nsubj") ]
print(subject)

Why is is this "Invalid syntax" in Python?

This isn't my code, it is a module I found on the internet which performs (or is supposed to perform) the task I want.
print '{'
for page in range (1,4):
rand = random.random()
id = str(long( rand*1000000000000000000 ))
query_params = { 'q':'a',
'include_entities':'true', 'lang':'en',
'show_user':'true',
'rpp': '100', 'page': page,
'result_type': 'mixed',
'max_id':id}
r = requests.get('http://search.twitter.com/search.json',
params=query_params)
tweets = json.loads(r.text)['results']
for tweet in tweets:
if tweet.get('text') :
print tweet
print '}'
print
The Python shell seems to indicate that the error is one Line 1. I know very little Python so have no idea why it isn't working.

This snippet is written for Python 2.x, but in Python 3.x (where print is now a proper function). Replace print SomeExp with print(SomeExpr) to solve this.
Here's a detailed description of this difference (along with other changes in 3.x).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python wikipedia package changing input - python

Related

How to parse answers with deepl api?

Find if a string contains substring python

How to make os.system read all the statements

Can I find subject from Spacy Dependency tree using NLTK in python?

Why is is this "Invalid syntax" in Python?

Categories

Resources