Python: Get pywebview Current URL - python

I am using the pywebview library to open a page that will redirect the user to another url. What I would like to do is get the URL the user is directed to.
my code so far:
import urllib.request
import urllib.parse
import webview
import threading
import time
def openwebview():
time.sleep(1)
page = webview.create_window("URL_that_redirects_user")
def geturl():
#what goes here?
t = threading.Thread(target = openwebview)
t.start()
I am using Windows, thanks!

Author of pywebview here. There is no way to get the current URL. Uou have to dig into an underlying webview to get the URL.
Thanks for the suggestion, I will look into introducing this feature.

Now you can do it:
def geturl():
print(webview.get_current_url())
See here:
https://github.com/r0x0r/pywebview/blob/master/examples/get_current_url.py

Related

Redirect hostname/endpoint to api.hostname/endpoint in django

I have my api built with this pattern: api.hostname/endpoint.
However there is a plugin to my app that uses hostname/endpoint pattern.
I would like to solve it on the backend side by adding redirection to api.hostname/endpoint.
I tried to experiment with adding urls or paths to urlpatterns, but it didn't help me.
How can I achieve it? Any ideas?
Regards,
Maciej.
You can use urllib
import urllib.parse
url = "https://hostname/endpoint"
split_url = urllib.parse.urlsplit(url)
result = f"{split_url.scheme}://api.{split_url.hostname}/{split_url.endpoint}"
print(result)
>> "https://api.hostname/endpoint"

Load URL without graphical interface

In Python3, I need to load a URL every set interval of time, but without a graphical interface / browser window. There is no JavaScript, all it needs to do is load the page, and then quit it. This needs to run as a console application.
Is there any way to do this?
You could use threading and create a Timer that calls your function after every specified interval of time.
import time, threading, urllib.request
def fetch_url():
threading.Timer(10, fetch_url).start()
req = urllib.request.Request('http://www.stackoverflow.com')
with urllib.request.urlopen(req) as response:
the_page = response.read()
fetch_url()
The requests library may have what you're looking for.
import requests, time
url = "url.you.need"
website_object = requests.get(url)
# Repeat as necessary

spynner doesn't load XHR data

I'm building a script to monitor a reporting service. Depending on how it takes to process the report the report appears in HTML or comes via XmlHttpRequest.
As a tool to check the page I want to use spynner, which works perfect for HTML, but it seems that I can't get it to work when the data comes via XHR.
The code for the test is the following:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__docformat__ = 'restructuredtext en'
from time import sleep
from spynner import browser
import pyquery
from PyQt4.QtCore import QUrl
from PyQt4.QtNetwork import QNetworkRequest, QNetworkAccessManager
from PyQt4.QtCore import QByteArray
def load_page(br):
ret = br.load_jquery(True)
print ret
return 'Japan' in br.html
br = browser.Browser(
debug_level=4
)
br.load('https://foobar.eu/newton/cgi-bin/cognos.cgi')
br.create_webview()
br.show()
#br.load("https://foobar.eu/newton/cgi-bin/cognos.cgi?b_action=xts.run&m=portal/cc.xts&m_folder=iA37B5BBC0615469DA37767D2B6F1DCF1")
#br.browse()
res = br.load("https://foobar.eu:443/newton/cgi-bin/cognos.cgi?b_action=cognosViewer&ui.action=run&ui.object=/content/folder[#name='DMA Admin Zone']/folder[#name='02. Performance Benchmark Module']/folder[#name='1. Reports']/report[#name='CQM_Test_3_HTML_Heavy_Local_Processing_Final']&ui.name=CQM_Test_3_HTML_Heavy_Local_Processing_Final&run.outputFormat=&run.prompt=true", 1, wait_callback=load_page)
d = str(pyquery.PyQuery(br.html))
if d.find("Japan") > -1:
print 'We discovered Japan!'
else:
print 'Japan is nowhere to be seen!'
sleep(10)
The URL in the comments is a page which contains a link to the report. When I click the report by hand the report works (via XHP). However, I can't seem to get it to work via scripting.
The br.load_jquery always returns None.
As a help I have added part of the spynner debug trace when I click the link by hand: http://fpaste.org/97583/13987135/
In firebug I can clearly see the XHP reponse with the string 'Japan' in.
What am I missing?
apparantly replacing the load page function with the following code makes it work:
def load_page(br):
br.wait(5)
return 'Japan' in br.html

Spynner crash python

I'm building a Django app and I'm using Spynner for web crawling. I have this problem and I hope someone can help me.
I have this function in the module "crawler.py":
import spynner
def crawling_js(url)
br = spynner.Browser()
br.load(url)
text_page = br.html
br.close (*)
return text_page
(*) I tried with br.close() too
in another module (eg: "import.py") I call the function in this way:
from crawler import crawling_js
l_url = ["https://www.google.com/", "https://www.tripadvisor.com/", ...]
for url in l_url:
mytextpage = crawling_js(url)
.. parse mytextpage....
when I pass the first url in to the function all is correct when I pass the second "url" python crash. Python crash in this line:br.load(url). Someone can help me? Thanks a lot
I have:
Django 1.3
Python 2.7
Spynner 1.1.0
PyQt4 4.9.1
Why you need to instantiate br = spynner.Browser() and close it every time you call crawling_js(). In a loop this will utilize a lot of resources which I think is the reason why it crashes. let's think of it like this, br is a browser instance. Therefore, you can make it browse any number of websites without the need to close it and open it again. Adjust your code this way:
import spynner
br = spynner.Browser() #you open it only once.
def crawling_js(url):
br.load(url)
text_page = br._get_html() #_get_html() to make sure you get the updated html
return text_page
then if you insist to close br later you simply do:
from crawler import crawling_js , br
l_url = ["https://www.google.com/", "https://www.tripadvisor.com/", ...]
for url in l_url:
mytextpage = crawling_js(url)
.. parse mytextpage....
br.close()

mechanize can't login python

I'm making auto-login script by use mechanize python.
Before I was used mechanize with no problem, but www.gmarket.co.kr in this site I couldn't make it .
whenever i try to login always login page was returned even with correct gmarket id , pass, i can't login and I saw some suspicious message
"<script language=javascript>top.location.reload();</script>"
I think this related with my problem, but don't know exactly how to handle .
Here is sample id and pass for login test
id: tgi177 pass: tk1047
if anyone can help me much appreciate thanks in advance
CODE:
# -*- coding: cp949 -*-
from lxml.html import parse, fromstring
import sys,os
import mechanize, urllib
import cookielib
import re
from BeautifulSoup import BeautifulSoup,BeautifulStoneSoup,Tag
try:
params = urllib.urlencode({'command':'login',
'url':'http%3A%2F%2Fwww.gmarket.co.kr%2F',
'member_type':'mem',
'member_yn':'Y',
'login_id':'tgi177',
'image1.x':'31',
'image1.y':'26',
'passwd':'tk1047',
'buyer_nm':'',
'buyer_tel_no1':'',
'buyer_tel_no2':'',
'buyer_tel_no3':''
})
rq = mechanize.Request("http://www.gmarket.co.kr/challenge/login.asp")
rs = mechanize.urlopen(rq)
data = rs.read()
logged_in = r'input_login_check_value' in data
if logged_in:
print ' login success !'
rq = mechanize.Request("http://www.gmarket.co.kr")
rs = mechanize.urlopen(rq)
data = rs.read()
print data
else:
print 'login failed!'
pass
quit()
except:
pass
mechanize doesn't have the ability to interact with JavaScript. Probably spidermonkey module will help you (I have no experience with it, but description is quite promising). Also you could handle such reload (e.g.Browser.reload() for this particular case) manually if it's the only site you have this problem.
Update:
Quick look through your page shows that you have submit to other URL (with https: scheme). Look through checkValid() JavaScript function. Posting to it gives other result. Note, that this looks like homework you should do yourself before asking.

Categories

Resources