Python Requests_html: giving me Timeout Error

Python Requests_html: giving me Timeout Error - python

I'm trying to scrape headlines from medium.com by using this library called requests_html
The code I'm using works well on other's PC but not mine.
Here's what the original code looks like this:
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://medium.com/#daranept27')
r.html.render()
x = r.html.find('a.eg.bv')
[print(elem.text) for elem in x]
It gives me pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 8000 ms exceeded.
Here's the full error:
Traceback (most recent call last):
File "C:\Users\intel\Desktop\hackerrank.py", line 5, in <module>
r.html.render()
File "C:\Users\intel\AppData\Local\Programs\Python\Python38\lib\site-packages\requests_html.py", line 598, in render
content, result, page = self.session.loop.run_until_complete(self._async_render(url=self.url, script=script, sleep=sleep, wait=wait, content=self.html, reload=reload, scrolldown=scrolldown, timeout=timeout, keep_page=keep_page))
File "C:\Users\intel\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 616, in run_until_complete
return future.result()
File "C:\Users\intel\AppData\Local\Programs\Python\Python38\lib\site-packages\requests_html.py", line 512, in _async_render
await page.goto(url, options={'timeout': int(timeout * 1000)})
File "C:\Users\intel\AppData\Local\Programs\Python\Python38\lib\site-packages\pyppeteer\page.py", line 885, in goto
raise error
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 8000 ms exceeded.
[Finished in 13.0s with exit code 1]
[shell_cmd: python -u "C:\Users\intel\Desktop\hackerrank.py"]
[dir: C:\Users\intel\Desktop]
[path: C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\nodejs\;C:\Python38;C:\Users\intel\AppData\Local\Programs\Python\Python38\Scripts\;C:\Users\intel\AppData\Local\Programs\Python\Python38\;C:\MinGW\bin;C:\Users\intel\AppData\Local\Programs\Microsoft VS Code\bin]
I saw a comment on one of my posts and saw others' answers too to re-run it, then it will work. I don't understand why...

The error you are getting suggests that you are not getting a response from the server in a timely manner.
I ran your code on my machine (Ubuntu 18.04) successfully and got the following results:
Seven Days -Between Life And Death
Have you ever encountered a fake friend? If So, Try These Simple Tips To Overcome it.
Does Anybody Ever Wonder Why He’s My Everything?
Ladies, Why Should You Treat Your Face Like The Coloring Books?
Listen, Girl, Aren’t You Curious How The Last Line Could Be This Hurtful?
The girl name “Rich”
She Lost Her Beloved Mother, But Why She Asserted that Loss Was Not Just A Loss sometimes?
You Used To Try This Lonely. Have You Ever Imagine The flavor You Tried To Eat it with Your Lover?
If You Have Siblings, You Won’t Comprehend this. Have You Ever Wonder How A Child Feels Like? This Is How It Perceives.
Is It Okay To Help A Stranger?
The Nightmare Was Always Considered A Bad Omen, But It Turned Incredible Differently.
If You’re A Woman Or Girl Who Loves To Wear Lipstick, Read This Poetry.
She Wants To Spread This Poetry For Every Girl Or Woman That Was Born Just Like The Way She Was.
You must check your internet connection.
Alternatively, I'd suggest you run your idle in administrator mode and re-run your code through idle.

Related

VI_ERROR_TMO when a computer does a query to a function generator

I am using a peaktech 4046 : 160MHz Function/arbitrary Waveform Generator. I developping on pyton and I am using the pyvisa librairy.
The connection is well established and the generator applies the query. But it generates the following error and stops the program (it doesn't do anything after the error).
Here is the code :
import pyvisa
rm = pyvisa.ResourceManager()
inst = rm.open_resource('TCPIP0::130.79.192.123::5025::SOCKET')
print(inst.session)
print(inst.io_protocol)
inst.query("source1:function squ")
And here is what I have in my terminal :
2
IOProtocol.normal
Traceback (most recent call last):
File "c:\Users\Labo préclinique\Desktop\ProjetPython\importation de librairies\Forum.py", line 7, in <module>
inst.query("source1:function squ ")
File "C:\Users\Labo préclinique\AppData\Local\Programs\Python\Python39\lib\site-packages\pyvisa\resources\messagebased.py", line 644, in query
return self.read()
File "C:\Users\Labo préclinique\AppData\Local\Programs\Python\Python39\lib\site-packages\pyvisa\resources\messagebased.py", line 486, in read
message = self._read_raw().decode(enco)
File "C:\Users\Labo préclinique\AppData\Local\Programs\Python\Python39\lib\site-packages\pyvisa\resources\messagebased.py", line 442, in _read_raw
chunk, status = self.visalib.read(self.session, size)
File "C:\Users\Labo préclinique\AppData\Local\Programs\Python\Python39\lib\site-packages\pyvisa\ctwrapper\functions.py", line 2337, in read
ret = library.viRead(session, buffer, count, byref(return_count))
File "C:\Users\Labo préclinique\AppData\Local\Programs\Python\Python39\lib\site-packages\pyvisa\ctwrapper\highlevel.py", line 222, in _return_handler
return self.handle_return_value(session, ret_value) # type: ignore
File "C:\Users\Labo préclinique\AppData\Local\Programs\Python\Python39\lib\site-packages\pyvisa\highlevel.py", line 251, in handle_return_value
raise errors.VisaIOError(rv)
pyvisa.errors.VisaIOError: VI_ERROR_TMO (-1073807339): Timeout expired before operation completed.
I have tried (to no avail) :
-changing SOCKET to INSTR
-using a timout much longeur ( inst.timeout = 10000)
-adding a end term (tried \n and \r) with : inst.read_termination = '\n'
So I don't know what to do anymore... I need to give more than one command, so the program must not stop so fast. I suspect that my function generator is not sending anything back, but I don't know how to make sure this is the case.
What I wish to know is : Why do I have a time out error if the connection is well established and the request is executed on the device ? How to do the request in a proper way ?
Thank you in advance !!
PS : I know how to catch the error (with try except) but I'd rather have an Ok answer thant a KO one.

Try to get a list of resources by
rm.list_resources()
and check that your resource TCPIP0::130.79.192.123::5025::SOCKET in it.
Then check the standard request to the resource from tutorial:
inst.query("*IDN?")
query is a short form for a write operation to send a message, followed by a read. So you could do this in two actions to specify the error(read or write error?):
inst.write('"source1:function squ"')
print(inst.read())
Please, check the name of query source1:function squ because I don't see it in the documentation. Maybe you should use "source1:am:interanal:function square(p. 57 of documentation) or change squ -> square?
Accordingly documentation, you could try to set infinite timeout to your request by
del inst.timeout
Also, you could add read_termination/write_termination option to specify when you'll finish your reading/writing by
inst = rm.open_resource('TCPIP0::130.79.192.123::5025::SOCKET', read_termination='\r')
And the last chance is changing the options query_delay and send_end.

Can't initialize ANT+ Node with Python OpenANT library

I've totally new in Python and also in the ANT+ technology. I wonder if that's not some basic problem, but I've been strugling with it for couple of days already browsing through forums with no luck..
So I'm trying to use the Python OpenANT library (https://github.com/Tigge/openant) to access my ANT doungle which is plugged into the USB port (WINDOWS 10 PRO). My goal is to access my Garmin through it and get some data from it. However, I'm stuck at the very beginning trying to inizialize the ANT Node. My code is this:
from ant.easy.node import Node
node=Node()
To this I get the exception:
File "C:/Users/Edgars/Desktop/untitled-5.py", line 2, in <module>
pass
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\Lib\site-packages\ant\easy\node.py", line 56, in __init__
self.ant = Ant()
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\Lib\site-packages\ant\base\ant.py", line 68, in __init__
self._driver.open()
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\Lib\site-packages\ant\base\driver.py", line 193, in open
cfg = dev.get_active_configuration()
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyusb-1.1.0-py3.8.egg\usb\core.py", line 909, in get_active_configuration
return self._ctx.get_active_configuration(self)
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyusb-1.1.0-py3.8.egg\usb\core.py", line 113, in wrapper
return f(self, *args, **kwargs)
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyusb-1.1.0-py3.8.egg\usb\core.py", line 250, in get_active_configuration
bConfigurationValue=self.backend.get_configuration(self.handle)
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyusb-1.1.0-py3.8.egg\usb\backend\libusb0.py", line 519, in get_configuration
ret = self.ctrl_transfer(
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyusb-1.1.0-py3.8.egg\usb\backend\libusb0.py", line 601, in ctrl_transfer
return _check(_lib.usb_control_msg(
File "C:\Users\Edgars\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pyusb-1.1.0-py3.8.egg\usb\backend\libusb0.py", line 447, in _check
raise USBError(errmsg, ret)
usb.core.USBError: [Errno None] b'libusb0-dll:err [control_msg] sending control message failed, win error: A device which does not exist was specified.\r\n\n'
I have closed the Garmin Agent, so no other programs are using my ANT dongle at the same time. When I run my code, the specific sound occurs every time - the one that we hear when we detach a USB device by selecting "Eject" from the drop-down menu (the sound happens simultaneously with the exception message), so I guess the USB gets accessed at some moment.
Before the exception I get such a printout:
Driver available: [<class 'ant.base.driver.SerialDriver'>, <class 'ant.base.driver.USB2Driver'>, <class 'ant.base.driver.USB3Driver'>]
- Using: <class 'ant.base.driver.USB3Driver'>
Could not check if kernel driver was active, not implemented in usb backend
I have seen other users' threads where the printout says Using ... USB1Driver or Using ... USB2Driver, and they don't get this message. I've installed various python libraries trying to get even this far, and now I've worried that maybe they get in each other's way.. Can anybody help me with this? It's really frustrating that a program of only two code lines can get so complicated.. :D
!!!EDIT!!!
OK, I found the problem - in the "driver.py" file there's a line dev.reset() which disconnects my USB dongle before trying to access it. I have no idea why such a line should exist there. I tried to comment this line out, and now I'm not getting the abovementioned error anymore. However, what happens now is there are continuos timeouts..
So my code has evolved to this (although actually the same timeouts happen also with my initial 2-lines-long program):
from ant.easy.node import Node
from ant.easy.channel import Channel
from ant.base.message import Message
import threading
NETWORK_KEY=[0xb9,0xa5,0x21,0xfb,0xbd,0x72,0xc3,0x45]
def on_data(data):
print("Data received")
print(data)
def back_thread(node):
node.set_network_key(0x00,NETWORK_KEY)
channel=node.new_channel(Channel.Type.BIDIRECTIONAL_RECEIVE)
channel.on_broadcast_data=on_data
channel.on_burst_data=on_data
channel.set_period(16070)
channel.set_search_timeout(20)
channel.set_rf_freq(57)
channel.set_id(0,120,0)
try:
channel.open()
node.start()
finally:
node.stop()
print("ANT Node Shutdown Complete")
node=Node()
x=threading.Thread(target=back_thread,args=(node,))
x.start()
Now I get this error line printed out for ever:
<class 'usb.core.USBError'>, (None, b'libusb0-dll:err [_usb_reap_async] timeout error\n')
When my Garmin Agent is active, I get the error "ANT resource already in use" instead of the timeout, so I'm certain that my code is accessing the ANT dongle.. However, now (having closed the Garmin Agent) I have no idea about how to get rid of the timeout and how to establish a simple handshake with my Garmin device..

OK, now I've figured out that my Garmin Forerunner 310XT can't act as a data source and thus cannot be accessed using the ANT+ protokol. Instead, I should use the ANT-FS protocol of File Sharing. Keeping my head down and trying it out...

I posted a PR with some changes that I made to get Tigge’s openant library to work. Basically, I put a pause after the reset line that you mentioned above and bypassed the use of udev_rules as it doesn’t apply in Windows. You can use libusb but installation is a bit different. I’ve added Windows installation instructions to the readme in the PR with details on what worked for me.

The whois.whois function always gets a timed out error

The whois.whois function always gets a timed out error.
At first, I thought it was because my project is written in Python 2.7 but I also checked in 3.7 and got the same error.
I checked the address on the online website that uses whois and the link worked and didn't get this error.
Anyone knows why this is happening?
import whois
w = whois.whois("https://stackoverflow.com")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Program Files\Python37\lib\site-packages\whois\__init__.py", line 43, in whois
text = nic_client.whois_lookup(None, domain.encode('idna'), flags)
File "C:\Program Files\Python37\lib\site-packages\whois\whois.py", line 264, in whois_lookup
result = self.whois(query_arg, nichost, flags)
File "C:\Program Files\Python37\lib\site-packages\whois\whois.py", line 142, in whois
s.connect((hostname, 43))
socket.timeout: timed out

Your code has at least two problems, and you may have a network problem also.
However, there is no reason for it not to work on Python2.
About the code
This works perfectly fine:
In [7]: import whois
In [8]: print whois.query('stackoverflow.com').expiration_date
2020-02-02 11:59:59
Note two things:
whois is about domain names, not URLs; so you should pass a domain name; note more generally that for new endeavors you should have a look at RDAP instead of whois since you will get a far better experience
you need to use whois.query not whois.whois (you are not saying which version of the library you use, but at its documentation page on https://pypi.org/project/whois/ you can clearly see it is whois.query so I do not know where your whois.whois` comes from).
About the network
You show a network error. It is not 100% clear but you may or may not have access to the whois servers you want to query.
Easy way to test: just use the command line whois from the same box as your code (but again use a domain name, not a URL as parameter) and you will see what is happening.
You can even do directly a telnet on port 43 as whois does nothing else.
$ echo 'stackoverflow.com' | nc whois.verisign-grs.com 43 | grep 'Expiry'
Registry Expiry Date: 2020-02-02T11:59:59Z

Boto3/Jenkins client throwing an error while running the code

I am running a daily glue script in one of our AWS machines, which I scheduled it using jenkins.
I am getting the following from the last 15 days. (this daily job is running for almost 6 months and all of a sudden since the 15 days this is happening)
The jenkins console output looks like this
Started by timer
Building in workspace /var/lib/jenkins/workspace/build_name_xyz
[build_name_xyz] $ /bin/sh -xe /tmp/jenkins8188702635955396537.sh
+ /usr/bin/python3 /var/lib/jenkins/path_to_script/glue_crawler.py
Traceback (most recent call last):
File "/var/lib/jenkins/path_to_script/glue_crawler.py", line 10, in <module>
response = glue_client.update_crawler(Name = crawler_name,Targets = {'S3Targets': [{'Path':update_path}]})
File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidInputException: An error occurred (InvalidInputException) when calling the UpdateCrawler operation: Cannot update Crawler while running. Please stop crawl or wait until it completes to update.
Build step 'Execute shell' marked build as failure
Finished: FAILURE
So, I went ahead and have seen the line 10 in this file
/var/lib/jenkins/path_to_script/glue_crawler.py
That looked something like this.
import boto3
import datetime
glue_client = boto3.client('glue', region_name='region_name')
crawler_name = 'xyz_abc'
today = (datetime.datetime.now()).strftime("%Y_%m_%d")
update_path = 's3://path-to-respective-aws-s3-bucket/%s' % (today)
response = glue_client.update_crawler(Name = crawler_name,Targets = {'S3Targets': [{'Path':update_path}]})
response_crawler = glue_client.start_crawler(
Name=crawler_name
)
print(response_crawler)
The above throws an error at line 10. I am not understanding what exactly is going wrong on line 10 and hence the jenkins throws an error with the red ball, requesting for some help here. I tried googling on this, but I couldn't find anything.
Just, FYI......if I run the same build (by clicking 'Build Now') using the jenkins UI after sometime, the job runs absolutely fine.
Not sure what exactly is wrong here, any help is highly appreciated.
Thanks in advance!!

The error is self explanatory:
Cannot update Crawler while running. Please stop crawl or wait until it completes to update.
So somehow the crawler was started approximately at the same time and in Glue it's not allowed to update crawler properties when it's running. Please check if there is any other task that starts crawler with name xyz_abc too. Besides that in AWS Console make sure the crawler is configured to run on demand rather than on schedule.

Python Selenium Firefox understand error message

I am running a test script in Python Selenium Firefox and seemingly at random it crashes with the following error...
Time Elapsed: 104.31666666666666
Traceback (most recent call last):
File "D:\sel_scripts\main.py", line 110, in <module>
source_rf_script(driver, time, randint)
File "D:\sel_scripts\data_sources\myscript.py", line 184, in source_rf_script
htmlText = driver.execute_script("return document.getElementsByTagName('html')[0].innerHTML")
File "C:\Users\user4\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 429, in execute_script
{'script': script, 'args':converted_args})['value']
File "C:\Users\user4\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 201, in execute
self.error_handler.check_response(response)
File "C:\Users\user4\AppData\Local\Programs\Python\Python35\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: waiting for doc.body failed
Stacktrace:
at injectAndExecuteScript/< (file:///C:/Users/user4/AppData/Local/Temp/tmpvbvr8pjg/webdriver-py-profilecopy/extensions/fxdriver#googlecode.com/components/driver-component.js:
10678)
at fxdriver.Timer.prototype.runWhenTrue/g (file:///C:/Users/user4/AppData/Local/Temp/tmpvbvr8pjg/webdriver-py-profilecopy/extensions/fxdriver#googlecode.com/components/driver
-component.js:629)
at fxdriver.Timer.prototype.setTimeout/<.notify (file:///C:/Users/user4/AppData/Local/Temp/tmpvbvr8pjg/webdriver-py-profilecopy/extensions/fxdriver#googlecode.com/components/
driver-component.js:623)
I am trying to work out what is causing it, to me it reads like the driver.execute_script command is causing the failure. Could it be when the page elements fail to load correctly (which happens occasionally on the dev server) and the command is unable to find any?
Am I reading it right?

You can find that error in the JavaScript source.
The logic seems to be that whenever someone calls execute_script(), i.e. on the line:
htmlText = driver.execute_script("return document.getElementsByTagName('html')[0].innerHTML")
... a listener waits up to 10 seconds for a <body> element to (yet) be available in the current document. If one doesn't, you get the error.
I believe the motivation is to alert you that the execution of a very slow script - and trying to grab the innerHtml of the entire doc will be slow - may have blocked the loading of the document that invoked it.
So, two possibilities:
If, as you say, the DOM never loads, then that could misleadingly trigger the error (though it's reasonable to get some kind of error).
Or it could still be a race condition, i.e. DOM loading blocked by misbehaving script.
Solutions would be:
Avoid executing any scripts until you know the DOM exists and is ready.
Find a smarter, quicker way to get the data from the page.
(Obviously you have a right to expect that the page you're testing will be served correctly from the back-end.)
See also this thread: https://github.com/seleniumhq/selenium-google-code-issue-archive/issues/1157

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Requests_html: giving me Timeout Error - python

Related

VI_ERROR_TMO when a computer does a query to a function generator

Can't initialize ANT+ Node with Python OpenANT library

The whois.whois function always gets a timed out error

Boto3/Jenkins client throwing an error while running the code

Python Selenium Firefox understand error message

Categories

Resources