Script below is successful in pulling the right information from a single IP(url_ip). But after trying to build a looping process the requests call falls over with connection errors. (errors below)
*NOTE - sloppy code so be warned.
from lxml import html
import requests
import smtplib
# STATIC URL
#TODO PULL A LIST OF IP ADDRESSES AND BUILD THE URL FOR EACH SYSTEM
#IPs = ['192.168.3.152','192.168.3.194']
def crawler(url_ip):
global eqid, counter, serial
print "Starting Crawler Service for: " + url_ip
url = "http://" + url_ip + "/cgi-bin/dynamic/printer/config/reports/deviceinfo.html"
urleqid = "http://" + url_ip + "/cgi-bin/dynamic/topbar.html"
response = requests.get(url)
tree = html.fromstring(response.text)
counter = tree.xpath('//td[contains(p,"Count")]/following-sibling::td/p/text()')
serial = tree.xpath('//td[contains(p, "Serial")]/following-sibling::td/p/text()')
counter = counter[0].split(' ')[3]
serial = serial[0].split(' ')[3]
responseeqid = requests.get(urleqid)
treeequid = html.fromstring(responseeqid.text)
eqid = treeequid.xpath('//descendant-or-self::node()/child::b[contains(., "Location")]/text()')[1].split(' ')[-1]
print " -- equipment id found: " + eqid
print " -- count found: " + counter
print " -- serial found: " + serial
print "Stopping Crawler Service for: " + url_ip
return
def send_mail(eqid,counter,serial):
GMAIL_USERNAME = "removed"
GMAIL_PASSWORD = "removed"
recipient = "removed"
email_subject = "Test"
body_of_email = "Equipment ID = " + eqid + "<br>Total Meter Count = " + counter + "<br>Serial Number = " + serial + "<br><br>"
session = smtplib.SMTP('smtp.gmail.com', 587)
session.ehlo()
session.starttls()
session.login(GMAIL_USERNAME, GMAIL_PASSWORD)
headers = "\r\n".join(["from: " + GMAIL_USERNAME,
"subject: " + email_subject,
"to: " + recipient,
"mime-version: 1.0",
"content-type: text/html"])
# body_of_email can be plain text or html!
content = headers + "\r\n\r\n" + body_of_email
session.sendmail(GMAIL_USERNAME, recipient, content)
return
with open('iplist.txt') as fp:
for line in fp:
crawler(line);
#send_mail(eqid,counter,serial);
ERROR LOG:
Starting Crawler Service for: 192.168.3.152
Traceback (most recent call last):
File "getmeters.py", line 63, in <module>
crawler(ipstring);
File "getmeters.py", line 17, in crawler
response = requests.get(url)
File "/Library/Python/2.7/site-packages/requests/api.py", line 68, in get
return request('get', url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/api.py", line 50, in request
response = session.request(method=method, url=url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 464, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/Library/Python/2.7/site-packages/requests/adapters.py", line 415, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', gaierror(8, 'nodename nor servname provided, or not known'))
I thought it was due to the value "line" being processed as a list object and not a string, so I converted to str(line) and that failed as well.
I suspect that you have line endings (\n) at the end of the lines in the files, and you may need to strip those off. otherwise your URL becomes something like
http://192.168.3.152
/cgi-bin/dynamic/printer/config/reports/deviceinfo.html"
instead of the intended
http://192.168.3.152/cgi-bin/dynamic/printer/config/reports/deviceinfo.html"
Related
I am trying to troubleshoot a script for Mimecast's API. The script runs fine for the most part, but a few times, I have noticed that it stops pulling logs and generally appears to be a hung process. After restarting the script and manually pushing logs to the syslog server, it starts working again without issue. I am not able to reproduce this issue at will.
The script is supposed to do the following:
Authenticate against Mimecast's API
Sign responses
download, extract and save log files to log dir
utilize a tokenized header to determine which file was downloaded in the last request. Should save the token ID within a file in the checkpoint directory
Push files to remote syslog server
Output any errors and info to console
Below is the sample code from Mimecast.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import logging.handlers
import json
import os
import requests
import base64
import uuid
import datetime
import hashlib
import shutil
import hmac
import time
from zipfile import ZipFile
import io
# Set up variables
APP_ID = "YOUR DEVELOPER APPLICATION ID"
APP_KEY = "YOUR DEVELOPER APPLICATION KEY"
URI = "/api/audit/get-siem-logs"
EMAIL_ADDRESS = 'EMAIL ADDRESS OF YOUR ADMINISTRATOR'
ACCESS_KEY = 'ACCESS KEY FOR YOUR ADMINISTRATOR'
SECRET_KEY = 'SECRET KEY FOR YOUR ADMINISTRATOR'
LOG_FILE_PATH = "FULLY QUALIFIED PATH TO FOLDER TO WRITE LOGS"
CHK_POINT_DIR = 'FULLY QUALIFIED PATH TO FOLDER TO WRITE PAGE TOKEN'
# Set True to output to syslog, false to only save to file
syslog_output = False
# Enter the IP address or hostname of your syslog server
syslog_server = 'localhost'
# Change this to override default port
syslog_port = 514
# delete files after fetching
delete_files = True
# Set threshold in number of files in log file directory
log_file_threshold = 10000
# Set up logging (in this case to terminal)
log = logging.getLogger(__name__)
log.root.setLevel(logging.DEBUG)
log_formatter = logging.Formatter('%(levelname)s %(message)s')
log_handler = logging.StreamHandler()
log_handler.setFormatter(log_formatter)
log.addHandler(log_handler)
# Set up syslog output
syslog_handler = logging.handlers.SysLogHandler(address=(syslog_server, syslog_port))
syslog_formatter = logging.Formatter('%(message)s')
syslog_handler.setFormatter(syslog_formatter)
syslogger = logging.getLogger(__name__)
syslogger = logging.getLogger('SysLogger')
syslogger.addHandler(syslog_handler)
# Supporting methods
def get_hdr_date():
return datetime.datetime.utcnow().strftime("%a, %d %b %Y %H:%M:%S UTC")
def read_file(file_name):
try:
with open(file_name, 'r') as f:
data = f.read()
return data
except Exception as e:
log.error('Error reading file ' + file_name + '. Cannot continue. Exception: ' + str(e))
quit()
def write_file(file_name, data_to_write):
if '.zip' in file_name:
try:
byte_content = io.BytesIO(data_to_write)
zip_file = ZipFile(byte_content)
zip_file.extractall(LOG_FILE_PATH)
except Exception as e:
log.error('Error writing file ' + file_name + '. Cannot continue. Exception: ' + str(e))
quit()
else:
try:
with open(file_name, 'w') as f:
f.write(data_to_write)
except Exception as e:
log.error('Error writing file ' + file_name + '. Cannot continue. Exception: ' + str(e))
quit()
def get_base_url(email_address):
# Create post body for request
post_body = dict()
post_body['data'] = [{}]
post_body['data'][0]['emailAddress'] = email_address
# Create variables required for request headers
request_id = str(uuid.uuid4())
request_date = get_hdr_date()
headers = {'x-mc-app-id': APP_ID, 'x-mc-req-id': request_id, 'x-mc-date': request_date}
# Send request to API
log.debug('Sending request to https://api.mimecast.com/api/discover-authentication with request Id: ' +
request_id)
try:
r = requests.post(url='https://api.mimecast.com/api/login/discover-authentication',
data=json.dumps(post_body), headers=headers)
# Handle Rate Limiting
if r.status_code == 429:
log.warning('Rate limit hit. sleeping for ' + str(r.headers['X-RateLimit-Reset'] * 1000))
time.sleep(r.headers['X-RateLimit-Reset'] * 1000)
except Exception as e:
log.error('Unexpected error getting base url. Cannot continue.' + str(e))
quit()
# Handle error from API
if r.status_code != 200:
log.error('Request returned with status code: ' + str(r.status_code) + ', response body: ' +
r.text + '. Cannot continue.')
quit()
# Load response body as JSON
resp_data = json.loads(r.text)
# Look for api key in region region object to get base url
if 'region' in resp_data["data"][0]:
base_url = resp_data["data"][0]["region"]["api"].split('//')
base_url = base_url[1]
else:
# Handle no region found, likely the email address was entered incorrectly
log.error(
'No region information returned from API, please check the email address.'
'Cannot continue')
quit()
return base_url
def post_request(base_url, uri, post_body, access_key, secret_key):
# Create variables required for request headers
request_id = str(uuid.uuid4())
request_date = get_hdr_date()
unsigned_auth_header = '{date}:{req_id}:{uri}:{app_key}'.format(
date=request_date,
req_id=request_id,
uri=uri,
app_key=APP_KEY
)
hmac_sha1 = hmac.new(
base64.b64decode(secret_key),
unsigned_auth_header.encode(),
digestmod=hashlib.sha1).digest()
sig = base64.encodebytes(hmac_sha1).rstrip()
headers = {
'Authorization': 'MC ' + access_key + ':' + sig.decode(),
'x-mc-app-id': APP_ID,
'x-mc-date': request_date,
'x-mc-req-id': request_id,
'Content-Type': 'application/json'
}
try:
# Send request to API
log.debug('Sending request to https://' + base_url + uri + ' with request Id: ' + request_id)
r = requests.post(url='https://' + base_url + uri, data=json.dumps(post_body), headers=headers)
# Handle Rate Limiting
if r.status_code == 429:
log.warning('Rate limit hit. sleeping for ' + str(r.headers['X-RateLimit-Reset'] * 1000))
time.sleep(r.headers['X-RateLimit-Reset'] * 1000)
r = requests.post(url='https://' + base_url + uri, data=json.dumps(post_body), headers=headers)
# Handle errors
except Exception as e:
log.error('Unexpected error connecting to API. Exception: ' + str(e))
return 'error'
# Handle errors from API
if r.status_code != 200:
log.error('Request to ' + uri + ' with , request id: ' + request_id + ' returned with status code: ' +
str(r.status_code) + ', response body: ' + r.text)
return 'error'
# Return response body and response headers
return r.content, r.headers
def get_mta_siem_logs(checkpoint_dir, base_url, access_key, secret_key):
uri = "/api/audit/get-siem-logs"
# Set checkpoint file name to store page token
checkpoint_filename = os.path.join(checkpoint_dir, 'get_mta_siem_logs_checkpoint')
# Build post body for request
post_body = dict()
post_body['data'] = [{}]
post_body['data'][0]['type'] = 'MTA'
post_body['data'][0]['compress'] = True
if os.path.exists(checkpoint_filename):
post_body['data'][0]['token'] = read_file(checkpoint_filename)
# Send request to API
resp = post_request(base_url, uri, post_body, access_key, secret_key)
now = datetime.datetime.now().strftime("%a %b %d %H:%M:%S %Y")
# Process response
if resp != 'error':
resp_body = resp[0]
resp_headers = resp[1]
content_type = resp_headers['Content-Type']
# End if response is JSON as there is no log file to download
if content_type == 'application/json':
log.info('No more logs available')
return False
# Process log file
elif content_type == 'application/octet-stream':
file_name = resp_headers['Content-Disposition'].split('=\"')
file_name = file_name[1][:-1]
# Save files to LOG_FILE_PATH
write_file(os.path.join(LOG_FILE_PATH, file_name), resp_body)
# Save mc-siem-token page token to check point directory
write_file(checkpoint_filename, resp_headers['mc-siem-token'])
try:
if syslog_output is True:
for filename in os.listdir(LOG_FILE_PATH):
file_creation_time = time.ctime(os.path.getctime(LOG_FILE_PATH + "/" + filename))
if now < file_creation_time or now == file_creation_time:
log.info('Loading file: ' + filename + ' to output to ' + syslog_server + ':' + str(syslog_port))
with open(file=os.path.join(LOG_FILE_PATH, filename), mode='r', encoding='utf-8') as log_file:
lines = log_file.read().splitlines()
for line in lines:
syslogger.info(line)
log.info('Syslog output completed for file ' + filename)
except Exception as e:
log.error('Unexpected error writing to syslog. Exception: ' + str(e))
# return true to continue loop
return True
else:
# Handle errors
log.error('Unexpected response')
for header in resp_headers:
log.error(header)
return False
def run_script():
# discover base URL
try:
base_url = get_base_url(email_address=EMAIL_ADDRESS)
except Exception as e:
log.error('Error discovering base url for ' + EMAIL_ADDRESS + ' . Exception: ' + str(e))
quit()
# Request log data in a loop until there are no more logs to collect
try:
log.info('Getting MTA log data')
while get_mta_siem_logs(checkpoint_dir=CHK_POINT_DIR, base_url=base_url, access_key=ACCESS_KEY,
secret_key=SECRET_KEY) is True:
log.info('Getting more MTA log files')
except Exception as e:
log.error('Unexpected error getting MTA logs ' + (str(e)))
file_number = len([name for name in os.listdir(LOG_FILE_PATH) if os.path.isfile(name)])
if delete_files or file_number >= log_file_threshold:
for filename in os.listdir(LOG_FILE_PATH):
file_path = os.path.join(LOG_FILE_PATH, filename)
try:
if os.path.isfile(file_path) or os.path.islink(file_path):
os.unlink(file_path)
elif os.path.isdir(file_path):
shutil.rmtree(file_path)
except Exception as e:
print('Failed to delete %s. Reason: %s' % (file_path, e))
quit()
# Run script
run_script()
It seems like it may potentially be a race condition but I am not sure how to confirm since I can't reproduce it. I notice that SumoLogic has a modified version of this script as well with a different methodology for managing the files/paths. If this script works better than the main sample script above, would anybody be able to explain WHY? I haven't had any issues with it yet.
https://github.com/SumoLogic/sumologic-content/blob/master/MimeCast/SumoLogic-Mimecast-Data-Collection/siem_collection.py
I am trying to consume services from a XMLRPC web service using python.
The remote web server require authentication and ssl verification. To do this staff, I implemented a an xmlrpc client using xmlrpc.client as follows:
class HTTPSDigestAuthTransport:
def request(self, host, handler, request_body, verbose=0):
api_url = Setup.get_api_url()
username = Setup.get_api_username()
password = Setup.get_api_password()
h = httplib2.Http()
if verbose:
h.debuglevel = 1
h.add_credentials(username, password)
h.disable_ssl_certificate_validation = True
resp, content = h.request("https://" + api_url, "POST", body=request_body,
headers={'content-type': 'text/xml'})
if resp.status != 200:
raise ProtocolError("https://" + api_url, resp.status, resp.reason, None)
p, u = getparser(0)
p.feed(content)
# transport factory instance
transport = HTTPSDigestAuthTransport()
# url composition
url = "https://" + Setup.get_api_username() + ":" + Setup.get_api_password() + "#" + Setup.get_api_url()
# create the proxy
proxy = xmlrpc.client.ServerProxy(url, transport)
res = proxy.do_some_work()
The problem is that the instruction res = proxy.do_some_work() generates this error:
File "/usr/lib/python3.6/xmlrpc/client.py", line 1112, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python3.6/xmlrpc/client.py", line 1455, in __request
if len(response) == 1:
TypeError: object of type 'NoneType' has no len()
Is object of type 'NoneType' has no len() due the the response format? What can be the solution?
I am trying to create a job using Gnip Historical Powertrack API.
I am getting issue with the urllib..
import urllib2
import base64
import json
UN = '' # YOUR GNIP ACCOUNT EMAIL ID
PWD = ''
account = '' # YOUR GNIP ACCOUNT USER NAME
def get_json(data):
return json.loads(data.strip())
def post():
url = 'https://historical.gnip.com/accounts/' + account + '/jobs.json'
publisher = "twitter"
streamType = "track"
dataFormat = "activity-streams"
fromDate = "201510140630"
toDate = "201510140631"
jobTitle = "job30"
rules = '[{"value":"","tag":""}]'
jobString = '{"publisher":"' + publisher + '","streamType":"' + streamType + '","dataFormat":"' + dataFormat + '","fromDate":"' + fromDate + '","toDate":"' + toDate + '","title":"' + jobTitle + '","rules":' + rules + '}'
base64string = base64.encodestring('%s:%s' % (UN, PWD)).replace('\n', '')
req = urllib2.Request(url=url, data=jobString)
req.add_header('Content-type', 'application/json')
req.add_header("Authorization", "Basic %s" % base64string)
proxy = urllib2.ProxyHandler({'http': 'http://proxy:8080', 'https': 'https://proxy:8080'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
try:
response = urllib2.urlopen(req)
the_page = response.read()
the_page = get_json(the_page)
print 'Job has been created.'
print 'Job UUID : ' + the_page['jobURL'].split("/")[-1].split(".")[0]
except urllib2.HTTPError as e:
print e.read()
if __name__=='__main__':
post()
this is the error I am getting :
Traceback (most recent call last):
File "gnip1.py", line 37, in <module>
post()
File "gnip1.py", line 28, in post
response = urllib2.urlopen(req)
File "/home/soundarya/anaconda-new-1/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/home/soundarya/anaconda-new-1/lib/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/home/soundarya/anaconda-new-1/lib/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/home/soundarya/anaconda-new-1/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/home/soundarya/anaconda-new-1/lib/python2.7/urllib2.py", line 1240, in https_open
context=self._context)
File "/home/soundarya/anaconda-new-1/lib/python2.7/urllib2.py", line 1197, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno -2] Name or service not known>
I even tried through the curl command:
When I tried running the below one in terminal, I am getting error - ServiceUsername is not valid.
curl -v -X POST -uname -d '{"title": "HPT_test_job","publisher": "Twitter","streamType":"track","dataFormat":"activity-streams","fromDate":"201401010000","toDate":"201401020000 ","rules":[{"value": "twitter_lang:en (Hillary Clinton OR Donald)","tag": "2014_01_01_snow"}]}' 'https://historical.gnip.com/accounts/account_name/jobs.json'
This is the exact output msg:
Error retrieving Job status: {u'serviceUsername': [u'is invalid']} -- Please verify your connection parameters and network connection *
Try this.. see if it helps
import urllib2
from urllib2.request import urlopen
u = urlopen ('http:// .........')
If you are using python 3.5 you should use the library urllib.request which is the newer version of urllib2. Notice however that this changes a few things in the code including print (which should be in parentheses) and the need to transform some of the string results into bytes. Here you can look at all the required changes in code adapted to python 3.5
while testing, I just discovered, that this
url = ' http://wi312.rockdizfile.com/d/uclf2kr7fp4r2ge47pcuihdpky2chcsjur5nrds2hx53f26qgxnrktew/Kimbra%20-%20Love%20in%20High%20Places.mp3'
works in browser and file download begins but if i try to fetch this file using
requests.get(url)
it gives massive error ...
any clue why is this happening ? do in need to decode this to make it working?
Update
this is the error I keep getting:
Exception in thread Thread-5:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "python/file_download.py", line 98, in _downloadChunk
stream=True)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/sessions.py", line 382, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/sessions.py", line 485, in send
r = adapter.send(request, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests-2.1.0-py2.7.egg/requests/adapters.py", line 381, in send
raise Timeout(e)
Timeout: (<requests.packages.urllib3.connectionpool.HTTPConnectionPool object at 0x10258de90>, 'Connection to wi312.rockdizfile.com timed out. (connect timeout=0.001)')
there was no space when I posted, it was just in newline because I posted inline code embed.
Here is the code that makes requests:(also try new URL: http://archive.org/download/LucyIsabelleMarsh/LucyIsabelleMarsh-ItalianStreetSong.mp3)
import requests
import signal
import sys
import time
import threading
import utils as _fdUtils
from socket import error as SocketError, timeout as SocketTimeout
def _downloadChunk(url, idx, irange, fileName, sizeInBytes):
_log.debug("Downloading %s for first chunk %s " % (irange, idx+1))
pulledSize = irange[-1]
try:
resp = requests.get(url, allow_redirects=False, timeout=0.001,
headers={'Range': 'bytes=%s-%s' % (str(irange[0]), str(irange[-1]))},
stream=True)
except (SocketTimeout, requests.exceptions), e:
_log.error(e)
return
chunk_size = str(irange[-1])
for chunk in resp.iter_content(chunk_size):
status = r"%10d [%3.2f%%]" % (pulledSize, pulledSize * 100. / int(chunk_size))
status = status + chr(8)*(len(status)+1)
sys.stdout.write('%s\r' % status)
sys.stdout.flush()
pulledSize += len(chunk)
dataDict[idx] = chunk
time.sleep(.03)
if pulledSize == sizeInBytes:
_log.info("%s downloaded %3.0f%%", fileName, pulledSize * 100. / sizeInBytes)
class ThreadedFetch(threading.Thread):
""" docstring for ThreadedFetch
"""
def __init__(self, saveTo, queue):
super(ThreadedFetch, self).__init__()
self.queue = queue
self.__saveTo = saveTo
def run(self):
threadLimiter.acquire()
try:
items = self.queue.get()
url = items[0]
split = items[-1]
fileName = _fdUtils.getFileName(url)
# grab split chunks in separate thread.
if split > 1:
maxSplits.acquire()
try:
sizeInBytes = _fdUtils.getUrlSizeInBytes(url)
if sizeInBytes:
byteRanges = _fdUtils.getRangeSegements(sizeInBytes, split)
else:
byteRanges = ['0-']
filePath = os.path.join(self.__saveTo, fileName)
downloaders = [
threading.Thread(
target=_downloadChunk,
args=(url, idx, irange, fileName, sizeInBytes),
)
for idx, irange in enumerate(byteRanges)
]
# start threads, let run in parallel, wait for all to finish
for th in downloaders:
th.start()
# this makes the wait for all thread to finish
# which confirms the dataDict is up-to-date
for th in downloaders:
th.join()
downloadedSize = 0
with open(filePath, 'wb') as fh:
for _idx, chunk in sorted(dataDict.iteritems()):
downloadedSize += len(chunk)
status = r"%10d [%3.2f%%]" % (downloadedSize, downloadedSize * 100. / sizeInBytes)
status = status + chr(8)*(len(status)+1)
fh.write(chunk)
sys.stdout.write('%s\r' % status)
time.sleep(.04)
sys.stdout.flush()
if downloadedSize == sizeInBytes:
_log.info("%s, saved to %s", fileName, self.__saveTo)
self.queue.task_done()
finally:
maxSplits.release()
The traceback is showing a Timeout exception, and in your code indeed you have a very short timeout set, either remove this limit or increase it:
requests.get(url, allow_redirects=False, timeout=0.001, # <-- this is very short
Even if you were accessing localhost (your own computer), such a timeout will result in a Timeout exception. From the documentation:
Note
timeout is not a time limit on the entire response download; rather,
an exception is raised if the server has not issued a response for
timeout seconds (more precisely, if no bytes have been received on the
underlying socket for timeout seconds).
So its not doing what you might expect.
You have a space before the start of the url which causes a requests.exceptions.InvalidSchema error:
url = ' http://wi312.rockdizfile.com/d/uclf2kr7fp4r2ge47pcuihdpky2chcsjur5nrds2hx53f26qgxnrktew/Kimbra%20-%20Love%20in%20High%20Places.mp3'
Change to:
url = 'http://wi312.rockdizfile.com/d/uclf2kr7fp4r2ge47pcuihdpky2chcsjur5nrds2hx53f26qgxnrktew/Kimbra%20-%20Love%20in%20High%20Places.mp3'
So I have been trying too convert an omegle bot, which was written in python2, to python3. This is the original code: https://gist.github.com/thefinn93/1543082
Now this is my code:
import requests
import sys
import json
import urllib
import random
import time
server = b"odo-bucket.omegle.com"
debug_log = False # Set to FALSE to disable excessive messages
config = {'verbose': open("/dev/null","w")}
headers = {}
headers['Referer'] = b'http://odo-bucket.omegle.com/'
headers['Connection'] = b'keep-alive'
headers['User-Agent'] = b'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.2 (KHTML, like Gecko) Ubuntu/11.10 Chromium/15.0.874.106 Chrome/15.0.874.106 Safari/535.2'
headers['Content-type'] = b'application/x-www-form-urlencoded; charset=UTF-8'
headers['Accept'] = b'application/json'
headers['Accept-Encoding'] = b'gzip,deflate,sdch'
headers['Accept-Language'] = b'en-US'
headers['Accept-Charset'] = b'ISO-8859-1,utf-8;q=0.7,*;q=0.3'
if debug_log:
config['verbose'] = debug_log
def debug(msg):
if debug_log:
print("DEBUG: " + str(msg))
debug_log.write(str(msg) + "\n")
def getcookies():
r = requests.get(b"http://" + server + b"/")
debug(r.cookies)
return(r.cookies)
def start():
r = requests.request(b"POST", b"http://" + server + b"/start?rcs=1&spid=", data=b"rcs=1&spid=", headers=headers)
omegle_id = r.content.strip(b"\"")
print("Got ID: " + str(omegle_id))
cookies = getcookies()
event(omegle_id, cookies)
def send(omegle_id, cookies, msg):
r = requests.request(b"POST","http://" + server + "/send", data="msg=" + urllib.quote_plus(msg) + "&id=" + omegle_id, headers=headers, cookies=cookies)
if r.content == "win":
print("You: " + msg)
else:
print("Error sending message, check the log")
debug(r.content)
def event(omegle_id, cookies):
captcha = False
next = False
r = requests.request(b"POST",b"http://" + server + b"/events",data=b"id=" + omegle_id, cookies=cookies, headers=headers)
try:
parsed = json.loads(r.content)
for e in parsed:
if e[0] == "waiting":
print("Waiting for a connection...")
elif e[0] == "count":
print("There are " + str(e[1]) + " people connected to Omegle")
elif e[0] == "connected":
print("Connection established!")
send(omegle_id, cookies, "HI I just want to talk ;_;")
elif e[0] == "typing":
print("Stranger is typing...")
elif e[0] == "stoppedTyping":
print ("Stranger stopped typing")
elif e[0] == "gotMessage":
print("Stranger: " + e[1])
try:
cat=""
time.sleep(random.randint(1,5))
i_r=random.randint(1,8)
if i_r==1:
cat="that's cute :3"
elif i_r==2:
cat="yeah, guess your right.."
elif i_r==3:
cat="yeah, tell me something about yourself!!"
elif i_r==4:
cat="what's up"
elif i_r==5:
cat="me too"
else:
time.sleep(random.randint(3,9))
send(omegle_id, cookies, "I really have to tell you something...")
time.sleep(random.randint(3,9))
cat="I love you."
send(omegle_id, cookies, cat)
except:
debug("Send errors!")
elif e[0] == "strangerDisconnected":
print("Stranger Disconnected")
next = True
elif e[0] == "suggestSpyee":
print ("Omegle thinks you should be a spy. Fuck omegle.")
elif e[0] == "recaptchaRequired":
print("Omegle think's you're a bot (now where would it get a silly idea like that?). Fuckin omegle. Recaptcha code: " + e[1])
captcha = True
except:
print("Derka derka derka")
if next:
print("Reconnecting...")
start()
elif not captcha:
event(omegle_id, cookies)
start()
The error I get is:
Traceback (most recent call last):
File "p3.py", line 124, in <module>
start()
File "p3.py", line 46, in start
r = requests.request(b"POST", b"http://" + server + b"/start?rcs=1&spid=", data=b"rcs=1&spid=", headers=headers)
File "/usr/lib/python3.4/site-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3.4/site-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3.4/site-packages/requests/sessions.py", line 553, in send
adapter = self.get_adapter(url=request.url)
File "/usr/lib/python3.4/site-packages/requests/sessions.py", line 608, in get_adapter
raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for 'b'http://odo-bucket.omegle.com/start?rcs=1&spid=''
I didn't really understand what would fix this error, nor what the problem really is, even after looking it up.
UPDATE:
Now after removing all the b's I get the following error:
Traceback (most recent call last):
File "p3.py", line 124, in <module>
start()
File "p3.py", line 47, in start
omegle_id = r.content.strip("\"")
TypeError: Type str doesn't support the buffer API
UPDATE 2:
After putting the b back to r.content, I get the following error message:
Traceback (most recent call last):
File "p3.py", line 124, in <module>
start()
File "p3.py", line 50, in start
event(omegle_id, cookies)
File "p3.py", line 63, in event
r = requests.request("POST","http://" + server + "/events",data="id=" + omegle_id, cookies=cookies, headers=headers)
TypeError: Can't convert 'bytes' object to str implicitly
UPDATE 3:
Everytime I try to start it excepts "Derka derka", what could be causing this (It wasn't like that with python2).
requests takes strings, not bytes values for the URL.
Because your URLs are bytes values, requests is converting them to strings with str(), and the resulting string contains the characters b' at the start. That's no a valid scheme like http:// or https://.
The majority of your bytestrings should really be regular strings instead; only the content.strip() call deals with actual bytes.
The headers will be encoded for you, for example. Don't even set the Content-Type header; requests will take care of that for you if you pass in a dictionary (using string keys and values) to the data keyword argument.
You shouldn't set the Connection header either; leave connection management to requests as well.