Argument is URL or path - python

What is the standard practice in Python when I have a command-line application taking one argument which is
URL to a web page
or
path to a HTML file somewhere on disk
(only one)
is sufficient the code?
if "http://" in sys.argv[1]:
print "URL"
else:
print "path to file"

import urlparse
def is_url(url):
return urlparse.urlparse(url).scheme != ""
is_url(sys.argv[1])

Depends on what the program must do. If it just prints whether it got a URL, sys.argv[1].startswith('http://') might do. If you must actually use the URL for something useful, do
from urllib2 import urlopen
try:
f = urlopen(sys.argv[1])
except ValueError: # invalid URL
f = open(sys.argv[1])

Larsmans might work, but it doesn't check whether the user actually specified an argument or not.
import urllib
import sys
try:
arg = sys.argv[1]
except IndexError:
print "Usage: "+sys.argv[0]+" file/URL"
sys.exit(1)
try:
site = urllib.urlopen(arg)
except ValueError:
file = open(arg)

Related

Why is my Maltego tranform giving me a empty value error?

I'm right now trying to code a Maltego tranform to search through the Breach Compilation data, using query.sh
Code:
#!/usr/bin/env python
# Maltego transform for grabbing Breach Compilation results locally.
from MaltegoTransform import *
import sys
import os
import subprocess
email = sys.argv[1]
mt = MaltegoTransform()
try:
dataleak = subprocess.check_output("LOCATION OF QUERY.SH" + email, shell=True).splitlines()
for info in dataleak:
mt.addEntity('maltego.Phrase', info)
else:
mt.addUIMessage("")
except Exception as e:
mt.addUIMessage(str(e))
mt.returnOutput()
Empty value for #org.simpleframework.xml.Text(data=false,
required=true, empty=) on method 'value' in class
com.paterva.maltego.transform.protocol.v2api.messaging.TransformResponse$Notification
at line 26
Not sure what the problem is.
Turns out
else:
mt.addUIMessage("")
except Exception as e:
mt.addUIMessage(str(e))
that bit of code was the problem. Got it working just fine.

Python and Urllib. How to handle the network exceptions?

I'm working on a simple code that is downloading a file over HTTP using the package urllib and urllib.request. Everything is working good excepted that I would like to be able to handle the network problems that could happens.
Checking if the computer is online (Connected to the internet). And proceed only if true.
Restarting the download of the file if during it, the connection is lost or too bad.
I would like, if possible, to use as less packages as possible.
Here is my actual code :
import urllib
import urllib.request
url = "http://my.site.com/myFile"
urlSplited = url.split('/')[-1];
print ("Downloading : "+urlSplited)
urllib.request.urlretrieve(url, urlSplited)
To check if a connection is etablished, I believe I can do something like
while connection() is true:
Download()
But that would do the downloading many times..
I'm working on Linux.
I suggest you to use a combination of try, while and sleep function. Like this:
import urllib
import urllib.request
import time
url = "http://my.site.com/myFile"
urlSplited = url.split('/')[-1];
try_again = True
print ("Downloading : "+urlSplited)
while try_again:
try:
urllib.request.urlretrieve(url, urlSplited, timeout = 100)
try_again = False
except Exception as e:
print(e)
time.sleep(600)
Replace while with if, then everything under it will run only once.
if connection() == True:
Download()
Also, connection() function could be something like this:
try:
urllib.urlopen(url, timeout=5)
return True
return False

Python else issues making an FTP program

I am having an issue with the else statement of this program... I have checked my spacing and it seems to be correct. I keep getting syntax error on the else statement. The program creates and file then attempts to upload it to a ftp server but if it fails to not say anything to the user and just continue It will try again when the program loops. Any help you could provide would be greatly appreciated.
#IMPORTS
import ConfigParser
import os
import random
import ftplib
from ftplib import FTP
#LOOP PART 1
from time import sleep
while True:
#READ THE CONFIG FILE SETUP.INI
config = ConfigParser.ConfigParser()
config.readfp(open(r'setup.ini'))
path = config.get('config', 'path')
name = config.get('config', 'name')
#CREATE THE KEYFILE
filepath = os.path.join((path), (name))
if not os.path.exists((path)):
os.makedirs((path))
file = open(filepath,'w')
file.write('text here')
file.close()
#Create Full Path
fullpath = path + name
#Random Sleep to Accomidate FTP Server
sleeptimer = random.randrange(1,30+1)
sleep((sleeptimer))
#Upload File to FTP Server
try:
host = '0.0.0.0'
port = 3700
ftp = FTP()
ftp.connect(host, port)
ftp.login('user', 'pass')
file = open(fullpath, "rb")
ftp.cwd('/')
ftp.storbinary('STOR ' + name, file)
ftp.quit()
file.close()
else:
print 'Something is Wrong'
#LOOP PART 2
sleep(180.00)
else is valid as part of an exception block, but it is only run if an exception is not raised and there must be a except defined before it.
(edit) Most people skip the else clause and just write code after exiting (dedenting) from the try/except clauses.
The quick tutorial is:
try:
# some statements that are executed until an exception is raised
...
except SomeExceptionType, e:
# if some type of exception is raised
...
except SomeOtherExceptionType, e:
# if another type of exception is raised
...
except Exception, e:
# if *any* exception is raised - but this is usually evil because it hides
# programming errors as well as the errors you want to handle. You can get
# a feel for what went wrong with:
traceback.print_exc()
...
else:
# if no exception is raised
...
finally:
# run regardless of whether exception was raised
...

DeadLink exception from Python2 to Python3

I found this code written in Python 2.7 to bypass a deadlink while reading a list of urls and retrieving their content:
for i in xrange(lines):
try:
t = urllib2.urlopen(urllib2.Request(lines[i]))
deadlinkfound = False
except:
deadlinkfound = True
if not(deadlinkfound):
urllib.urlretrieve(lines[i], "Images/imag" + "-%s" % i)
It worked fine in Python2 but I can't find the equivalent in Python3 because of the urllib2 merging.
You can do the exact same thing with urllib.request here. Don't catch every conceivable exception, only catch what is reasonably going to be thrown:
from urllib import request, error
from http.client import HTTPException
for i, url in enumerate(lines):
try:
t = request.urlopen(request.Request(url, method='HEAD'))
except (HTTPException, error.HTTPError):
continue
request.urlretrieve(url, 'Images/imag-{}'.format(i))
This code does the same, but more efficiently.

Bypass exception to always execute command?

The following code works almost perfect, thanks to the help received here:
import urllib.request
import zipfile
import subprocess
urls = ["http://url.com/archive1.zip", "http://url.com/archive2.zip", "http://url.com/archive3.zip"]
filename = "C:/test/test.zip"
destinationPath = "C:/test"
for url in urls:
try:
urllib.request.urlretrieve(url,filename)
sourceZip = zipfile.ZipFile(filename, 'r')
break
except ValueError:
pass
for name in sourceZip.namelist():
sourceZip.extract(name, destinationPath)
sourceZip.close()
subprocess.call(r'C:\WINDOWS\system32\cmd.exe /C "C:\test\test.exe"')
Except that when none of the url's successfully download, the final subprocess.call command never gets executed and I get the following error:
Traceback (most recent call last):
File "test.py", line 29, in <module>
for name in sourceZip.namelist():
NameError: name 'sourceZip' is not defined
I have been playing around a bit with the try: and except, but can not come to a working solution. I believe it is because the try: except command has to be inside the sourceZip loop, but since the sourceZip variable never gets declared, it fails. How would I alter this code to have the subprocess.call always get executed at the end regardless whether the download is successfull or not? (very new to Python)
Set sourceZip = None prior to the for url in urls line. Then test if sourceZip is None after the for loop to determine if you were able to successfully fetch the file. For example:
sourceZip = None
for url in urls:
try:
urllib.request.urlretrieve(url,filename)
sourceZip = zipfile.ZipFile(filename, 'r')
break
except ValueError:
pass
if sourceZip is not None:
for name in sourceZip.namelist():
sourceZip.extract(name, destinationPath)
sourceZip.close()
subprocess.call(r'C:\WINDOWS\system32\cmd.exe /C "C:\test\test.exe"')
Initialize sourceZip to None. Then check if it is not none do the extraction and closing.

Categories

Resources