I am trying to read URL directly from MYSQLDB table and tldextract to get the domain from the url and find the SPF(Sender Policy Framework) Record for the domain.
When i'm trying to write the SPF records of each and every domain i scan,My Ouput_SPF_Records.txt do not contain any records i write.
Not sure with the issue,Any suggestions please ?
import sys
import socket
import dns.resolver
import re
import MySQLdb
import tldextract
from django.utils.encoding import smart_str, smart_unicode
def getspf (domain):
answers = dns.resolver.query(domain, 'TXT')
for rdata in answers:
for txt_string in rdata.strings:
if txt_string.startswith('v=spf1'):
return txt_string.replace('v=spf1','')
db=MySQLdb.connect("x.x.x.x","username","password","db_table")
cursor=db.cursor()
cursor.execute("SELECT application_id,url FROM app_info.app_urls")
data=cursor.fetchall()
x=0
while x<len(data):
c=tldextract.extract(data[x][1])
#print c
app_id=data[x][0]
#print app_id
d=str(app_id)+','+c[1]+'.'+c[2]
#with open('spfout.csv','a') as out:
domain=smart_str(d)
#print domain
with open('Ouput_SPF_Records.txt','w') as g:
full_spf=""
spf_rec=""
y=domain.split(',')
#print "y===",y,y[0],y[1]
app_id=y[0]
domains=y[1]
try:
full_spf=getspf(domains.strip())+"\n"
spf_rec=app_id+","+full_spf
print spf_rec
except Exception:
pass
g.write(spf_rec)
x=x+1
g.close()
Try openning the file with append mode, instead of w mode. w mode overwrites the file in each iteration. Example -
with open('Ouput_SPF_Records.txt','a') as g:
Most probably, the last time you open the file in write mode, you do not write anything in since, you are catching and ignoring all exceptions , which causes the empty file.
Also, if you know the error which you are expecting, you should use except <Error>: instead of except Exception: . Example -
try:
full_spf=getspf(domains.strip())+"\n"
spf_rec=app_id+","+full_spf
print spf_rec
except <Error you want to catch>:
pass
Your problem is you open the file many times, each time through the loop. You use w mode, which erases the contents and writes from the beginning.
Either open the file once before the loop, or open in append mode a, so you don't delete the previously written data.
You can use :
import pdb;pdb.set_trace()
Debug your code and try to figure out the problem.
also note that :
1. You shouldn't just write 'pass' in the try/except block. Deal with the Exception
2.
with open('Ouput_SPF_Records.txt','w') as g:
it will automatically close the file, so there is no need to do : g.close() explicitly.
I think this is the result of getspf return None by default.
The Problem is that python cant concatenate str and NoneType (the type of None) (which throws an exception that you quickly discard).
You may try this instead:
def getspf (domain):
answers = dns.resolver.query(domain, 'TXT')
for rdata in answers:
for txt_string in rdata.strings:
if txt_string.startswith('v=spf1'):
return txt_string.replace('v=spf1','')
return ""#"Error"
Probably you should check for the exception, my guess is that statements inside it are not performed, and spf_rec is left to "".
As per POSIX definition (http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206), every line you write should end with "\n".
You might considering to initialise spf_rec with "\n" rather than "".
Also, as "Anand S Kumar" said, without an append, the file is overwritten at every "while x
I think that if you open the Ouput_SPF_Records.txt file with "vi", you will see the last line written (unless an exception occurred on the last execution of the cycle, causing the file to be just "").
In other words, the problem is that many software may not read a line that doesn't respect the POSIX standard, and because your file is probably composed by a unique line that doesn't respect this standard, the file won't be read at all.
Related
I'm trying to get the SPF records of a domains and the domains are read from a file.When i am trying to get the spf contents and write it to a file and the code gives me the results of last domain got from input file.
Example `Input_Domains.txt`
blah.com
box.com
marketo.com
The output,I get is only for the marketo.com
#!/usr/bin/python
import sys
import socket
import dns.resolver
import re
def getspf (domain):
answers = dns.resolver.query(domain, 'TXT')
for rdata in answers:
for txt_string in rdata.strings:
if txt_string.startswith('v=spf1'):
return txt_string.replace('v=spf1','')
with open('Input_Domains.txt','r') as f:
for line in f:
full_spf=getspf(line.strip())
my_file=open("out_spf.txt","w")
my_file.write(full_spf)
my_file.close()
How can i solve this by writing all the spf contents of domains which i got it to file,Any suggestions please ?
It is because you are rewriting full_spf all the time so only last value is stored
with open('Input_Domains.txt','r') as f:
for line in f:
full_spf=getspf(line.strip())
Modification:
with open('Input_Domains.txt','r') as f:
full_spf=""
for line in f:
full_spf+=getspf(line.strip())+"\n"
Try using a generator expression inside your with block, instead of a regular for loop:
full_spf = '\n'.join(getspf(line.strip()) for line in f)
This will grab all the lines at once, do your custom getspf operations to them, and then join them with newlines between.
The advantage to doing it this way is that conceptually you're doing a single transformation over the data. There's nothing inherently "loopy" about taking a block of data and processing it line-by-line, since it could be done in any order, all lines are independent. By doing it with a generator expression you are expressing your algorithm as a single transformation-and-assignment operation.
Edit: Small oversight, since join needs a list of strings, you'll have to return at least an empty string in every case from your getspf function, rather than defaulting to None when you don't return anything.
I need to write data to a file and overwrite it if file exists. In case of an error my code should catch an exception and restore original file (if applicable)
How can I restore file? Should I read original one and put content i.e. in a list and then in case of an exception just write this list to a file?
Are there any other options? Many thanks if anyone can provide some kind of an example
Cheers
The code below will make a copy of the original file and set a flag indicating if it existed before or not. Do you code in the try and if it makes it to the end it will set the worked flag to True and your are go to go otherwise, it will never get there, the exception will still be raised but the finally will clean up and replace the file if it had existed in the first place.
import os
import shutil
if os.path.isfile(original_file):
shutil.copy2(original_file, 'temp' + original_file)
prev_existed = True
else:
prev_existed = False
worked = False
try:
with open(original_file, 'w') as f:
## your code
worked = True
except:
raise
finally:
if not worked and prev_existed:
shutil.copy2('temp' + original_file, original_file)
Any attempt to restore on error will be fragile; what if your program can't continue or encounters another error when attempting to restore the data?
A better solution would be to not replace the original file until you know that whatever you're doing succeeded. Write whatever you need to to a temporary file, then when you're ready, replace the original with the temporary.
One way to do it is the append the new data at the end of the existing file. If you catch an error , delete all that you appended.
If no errors, delete everything that you had before appending so the new file will have only the appended data.
Create a copy of your file before starting capturing the data in a new file.
Please don't laugh. I'm trying to write a simple script that will replace the hostname and IP of a base VM. I have a working version of this, but I'm trying to make it more readable and concise. I'm getting a syntax error when trying the code below. I was trying to make these list comprehensions, but since they are file types, that won't work. Thanks in advance.
try:
old_network = open("/etc/sysconfig/network", "r+")
new_network = open("/tmp/network", "w+")
replacement = "HOSTNAME=" + str(sys.argv[1]) + "\n"
shutil.copyfile('/etc/sysconfig/network', '/etc/sysconfig/network.setup_bak')
for line in old_network: new_network.write(line) if not re.match(("HOSTNAME"), line)
for line in old_network: new_network.write(replacement) if re.match(("HOSTNAME"), line)
os.rename("/tmp/network","/etc/sysconfig/network")
print 'Hostname set to', str(sys.argv[1])
except IOError, e:
print "Error %s" % e
pass
You are using some odd syntax here:
for line in old_network: new_network.write(line) if not re.match(("HOSTNAME"), line)
for line in old_network: new_network.write(replacement) if re.match(("HOSTNAME"), line)
You need to reverse the if statement there, and put these on separate lines; you have to combine the loops, which lets you simplify the if statements too:
for line in old_network:
if not re.match("HOSTNAME", line):
new_network.write(line)
else:
new_network.write(replacement)
You cannot really loop over an input file twice (your second loop wouldn't do anything as the file has already been read in full).
Next, you want to use the open file objects context managers (using with) to make sure they are closed properly, whatever happens. You can drop the + from the file modes, you are not using the files in mixed mode, and the backup copy is probably best done first before opening anything for reading and writing just yet.
There is no need to use a regular expression here; you are testing for the presence of a straightforward simple string, 'HOSTNAME' in line will do, or perhaps line.strip().startswith('HOSTNAME') to make sure the line starts with HOSTNAME.
Use the tempfile module to create a temporary file with a name that won't conflict:
from tempfile import NamedTemporaryFile
shutil.copyfile('/etc/sysconfig/network', '/etc/sysconfig/network.setup_bak')
replacement = "HOSTNAME={}\n".format(sys.argv[1])
new_network = NamedTemporaryFile(mode='w', delete=False)
with open("/etc/sysconfig/network", "r") as old_network, new_network:
for line in old_network:
if line.lstrip().startswith('HOSTNAME'):
line = replacement
new_network.write(line)
os.rename(new_network.name, "/etc/sysconfig/network")
print 'Hostname set to {}'.format(sys.argv[1])
You can simplify this even further by using the fileinput module, which lets you replace a file contents by simply printing; it supports creating a backup file natively:
import fileinput
import sys
replacement = "HOSTNAME={}\n".format(sys.argv[1])
for line in fileinput('/etc/sysconfig/network', inplace=True, backup='.setup_bak'):
if line.lstrip().startswith('HOSTNAME'):
line = replacement
sys.stdout.write(line)
print 'Hostname set to {}'.format(sys.argv[1])
That's 6 lines of code (not counting imports) versus your 12. We can squash this down to just 4 by using a conditional expression, but I am not sure if that makes things more readable:
for line in fileinput('/etc/sysconfig/network', inplace=True, backup='.setup_bak'):
sys.stdout.write(replacement if line.lstrip().startswith('HOSTNAME') else line)
def FileCheck(fn):
try:
fn=open("TestFile.txt","U")
except IOError:
print "Error: File does not appear to exist."
return 0
I'm trying to make a function that checks to see if a file exists and if doesn't then it should print the error message and return 0 . Why isn't this working???
You'll need to indent the return 0 if you want to return from within the except block.
Also, your argument isn't doing much of anything. Instead of assigning it the filehandle, I assume you want this function to be able to test any file? If not, you don't need any arguments.
def FileCheck(fn):
try:
open(fn, "r")
return 1
except IOError:
print "Error: File does not appear to exist."
return 0
result = FileCheck("testfile")
print result
I think os.path.isfile() is better if you just want to "check" if a file exists since you do not need to actually open the file. Anyway, after open it is a considered best practice to close the file and examples above did not include this.
This is likely because you want to open the file in read mode.
Replace the "U" with "r".
Of course, you can use os.path.isfile('filepath') too.
If you just want to check if a file exists or not, the python os library has solutions for that such as os.path.isfile('TestFile.txt'). OregonTrails answer wouldn't work as you would still need to close the file in the end with a finally block but to do that you must store the file pointer in a variable outside the try and except block which defeats the whole purpose of your solution.
I am FTPing a zip file from a remote FTP site using Python's ftplib. I then attempt to write it to disk. The file write works, however most attempts to open the zip using WinZip or WinRar fail; both apps claim the file is corrupted. Oddly however, when right clicking and attempting to extract the file using WinRar, the file will extract.
So to be clear, the file write will work, but will not open inside the popular zip apps, but will decompress using those same apps. Note that the Python zipfile module never fails to extract the zips.
Here is the code that I'm using to get the zip file from the FTP site (please ignore the bad tabbing, that's not the issue).
filedata = None
def appender(chunk):
global filedata
filedata += chunk
def getfile(filename):
try:
ftp = None
try:
ftp = FTP(address)
ftp.login('user', 'password')
except Exception, e:
print e
command = 'RETR ' + filename
idx = filename.rfind('/')
path = filename[0:idx]
ftp.cwd(path)
fileonly = filename[idx+1:len(filename)]
ftp.retrbinary('RETR ' + filename, appender)
global filedata
data = filedata
ftp.close()
filedata = ''
return data
except Exception, e:
print e
data = getfile('/archives/myfile.zip')
file = open(pathtoNTFileShare, 'wb')
file.write(data)
file.close()
Pass file.write directly inside the retrbinary function instead of passing appender. This will work and it will also not use that much RAM when you are downloading a big file.
If you'd like the data stored inside a variable though, you can also have a variable named:
blocks = []
Then pass to retrbinary instead of appender:
blocks.append
Your current appender function is wrong. += will not work correctly when there is binary data because it will try to do a string append and stop at the first NULL it sees.
As mentioned by #Lee B you can also use urllib2 or Curl. But your current code is almost correct if you make the small modifications I mentioned above.
I've never used that library, but urllib2 works fine, and is more straightforward. Curl is even better.
Looking at your code, I can see a couple of things wrong. Your exception catching only prints the exception, then continues. For fatal errors like not getting an FTP connection, they need to print the message and then exit. Also, your filedata starts off as None, then your appender uses += to add to that, so you're trying to append a string + None, which gives a TypeError when I try it here. I'm surprised it's working at all; I would have guessed that the appender would throw an exception, and so the FTP copy would abort.
While re-reading, I just noticed another answer about use of += on binary data. That could well be it; python tries to be smart sometimes, and could be "helping" when you join strings with whitespace or NULs in them, or something like that. Your best bet there is to have the file open (let's call it outfile), and use your appender to just outfile.write(chunk).