Mongodump specify the output and filename - python

Is there a way to specify the whole path when running mongodump? I tried using --out but what it does at the moment is saving into a file at my_given_path/database/collection_name.json.gz
I have the following:
path = file_path + '/' + database + '/' + collection + '/'
query_input = "{\\\"metadata_id\\\": {\\\"\$oid\\\": \\\"" + metadata_id + "\\\"}}"
command = "mongodump --uri " + connection_string + database + " " \
"--collection=" + collection + " --query=\"" + query_input + "\" --gzip --out=" + path + " --quiet"
for which the file is saved at:
file_path/database/collection/database/collection.json.gz
Ideally I would like to save it into
file_path/database/collection/metadata_id.json.gz
Would this be possible?

You can use the --archive=<file> flag instead of --out

Related

tar: cowardly refusing to create an empty archive and I don't know why it would give me that error (files to be archived are not empty)

I tried looking on stack overflow for a solution to this and other online resources but my specific situation didn't apply to what I found online.
When I run my script, it works until the line that I expect to create a tar archive of 2 files using the os.system command and store that archive in /home/dahmed26/backups/xmlfiles
It gives the following error:
tar: Cowardly refusing to create an empty archive
Try 'tar --help' or 'tar --usage' for more information.
sh: line 1: .tar.gz: command not found
This is my code:
import os
currentuser = os.popen('whoami')
username = username.strip()
if username != 'root':
print("Must be root")
exit()
else:
vmChoice = input("Choose a VM")
now = os.popen('date +%Y%m%d').read()
name = (vmChoice + '-' + now)
os.system('virsh dumpxml ' + vmChoice + ' > /home/dahmed26/backups/xmlfiles/' + vmChoice + '.xml')
location = os.popen('cat /home/dahmed26/backups/xmlfiles/' + vmChoice + '.xml | grep "source file" | cut -d "\'" -f2').read()
os.system('tar czf' + ' ' + '/home/dahmed26/backups/xmlfiles/' + name + '.tar.gz' + ' ' + '/home/dahmed26/backups/xmlfiles/' + vmChoice + '.xml' + location)

Python script to run FME workbench

I have more than 500 xml files and each xml file should processed on FME workbench individually (iteration of FME workbench for each xml file).
For such a propose i have to run a python file (loop.py) to iterate FME workbench for each xml file.
The whole process was working in past on other PC without any problem. Now Once i run Module i got the following error:
Traceback (most recent call last):E:\XML_Data
File "E:\XML_Data\process\01_XML_Tile_1.py", line 28, in
if "Translation was SUCCESSFUL" in open(path_log + "\" + data + ".log").read():
IOError: [Errno 2] No such file or directory: 'E:\XML_Data\data_out\log_01\re_3385-5275.xml.log'
Attached the python code(loop.py).
Any help is greatly appreciated.
import os
import time
# Mainpath and Working Folder:
#path_main = r"E:\XML_Data"
path_main = r"E:\XML_Data"
teil = str("01")
# variables
path_in = path_main + r"\data_in\03_Places\teil_" + teil # "Source folder of XML files"
path_in_tile10 = path_main + r"\data_in\01_Tiling\10x10.shp" # "Source folder of Grid shapefile"
path_in_commu = path_main + r"\data_in\02_Communities\Communities.shp" # "Source folder of Communities shapefile"
path_out = path_main + r"\data_out\teil_" + teil # "Output folder of shapefiles that resulted from XML files (tile_01 folder)"
path_log = path_main + r"\data_out\log_" + teil # "Output folder of log files for each run(log_01 folder)"
path_fme = r"%FME_EXE_2015%" # "C:\Program Files\FME2015\fme.exe"
path_fme_workbench = path_main + r"\process\PY_FME2015.fmw" # "path of FME workbench"
datalists = os.listdir(path_in)
count = 0
# loop each file individually in FME
for data in datalists:
if data.find(".xml") != -1:
count +=1
print ("Run-No." + str(count) + ": with data " + data)
os.system (path_fme + " " + path_fme_workbench + " " + "--SourceDataset_XML"+ " " + path_in + "\\" + data + " " + "--SourceDataset_SHAPE" + " " + path_in_tile10 + " " + "--SourceDataset_SHAPE_COMU" + " " + path_in_commu + " " + "--DestDataset_SHAPE" +" " +path_out + " " +"LOG_FILENAME" + " " + path_log + "\\" + data + ".log" )
print ("Data processed: " + data)
shape = str(data[19:28]) + "_POPINT_CENTR_UTM32N.shp"
print ("ResultsFileName: " + shape)
if "Translation was SUCCESSFUL" in open(path_log + "\\" + data + ".log").read():
# Translation was successful and SHP file exists:
if os.path.isfile(path_out + "\\" + shape):
write_log = open(path_out + "\\" + "result_xml.log", "a")
write_log.write(time.asctime(time.localtime()) + " " + shape + "\n")
write_log.close()
print("Everything ok")
#Translation was successful, but SHP file does not exist:
else:
write_log = open(path_out + "\\" + "error_xml.log", "a")
write_log.write(time.asctime(time.localtime()) + " Data: " + shape + " unavailable.\n")
write_log.close()
# Translation was not successful:
else:
write_log = open(path_out + "\\" + "error_xml.log", "a")
write_log.write(time.asctime(time.localtime()) + " Translation " + Data + " not successful.\n")
write_log.close()
print ("Number of calculated files: " + str(count))
Most likely, the script failed at the os.system line, so the log file was not created from the command. Since you mentioned a different computer, it could be caused by many reasons, such as a different version of FME (so the environment variable %FME_EXE_2015% would not exist).
Use a workspace runner transformer to do this.
The FME version is outdated.so first check the version whether it is creating the problem.
subprocess.call(["C:/Program Files/fme/FMEStarter/FMEStarter.exe", "C:/Program Files/fme/fme20238/fme.exe", "/fmefile.fmw" "LOG_FILENAME","logfile"], stdin=None, stdout=None, stderr=None, shell=True, timeout=None)

How to avoid multiple try except blocks python

I have some code that I wrote that downloads images from a website. The way that it currently works it needs to guess what the file extension will be for the url it will be downloading from. The block of code that does that looks like this:
for imageLink in imageLinks:
try:
urllib.request.urlretrieve(imageLink + ".png", str(threadName) + "/" + str(count) + ".png")
except:
try:
urllib.request.urlretrieve(imageLink + ".jpg",str(threadName) + "/" + str(count) + ".png")
except:
try:
urllib.request.urlretrieve(imageLink + ".gif",str(threadName) + "/" + str(count) + ".gif")
except:
urllib.request.urlretrieve(imageLink + ".webm",str(threadName) + "/" + str(count) + ".webm")
As it stands the code is relying on a fail in order to try something else.
I wanted to know if their is a way to have this functionality but to basically just look better. These methods will give identical errors if they fail so I want to just go through them sequentially until one works
for ext in ('.png', '.jpg', '.gif', '.webm'):
try:
urllib.request.urlretrieve(imageLink + ext, str(threadName) + "/" + str(count) + ext)
break
except:
pass
You can use a try/except block inside a function and return None if the control goes to the except statement. You can optimize the for loop according to your own needs. One example is here:
def get_url(link1, link2):
try:
requestData = urllib.request.urlretrieve(link1, link2)
except:
return None
return requestData
for imageLink in imageLinks:
data = urllib.request.urlretrieve(imageLink + ".png", str(threadName) + "/" + str(count) + ".png")
if data == None:
data = urllib.request.urlretrieve(imageLink + ".jpg",str(threadName) + "/" + str(count) + ".png")
if data == None:
data = urllib.request.urlretrieve(imageLink + ".gif",str(threadName) + "/" + str(count) + ".gif")
if data == None:
urllib.request.urlretrieve(imageLink + ".webm",str(threadName) + "/" + str(count) + ".webm")

Error writing to folder in python os.makedirs()

I am trying to download a .zip file from ftp site, (works independent of the error), I am creating a folder in a directory with the current date in the name. I want the downloaded zip file to be placed in the newly created folder. my code is below.
import os
import urllib
import datetime
now = datetime.datetime.now()
situs = "ftp://pbcgis:sigcbp#ftp.co.palm-beach.fl.us/CWGIS/SITUS_PUB.zip"
path = os.path.join(r"Y:\JH_Data_Dump\SITUS\PBC_SITUS" + str(now.month) + "_" + str(now.day) + "_" + str(now.year))
path1 = os.path.join(path + "PBC_SITUS" + str(now.month) + "_" + str(now.day) + "_" + str(now.year) +".zip")
print "Creating new directory..."
os.makedirs(path)
print "beginning PBC SITUS Download..."
urllib.urlretrieve(situs, path1)
I get no errors and the file downloads successfully but its not placing the .zip into my newly created folder, its placing it the same directory as the folder but not inside.
You use os.path.join incorrectly. Path segments - directories and filename - are distinct arguments. They are joined using path separator, either \ or /.
path = os.path.join('Y:', "PBC_SITUS123")
path1 = os.path.join(path, "PBC_SITUS123" + ".zip")
will result in Y:\PBC_SITUS123\PBC_SITUS123.zip
I figured out why, I was missing a "\" in the path1 string
it should read:
path1 = os.path.join(path + r"\PBC_SITUS" + str(now.month) + "_" + str(now.day) + "_" + str(now.year) +".zip")

Equivalent to grep, but not using open()

I have the following script (see below) which is taken stdin and manipulating into some simple files.
# Import Modules for script
import os, sys, fileinput, platform, subprocess
# Global variables
hostsFile = "hosts.txt"
hostsLookFile = "hosts.csv"
# Determine platform
plat = platform.system()
if plat == "Windows":
# Define Variables based on Windows and process
currentDir = os.getcwd()
hostsFileLoc = currentDir + "\\" + hostsFile
hostsLookFileLoc = currentDir + "\\" + hostsLookFile
ipAddress = sys.argv[1]
hostName = sys.argv[2]
hostPlatform = sys.argv[3]
hostModel = sys.argv[4]
# Add ipAddress to the hosts file for python to process
with open(hostsFileLoc,"a") as hostsFilePython:
for line in open(hostsFilePython, "r"):
if ipAddress in line:
print "ipAddress \(" + ipAddress + "\) already present in hosts file"
else:
print "Adding ipAddress: " + ipAddress + " to file"
hostsFilePython.write(ipAddress + "\n")
# Add all details to the lookup file for displaying on-screen and added value
with open(hostsLookFileLoc,"a") as hostsLookFileCSV:
for line in open(hostsLookFileCSV, "r"):
if ipAddress in line:
print "ipAddress \(" + ipAddress + "\) already present in lookup file"
else:
print "Adding details: " + ipAddress + "," + hostName + "," + hostPlatform + "," + hostModel + " to file"
hostsLookFileCSV.write(ipAddress + "," + hostName + "," + hostPlatform + "," + hostModel + "\n")
if plat == "Linux":
# Define Variables based on Linux and process
currentDir = os.getcwd()
hostsFileLoc = currentDir + "/" + hostsFile
hostsLookFileLoc = currentDir + "/" + hostsLookFile
ipAddress = sys.argv[1]
hostName = sys.argv[2]
hostPlatform = sys.argv[3]
hostModel = sys.argv[4]
# Add ipAddress to the hosts file for python to process
with open(hostsFileLoc,"a") as hostsFilePython:
print "Adding ipAddress: " + ipAddress + " to file"
hostsFilePython.write(ipAddress + "\n")
# Add all details to the lookup file for displaying on-screen and added value
with open(hostsLookFileLoc,"a") as hostsLookFileCSV:
print "Adding details: " + ipAddress + "," + hostName + "," + hostPlatform + "," + hostModel + " to file"
hostsLookFileCSV.write(ipAddress + "," + hostName + "," + hostPlatform + "," + hostModel + "\n")
This code obviously does not work, because the for line in open(hostsFilePython, "r"): syntax is wrong... I can not use a current file object with "open()". However this is want I want to achieve, how can I do this?
You want to open your file using the a+ mode so that you can both read and write, then simply use the existing file object (hostsFilePython).
However, this still won't work as you can only iterate over a file once before it is exhausted.
It's worth noting that this isn't very efficient. The better plan is to read the data into a set, update the set with your new values, then write the set to the file. (As pointed out in the comments, sets don't preserve duplicates (good for your purposes), and order, which may or may not work for you. If not, then you might need to use a list, which will be less efficient).
with open(hostsFileLoc) as hostsFilePython:
lines = hostsFilePython.readlines()
for filename in lines:
with open(hostsFileLoc, 'a') as hostFilePython:
with open(filename) as hostsFile:
for line in hostsFile.readlines():
if ipAddress in line:
print "ipAddress \(" + ipAddress + "\) already present in hosts file"
else:
print "Adding ipAddress: " + ipAddress + " to file"
hostsFilePython.write(ipAddress + "\n")
The default mode is read, so you don't need to pass in r explicitly.

Categories

Resources