os.renames for ftp in python - python

I want to move a large number of files from a windows system to a unix ftp server using python. I have a csv which has the current full path and filename and the new base bath to send it to (see here for an example dataset).
I have got a script using os.renames to do the transfer and directory creation in windows but can figure out a way to easily do it via ftp.
import os, glob, arcpy, csv, sys, shutil, datetime
top=os.getcwd()
RootOutput = top
startpath=top
FileList = csv.reader(open('FileList.csv'))
filecount=0
successcount=0
errorcount=0
# Copy/Move to FTP when required
ftp = ftplib.FTP('xxxxxx')
ftp.login('xxxx', 'xxxx')
directory = '/TransferredData'
ftp.cwd(directory)
##f = open(RootOutput+'\\Success_LOG.txt', 'a')
##f.write("Log of files Succesfully processed. RESULT of process run #:"+str(datetime.datetime.now())+"\n")
##f.close()
##
for File in FileList:
infile=File[0]
# local network ver
#outfile=RootOutput+File[4]
#os.renames(infile, outfile)
# ftp netowrk ver
# outfile=RootOutput+File[4]
# ftp.mkd(directory)
print infile, outfile
I tried the process in http://forums.arcgis.com/threads/17047-Upload-file-to-FTP-using-Python-ftplib but this is for moving all files in a directory, I have the old and new full file names and just need it to create the intermediate directories.
Thanks,

The following might work (untested):
def mkpath(ftp, path):
path = path.rsplit('/', 1)[0] # parent directory
if not path:
return
try:
ftp.cwd(path)
except ftplib.error_perm:
mkpath(ftp, path)
ftp.mkd(path)
ftp = FTP(...)
directory = '/TransferredData/'
for File in FileList:
infile = File[0]
outfile = File[4].split('\\') # need forward slashes in FTP
outfile = directory + '/'.join(outfile)
mkpath(ftp, outfile)
ftp.storbinary('STOR '+outfile, open(infile, 'rb'))

Related

Skip list of filenames from txt file with os.walk

I would like to upload files that users dump into a shared folder to an FTP site. Only certain files must be uploaded based on a pattern in the filename, and that works. I would like to avoid uploading files that have been uploaded in the past. A simple solution would be to move the files to a subdirectory once uploaded, but users whish for the files to remain where they are.
I was thinking of writing a filename to a text file when each iteration of the loops makes an update. Populating the text file works.
Excluding directories with os.walk is mentioned in many articles and I can get that to work fine, but excluding a list of filenames seems to be a bit more obscure
This is what I have so far:
import ftplib
import os
import os.path
import fnmatch
## set local path variables
dir = 'c:/Temp'
hist_path = 'C:/Temp/hist.txt'
pattern = '*SomePattern*'
## make the ftp connection and set appropriate working directory
ftp = ftplib.FTP('ftp.someserver.com')
ftp.login('someuser', 'somepassword')
ftp.cwd('somedirectory')
## make a list of previously uploaded files
hist_list = open(hist_path, 'r')
hist_content = hist_list.read()
# print(hist_content)
## loop through the files and upload them to the FTP as above
for root, dirs, files in os.walk(dir):
for fname in fnmatch.filter(files, pattern): # this filters for filenames that include the pattern
## upload each file to the ftp
os.chdir(dir)
full_fname = os.path.join(root, fname)
ftp.storbinary('STOR ' + fname, open(full_fname, 'rb'))
## add an entry for each file into the historical uploads log
f = open(hist_path, 'a')
f.write(fname + '\n')
f.close()
Any help would be appreciated

Walk directories and remove file extensions

I'm trying to remove all the outlook .ost and .nst files from the user's folder on a network PC, as well as I'm trying to get it to write what files were removed into a CSV file.
I'm able to get it to find all the files in the directory and write it to a CSV file but when I try to remove the files with os.remove it doesn't seem to run, I hashed it out for the time being.
I added in the try and except, to skip the files that are in use.
import os
import sys
sys.stdout = open("output_file.csv", "w")
try:
for rootDir, subdir, files in os.walk("//network_pc_name/c$/Users"):
for filenames in files:
if filenames.endswith((".nst",".ost")):
foundfiles = os.path.join(rootDir, filenames)
#os.remove(os.path.join(rootDir, filenames))
print(foundfiles)
except:
pass
sys.stdout.close()
I made some change to the script as suggested and it appears to run alot quicker, however, I can't seem to figure out how to ignore files which are in use.
I switched the files extensions to .xlsx and .txt files to simulate the .xlsx file being open receiving the permissions error and to see if the script would continue to run and remove the .txt file.
I got the following error:
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: '//DESKTOP-HRLS19N/c$/globtest\Book1.xlsx
import glob
import os
files = [i for i in glob.glob("//DESKTOP-HRLS19N/c$/globtest/**", recursive = True) if i.endswith((".xlsx",".txt"))]
[os.remove(f) for f in files]
with open("output_file.csv", "w") as f:
f.writelines("\n".join(files))
In my experience glob is much easier:
print([i for i in glob.glob("//network_pc_name/c$/Users/**", recursive=True) if i.endswith((".nst", ".ost"))])
Assuming that prints out the files you're expecting:
files = [i for i in glob.glob("//network_pc_name/c$/Users/**", recursive=True) if i.endswith((".nst", ".ost"))]
removed_files = []
for file in files:
try:
size = os.path.getsize(file)
os.remove(file)
removed_files.append(file + " Bytes: " + size)
except Exception as e:
print("Could not remove file: " + file)
with open("output_file.csv", "w") as f:
f.writelines("\n".join(removed_files))

How do I save a csv file to an absolute path using the csv module in Python?

I currently have a program that collects data from .txt files in a folder and then saves that data to a csv file. Due to how I am planning on distributing this program, I need the Python file to live in the folder where these .txt files are located. However, I need the .csv files to be thrown to an absolute file path rather than being created in the same folder as the Python script and .txt documents. Here is what I have currently coded,
def write_to_csv(journal_list):
#writes a list of journal dictionaries to a csv file.
import csv
username = "Christian"
csv_name = username + ".csv"
myFile = open(csv_name, 'w')
with myFile:
myFields = ["filename", "username", "project_name", "file_path",
"date", "start_time", "end_time", "length_of_revit_session",
"os_version", "os_build", "revit_build", "revit_branch",
"cpu_name", "cpu_clockspeed", "gpu_name", "ram_max", "ram_avg", "ram_peak",
"sync_count", "sync_time_total", "sync_time_peak", "sync_time_avg",
"commands_total", "commands_hotkey_percentage", "commands_unique",
"commands_dynamo", "commands_escape_key", "commands_most_used"]
writer = csv.DictWriter(myFile, fieldnames=myFields)
writer.writeheader()
for item in journal_list:
try:
writer.writerow(item)
except:
print("error writing data to:", item)
I appreciate the help.
USing os.path.join() you can select your desire path for your file to be written. Here is an example:
import os
desier_path = '/home/foo/'
file_path = os.path.join(dest_dir, csv_name)
with open(file_path, 'w'):
...
Consider asking for the path from the script, and setting a default if not passed in. This would make your script a lot more flexible than having the path coded in it.
You can use the click package which simplifies this a bit.
import os
import click
def write_to_csv(path, journal_list):
# .. your normal code
file_path = os.path.join(path, '{}.csv'.format(username))
# .. rest of your code here
#click.command()
#click.option('--output_dir', default='/home/foo/bar/', help='Default path to save files')
def main(output_dir):
write_to_csv(output_dir)
if __name__ == '__main__':
main()

Getting the latest files from FTP folder (filename having spaces) in Python

I have a requirement where I have to pull the latest files from an FTP folder, the problem is that the filename is having spaces and the filename is having a specific pattern.
Below is the code I have implemented:
import sys
from ftplib import FTP
import os
import socket
import time
import pandas as pd
import numpy as np
from glob import glob
import datetime as dt
from __future__ import with_statement
ftp = FTP('')
ftp.login('','')
ftp.cwd('')
ftp.retrlines('LIST')
filematch='*Elig.xlsx'
downloaded = []
for filename in ftp.nlst(filematch):
fhandle=open(filename, 'wb')
print 'Getting ' + filename
ftp.retrbinary('RETR '+ filename, fhandle.write)
fhandle.close()
downloaded.append(filename)
ftp.quit()
I understand that I can append an empty list to ftp.dir() command, but since the filename is having spaces, I am unable to split it in the right way and pick the latest file of the type that I have mentined above.
Any help would be great.
You can get the file mtime by sending the MDTM command iff the FTP server supports it and sort the files on the FTP server accordingly.
def get_newest_files(ftp, limit=None):
"""Retrieves newest files from the FTP connection.
:ftp: The FTP connection to use.
:limit: Abort after yielding this amount of files.
"""
files = []
# Decorate files with mtime.
for filename in ftp.nlst():
response = ftp.sendcmd('MDTM {}'.format(filename))
_, mtime = response.split()
files.append((mtime, filename))
# Sort files by mtime and break after limit is reached.
for index, decorated_filename in enumerate(sorted(files, reverse=True)):
if limit is not None and index >= limit:
break
_, filename = decorated_filename # Undecorate
yield filename
downloaded = []
# Retrieves the newest file from the FTP server.
for filename in get_newest_files(ftp, limit=1):
print 'Getting ' + filename
with open(filename, 'wb') as file:
ftp.retrbinary('RETR '+ filename, file.write)
downloaded.append(filename)
The issue is that the FTP "LIST" command returns text for humans, which format depends on the FTP server implementation.
Using PyFilesystem (in place of the standard ftplib) and its API will provide a "list" API (search "walk") that provide Pythonic structures of the file and directories lists hosted in the FTP server.
http://pyfilesystem2.readthedocs.io/en/latest/index.html

python FTP download file with certain name

I have this FTP with folder and it contains these files:
pw201602042000.nc,
pw201602042010.nc,
pw201602042020.nc,
pw201602042030.nc,
pw201602042040.nc,
pw201602042050.nc,
pw201602042100.nc,
pw201602042110.nc,
pw201602042120.nc,
pw201602042130.nc,
pw201602042140.nc,
pw201602042150.nc,
pw201602042200.nc
how to download only file ending with 00?
from ftplib import FTP
server = FTP("ip/serveradress")
server.login("user", "password")
server.retrlines("LIST")
server.cwd("Folder")
server.sendcmd("TYPE i") # ready for file transfer
server.retrbinary("RETR %s"%("pw201602042300.nc"), open("pw", "wb").write)
when you obtained the list of files as list_of_files, just use fnmatch to match the file names according to wildcard:
list_of_files = server.retrlines("LIST")
dest_dir = "."
for name in list_of_files:
if fnmatch.fnmatch(name,"*00.nc"):
with open(os.path.join(dest_dir,name), "wb") as f:
server.retrbinary("RETR {}".format(name), f.write)
(note that you're writing the files on the same "pw" output file, I changed that, reusing the original name and provided a destination directory variable, and protecting the open in a with block to ensure file is closed when exiting the block)

Categories

Resources