Recursively get meta data of FTP folder and all sub folders - python

I am trying to figure out how to retrieve metadata from an FTP folder and all sub folders. I want to get the file name, file size, and date/time (of when the file was modified). I found the sample code (below) online. I entered in my credentials, ran the code, and received this error: No hostkey for host ftp.barra.com found.
Is there a quick fix for this?
from __future__ import print_function
import os
import time
import pysftp
ftp_username='xxx'
ftp_password='xxx'
ftp_host='xxx'
year = time.strftime("%Y")
month = time.strftime("%m")
day = time.strftime("%d")
ftp_dir = 'data/'+year+'/'+month
filename = time.strftime('ftp_file_lists.txt')
fout = open(filename, 'w')
wtcb = pysftp.WTCallbacks()
with pysftp.Connection(ftp_host, username=ftp_username, password=ftp_password) as sftp:
sftp.walktree(ftp_dir, fcallback=wtcb.file_cb, dcallback=wtcb.dir_cb, ucallback=wtcb.unk_cb)
print(len(wtcb.flist))
for fpath in wtcb.flist:
print(fpath, file=fout)
sftp.close()
Code from here.
http://alvincjin.blogspot.com/2014/09/recursively-fetch-file-paths-from-ftp.html

Related

Generate filenames with uuid in python

with open(r"path/sample.txt")as file:
some operations
print('exiting')
when i open the file is it possible to open the filename as below
sample2018-10-25-18-25-36669_devicename_uuid
How to create filenames in python with UTCdatetime & Hostname and guid, for example i need the below mentioned format of file
I am already opening a file to perform some string operations and store it in the same file. It could be very great if I can create a filename while starting the open operation or can I create a file and do all the operations and rename the file in the above mentioned format. How to proceed with this further. I am very new to python
Sure, you can generate a filename in python dynamically.
There is a simple code example that would help you generate file name as you describe.
import os
import socket
from datetime import datetime
from uuid import uuid4
dt = datetime.utcnow().strftime("%Y-%m-%d-%H-%M-%s")
path = 'path'
hostname = socket.gethostname()
filename = f"samle{dt}-{hostname}-{uuid4()}"
with open(os.path.join(path, filename), 'w') as f:
f.write('some_content')
If you want to get a unique hardware ID with Python please check this
link

Removing files using python from a server using FTP

I’m having a hard time with this simple script. It’s giving me an error of file or directory not found but the file is there. Script below I’ve masked user and pass plus FTP site
Here is my script
from ftplib import FTP
ftp = FTP('ftp.domain.ca')
pas = str('PASSWORD')
ftp.login(user = 'user', passwd=pas)
ftp.cwd('/public_html/')
filepaths = open('errorstest.csv', 'rb')
for j in filepaths:
    print(j)
    ftp.delete(str(j))
ftp.quit()
The funny thing tho is if I slight change the script to have ftp.delete() it finds the file and deletes it. So modified to be like this:
from ftplib import FTP
ftp = FTP('ftp.domain.ca')
pas = str('PASSWORD')
ftp.login(user = 'user', passwd=pas)
ftp.cwd('/public_html/')
ftp.delete(<file path>)
ftp.quit()
I’m trying to read this from a csv file. What am I doing wrong?
Whatever you have showed seems to be fine. But could you try this?
from ftplib import FTP
ftp = FTP(host)
ftp.login(username, password)
ftp.cwd('/public_html/')
print(ftp.pwd())
print(ftp.nlst())
with open('errorstest.csv') as file:
for line in file:
if line.strip():
ftp.delete(line.strip())
print(ftp.nlst())

Python download files from FTP ignoring the missing ones

I've a list of numbers on csv file like this:
1
2
3
4
5
And an ftp server with files named like those numbers:
1.jpg
2.jpg
4.jpg
5.jpg
( 3.jpg is missing )
I want to download all the files of the FTP if the filename is on that csv list.
On my code i can successfully download the files but when it tryes to download a missing file on FTP the program crashes with:
urllib2.URLError: <urlopen error ftp error: [Errno ftp error] 550 Can't change directory to 3.jpg: No such file or directory>
Python code:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import urllib2, shutil
import pandas as pd
import numpy as np
from ftplib import FTP
FTP_server = 'ftp://user:pass#server.com/'
ftp = FTP_server+'the/path/to/files/'
class Test:
def Get(self):
data = pd.read_csv('test.csv',encoding='utf-8',delimiter=';')
#data['REF'].replace('', np.nan, inplace=True)
#data.dropna(subset=['REF'], inplace=True)
data['REF'] = data['REF'].astype(int)
new_data = data['REF']
for ref in new_data:
file = str(ref)+str('.jpg')
ftpfile = urllib2.urlopen(ftp+file)
localfile = open(file, 'wb')
shutil.copyfileobj(ftpfile, localfile)
Try = Test()
Try.Get()
I'm trying to make an if inside the for loop but i can't get it working, someone can give me some idea or tip plz?
Get acquainted with try-except blocks to handle this:
for ref in new_data:
try:
file = str(ref)+str('.jpg')
ftpfile = urllib2.urlopen(ftp+file)
localfile = open(file, 'wb')
shutil.copyfileobj(ftpfile, localfile)
except urllib2.URLError: print("-I- Skipping",file," - doesn't exist.")

Error when converting Excel document to pdf using comtypes in Python

I am trying to convert an Excel spreadsheet to PDF using Python and the comtypes package using this code:
import os
import comtypes.client
FORMAT_PDF = 17
SOURCE_DIR = 'C:/Users/IEUser/Documents/jscript/test/resources/root3'
TARGET_DIR = 'C:/Users/IEUser/Documents/jscript'
app = comtypes.client.CreateObject('Excel.Application')
app.Visible = False
infile = os.path.join(os.path.abspath(SOURCE_DIR), 'spreadsheet1.xlsx')
outfile = os.path.join(os.path.abspath(TARGET_DIR), 'spreadsheet1.pdf')
doc = app.Workbooks.Open(infile)
doc.SaveAs(outfile, FileFormat=FORMAT_PDF)
doc.Close()
app.Quit()
This script above runs fine and the pdf file is created, but when I try to open it I get the error "The file cannot be opened - there is a problem with the file format" (but after closing this error dialog it is actually possible to preview the pdf file). I have tried a similar script to convert Word documents to pdfs and this worked just fine.
Any ideas on how I can resolve this problem with the file format error?
Found a solution - this seems to be working:
import os
import comtypes.client
SOURCE_DIR = 'C:/Users/IEUser/Documents/jscript/test/resources/root3'
TARGET_DIR = 'C:/Users/IEUser/Documents/jscript'
app = comtypes.client.CreateObject('Excel.Application')
app.Visible = False
infile = os.path.join(os.path.abspath(SOURCE_DIR), 'spreadsheet1.xlsx')
outfile = os.path.join(os.path.abspath(TARGET_DIR), 'spreadsheet1.pdf')
doc = app.Workbooks.Open(infile)
doc.ExportAsFixedFormat(0, outfile, 1, 0)
doc.Close()
app.Quit()
This link may also be helpful as an inspiration regarding the arguments to the ExportAsFixedFormatfunction: Document.ExportAsFixedFormat Method (although some of the values of arguments have to be modified a bit).
You need to describe ExportAsFixedFormat(0,outputfile) to save workbook in pdf format. The solution from http://thequickblog.com/convert-an-excel-filexlsx-to-pdf-python/ works for me.
from win32com import client
import win32api
input_file = r'C:\Users\thequickblog\Desktop\Python session 2\tqb_sample.xlsx'
#give your file name with valid path
output_file = r'C:\Users\thequickblog\Desktop\Python session 2\tqb_sample_output.pdf'
#give valid output file name and path
app = client.DispatchEx("Excel.Application")
app.Interactive = False
app.Visible = False
Workbook = app.Workbooks.Open(input_file)
try:
Workbook.ActiveSheet.ExportAsFixedFormat(0, output_file)
except Exception as e:
print("Failed to convert in PDF format.Please confirm environment meets all the requirements and try again")
print(str(e))
finally:
Workbook.Close()
app.Exit()

how to retrieve the latest file from an ftp folder using Python? [duplicate]

This question already has answers here:
Python FTP get the most recent file by date
(5 answers)
Closed 4 years ago.
At my company we have a scale that checks the weights of boxes before loading them into a truck. In the case the box contains more or less product than acceptable, the box is rejected and delivered to another conveyor belt. The electronic scale keeps a record of the performance of the operation. Files are stored in the scale's disk and accessed using ftp from a nearby desktop computer. My boss wants the reports to be automatically emailed to his account so he doesn't need to go to that facility just to check the rejections of the day before. I started writing a program in Python to do that, but got stucked in the part about retrieving the file from the folder. Here is my code:
#This program tries to retrieve the latest report and send it by email.
import urllib
import shutil
import ftplib
import os
import sys
import glob
import time
import datetime
import smtplib
import email
#Define the server and folder where the reports are stored.
carpetaftp = "/reports/"
#This function looks for the last file in the folder.
def obtenerultimoarchivo(camino):
for cur_path, dirnames, filenames in os.walk(camino):
for filename in filenames:
datos_archivo = os.stat(filename)
tiempo_archivo = datos_archivo.st_mtime
#Connects to an ftp folder and downloads the last report.
def descargareporteftp(carpetaftp):
ftp = ftplib.FTP("server.scale.com")
ftp.login()
ftp.cwd(carpetaftp)
#Uses 'ultimoreporte.pdf' as a copy of the last report.
archivo = open('C:\\Balanza\\Reportes\\ultimoreporte.pdf',"wb")
ftp.retrbinary("RETR " + obtenerultimoarchivo(),archivo.write)
archivo.close()
return archivo
#The function enviaemail() sends an email with an attachment.
def enviaemail(destinatario, adjunto):
remitente = "electronic_scale#email.com.uy"
msg = email.MIMEMultipart()
msg['From'] = remitente
msg['To'] = destinatario
msg['Subject'] = "Ultimo informe de la balanza."
adjunto = open('C:\\Balanza\\Reportes\\ultimoreporte.pdf', 'rb')
attach = email.MIMENonMultipart('application', 'pdf')
payload = base64.b64encode(adjunto.read()).decode('ascii')
attach.set_payload(payload)
attach['Content-Transfer-Encoding'] = 'base64'
adjunto.close()
attach.add_header('Content-Disposition', 'attachment', filename = 'ultimoreporte.pdf')
msg.attach(attach)
server = smtplib.SMTP('smtp.email.com.uy')
server.login('electronic_scale#email.com.uy', 'thermofischer')
server.sendmail('electronic_scale#email.com.uy',destinatario, msg.as_string())
server.quit()
#The main routine, to retrieve the last report and send it by email.
adjunto = descargareporteftp(carpetaftp)
print("Reporte descargado")
enviaemail('myemail#email.com.uy',reporte)
print("Reporte enviado")
Here is a dirty way I found by making a mistake:
If no file is specified in a urllib retrieve command it will return and list of files in the directory (as in an ls command).
So using this, the code does the following:
Retrieve a list of files in the FTP folder and save file locally
Read the file and find the last entry on the list
Retrieve the last file from the FTP folder
import urllib
def retrieve(address,output):
# Run the ftp retrieve command here
urllib.urlretrieve(address, output)
# Flush the urllib so that we can download a second file
urllib.urlcleanup()
def main():
#let's build the ftp address here:
usr = "usrname"
psswrd = "password"
host = "host.com"
path = "/foo/bar/"
list_file = "list"
address = 'ftp://%s:%s#%s%s' % (usr,psswrd,host,path)
print "Getting file listing from: " + address
# Retrieve the list
retrieve(address,list_file)
# read a text file as a list of lines
# find the last line, change to a file you have
list_file_reader = open ( 'list',"r" )
lineList = list_file_reader.readlines()
last_file = lineList[-1].split(' ')[-1:][0].rstrip()
output = "wanted_file.dat"
address = 'ftp://%s:%s#%s%s%s' % (usr,psswrd,host,path,last_file)
# Retrieve the file you are actually looking for
retrieve(address, output)
print address
main()
This is certainly no the most efficient way, but it works in my scenario.
References:
Python: download a file over an FTP server
https://www.daniweb.com/programming/software-development/threads/24544/how-do-i-read-the-last-line-of-a-text-file
Downloading second file from ftp fails

Categories

Resources