FTP Python pandas dataframe result set - python

Instead of FTPing a file from a locale server to a remote server, I am interested in sending the content of a pandas dataframe directly to the remote server.
Suppose I have the following in a dataframe
df.head()
Country City
A New-York
B France
C Londo
I want to be able to create write the content of the panda df directly to FTP, without having to write the file to disk and reading it before ftp.
Thanks
import ftplib
import os
ftp.storbinary("STOR " + file, open(file, "rb"))
ftp = ftplib.FTP('myserver.host.com')
ftp.login("", "")
File=open(" ", 'rb')
ftp.storbinary("file.txt" , File)
File.close()
ftp.quit()

Related

How to put data into a tempfile and post as CSV on SFTP

Goal is
Create a temporary SCP file filled with data and upload it to an sftp. The data to fill is TheList and is from class list.
What I am able to achieve
Create the connection to the SFTP
Push a file to the SFTP
What happens with the code below
There is a file created/put to the SFTP, but the file is empty and has 0 byte.
Question
How can I achieve that I have a file with type SCP on SFTP with the content of TheList?
import paramiko
import tempfile
import csv
# code part to make and open sftp connection
TheList = [['name', 'address'], [ 'peter', 'london']]
csvfile = tempfile.NamedTemporaryFile(suffix='.csv', mode='w', delete=False)
filewriter = csv.writer(csvfile)
filewriter.writerows(TheList)
sftp.put(csvfile.name, SftpPath + "anewfile.csv")
# code part to close sftp connection
You do not need to create a temporary file. You can use csv.writer to write the rows directly to the SFTP with use of file-like object opened using SFTPClient.open:
with sftp.open(SftpPath + "anewfile.csv", mode='w', bufsize=32768) as csvfile:
writer = csv.writer(csvfile, delimiter=',')
filewriter.writerows(TheList)
See also pysftp putfo creates an empty file on SFTP server but not streaming the content from StringIO
To answer your literal question: I believe you need to flush the temporary file before trying to upload it:
filewriter.flush()
See How to use tempfile.NamedTemporaryFile() in Python
Though better option would be to use Paramiko SFTPClient.putfo to upload the NamedTemporaryFile object, rather then trying to refer to the temporary file via the filename (what allegedly would not work at least on Windows anyway):
csvfile.seek(0)
sftp.putfo(csvfile, SftpPath + "anewfile.csv")

Reading CSV file downloaded from FTP in Python not reading all rows

I am trying to read a CSV file from a folder in FTP. The file has 3072 rows. However, when I am running the code, it is not reading all the rows. Certain rows from the bottom are getting missed out.
## FTP host name and credentials
ftp = ftplib.FTP('IP', 'username','password')
## Go to the required directory
ftp.cwd("Folder_Name")
names = ftp.nlst()
final_names= [line for line in names if '.csv' in line]
latest_time = None
latest_name = None
#os.chdir(filepath)
for name in final_names:
time1 = ftp.sendcmd("MDTM " + name)
if (latest_time is None) or (time1 > latest_time):
latest_name = name
latest_time = time1
file = open(latest_name, 'wb')
ftp.retrbinary('RETR '+ latest_name, file.write)
dat = pd.read_csv(latest_name)
The CSV file to be read from FTP is as given below-
The output from the code is as-
Make sure you close the file, before you try to read it, using file.close(), or even better using with:
with open(latest_name, 'wb') as file:
ftp.retrbinary('RETR '+ latest_name, file.write)
dat = pd.read_csv(latest_name)
If you do not need to actually store the file to local file system, and the file is not too large, you can download it to memory only:
Reading files from FTP server to DataFrame in Python

How do I read a CSV from Secure FTP Server

I have a script which get .csv file and some data correction and save my django database. In my case I couldn't get .csv file from FTP server. I tried following codes but I faced different errors each time.
import pandas as pd
import pysftp as sftp
with sftp.connect(your_host, your_user, your_pw) as conn:
with conn.open("path_and_file.csv", "r") as f:
df = pd.read_csv(f)
Error: "AttributeError: module 'pysftp' has no attribute 'connect'"
ftp = FTP('your_host')
ftp.login('your_user', 'your_pw')
ftp.set_pasv(False)
I couldn't go further.
How can I read .csv file from FTP server using by pandas?
I Solved my problem as below:
I copied files then opened as pd.
with FTP(host) as ftp:
ftp.login(user=user, passwd=password)
print(ftp.getwelcome())
with open("proj.csv", "wb") as f:
ftp.retrbinary("RETR " + "proj.csv", f.write, 1024)
with open("pers.csv", "wb") as f:
ftp.retrbinary("RETR " + "pers.csv", f.write, 1024)
ftp.quit()
import pysftp
import pandas as pd
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None
with pysftp.Connection(hostname='hostname',username='username',password='password', cnopts=cnopts) as conn:
conn.get('filename')
with.open('filename') as f:
df = pd.read_csv(f)
this should give you the data frame of csv.

How to download a CSV file from the World Bank's dataset

I would like to automate the download of CSV files from the World Bank's dataset.
My problem is that the URL corresponding to a specific dataset does not lead directly to the desired CSV file but is instead a query to the World Bank's API. As an example, this is the URL to get the GDP per capita data: http://api.worldbank.org/v2/en/indicator/ny.gdp.pcap.cd?downloadformat=csv.
If you paste this URL in your browser, it will automatically start the download of the corresponding file. As a consequence, the code I usually use to collect and save CSV files in Python is not working in the present situation:
baseUrl = "http://api.worldbank.org/v2/en/indicator/ny.gdp.pcap.cd?downloadformat=csv"
remoteCSV = urllib2.urlopen("%s" %(baseUrl))
myData = csv.reader(remoteCSV)
How should I modify my code in order to download the file coming from the query to the API?
This will get the zip downloaded, open it and get you a csv object with whatever file you want.
import urllib2
import StringIO
from zipfile import ZipFile
import csv
baseUrl = "http://api.worldbank.org/v2/en/indicator/ny.gdp.pcap.cd?downloadformat=csv"
remoteCSV = urllib2.urlopen(baseUrl)
sio = StringIO.StringIO()
sio.write(remoteCSV.read())
# We create a StringIO object so that we can work on the results of the request (a string) as though it is a file.
z = ZipFile(sio, 'r')
# We now create a ZipFile object pointed to by 'z' and we can do a few things here:
print z.namelist()
# A list with the names of all the files in the zip you just downloaded
# We can use z.namelist()[1] to refer to 'ny.gdp.pcap.cd_Indicator_en_csv_v2.csv'
with z.open(z.namelist()[1]) as f:
# Opens the 2nd file in the zip
csvr = csv.reader(f)
for row in csvr:
print row
For more information see ZipFile Docs and StringIO Docs
import os
import urllib
import zipfile
from StringIO import StringIO
package = StringIO(urllib.urlopen("http://api.worldbank.org/v2/en/indicator/ny.gdp.pcap.cd?downloadformat=csv").read())
zip = zipfile.ZipFile(package, 'r')
pwd = os.path.abspath(os.curdir)
for filename in zip.namelist():
csv = os.path.join(pwd, filename)
with open(csv, 'w') as fp:
fp.write(zip.read(filename))
print filename, 'downloaded successfully'
From here you can use your approach to handle CSV files.
We have a script to automate access and data extraction for World Bank World Development Indicators like: https://data.worldbank.org/indicator/GC.DOD.TOTL.GD.ZS
The script does the following:
Downloading the metadata data
Extracting metadata and data
Converting to a Data Package
The script is python based and uses python 3.0. It has no dependencies outside of the standard library. Try it:
python scripts/get.py
python scripts/get.py https://data.worldbank.org/indicator/GC.DOD.TOTL.GD.ZS
You also can read our analysis about data from World Bank:
https://datahub.io/awesome/world-bank
Just a suggestion than a solution. You can use pd.read_csv to read any csv file directly from a URL.
import pandas as pd
data = pd.read_csv('http://url_to_the_csv_file')

How to Retrieve a Zip Folder from FTP in Python

I'm trying to retrieve a zip folder(s) from an ftp site and save them to my local machine, using python (ideally I'd like to specify where they are saved on my C:).
The code below connects to the FTP site and then *something happens in the PyScripter window that looks like random characters for about 1000 lines... but nothing actually gets downloaded to my hard drive.
Any tips?
import ftplib
import sys
def gettext(ftp, filename, outfile=None):
# fetch a text file
if outfile is None:
outfile = sys.stdout
# use a lambda to add newlines to the lines read from the server
ftp.retrlines("RETR " + filename, lambda s, w=outfile.write: w(s+"\n"))
def getbinary(ftp, filename, outfile=None):
# fetch a binary file
if outfile is None:
outfile = sys.stdout
ftp.retrbinary("RETR " + filename, outfile.write)
ftp = ftplib.FTP("FTP IP Address")
ftp.login("username", "password")
ftp.cwd("/MCPA")
#gettext(ftp, "subbdy.zip")
getbinary(ftp, "subbdy.zip")
Well, it seems that you simply forgot to open the file you want to write into.
Something like:
getbinary(ftp, "subbdy.zip", open(r'C:\Path\to\subbdy.zip', 'wb'))

Categories

Resources