Trouble extracting zip in python over ftp - python

I'm trying to unzip a file from an FTP site. I've tried it using 7z in a subprocess as well as using 7z in the older os.system format. I get closest however when I'm using the zipfile module in python so I've decided to stick with that. No matter how I edit this I seem to get one of two errors so here are both of them so y'all can see where I'm banging my head against the wall:
z = zipfile.ZipFile(r"\\svr-dc\ftp site\%s\daily\data1.zip" % item)
z.extractall()
NotImplementedError: compression type 6 (implode)
(I think this one is totally wrong, but figured I'd include.)
I seem to get the closest with the following:
z = zipfile.ZipFile(r"\\svr-dc\ftp site\%s\daily\data1.zip" % item)
z.extractall(r"\\svr-dc\ftp site\%s\daily\data1.zip" % item)
IOError: [Errno 2] No such file or directory: '\\\\svr-dc...'
The catch with this is that it is actually giving me the first file name in the zip. I can see the file AJ07242013.PRN at the end of the error so I feel closer because it's at least getting to the point of reading the contents of the zip file.
Pretty much any iteration of this that I try gets me one of those two errors, or a syntax error but that's easily addressed and not my primary concern.
Sorry for being so long winded. I'd love to get this working, so let me know what you think I need to do.
EDIT:
So 7z has finally been added to the path and is running through without any errors with both the subprocess as well as os.system. However, I still can't seem to get anything to unpack. It looks to me, from all I've read in the python documentation that I should be using the subprocess.communicate() module to extract this file but it just won't unpack. When I use os.system it keeps telling me that it cannot find the archive.
import subprocess
cmd = ['7z', 'e']
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)
sp.communicate('r"\C:\Users\boster\Desktop\Data1.zip"')
I don't think that sp.communicate is right but if I add anything else to it I have too many arguments.

python's zipfile doesn't support compression type 6 (imploded) so its simply not going to work. In the first case, that's obvious from the error. In the second case, things are worse. The parameter for extractfile is an alternate unzip directory. Since you gave it the name of your zip file, a directory of the same name can't be found and zipfile gives up before getting to the not-supported problem.
Make sure you can do this with 7z on the command line, try implementing subprocess again and ask for help on that technique if you need it.
Here's a script that will look for 7z in the usual places:
import os
import sys
import subprocess
from glob import glob
print 'python version:', sys.version
subprocess.call('ver', shell=True)
print
if os.path.exists(r'C:\Program Files\7-Zip'):
print 'have standard 7z install'
if '7-zip' in os.environ['PATH'].lower():
print '...and its in the path'
else:
print '...but its not in the path'
print
print 'find in path...'
found = 0
for p in os.environ['PATH'].split(os.path.pathsep):
candidate = os.path.join(p, '7z.*')
for fn in glob(candidate):
print ' found', fn
found += 1
print
if found:
print '7z located, attempt run'
subprocess.call(['7z'])
else:
print '7z not found'

Accoring to the ZipFile documentation, you might be better off copying the zip first to your working directory. (http://docs.python.org/2/library/zipfile#zipfile.ZipFile.extract)
If you have problems copying, you might want to store the zip in a path with no spaces or protect your code against spaces by using os.path.
I made a small test in which I used os.path.abspath to make sure I had the proper path to my zip and it worked properly.
Also make sure that for extractall the path that you specify is the path where the zip content will be extracted. (If a folder that is specified is not created, it will be created automatically) Your files will be extracted in your current working directory (CWD) if no parameter is passed to extractall.
Cheers!

Managed to get this to work without using the PIPE functionality as subprocess.communicate wouldn't unpack the files. Here was the solution using subprocess.call. Hope this can help someone in the future.
def extract_data_one():
for item in sites:
os.chdir(r"\\svr-dc\ftp site\%s\Daily" % item)
subprocess.call(['7z', 'e', 'data1.zip', '*.*'])

Related

Loop an existing script

I'm using a script from a third party I can't modify or show (let's call it original.py) which takes a file and produces some calculations. At the end it ouputs a result (using the print statment).
Since I have many files I decided to make a second script that gets all wanted files and runs them through the original.py
1st get list of all files to run
2nd run each file through the original.py
3rd obtain results from each file
I have the 1st and 2nd step. However, the end result only saves the calculations from the last file it read.
import sys
import original
import glob
import os
fn=str(sys.argv[1])
for filename in sys.argv[1:]:
print(filename)
ficheiros = [f for f in glob.glob(fn)]
for ficheiro in ficheiros:
original.file = bytes(ficheiro,'utf-8')
original.function()
To summarize:
Knowing I can't change the original script (which is made with a print statement) how can I obtain the results for each loop? Is there a better way than using a for loop?.
The first script can be invoked with python original.py
It requires the file to be changed manually inside the script in the original.file line.
This script outputs the result in the console and I redirect it with: python original.py > result.txt
At the moment when I try to run my script, it reads all the correct files in the folder but only returns the results for the last file.
#
(I tried to reformulate the question hopefully it's easier to understand)
#
The problem is due to a mistake in the ````ficheiros = [f for f in glob.glob(fn)]`````it's only reading one file, hence only outputting one result.
Thanks for the time.sleep() trick in the comments.
Solved:
I changed the initial part to:
fn=str(sys.argv[1])
ficheiros= []
for filename in sys.argv[1:]:
ficheiros.append(filename)
#print(filename)
and now it correctly reads all the files and it outputs all the results
Depending on your operating system there are different ways to take what is printed to the console and append it to a file.
For example on Linux, you could run this file that calls original.py for every file python yourfile.py >> outputfile.txt, which will then effectively save everything that is printed into outputfile.txt.
The syntax is similar for Windows.
I'm not quite sure what you're asking, but you could try one of these:
Either redirecting all output to a file for later use, by running the script like so: python secondscript.py > outfilename.txt
Or, and this might or might not work for you, redefining the print command to a function that outputs the result how you want, eg:
def print(x):
with open('outfile.txt','w') as f:
f.write('example: ' + x)
If you choose the second option, I recommend saving the old print function (oldprint = print) so you can restore and use the regular print later.
I don't know if I got exactly what you want. You have a first script named original.py which takes some arguments and returns things in the form of print statements and you would like to grab these prints statements in your scripts to do things?
If so, a solution could be the subprocess module:
Let's say that this is original.py:
print("Hi, I'm original.py")
print("print me!")
And this is main.py:
import subprocess
script_path = "original.py"
print("Executing ", script_path)
process = subprocess.Popen(["python3", script_path], stdout=subprocess.PIPE)
for line in process.stdout:
print(line.decode("utf8"))
You can easily add more arguments in the Popen call like ["arg1", "arg2",] etc.
Output:
Executing original.py
Hi, I'm original.py
print me!
and you can grab the lines in the main.py to do what you want with them.

Using os.system to operate a .py file on many files

I hope that I can ask this in a clear way, im very much a beginner to python and forums in general so I apologise if i've got anything wrong from the start!
My issue is that I am currently trying to use os.system() to enable a program to run on every file within a directory (this is a directory of ASCII tables which I am crossing with a series of other tables to find matches.
import os
for filename in os.listdir('.'):
os.system('stilts tmatch2 ifmt1=ascii ifmt2=ascii in1=intern in2= %s matcher=2d values1='col1 col2' values2='col1 col2' params=5 out= %s-table.fits'%(filename,filename))
So what im hoping this would do is for every 'filename' it would operate this program known as stilts. Im guessing this gets interrupted/doesn't work because of the presence of apostrophes ' in the line of code itself, which must disrupt the syntax? (please correct me if I am wrong)
I then replaced the ' in os.system() with "" instead. This, however, stops me using the %s notation to refer to filenames throughout the code (at least I am pretty sure anyway).
import os
for filename in os.listdir('.'):
os.system("stilts tmatch2 ifmt1=ascii ifmt2=ascii in1=intern in2= %s matcher=2d values1='col1 col2' values2='col1 col2' params=5 out= %s-table.fits"%(filename,filename))
This now runs but obviously doesn't work, as it inteferes with the %s input.
Any ideas how I can go about fixing this? are there any alternative ways to refer to all of the other files given by 'filename' without using %s?
Thanks in advance and again, sorry for my inexperience with both coding and using this forum!
I am not familiar with os.system() but maybe if you try do some changes about the string you are sending to that method before it could behave differently.
You must know that in python you can "sum" strings so you can save your commands in a variable and add the filenames as in:
os.system(commands+filename+othercommands+filename)
other problem that could be working is that when using:
for file in os.listdir()
you may be recievin file types instead of the strings of their names. Try using a method such as filename.name to check if this is a different type of thing.
Sorry I cant test my answers for you but the computer I am using is too slow for me to try downloading python.

Turn off interactive mode in FTP

I am trying to automate the download of multiple files from an ftp source. These will span multiple years, dates, and from multiple sites that collected the data. Right now, I'm trying to make the basic download work. I can download a single file, but multiple files fail. I know when doing it manually, we would get to the directory, then
$>prompt
$>mget *.*
I have the following code as a first run at this...
import ftplib, subprocess
session = ftplib.FTP(host,user,password)
session.cwd(path)
subprocess.call("prompt")
files = session.nlst()
for f in files:
print f
session.retrbinary(("RETR" + f), open(f, 'wb').write)
session.quit()
Without the subprocess.call, the code pulls the first file, then errors out saying "command not understood." My assumption is that this is the box promptingg, since it does that if being downloaded manually. That's why I'm assuming I need the subprocess.call("prompt") command in there, as I would if handling this manually. However, when I have the subprocess added, it gives me an error that "The system cannot find the file specified" so that doesn't work, either. This error comes out of the subprocess.py module.
I guess I should post this here. Thank you to Greg Hewgill in the comments for the answer. I just needed a space after "Retr" in the line
session.retrbinary(("RETR " + f), open(f, 'wb').write)

Python OSError: Too many open files

I'm using Python 2.7 on Windows XP.
My script relies on tempfile.mkstemp and tempfile.mkdtemp to create a lot of files and directories with the following pattern:
_,_tmp = mkstemp(prefix=section,dir=indir,text=True)
<do something with file>
os.close(_)
Running the script always incurs the following error (although the exact line number changes, etc.). The actual file that the script is attempting to open varies.
OSError: [Errno 24] Too many open files: 'path\\to\\most\\recent\\attempt\\to\\open\\file'
Any thoughts on how I might debug this? Also, let me know if you would like additional information. Thanks!
EDIT:
Here's an example of use:
out = os.fdopen(_,'w')
out.write("Something")
out.close()
with open(_) as p:
p.read()
You probably don't have the same value stored in _ at the time you call os.close(_) as at the time you created the temp file. Try assigning to a named variable instead of _.
If would help you and us if you could provide a very small code snippet that demonstrates the error.
why not use tempfile.NamedTemporaryFile with delete=False? This allows you to work with python file objects which is one bonus. Also, it can be used as a context manager (which should take care of all the details making sure the file is properly closed):
with tempfile.NamedTemporaryFile('w',prefix=section,dir=indir,delete=False) as f:
pass #Do something with the file here.

How to move a file with a complicated filename in python

I am trying to do $ mv <file> .. in a python script using subprocess.call(). I am able to do this on 'normal' filenames, but on certain filenames it does not work. I do not have control of the filenames that are given to the script. Here is an example:
M filename is "ITunes ES Film Metadata_10_LaunchTitles(4th Batch)_08_20_2010.XLS"
When I try and do the command directly into the python prompt and drag the file into it, this is what I get:
>>> /Users/David/Desktop/itunes_finalize/TheInventionOfLying_CSP/
ITunes\ ES\ Film\ Metadata_10_LaunchTitles\(4th\ Batch\)_08_20_2010.XLS
No such file or directory
How would I go about moving this file in a python script?
Update:
Thanks for the answers, this is how I ended up doing it:
for file in glob.glob(os.path.join(dir, '*.[xX][lL][sS]')):
shutil.move(file, os.path.join(os.path.dirname(file), os.path.pardir))
subprocess is not the best way to go here. For example, what if you're on an operating system that isn't POSIX compliant?
Check out the shutil module.
>>> import shutil
>>> shutil.move(src, dest)
If finding the actual string for the filename is hard you can use glob.glob to pattern match what you want. For example, if you're running the script/prompt from the directory with the .XLS file in question you could do the following.
>>> import glob
>>> glob.glob('*ITunes*.XLS')
You'll get a list back with all the file strings that fit that pattern.
Rather than using subprocess and spawning a new process, use shutil.move() to just do it in Python. That way, the names won't be reinterpreted and there will be little chance for error.
Spaces, parens, etc. are the shell's problem. They don't require escaping in Python provided you don't pass them to a shell.
open('*WOW!* Rock&Roll(uptempo).mp3')

Categories

Resources