Python to create multiple zip files for all in a folder - python

Many files in a folder. I want to zip them all. Every 10 files will be added to a zip file.
import os, glob
import numpy as np
import zipfile
file_folder = "C:\\ABC\\DEF\\"
all_files = glob.glob(file_folder + "/*.*")
several_lists= np.array_split(all_files, 10)
for num, file_names in enumerate(several_lists):
ZipFile = zipfile.ZipFile(file_folder + str(num) + ".zip", "w" )
for f in file_names:
ZipFile.write(f, compress_type=zipfile.ZIP_DEFLATED)
ZipFile.close()
The generated zip files contains also the paths, i.e. every zip file has a folder DEF in a folder ABC. The file themselves are in DEF.
I changed the line to:
ZipFile.write(os.path.basename(f), compress_type=zipfile.ZIP_DEFLATED)
Error pops for:
WindowsError: [Error 2] The system cannot find the file specified:
How to correct it? Thank you.
Btw, is there a big difference in zip and rar file created by Python?

ZipFile.write has a parameter arcname which allows explicitly providing an in-archive filename (by default it's the same as the on-disk path).
So just use zip.write(f, arcname=os.path.basename(f)).
Also for simplicity you could set the compression mode on the zipfile.ZipFile.
edit: and you can use the zipfile as a context manager for more reliability and less lines, and assuming Python 3.6 f-strings are nice:
with zipfile.ZipFile(f'{file_folder}{num}.zip', 'w', compression=zipfile.ZIP_DEFLATED) as zip:
for f in file_names:
zip.write(f, arcname=os.path.basename(f))

Related

Delete file by non standard extension

I know how to delete files by extension but what if my files are looking like this:
update_24-08-2022_14-54.zip.001
Where last 3 digits can be between 001-029
Here is code that I'm using for standard zip files
files_in_directory = os.listdir(directory)
filtered_files = [file for file in files_in_directory if file.endswith(".zip")]
for file in filtered_files:
path_to_file = os.path.join(directory, file)
os.remove(path_to_file)
Assuming the double extensions are of the form .zip.xyz, with xyz being triple digits, you can use globbing:
import glob
import os
for path in glob.glob('*.zip.[0-9][0-9][0-9]'):
os.remove(path)
(As a usual precaution, check first, by replacing os.remove with print).
If you have a specific directory, its name stored in directory, you can use:
import glob
import os
for path in glob.glob(os.path.join(directory, '*.zip.[0-9][0-9][0-9]')):
os.remove(path)
There is no need to join the directory and path inside the for loop (as is the case in the question): path itself will already contain the directory name.

Rename the filename using python

I have folder where I have multiple files. Out of this files I want to rename some of them. For example: PB report December21 North.xlsb, PB report November21 North.xslb and so on. They all have a same start - PB report. I would like to change their name and leave only PB report and Month. For example PB report December.
I have tried this code:
import os
path = r'C://Users//greencolor//Desktop//Autoreport//Load_attachments//'
for filename in os.listdir(path):
if filename.startswith("PB report"):
os.rename(filename, filename[:-8])
-8 indicates that I want to split the name from the end on the 8th character
I get this error:
FileNotFoundError: [WinError 2] The system cannot find the file specified
Any suggestion?
You need the path when renaming file with os.rename:
Replace:
os.rename(filename, filename[:-8])
With:
filename_part, extension = os.path.splitext(filename)
os.rename(path+filename, path+filename_part[:-8]+extension)
The problem is likely that it cannot find the file because the directory is not specified. You need to add the path to the file name:
import os
path = r'C://Users//greencolor//Desktop//Autoreport//Load_attachments//'
for filename in os.listdir(path):
if filename.startswith("PB report"):
os.rename(os.path.join(path, filename), os.path.join(path, filename[:-8]))
This is a classic example of how working with os/os.path to manipulate paths is just not convenient. This is why pathlib exists. By treating paths as objects, rather than strings everything becomes more sensible. By using a combination of path.iterdir() and path.rename() you can achieve what you want like:
from pathlib import Path
path = Path(r'your path')
for file in path.iterdir():
if file.name.startswith("PB report"):
file.rename(file.with_stem(file.stem[:-8]))
Note that stem means the filename without the extension and that with_stem was added in Python 3.9. So for older versions you can still use with_name:
file.rename(file.with_name(file.stem[:-8] + file.suffix))
Where suffix is the extension of the file.

Writing zipfile in Python 3.6 without absolute path

I am trying to write a zip file using Python's zipfile module that starts at a certain subfolder but still maintains the tree structure from that subfolder. For example, if I pass "C:\Users\User1\OneDrive\Documents", the zip file will contain everything from Documents onward, with all of Documents' subfolders maintained within Documents. I have the following code:
import zipfile
import os
import datetime
def backup(src, dest):
"""Backup files from src to dest."""
base = os.path.basename(src)
now = datetime.datetime.now()
newFile = f'{base}_{now.month}-{now.day}-{now.year}.zip'
# Set the current working directory.
os.chdir(dest)
if os.path.exists(newFile):
os.unlink(newFile)
newFile = f'{base}_{now.month}-{now.day}-{now.year}_OVERWRITE.zip'
# Write the zipfile and walk the source directory tree.
with zipfile.ZipFile(newFile, 'w') as zip:
for folder, _ , files in os.walk(src):
print(f'Working in folder {os.path.basename(folder)}')
for file in files:
zip.write(os.path.join(folder, file),
arcname=os.path.join(
folder[len(os.path.dirname(folder)) + 1:], file),
compress_type=zipfile.ZIP_DEFLATED)
print(f'\n---------- Backup of {base} to {dest} successful! ----------\n')
I know I have to use the arcname parameter for zipfile.write(), but I can't figure out how to get it to maintain the tree structure of the original directory. The code as it is now writes every subfolder to the first level of the zip file, if that makes sense. I've read several posts suggesting I use os.path.relname() to chop off the root, but I can't seem to figure out how to do it properly. I am also aware that this post looks similar to others on Stack Overflow. I have read those other posts and cannot figure out how to solve this problem.
The arcname parameter will set the exact path within the zip file for the file you are adding. You issue is when you are building the path for arcname you are using the wrong value to get the length of the prefix to remove. Specifically:
arcname=os.path.join(folder[len(os.path.dirname(folder)) + 1:], file)
Should be changed to:
arcname=os.path.join(folder[len(src):], file)

how to get name of a file in directory using python

There is an mkv file in a folder named "export". What I want to do is to make a python script which fetches the file name from that export folder.
Let's say the folder is at "C:\Users\UserName\Desktop\New_folder\export".
How do I fetch the name?
I tried using this os.path.basename and os.path.splitext .. well.. didn't work out like I expected.
os.path implements some useful functions on pathnames. But it doesn't have access to the contents of the path. For that purpose, you can use os.listdir.
The following command will give you a list of the contents of the given path:
os.listdir("C:\Users\UserName\Desktop\New_folder\export")
Now, if you just want .mkv files you can use fnmatch(This module provides support for Unix shell-style wildcards) module to get your expected file names:
import fnmatch
import os
print([f for f in os.listdir("C:\Users\UserName\Desktop\New_folder\export") if fnmatch.fnmatch(f, '*.mkv')])
Also as #Padraic Cunningham mentioned as a more pythonic way for dealing with file names you can use glob module :
map(path.basename,glob.iglob(pth+"*.mkv"))
You can use glob:
from glob import glob
pth ="C:/Users/UserName/Desktop/New_folder/export/"
print(glob(pth+"*.mkv"))
path+"*.mkv" will match all the files ending with .mkv.
To just get the basenames you can use map or a list comp with iglob:
from glob import iglob
print(list(map(path.basename,iglob(pth+"*.mkv"))))
print([path.basename(f) for f in iglob(pth+"*.mkv")])
iglob returns an iterator so you don't build a list for no reason.
I assume you're basically asking how to list files in a given directory. What you want is:
import os
print os.listdir("""C:\Users\UserName\Desktop\New_folder\export""")
If there's multiple files and you want the one(s) that have a .mkv end you could do:
import os
files = os.listdir("""C:\Users\UserName\Desktop\New_folder\export""")
mkv_files = [_ for _ in files if _[-4:] == ".mkv"]
print mkv_files
If you are searching for recursive folder search, this method will help you to get filename using os.walk, also you can get those file's path and directory using this below code.
import os, fnmatch
for path, dirs, files in os.walk(os.path.abspath(r"C:/Users/UserName/Desktop/New_folder/export/")):
for filename in fnmatch.filter(files, "*.mkv"):
print(filename)
You can use glob
import glob
for file in glob.glob('C:\Users\UserName\Desktop\New_folder\export\*.mkv'):
print(str(file).split('\')[-1])
This will list out all the files having extention .mkv as
file.mkv, file2.mkv and so on.
From os.walk you can read file paths as a list
files = [ file_path for _, _, file_path in os.walk(DIRECTORY_PATH)]
for file_name in files[0]: #note that it has list of lists
print(file_name)

Zipping only files in Python

So I create several files in some temp directory dictated by NamedTemporaryFile function.
zf = zipfile.ZipFile( zipPath, mode='w' )
for file in files:
with NamedTemporaryFile(mode='w+b', bufsize=-1, prefix='tmp') as tempFile:
tempPath = tempFile.name
with open(tempPath, 'w') as f:
write stuff to tempPath with contents of the variable 'file'
zf.write(tempPath)
zf.close()
When I use the path of these files to add to a zip file, the temp directories themselves get zipped up.
When I try to unzip, I get a series of temp folders, which eventually contain the files I want.
(i.e. I get the folder Users, which contains my user_id folder, which contains AppData...).
Is there a way to add the files directly, without the folders, so that when I unzip, I get the files directly? Thank you so much!
Try giving the arcname:
from os import path
zf = zipfile.ZipFile( zipPath, mode='w' )
for file in files:
with NamedTemporaryFile(mode='w+b', bufsize=-1, prefix='tmp') as tempFile:
tempPath = tempFile.name
with open(tempPath, 'w') as f:
write stuff to tempPath with contents of the variable 'file'
zf.write(tempPath,arcname=path.basename(tempPath))
zf.close()
Using os.path.basename you can get the file's name from a path. According to zipfile documentation, the default value for arcname is filename without a drive letter and with leading path separators removed.
Try using the arcname parameter to zf.write:
zf.write(tempPath, arcname='Users/mrb0/Documents/things.txt')
Without knowing more about your program, you may find it easier to get your arcname from the file variable in your outermost loop rather than deriving a new name from tempPath.

Categories

Resources