Determine Filename of Unzipped File - python

Say you unzip a file called file123.zip with zipfile.ZipFile, which yields an unzipped file saved to a known path. However, this unzipped file has a completely random name. How do you determine this completely random filename? Or is there some way to control what the name of the unzipped file is?
I am trying to implement this in python.

By "random" I assume that you mean that the files are named arbitrarily.
You can use ZipFile.read() which unzips the file and returns its contents as a string of bytes. You can then write that string to a named file of your choice.
from zipfile import ZipFile
with ZipFile('file123.zip') as zf:
for i, name in enumerate(zf.namelist()):
with open('outfile_{}'.format(i), 'wb') as f:
f.write(zf.read(name))
This will write each file from the archive to a file named output_n in the current directory. The names of the files contained in the archive are obtained with ZipFile.namelist(). I've used enumerate() as a simple method of generating the file names, however, you could substitute that with whatever naming scheme you require.

If the filename is completely random you can first check for all filenames in a particular directory using os.listdir(). Now you know the filename and can do whatever you want with it :)
See this topic for more information.

Related

Python files pdf rename

I have a file .pdf in a folder and I have a .xls with two-column. In the first column I have the filename without extension .pdf and in the second column, I have a value.
I need to open file .xls, match the value in the first column with all filenames in the folder and rename each file .pdf with the value in the second column.
Is it possible?
Thank you for your support
Angelo
You'll want to use the pandas library within python. It has a function called pandas.read_excel that is very useful for reading excel files. This will return a dataframe, which will allow you to use iloc or other methods of accessing the values in the first and second columns. From there, I'd recommend using os.rename(old_name, new_name), where old_name and new_name are the paths to where your .pdf files are kept. A full example of the renaming part looks like this:
import os
# Absolute path of a file
old_name = r"E:\demos\files\reports\details.txt"
new_name = r"E:\demos\files\reports\new_details.txt"
# Renaming the file
os.rename(old_name, new_name)
I've purposely left out a full explanation because you simply asked if it is possible to achieve your task, so hopefully this points you in the right direction! I'd recommend asking questions with specific reproducible code in the future, in accordance with stackoverflow guidelines.
I would encourage you to do this with a .csv file instead of a xls, as is a much easier format (requires 0 formatting of borders, colors, etc.).
You can use the os.listdir() function to list all files and folders in a certain directory. Check os built-in library docs for that. Then grab the string name of each file, remove the .pdf, and read your .csv file with the names and values, and the rename the file.
All the utilities needed are built-in python. Most are the os lib, other are just from csv lib and normal opening of files:
with open(filename) as f:
#anything you have to do with the file here
#you may need to specify what permits are you opening the file with in the open function

zipping files in python without folder structure

I want to create a zip file of some files somewhere on the disk in python. I successfully got the path to the folder and each file name so I did:
with zp(os.path.join(self.savePath, self.selectedIndex + ".zip"), "w") as zip:
for file in filesToZip:
zip.write(self.folderPath + file)
Everything works fine, but the zipfile that is output contains the entire folder structure leading up to the files. Is there a way to only zip the files and not the folders with it?
From the documentation:
ZipFile.write(filename, arcname=None, compress_type=None,
compresslevel=None)
Write the file named filename to the archive, giving it the archive
name arcname (by default, this will be the same as filename, but
without a drive letter and with leading path separators removed).
So, just specify an explicit arcname:
with zp(os.path.join(self.savePath, self.selectedIndex + ".zip"), "w") as zip:
for file in filesToZip:
zip.write(self.folderPath + file, arcname=file)
Maybe I' misunderstand the question but could someone explain to me why the answer is not:
zip.write(file, arcname=os.path.basename(file))
It works for me but, again, I might be missing something in the question...

How to open files in a particular folder with randomly generated names?

How to open files in a particular folder with randomly generated names? I have a folder named 2018 and the files within that folder are named randomly. I want to iterate through all of the files and open them up.
I will post three names of the files as an example but note that there are over a thousand files in this folder so it has to work on a large scale without any hard coding.
0a2ec2da-628d-417d-9520-b0889886e2ac_1.xml
00a6b260-951d-46b5-ab27-b2e8729e664d_1.xml
00a6b260-951d-46b5-ab27-b2e8729e664d_2.xml
You're looking for os.walk().
In general, if you want to do something with files, it's worth glancing at the os, os.path, pathlib and other built-in modules. They're all documented.
You could also use glob expansion to expand "folder/*" into a list of all the filenames, but os.walk is probably better.
With os.listdir() or os.walk(), depending on whether you want to do it recursively or not.
You can go through the python doc
https://docs.python.org/3/library/os.html#os.walk
https://docs.python.org/3/library/os.html#os.listdir
One you have list of files you can read it simply -
for file in files:
with open(file, "r") as f:
# perform file operations

Search for file names that contain words from a list and have a certain file extension

Beginner at python. I'm trying to search users folders for illegal content saved in folders. I want to find all files that contain either one or a number of words from the below list and also the files also have an extension that's listed.
I can search the files using file.endswith but don't know how to add in the word condition.
I've looked through the site and how only come across how to search for a certain word and not a list of words.
Thank you in advance
import os
L = ['720p','aac','ac3','bdrip','brrip','demonoid','disc','hdtv','dvdrip',
'edition','sample','torrent','www','x264','xvid']
for root, dirs, files in os.walk("Y:\User Folders\"):
for file in files:
if file.endswith(('*.7z','.3gp','.alb','.ape','.avi','.cbr','.cbz','.cue','.divx','.epub','.flac',
'.flv','.idx','.iso','.m2ts','.m2v','.m3u','.m4a','.m4b','.m4p','.m4v','.md5',
'.mkv','.mobi','.mov','.mp3','.mp4','.mpeg','.mpg','.mta','.nfo','.ogg','.ogm',
'.pla','.rar','.rm','.rmvb','.sfap0','.sfk','.sfv','.sls','.smfmf','.srt,''.sub',
'.torrent','.vob','.wav','.wma','.wmv','.wpl','.zip')):
print(os.path.join(root, file))
Perhaps it might be better to do a reverse search, and display a warning about files that DON'T match the file types you want. For instance you could do this:
if file.endswith(".txt", ".py"):
print("File is ok!")
else:
print("File is not ok!")
Using py.path.local from py package
The py package (install by $ pip install py) offers a very nice interface for working with files.
from py.path import local
def isbadname(path):
bad_extensions = [".pyc", "txt"]
bad_names = ["code", "xml"]
return (path.ext in bad_extensions) or (path.purebasename in bad_names)
for path in local(".").visit(isbadname):
print(path.strpath)
Explained:
Import
from py.path import local
py.path.local function creates "objectified" file names. To keep my code short, I import
it this way to use only local for objectifying file name strings.
Create objectified path to local directory:
local(".")
Created object is not a string, but an object, which has many interesting properties and methods.
Listing all files within some directory:
local(".").visit("*.txt")
returns a generator, providing all paths to files having extension ".txt"..
Alternative method to detect files to generate is providing a function, which gets argument path
(objectified file name) and returns True if the file is to be used, False otherwise.
The function isbadname serves exactly this purpose.
If you want to google for more information, use py path local (the name py is not giving good hits).
For more see https://py.readthedocs.io/en/latest/path.html
Note, that if you use pytest package, the py is installed with it (for good
reason - it makes tests related to file names much more readable and shorter).

Adding file to existing zipfile

I'm using python's zipfile module.
Having a zip file located in a path of:
/home/user/a/b/c/test.zip
And having another file created under /home/user/a/b/c/1.txt
I want to add this file to existing zip, I did:
zip = zipfile.ZipFile('/home/user/a/b/c/test.zip','a')
zip.write('/home/user/a/b/c/1.txt')
zip.close()`
And got all the subfolders appears in path when unzipping the file, how do I just enter the zip file without path's subfolders?
I tried also :
zip.write(os.path.basename('/home/user/a/b/c/1.txt'))
And got an error that file doesn't exist, although it does.
You got very close:
zip.write(path_to_file, os.path.basename(path_to_file))
should do the trick for you.
Explanation: The zip.write function accepts a second argument (the arcname) which is the filename to be stored in the zip archive, see the documentation for zipfile more details.
os.path.basename() strips off the directories in the path for you, so that the file will be stored in the archive under just it's name.
Note that if you only zip.write(os.path.basename(path_to_file)) it will look for the file in the current directory where it (as the error says) does not exist.
import zipfile
# Open a zip file at the given filepath. If it doesn't exist, create one.
# If the directory does not exist, it fails with FileNotFoundError
filepath = '/home/user/a/b/c/test.zip'
with zipfile.ZipFile(filepath, 'a') as zipf:
# Add a file located at the source_path to the destination within the zip
# file. It will overwrite existing files if the names collide, but it
# will give a warning
source_path = '/home/user/a/b/c/1.txt'
destination = 'foobar.txt'
zipf.write(source_path, destination)

Categories

Resources