Create a package files python - python

I need to create a script to copy all files .class and .xml from multiple folders and generate a package something like tar type, those diferent path folders will be filled when the script runs, is this possible?
I'm using linux - Centos
Thanks

Python's standard library comes with multiple archiving modules, and more are available from PyPI and elsewhere.
I'm not sure how you want to fill in the paths to the things to include, but let's say you've already got that part done, and you have a list or iterator full of (appropriately relative) pathnames to files. Then, you can just do this:
with tarfile.TarFile('package.tgz', 'w:gz') as tar:
for pathname in pathnames:
tar.add(pathname)
But you don't even have to gather all the files one by one, because tarfile can do that for you. Let's say your script just takes one or more directory names as command-line arguments, and you want it to recursively add all of the files whose names end in .xml or .class anywhere in any of those directories:
def package_filter(info):
if info.isdir() or os.path.splitext(info.name)[-1] in ('.xml', '.class'):
return info
else:
return None
with tarfile.TarFile('package.tgz', 'w:gz', filter=package_filter) as tar:
for pathname in sys.argv[1:]:
tar.add(pathname)
See the examples for more. But mainly, read the docs for TarFile's constructor and open method.

Related

How to open files in a particular folder with randomly generated names?

How to open files in a particular folder with randomly generated names? I have a folder named 2018 and the files within that folder are named randomly. I want to iterate through all of the files and open them up.
I will post three names of the files as an example but note that there are over a thousand files in this folder so it has to work on a large scale without any hard coding.
0a2ec2da-628d-417d-9520-b0889886e2ac_1.xml
00a6b260-951d-46b5-ab27-b2e8729e664d_1.xml
00a6b260-951d-46b5-ab27-b2e8729e664d_2.xml
You're looking for os.walk().
In general, if you want to do something with files, it's worth glancing at the os, os.path, pathlib and other built-in modules. They're all documented.
You could also use glob expansion to expand "folder/*" into a list of all the filenames, but os.walk is probably better.
With os.listdir() or os.walk(), depending on whether you want to do it recursively or not.
You can go through the python doc
https://docs.python.org/3/library/os.html#os.walk
https://docs.python.org/3/library/os.html#os.listdir
One you have list of files you can read it simply -
for file in files:
with open(file, "r") as f:
# perform file operations

Check if there are .format files in a directory

I have been trying to figure out for a while how to check if there are .pkl files in a given directory. I checked the website and I could find ways to find if there are files in the directory and list them, but I just want to check if they are there.
In my directory are a total of 7 .pkl files, as soon as I create one, the others are created so to check if the seven of them exist, it will be enough to check if one exists. Therefore, I would like to check if there is any .pkl file.
This is working if I do:
os.path.exists('folder1/folder2/filename.pkl')
But I had to write one of my file names. I would like to do so without searching for a specific file. I also tried
os.path.exists('folder1/folder2/*.pkl'),
but it is not working neither as I don't have any file named *.pkl.
You can use the python module glob (https://docs.python.org/3/library/glob.html)
Specifically, glob.glob('folder1/folder2/*.pkl') will return a list of all .pkl files in folder2.
You can use :
for dir_path, dir_names, file_names in os.walk(search_dir):
# Go over all files and folders
for file_name in file_names:
if (file_name.endswith(".pkl")):
# do something like break after the first one you find
Note : This can be used if you want to search entire directory with sub directories also
In case you want to search only one directory , you can run the "for" on os.listdir(path)

Search for file names that contain words from a list and have a certain file extension

Beginner at python. I'm trying to search users folders for illegal content saved in folders. I want to find all files that contain either one or a number of words from the below list and also the files also have an extension that's listed.
I can search the files using file.endswith but don't know how to add in the word condition.
I've looked through the site and how only come across how to search for a certain word and not a list of words.
Thank you in advance
import os
L = ['720p','aac','ac3','bdrip','brrip','demonoid','disc','hdtv','dvdrip',
'edition','sample','torrent','www','x264','xvid']
for root, dirs, files in os.walk("Y:\User Folders\"):
for file in files:
if file.endswith(('*.7z','.3gp','.alb','.ape','.avi','.cbr','.cbz','.cue','.divx','.epub','.flac',
'.flv','.idx','.iso','.m2ts','.m2v','.m3u','.m4a','.m4b','.m4p','.m4v','.md5',
'.mkv','.mobi','.mov','.mp3','.mp4','.mpeg','.mpg','.mta','.nfo','.ogg','.ogm',
'.pla','.rar','.rm','.rmvb','.sfap0','.sfk','.sfv','.sls','.smfmf','.srt,''.sub',
'.torrent','.vob','.wav','.wma','.wmv','.wpl','.zip')):
print(os.path.join(root, file))
Perhaps it might be better to do a reverse search, and display a warning about files that DON'T match the file types you want. For instance you could do this:
if file.endswith(".txt", ".py"):
print("File is ok!")
else:
print("File is not ok!")
Using py.path.local from py package
The py package (install by $ pip install py) offers a very nice interface for working with files.
from py.path import local
def isbadname(path):
bad_extensions = [".pyc", "txt"]
bad_names = ["code", "xml"]
return (path.ext in bad_extensions) or (path.purebasename in bad_names)
for path in local(".").visit(isbadname):
print(path.strpath)
Explained:
Import
from py.path import local
py.path.local function creates "objectified" file names. To keep my code short, I import
it this way to use only local for objectifying file name strings.
Create objectified path to local directory:
local(".")
Created object is not a string, but an object, which has many interesting properties and methods.
Listing all files within some directory:
local(".").visit("*.txt")
returns a generator, providing all paths to files having extension ".txt"..
Alternative method to detect files to generate is providing a function, which gets argument path
(objectified file name) and returns True if the file is to be used, False otherwise.
The function isbadname serves exactly this purpose.
If you want to google for more information, use py path local (the name py is not giving good hits).
For more see https://py.readthedocs.io/en/latest/path.html
Note, that if you use pytest package, the py is installed with it (for good
reason - it makes tests related to file names much more readable and shorter).

Python operating on files in a folder - 'for file in folder'

I know a folder's path, and for every file in the folder I would like to do some operations. So essentially what I'm looking for is a for file in folder type of code that gives me access to the files in variables.
What is the Python way of doing this?
Thanks
EDIT - example: my folder will contain a bunch of XML files, and I have a python routine already to parse them into variables I need.
This will allow you to access and print all the file names in your current directory:
import os
for filename in os.listdir('.'):
print filename
The os module contains much more information about the various functions available. The os.listdir() function can also take any other paths you want to specify.
Does the glob library look helpful?
It will perform some pattern matching, and accepts both absolute and relative addresses.
>>> import glob
>>> for file in glob.glob("*.xml"): # only loops over XML documents
print file
For people coming at this from a python version 3.5 or later, we now have the superior os.scandir() which has tremendous performance improvements over os.listdir()
For more information about the improvements/benefits, check out https://benhoyt.com/writings/scandir/

Distinguishing Files From Directories

So I'm sure this is a stupid question, but I've looked through Python's documentation and attempted a couple of Google codes and none of them has worked.
It seems like the following should work, but it returns "False" for
In my directory /foo/bar I have 3 items: 1 Folder "[Folder]", 1 file "test" (no extension), and 1 file "test.py".
I'm look to have a script that can distinguish folders from files for a bunch of functions, but I can't figure out anything that works.
#!/usr/bin/python
import os, re
for f in os.listdir('/foo/bar'):
print f, os.path.isdir(f)
Currently returns false for everything.
This is because listdir() returns the names of the files in /foo/bar. When you later do os.path.isdir() on one of these, the OS interprets it relative to the current working directory which is probably the directory your script is in, not /foo/bar, and it probably does not contain a directory of the specified name. A path that doesn't exist is not a directory and so isdir() returns False..
Use the complete pathname. Best way is to use os.path.join, e.g., os.path.isdir(os.path.join('/foo/bar', f)).
You might want to use os.walk instead: http://docs.python.org/library/os.html#os.walk
When it returns the contents of the directory, it returns files and directories in separate lists, negating the need for checking.
So you could do:
import os
root, dirs, files = next(os.walk('/foo/bar'))
print 'directories:', dirs
print 'files:', files
I suppose that os.path.isdir(os.path.join('/foo/bar', f)) should work.

Categories

Resources