Ansible module to compare files in 2 folder and delete extras - python

I have 2 folder one on my server and one on my ansible server and want to make sure that there is no extra files on my server ( such as .bk or files that aren't used anymore ) and be able to exclude files ending with .xyz . Is there a module or a way that this can be done in Ansible or is there a module that can be used. I wont to use cmd or shell tasks but wasn't able to find any module
Folder 1 :
Bla.txt
Hello.jar
Folder 2:
Bla.txt
Hello.jar
info.log
bla.txt.bk
I would like the bla.txt.bk to be delete from the folder 2
Thank you in advance

An option would be to use synchronize.
"To make sure that there is no extra files" use delete parameter, or rsync_opts parameter
"To exclude files ending with .xyz" use rsync_opts parameter
For example (not tested)
- synchronize:
src: folder_1
dest: folder_2
rsync_opts:
- "--exclude=*.xyz"
- "--delete"
To exclude list of patterns use --exclude-from=FILE and read exclude patterns from FILE.

Related

A format to specify search pattern for folders and files within them

I am using a testing framework to test designs written in VHDL. In order for this to work, a Python script creates several "libraries" and then adds files in these libraries. Finally the simulator program is invoked, it starts up, compiles all the files into the specified libraries and then runs the tests.
I want to make changes in the way we specify what "libraries" to create and where to add the files for each library from. I think that it should be possible to write the description for these things in JSON and then let Python script process it. In this way, the same Python script can be used for all projects and I don't have to worry about someone not knowing Python.
The main issue is deciding how to express the information in JSON file. The JSON file shall create entries for library name and then location of source files. The fundamental problem is how to express these things using some type of pattern like glob or regular expression:
Pattern for name of folder to search
Pattern for name of subfolders to search
Express if all subfolders should be searched in a folder or not
What subfolders to exclude from search
This would express something like e.g "files in folder A but not its subfolders, folder B and its subfolders but not subfolder X in folder B"
Then we come to the pattern for the actual file names. The pattern of file names shall follow the pattern for the folder. If same file pattern applies to multiple folders, then after multiple lines of folder name patterns, the filename pattern applying to all of them shall occur once.
Pattern for name of file to add into library.
Pattern for name of file to exclude from library.
This would express something like e.g "all files ending with ".vhd" but no files that have "_bb_inst.vhd" in their name and do not add p.vhd and q.vhd"
Finally the Python script parsing the files should be able to detect conflicts in the rules e.g a folder is specified for search and exclusion at same time, the same files are being added into multiple libraries e.t.c. This will of course be done within the Python script.
Now my question is, does a well defined pre-existing method to define something like what I have described here already exist? The only reason to choose JSON to express this is that Python has packages to traverse JSON files.
Have you looked at the glob library?
For your more tricky use cases you could specify in/out lists using glob patterns.
For example
import glob
inlist_pattern = "/some/path/on_yoursystem/*.vhd"
outlist_pattern = "/some/path/on_yoursystem/*_bb_inst.vhd"
filtered_files = set(glob.glob(inlist_pattern )) - set(glob.glob(outlist_pattern))
And other set operations allow you to perform more interesting in/out operations.
To do recursive scans, try ammending your patterns accordingly:
inlist_pattern = "/some/path/on_yoursystem/**/*.vhd"
outlist_pattern = "/some/path/on_yoursystem/**/*_bb_inst.vhd"
list_of_all_vhds_in_sub_dirs = glob.glob(inlist_pattern, recursive=True)
With the recursive=True keyword option, the scan will be performed at the point in the path specified, and where the ** notation is used, plus zero or more subfolders, returning the files that match the overall pattern.

How to get higher directory folder in python project?

It drives me crazy so I decided to ask here. I need to save the generated txt file (from lower structures of the project), to folder which is the sibling of the parent directory. I have used variety of different code along with all possibiliteis given by os.path.join etc and still nothing.
My structure:
--reports
--parent folder
--another folder
--another folder
-file.py
My lates code (based on string):
abs_dir = str(Path(__file__))
i = abs_dir.index("master")
self.new_dir = os.path.join(abs_dir[:i]+f"reports//log({self.date}).txt")
Use double dots (..), that will go to the parent directory
If you want to get to a folder in a parent folder
f1
- file.py
f2
- a.txt
you could just do ../f2/a.txt
In your case, you would have to go 3 folders up: ../../../reports/log...

python delete contents of a folder except folders with a specific name

I'm just starting to learn Python... I've read up and it looks like I need to use glob - I just don't understand the filter process.
Imagine a directory structure like:
Main Directory
- Sub-Directory to delete
- Sub-Sub-Directory Alpha
- Sub-Sub-Directory Bravo Keep
file a
file b
- Sub-Sub-Directory Charlie
- Sub-Sub-Directory Oscar Keep
file a
file b
Using Python how can I delete all the folders and their contents under the folder named "Main Directory" except if the folder name contains a string - in this example "Keep" so that it ends up like this and keeps the original directory structure.
Main Directory
- Sub-Directory
- Sub-Sub-Directory Bravo Keep
file a
file b
- Sub-Sub-Directory Oscar Keep
file a
file b
You can use e.g. os.walk or os.path.listdir to find out which directories exist. Then decide which to delete.
As a general rule, you shoud go through the documentation and see what functions exist when you want to do something. For OS functions see documentation for os and os.path.
EDIT
shutil.rmtree is used for deleting a folder with all of it's contents, which is quite useful if that is what you need, but in this case you need to use the lower-level API.

Create a package files python

I need to create a script to copy all files .class and .xml from multiple folders and generate a package something like tar type, those diferent path folders will be filled when the script runs, is this possible?
I'm using linux - Centos
Thanks
Python's standard library comes with multiple archiving modules, and more are available from PyPI and elsewhere.
I'm not sure how you want to fill in the paths to the things to include, but let's say you've already got that part done, and you have a list or iterator full of (appropriately relative) pathnames to files. Then, you can just do this:
with tarfile.TarFile('package.tgz', 'w:gz') as tar:
for pathname in pathnames:
tar.add(pathname)
But you don't even have to gather all the files one by one, because tarfile can do that for you. Let's say your script just takes one or more directory names as command-line arguments, and you want it to recursively add all of the files whose names end in .xml or .class anywhere in any of those directories:
def package_filter(info):
if info.isdir() or os.path.splitext(info.name)[-1] in ('.xml', '.class'):
return info
else:
return None
with tarfile.TarFile('package.tgz', 'w:gz', filter=package_filter) as tar:
for pathname in sys.argv[1:]:
tar.add(pathname)
See the examples for more. But mainly, read the docs for TarFile's constructor and open method.

Script to rename files in folder to match names of files in another folder

I need to do a batch rename given the following scenario:
I have a bunch of files in Folder A
A bunch of files in Folder B.
The files in Folder A are all ".doc",
the files in Folder B are all ".jpg".
The files in Folder A are named "A0001.doc"
The files in Folder B are named "A0001johnsmith.jpg"
I want to merge the folders, and rename the files in Folder A so that they append the name portion of the matching file in Folder B.
Example:
Before:
FOLDER A: Folder B:
A0001.doc A0001johnsmith.jpg
After:
Folder C:
A0001johnsmith.doc
A0001johnsmith.jpg
I have seen some batch renaming scripts, but the only difference is that i need to assign a variable to contain the name portion so I can append it to the end of the corresponding file in Folder A.
I figure that the best way to do it would be to do a simple python script that would do a recursive loop, working on each item in the folder as follows:
Parse filename of A0001.doc
Match string to filenames in Folder B
Take the portion following the string that matched but before the "." and assign variable
Take the original string A0001 and append the variable containing the name element and rename it
Copy both files to Folder C (non-destructive, in case of errors etc)
I was thinking of using python for this, but I could use some help with syntax and such. I only know a little bit using the base python library, and I am guessing I would be importing libraries such as "OS", and maybe "SYS". I have never used them before, any help would be appreciated. I am also open to using a windows batch script or even powershell. Any input is helpful.
This is Powershell since you said you would use that.
Please note that I HAVE NOT TESTED THIS. I don't have access to a Windows machine right now so I can't test it. I'm basing this off of memory, but I think it's mostly correct.
foreach($aFile in ls "/path/to/FolderA")
{
$matchString = $aFile.Name.Split("."}[0]
$bFile = $(ls "/path/to/FolderB" |? { $_.Name -Match $matchString })[0]
$addString = $bFile.Name.Split(".")[0].Replace($matchString, "")
cp $aFile ("/path/to/FolderC/" + $matchString + $addString + ".doc")
cp $bFile "/path/to/FolderC"
}
This makes a lot of assumptions about the name structure. For example, I assumed the string to add doesn't appear in the common filename strings.
It is very simple with a plain batch script.
#echo off
for %%A in ("folderA\*.doc") do (
for %%B in ("folderB\%%~nA*.jpg") do (
copy "%%A" "folderC\%%~nB.doc"
copy "%%B" "folderC"
)
)
I haven't added any error checking.
You could have problems if you have a file like "A1.doc" matching multiple files like "A1file1.jpg" and "A10file2.jpg".
As long as the .doc files have fixed width names, and there exists a .jpg for every .doc, then I think the code should work.
Obviously more code could be added to handle various scenarios and error conditions.

Categories

Resources