I'm just starting to learn Python... I've read up and it looks like I need to use glob - I just don't understand the filter process.
Imagine a directory structure like:
Main Directory
- Sub-Directory to delete
- Sub-Sub-Directory Alpha
- Sub-Sub-Directory Bravo Keep
file a
file b
- Sub-Sub-Directory Charlie
- Sub-Sub-Directory Oscar Keep
file a
file b
Using Python how can I delete all the folders and their contents under the folder named "Main Directory" except if the folder name contains a string - in this example "Keep" so that it ends up like this and keeps the original directory structure.
Main Directory
- Sub-Directory
- Sub-Sub-Directory Bravo Keep
file a
file b
- Sub-Sub-Directory Oscar Keep
file a
file b
You can use e.g. os.walk or os.path.listdir to find out which directories exist. Then decide which to delete.
As a general rule, you shoud go through the documentation and see what functions exist when you want to do something. For OS functions see documentation for os and os.path.
EDIT
shutil.rmtree is used for deleting a folder with all of it's contents, which is quite useful if that is what you need, but in this case you need to use the lower-level API.
Related
i just downloaded a file called "N_PR_8705_004A_.doc" in my "Downloads" folder and i want to put it into my "Stage NLP" folder using os. I know how to do it without os but i'd like that shit to work it's faster and it simply doesnt. First i tried to get the path of my file doing this:
import os
os.path.dirname(os.path.abspath("N_PR_8705_004A_.doc"))
# or os.path.realpath it's the same
and the result i get is:
'C:\\Users\\f002722\\Stage NLP'
whereas when i do list all the files in this folder doing:
os.listdir("C:\\Users\\f002722\\Stage NLP")
you clearly see it is simply not there:
['.ipynb_checkpoints',
'ADR service study - D2 (1st part).pdf',
'basetal.py',
'Codes test',
'Cours NLP.ipynb',
'e Deorbit',
'edot CDF study.pdf',
'edot_v5.pdf',
'Entrainement.ipynb',
'ESA edot workshop May 6th 2014 - Summary.msg',
'ESA_edotWorkshop-_Envisat_attitude-Copy1',
'ESA_edotWorkshop-_Envisat_attitude.pdf',
'ESA_edotWorkshop_GNC_.pdf',
'ESA_INNOCENTI_Challenges.pdf',
'ESA_Robin_Biesbroek_edot.pdf',
'GMV_edot_Symposium.pdf',
'JOP_edotWorkshop.pdf',
'KT_HAARMANN_Edot.pdf',
'MDA_edot_Symposium_-_Robotic_Capture.pdf',
'MDA_eDot_Symposium_-_Robotic_Capture.pdf.kx2zd5w.partial',
'Note_Ariane_NLP.ipynb',
'Note_Ariane_NLP_2.ipynb',
'Note_Ariane_NLP_3.ipynb',
'OHB_eDotWorkshop_ADRM.pdf',
'OHB_Sweden_eDotWorkshop_PRISMA_and_IRIDES.pdf',
'SKA_Polska_eDotWorkshop_Net_Simulator.pdf',
'TAS_Carole_Billot_edot.pdf',
'Test.ipynb',
'Text_clustering_v3_2.py',
'Webinar_OOSandADR_7May2020.pdf',
'__pycache__']
So what the hell is going on i'm out of ideas here.
Thx in advance
I think I have a possible answer to your question. Neither realpath nor abspath require their arguments to name existing files. In particular, the documentation for abspath() says: "On most platforms, this is equivalent to calling the function normpath() as follows: normpath(join(os.getcwd(), path))."
This means that if you have a Python script that has a line like,
foo = os.path.dirname(os.path.abspath("doesnotexist"))
then the value of foo will be the current working directory of the script. Since "doesnotexist" isn't the name of a file in this directory, it won't show up if you do os.listdir(foo).
I notice that you wrote that "N_PR_8705_004A_.doc" was in your "Downloads" directory, which is obviously not the same as 'C:\\Users\\f002722\\Stage NLP'. If 'C:\\Users\\f002722\\Stage NLP' is the working directory for your Python script, then running os.path.dirname(os.path.abspath("N_PR_8705_004A_.doc")) is just like writing os.path.dirname(os.path.abspath("doesnotexist")), for the reasons that I just gave.
Python can't automatically figure out the path of a file just by giving it a relative file name. For example, there could be many files named README.txt on a system, each in different directories, so there's no way for os.path.abspath('README.txt') to know which of those directories you want.
To move the file "N_PR_8705_004A_.doc" from the "Downloads" directory to 'C:\\Users\\f002722\\Stage NLP', you'd probably need to do something like this:
import shutil
shutil.move('C:\\Users\\f002722\\Downloads\\N_PR_8705_004A_.doc',
'C:\\Users\\f002722\\Stage NLP')
presuming, of course, that the "Downloads" directory was inside 'C:\\Users\\f002722'.
I am also having difficulty getting this code to run correctly. Per the book (automate the boring stuff with python 2), this code is supposed to create a copy of the spam.txt file in a new folder, namely 'some_folder". (If I understand correctly.)
Instead it only creates a file called some_folder in the home folder (when opened with notepad, that file is then the copy of the spam.txt file.
import shutil, os
from pathlib import Path
p = Path.home()
shutil.copy(p / 'spam.txt', p / 'some_folder')
I have seen a similar question asked but the issue they had was the spam.txt file had not been created. I have the spam.txt file, my issue is the output is not a copied file under a new 'some_folder' folder.
Apologies, I misunderstood. The new folder is supposed to already be created. This will not lead to a new folder being created.
It drives me crazy so I decided to ask here. I need to save the generated txt file (from lower structures of the project), to folder which is the sibling of the parent directory. I have used variety of different code along with all possibiliteis given by os.path.join etc and still nothing.
My structure:
--reports
--parent folder
--another folder
--another folder
-file.py
My lates code (based on string):
abs_dir = str(Path(__file__))
i = abs_dir.index("master")
self.new_dir = os.path.join(abs_dir[:i]+f"reports//log({self.date}).txt")
Use double dots (..), that will go to the parent directory
If you want to get to a folder in a parent folder
f1
- file.py
f2
- a.txt
you could just do ../f2/a.txt
In your case, you would have to go 3 folders up: ../../../reports/log...
I have 2 folder one on my server and one on my ansible server and want to make sure that there is no extra files on my server ( such as .bk or files that aren't used anymore ) and be able to exclude files ending with .xyz . Is there a module or a way that this can be done in Ansible or is there a module that can be used. I wont to use cmd or shell tasks but wasn't able to find any module
Folder 1 :
Bla.txt
Hello.jar
Folder 2:
Bla.txt
Hello.jar
info.log
bla.txt.bk
I would like the bla.txt.bk to be delete from the folder 2
Thank you in advance
An option would be to use synchronize.
"To make sure that there is no extra files" use delete parameter, or rsync_opts parameter
"To exclude files ending with .xyz" use rsync_opts parameter
For example (not tested)
- synchronize:
src: folder_1
dest: folder_2
rsync_opts:
- "--exclude=*.xyz"
- "--delete"
To exclude list of patterns use --exclude-from=FILE and read exclude patterns from FILE.
I need to do a batch rename given the following scenario:
I have a bunch of files in Folder A
A bunch of files in Folder B.
The files in Folder A are all ".doc",
the files in Folder B are all ".jpg".
The files in Folder A are named "A0001.doc"
The files in Folder B are named "A0001johnsmith.jpg"
I want to merge the folders, and rename the files in Folder A so that they append the name portion of the matching file in Folder B.
Example:
Before:
FOLDER A: Folder B:
A0001.doc A0001johnsmith.jpg
After:
Folder C:
A0001johnsmith.doc
A0001johnsmith.jpg
I have seen some batch renaming scripts, but the only difference is that i need to assign a variable to contain the name portion so I can append it to the end of the corresponding file in Folder A.
I figure that the best way to do it would be to do a simple python script that would do a recursive loop, working on each item in the folder as follows:
Parse filename of A0001.doc
Match string to filenames in Folder B
Take the portion following the string that matched but before the "." and assign variable
Take the original string A0001 and append the variable containing the name element and rename it
Copy both files to Folder C (non-destructive, in case of errors etc)
I was thinking of using python for this, but I could use some help with syntax and such. I only know a little bit using the base python library, and I am guessing I would be importing libraries such as "OS", and maybe "SYS". I have never used them before, any help would be appreciated. I am also open to using a windows batch script or even powershell. Any input is helpful.
This is Powershell since you said you would use that.
Please note that I HAVE NOT TESTED THIS. I don't have access to a Windows machine right now so I can't test it. I'm basing this off of memory, but I think it's mostly correct.
foreach($aFile in ls "/path/to/FolderA")
{
$matchString = $aFile.Name.Split("."}[0]
$bFile = $(ls "/path/to/FolderB" |? { $_.Name -Match $matchString })[0]
$addString = $bFile.Name.Split(".")[0].Replace($matchString, "")
cp $aFile ("/path/to/FolderC/" + $matchString + $addString + ".doc")
cp $bFile "/path/to/FolderC"
}
This makes a lot of assumptions about the name structure. For example, I assumed the string to add doesn't appear in the common filename strings.
It is very simple with a plain batch script.
#echo off
for %%A in ("folderA\*.doc") do (
for %%B in ("folderB\%%~nA*.jpg") do (
copy "%%A" "folderC\%%~nB.doc"
copy "%%B" "folderC"
)
)
I haven't added any error checking.
You could have problems if you have a file like "A1.doc" matching multiple files like "A1file1.jpg" and "A10file2.jpg".
As long as the .doc files have fixed width names, and there exists a .jpg for every .doc, then I think the code should work.
Obviously more code could be added to handle various scenarios and error conditions.