set directory for search program in python - python

I am trying to develop a CNN for image processing. I have about 130 gigs stored on a separate drive on my comp, and I'm having trouble navigating a simple python search program to search through that specified directory. Im trying to have it find a bunch of random XML files scattered in a host of sub-directories/sub-directories/subs on that drive. How do I specify for just this one python program the directory it should be searching in, keeping it only to the context of the program?
Ive tried setting a variable Path = "B:\\MainFolder\SubFolder" and using os.walk, but it makes it through the first directory then stops.

can you try the following:
import os
import glob
base_dir = 'your/start/sirectory'
req_files = glob.glob(os.path.join(base_dir, '**/*.xml'), recursive=True)

Jeril and Eduardo, thank you for the help. i took a shot at pathlib and it worked. idk what was up with my glob code, looked basically the same as yours Jeril:
import glob, os
filelist = []
from pathlib import Path
for path in Path('B:\\CTImageDataset\LIDC-IDRI').rglob('*.xml'):
filelist.append(path.name)
print(filelist)
Worked great, thanks again

Related

How are paths meant to be denoted on for Biopython on mac?

I am trying to run a basic biopython script to rename sequences within a fasta file. I have only ever ran this on a server; i am trying to do it on my macbook but I can't work out what the correct path to the file should be.
on the server is worked as follows:
original_file = r”/home/ggb_myname/Documents/Viromex/Viromex.contigs.fa”
I am trying to do the same thing on my mac with
original_file = r"/Users/u2188165/Documents/Home/Post-qiime/dna-sequences.fasta"
and it returns the error
FileNotFoundError: [Errno 2] No such file or directory: '/Users/u2188165/Documents/Home/Post-qiime/dna-sequences.fasta
I know this is probably basic, but I can't find the correct way to write the path, either on my own or online.
Try using libraries like pathlib and os. Makes your code more modular and os independent to use.
from pathlib import Path
import os
dir ="/Users/u2188165/Documents/Home/Post-qiime"
file= "dna-sequences.fasta"
full_path = os.path.join(str(Path(dir)), file)
Or even try drill down approach for more versatility.
from pathlib import Path
import os
path_drill = ["Users","u2188165","Documents","Home","Post-qiime"]
file= "dna-sequences.fasta"
full_path = str(Path(os.path.join(*path_drill, file)))
How/Where you want to store this is upto your imagination and requirements.
Happy coding!

Is there a way to be able to use a variable path using os

The goal is to run through a half stable and half variable path.
I am trying to run through a path (go to lowest folder which is called Archive) and fill a list with files that have a certain ending. This works quite well for a stable path such as this.
fileInPath='\\server123456789\provider\COUNTRY\CATEGORY\Archive
My code runs through the path (recursive) and lists all files that have a certain ending. This works well. For simplicity I will just print the file name in the following code.
import csv
import os
fileInPath='\\\\server123456789\\provider\\COUNTRY\\CATEGORY\\Archive
fileOutPath=some path
csvSeparator=';'
fileList = []
for subdir, dirs, files in os.walk(fileInPath):
for file in files:
if file[-3:].upper()=='PAR':
print (file)
The problem is that I can manage to have country and category to be variable e.g. by using *
The standard library module pathlib provides a simple way to do this.
Your file list can be obtained with
from pathlib import Path
list(Path("//server123456789/provider/".glob("*/*/Archive/*.PAR"))
Note I'm using / instead of \\ pathlib handles the conversion for you on windows.

How to import using a path that is a variable in Python

I am trying to make a program that will go through and visit an array of directories and run a program and create a file inside.
I have everything working except that I need to figure out a way to import from a new path each time to get to a new directory.
For example:
L =["directory1", "directory2", "directory3"]
for i in range(len(L)):
#I know this is wrong, but just to give an idea
myPath = "parent."+L[i]
from myPath import file
#make file... etc.
Obviously when I use myPath as a variable for the path to import, I get an error. I have tried several different ways by searching online through Stack Overflow and reading OS and Sys documentation, but have come to no working result.
You can use 'imp' module to load source code of python scrips
import imp
root_dir = '/root/'
dirs =["directory1", "directory2", "directory3"]
for _dir in dirs:
module_path = os.path.join(root_dir,_dir,'module.py')
mod = imp.load_source("module_name", module_path)
# now you can call function in regular way, like mod.some_func()
I want to create a text file inside each directory. To do this I must
cycle through my array and take each directory name so I can visit it.
import is for loading external modules, not creating new files, if creating new files is what you want to do, use the open statement, and open the not yet existing file with 'w' mode. Note: the directory must exist.
from os.path import join
L =["directory1", "directory2", "directory3"]
for d in L: # loop through the directories
with open(join(d,"filename.txt"), "w") as file:
pass # or do stuff with the newly created file

How do I make the current folder path to work for my command

I'm new to Python and really want this command to work so I have been looking around on google but I still can't find any solution. I'm trying to make a script that deletes a folder inside the folder my Blender game are inside so i have been trying out those commands:
import shutil
from bge import logic
path = bge.logic.expandPath("//")
shutil.rmtree.path+("/killme") # remove dir and all contains
The Folder i want to delete is called "killme" and I know you can just do: shutil.rmtree(Path)
but I want the path to start at the folder that the game is in and not the full C:/programs/blabla/blabla/test/killme path.
Happy if someone could explain.
I think you are using shutil.rmtree command in wrong way. You may use the following.
shutil.rmtree(path+"/killme")
Look at the reference https://docs.python.org/3/library/shutil.html#shutil.rmtree
Syntax: shutil.rmtree(path, ignore_errors=False, onerror=None)
Assuming that your current project directory is 'test'. Then, your code will look like the follwing:
import shutil
from bge import logic
path = os.getcwd() # C:/programs/blabla/blabla/test/
shutil.rmtree(path+"/killme") # remove dir and all contains
NOTE: It will fail if the files are read only in the folder.
Hope it helps!
What you could do is set a base path like
basePath = "/bla_bla/"
and then append the path and use something like:
shutil.rmtree(basePath+yourGamePath)
If you are executing the python as a standalone script that is inside the desired folder, you can do the following:
#!/usr/bin/env_python
import os
cwd = os.getcwd()
shutil.rmtree(cwd)
Hope my answer was helpful
The best thing you could do is use the os library.
Then with the os.path function you can list all the directories and filenames and hence can delete/modify the required folders while extractring the name of folders in the same way you want.
for root, dirnames, files in os.walk("issues"):
for name in dirnames:
for filename in files:
*what you want*

How to rapidly switch from one directory to another Python

I have a huge list of image in one directory and another corresponding list of annotations in the other (.txt files).
I need to perform an operation on each image following the matching image annotations and save it into another directory. Is there an elegant way not to chdir three times at each step?
Maybe using cPickle or whatever library used for fast files management ?
import glob
from PIL import Image
os.chdir('path_images')
list_im=glob.glob('*.jpg')
list_im.sort()
list_im=path_images+list_im
os.chdir('path_txt')
list_annot=glob.glob('*.txt')
list_annot.sort()
list_annot=path_txt+list_im
for i in range(0,len(list_images)):
Joel pointed out that the os operations are not mandatory if you include the path in the name
#os.chdir('path_images')
im=Image.open(list_im[i])
#os.chdir('path_text')
action_on_image(im,list_annot[i])
#os.chdir('path_to_save_image')
im.save(path_to_save+nom_image)
I am a true beginner in Python but I am confident that my code is super inefficient and can be improved.
You don't have to chdir (and FWIW you really don't want to depend on the current working directory). Use absolute paths everywhere in your code and you'll be fine.
import os
import glob
from PIL import Image
abs_images_path = <absolute path to your images directory here>
abs_txt_path = <absolute path to your txt directory here>
abs_dest_path = <absolute path to where you want to save your images>
list_im=sorted(glob.glob(os.path.join(abs_images_path, '*.jpg')))
list_annot=sorted(glob.glob(os.path.join(abs_txt_path, '*.txt')))
for im_path, txt_path in zip(list_im, list_annot):
im = Image.open(im_path)
action_on_image(im, txt_path)
im.save(os.path.join(abs_dest_path, nom_image))
Note that if your paths are relative to where your script is installed, you can get the script's directory path with os.path.dirname(os.path.abspath(__file__))

Categories

Resources