Creating subfolder and storing specified files/images in those

Creating subfolder and storing specified files/images in those - python

During one of my projects, I faced this challenge: There is a folder named Project, and inside that, there are multiple images (say 100 images), and each has been named sequentially like the first image name is imag_0, 2nd image name is img_2,....imag_99.
Now, based on some conditions, I need to separate out some images say img_5, img_10, img_30, img_88, img_61. My question will be, is there any way to filter out these images and make a folder inside the folder Project named "the odd ones" and store those specified images?
One extra help will be in my case. Suppose I have hundreds of such Projects folders in a sequential way Projects_1, Projects_2, Projects_3,....., Projects_99, and each contains hundreds of pictures. Can it be possible to separate all the specified photos and store them inside a separate folder inside each Projects_n folder, assuming the photos we have to separate out and store differently will be the same for each Projects_n folder?
Please help me with this. Thank you!

For the first problem you can lookup to the below pseudo-code (you have to specify the target function). Instead, for the second problem you should provide more details;
from glob import glob
import itertools
import shutil
import os
# Creating a funtion to check if filename
# is a target file which has to be moved:
def is_target(filename):
if ... return True
else return False
dirname = "some/path/to/project"
# Creating a list of all files in dir which
# could be moved based on type extension:
types = ('*.png', '*.jpeg')
filepaths = list(itertools.chain(*[glob(os.path.join(dirname, f"*.{t}")) for t in types]))
# Finding the files to move:
filepaths_to_move = []
for filepath in filepaths:
if is_target(os.path.basename(filepath)):
filepaths_to_move.append(filepath)
# Creating the new subfolder:
new_folder_name = "odd_images"
new_dir = os.path.join(dirname, new_folder_name)
if not os.path.exists(new_dir): os.makedirs(new_dir)
# Moving files into subfolder:
for filepath in filepaths_to_move:
basename = os.path.basename(filepath)
shutil.move(source, os.path.join(filepath, os.path.join(dirname, basename)))

Here is the logic.make necessary improvements for your use case
project_dir = "project_dir"
move_to_dir = os.path.join(project_dir,"move_to_dir")
files = [os.path.join(project_dir,file) for file in os.listdir(project_dir)]
filenames_to_filter = "test1.txt,test2.txt"
if not os.path.exists(move_to_dir):
os.makedirs(move_to_dir)
for(file in files):
if os.path.basename(file) in filenames_to_filter:
shutil.move(file,move_to_dir)
`

Related

Move files one by one to newly created directories for each file with Python 3

What I have is an initial directory with a file inside D:\BBS\file.x and multiple .txt files in the work directory D:\
What I am trying to do is to copy the folder BBS with its content and incrementing it's name by number, then copy/move each existing .txt file to the newly created directory to make it \BBS1, \BBS2, ..., BBSn (depends on number of the txt).
Visual example of the Before and After:
Initial view of the \WorkFolder
Desired view of the \WorkFolder
Right now I have reached only creating of a new directory and moving txt in it but all at once, not as I would like to. Here's my code:
from pathlib import Path
from shutil import copy
import shutil
import os
wkDir = Path.cwd()
src = wkDir.joinpath('BBS')
count = 0
for content in src.iterdir():
addname = src.name.split('_')[0]
out_folder = wkDir.joinpath(f'!{addname}')
out_folder.mkdir(exist_ok=True)
out_path = out_folder.joinpath(content.name)
copy(content, out_path)
files = os.listdir(wkDir)
for f in files:
if f.endswith(".txt"):
shutil.move(f, out_folder)
I kindly request for assistance with incrementing and copying files one by one to the newly created directory for each as mentioned.
Not much skills with python in general. Python3 OS Windows
Thanks in advance

Now, I understand what you want to accomplish. I think you can do it quite easily by only iterating over the text files and for each one you copy the BBS folder. After that you move the file you are currently at. In order to get the folder_num, you may be able to just access the file name's characters at the particular indexes (e.g. f[4:6]) if the name is always of the pattern TextXX.txt. If the prefix "Text" may vary, it is more stable to use regular expressions like in the following sample.
Also, the function shutil.copytree copies a directory with its children.
import re
import shutil
from pathlib import Path
wkDir = Path.cwd()
src = wkDir.joinpath('BBS')
for f in os.listdir(wkDir):
if f.endswith(".txt"):
folder_num = re.findall(r"\d+", f)[0]
target = wkDir.joinpath(f"{src.name}{folder_num}")
# copy BBS
shutil.copytree(src, target)
# move .txt file
shutil.move(f, target)

Ignoring folders when organizing files

I am fairly new to python, and trying to write a program that organizes files based on their extensions
import os
import shutil
newpath1 = r'C:\Users\User1\Documents\Downloads\Images'
if not os.path.exists(newpath1): # check to see if they already exist
os.makedirs(newpath1)
newpath2 = r'C:\Users\User1\Documents\Downloads\Documents'
if not os.path.exists(newpath2):
os.makedirs(newpath2)
newpath3 = r'C:\Users\User1\Documents\Downloads\Else'
if not os.path.exists(newpath3):
os.makedirs(newpath3)
source_folder = r"C:\Users\User1\Documents\Downloads" # the location of the files we want to move
files = os.listdir(source_folder)
for file in files:
if file.endswith(('.JPG', '.png', '.jpg')):
shutil.move(os.path.join(source_folder,file), os.path.join(newpath1,file))
elif file.endswith(('.pdf', '.pptx')):
shutil.move(os.path.join(source_folder,file), os.path.join(newpath2,file))
#elif file is folder:
#do nothing
else:
shutil.move(os.path.join(source_folder,file), os.path.join(newpath3,file))
I want it to move files based on their extensions. However, I am trying to figure out how to stop the folders from moving. Any help would be greatly appreciated.
Also, for some reason, not every file is being moved, even though they have the same extension.

As with most path operations, I recommend using the pathlib module. Pathlib is available since Python 3.4 and has portable (multi platform), high-level API for file system operations.
I recommend using the following methods on Path objects, to determine their type:
Path.is_file()
Path.is_dir()
import shutil
from pathlib import Path
# Using class for nicer grouping of target directories
# Note that pathlib.Path enables Unix-like path construction, even on Windows
class TargetPaths:
IMAGES = Path.home().joinpath("Documents/Downloads/Images")
DOCUMENTS = Path.home().joinpath("Documents/Downloads/Documents")
OTHER = Path.home().joinpath("Documents/Downloads/Else")
__ALL__ = (IMAGES, DOCUMENTS, OTHER)
for target_dir in TargetPaths.__ALL__:
if not target_dir.is_dir():
target_dir.mkdir(exist_ok=True)
source_folder = Path.home().joinpath("Documents/Downloads") # the location of the files we want to move
# Get absolute paths to the files in source_folder
# files is a generator (only usable once)
files = (path.absolute() for path in source_folder.iterdir() if path.is_file())
def move(source_path, target_dir):
shutil.move(str(source_path), str(target_dir.joinpath(file.name))
for path in files:
if path.suffix in ('.JPG', '.png', '.jpg'):
move(path, TargetPaths.IMAGES)
elif path.suffix in ('.pdf', '.pptx'):
move(path, TargetPaths.DOCUMENTS)
else:
move(path, TargetPaths.OTHER)

See here
In particular, the os.walk command. This command returns a 3-tuple with the dirpath, dirname, and filename.
In your case, you should use [x[0] for x in os.walk(dirname)]

Looping through multiple CSV-files in different folders and producing multiple outputs and put these in the same folder based on input

Problem: I have around 100 folders in 1 main folder each with a csv file with the exact same name and structure, for example input.csv. I have written a Python script for one of the files which takes the csv-file as an input and gives two images as output in the same folder. I want to produce these images and put them in every folder given the input per folder.
Is there a fast way to do this? Until now I have copied my script every time in each folder and been executing it. For 5 folders it was alright, but for 100 this will get tedious and take a lot of time.
Can someone please help met out? I'm very new to coding w.r.t. directories, paths, files etc. I have already tried to look for a solution, but no succes so far.

You could try something like this:
import os
import pandas as pd
path = r'path\to\folders'
filename = 'input'
# get all directories within the top level directory
dirs = [os.path.join(path, val) for val in os.listdir(path) if os.path.isdir(os.path.join(path, val))]
# loop through each directory
for dd in dirs:
file_ = [f for f in os.listdir(dd) if f.endwith('.csv') and 'input' in f]
if file_:
# this assumes only a single file of 'filename' and extension .csv exists in each directory
file_ = file_[0]
else:
continue
data = pd.read_csv(file_)
# apply your script here and save images back to folder dd

How to check if multiple files exist in different directories

I know how to use python to check to see if a file exists, but what I am after is trying to see if multiple files of the same name exist throughout my working directory. Take for instance:
gamedata/areas/
# i have 2 folders in this directory
# testarea and homeplace
1. gamedata/areas/testarea/
2. gamedata/areas/homeplace/
Each folder of homeplace and testarea for instance contains a file called 'example'
Is there a pythonic way to use 'os' or similiar to check to see if the file 'example' can be found in both testarea and homeplace?
Although is their a way to do this without manually and statically using
os.path.isfile()
because throughout the life of the program new directories will be made, and I don't want to constantly go back into the code to change it.

You can check in every directory bellow gamedata/areas/:
This only goes down one level, you could extend it to go down as many levels as you want.
from os import listdir
from os.path import isdir, isfile, join
base_path = "gamedata/areas/"
files = listdir(base_path)
only_directories = [path for path in files if isdir(join(base_path,path))]
for directory_path in only_directories:
dir_path = join(base_path, directory_path)
for file_path in listdir(dir_path):
full_file_path = join(base_path, dir_path, file_path)
is_file = isfile(full_file_path)
is_example = "example" in file_path
if is_file and is_example:
print "Found One!!"
Hope it helps!

Maybe something like
places = ["testarea", "homeplace"]
if all(os.path.isfile(os.path.join("gamedata/areas/", x, "example") for x in places)):
print("Missing example")
If the condition is false, this doesn't tell you which subdirectory does not contain the file example, though. You can update places as necessary.

As I mentioned in the comments, os.walk is your friend:
import os
ROOT="gamedata/areas"
in_dirs = [path for (path, dirs, filenames)
in os.walk(ROOT)
if 'example' in filenames]
in_dirs will be a list of subdirectories where example is found

Zipped files have extra unwanted folders

I have been having an issue with using the zipfile.Zipfile() function. It zips my files properly, but then has extra folders that I do not want in the output zip file. It does put all my desired files in the .zip but it seems to add the last few directories from the files being written in the .zip file by default. Is there any way to exclude these folders? Here is my code:
import arcpy, os
from os import path as p
import zipfile
arcpy.overwriteOutput = True
def ZipShapes(path, out_path):
arcpy.env.workspace = path
shapes = arcpy.ListFeatureClasses()
# iterate through list of shapefiles
for shape in shapes:
name = p.splitext(shape)[0]
print name
zip_path = p.join(out_path, name + '.zip')
zip = zipfile.ZipFile(zip_path, 'w')
zip.write(p.join(path,shape))
for f in arcpy.ListFiles('%s*' %name):
if not f.endswith('.shp'):
zip.write(p.join(path,f))
print 'All files written to %s' %zip_path
zip.close()
if __name__ == '__main__':
path = r'C:\Shape_test\Census_CedarCo'
out_path = r'C:\Shape_outputs'
ZipShapes(path, out_path)
I tried to post some pictures but I do not have enough reputation points. Basically it is adding 2 extra folders (empty) inside the zip file. So instead of the files being inside the zip like this:
C:\Shape_outputs\Public_Buildings.zip\Public_Buildings.shp
They are showing up like this:
C:\Shape_outputs\Public_Buildings.zip\Shape_test\Census_CedarCo\Public_Buildings.shp
The "Shape_test" and "Census_CedarCo" folders are the directories that the shapefiles I am trying to copy come from, but if I am just writing these files why are the sub directories also being copied into the zip file? I suppose it is not a huge deal since I am getting the files zipped, but it is more of an annoyance than anything.
I assumed that when creating a zip file it would just write the files I specify themselves. Why does it add these extra directories inside the zip file? Is there a way around it? Am I missing something here? I appreciate any input! Thanks

The optional second parameter in ZipFile.write(filename[, arcname[, compress_type]]) is that name used in the archive file. You can strip the offending folders from the front of the path and use the remainder for the archive path name. I'm not sure exactly how arcpy gives you the paths, but something like zip.write(p.join(path,shape), shape) should do it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating subfolder and storing specified files/images in those - python

Related

Move files one by one to newly created directories for each file with Python 3

Ignoring folders when organizing files

Looping through multiple CSV-files in different folders and producing multiple outputs and put these in the same folder based on input

How to check if multiple files exist in different directories

Zipped files have extra unwanted folders

Categories

Resources