Batch modify file name and add ascending numbers - python

I have a bunch of TIFF image files ordered by date. I need to rename them using either python, or terminal commands. The file names are structured like this:
basename_unnecessary_x.tif
where:
basename = is part of the original filename I need to keep (16 characters long)
unnecessary = part of the original filename I want to discard (14 characters long)
x = ascending numbers I need to add. Starting at 0 and going up in steps of 250 for every subsequent file.
I know there are plenty of questions on batch renaming and adding ascending numbers to file names but I haven't found anything that keeps part of the original filename and deletes another portion and adds ascending numbers. Any help would be appreciated.
Thanks!

first get all file names you wanted to rename into a text file.
if you want to rename all files in the directory then simply run below command and redirect it to a text file.[ changed code, it will now list .tif files only]
dir /a:-D /b *.tif >cp1.txt
Now use below code , which will rename files basename_unnecessary_x.tif to basename_0.tif and so on.
#echo off
CD %CD%\<Folderpath in which .tif files should be renamed>
setlocal enabledelayedexpansion
set /a count=0
echo --------Script started -------------------------
echo.
for /f "tokens=*" %%a in (cp1.txt) do (
echo original file name %%a
echo ------------------------------------------
for /f "tokens=1 delims=_" %%b in ("%%a") do (
echo file will be renamed to %%b_!count!.tif
echo ------------------------------------------
rename %%a %%b_!count!.tif
set /a count+=250
)
)
echo.
echo --------Script Completed -------------------------
Changes to the script :
dir command will now only list .tif files to cp1.txt
you can execute the script from any location , provided you update the path in CD section of code.
Updated the code now, it will follow the sequence of 0 to 250 .. so on.
FYI the reason it was giving 250 to the first files even though i have initialized to zero , because i have increased it by 250 before using it in rename command.

Related

How do I check multiple folders and delete any files with unique file names?

I'm capturing images of widgets off of multiple cameras on an inspection system. If the inspection is unsuccessful, the image doesn't get saved. The images are named with the widget's serial number.
So my folder structure might look like
Camera1
1.tif
2.tif
4.tif
Camera2
2.tif
3.tif
4.tif
Camera3
1.tif
2.tif
3.tif
4.tif
I want to be able to delete images that don't have a match in all three folders. I don't mind running the solution twice, once between camera1 and camera2, and then again using camera2 and camera 3.
I'm hoping to only be left with the following folder structure.
Camera1
2.tif
4.tif
Camera2
2.tif
4.tif
Camera3
2.tif
4.tif
There are ~12,000 files in each folder for analysis and probably 2%-3% erroneous which need to be removed to continue analysis.
I don't mind prepackaged solutions requiring payment, python, command line, etc.
Thanks much!
As suggested in the comments, next time you ask something on SO, have a shot at it yourself first, and ask about any problems - you learn more that way.
Here's a start, as suggested the code below creates 3 sets with the contents of the folders, determines the intersection of those three sets, and then removes that intersection from the original sets. The result tells you exactly what files you need to remove in each folder:
from pathlib import Path
def find_unmatched(dirs):
# list the (file) contents of the folders
contents = {}
for d in dirs:
contents[d] = set(str(n.name) for n in Path(d).glob('*') if n.is_file())
# decide what the folders have in common
all_files = list(contents.values())
common = all_files[0]
for d_contents in all_files[1:]:
common = common.intersection(d_contents)
# create a dictionary that tells you what to remove
return {d: files - common for d, files in contents.items()}
to_remove = find_unmatched(['photos/Camera1', 'photos/Camera2', 'photos/Camera3'])
print(to_remove)
Result (given the folders in your example sit in a folder called photos):
{'photos/Camera1': {'1.tif'}, 'photos/Camera2': {'3.tif'}, 'photos/Camera3': {'1.tif', '3.tif'}}
Actually removing the files is some code you can probably figure out yourself.
As said before, you should do your own efforts to solve the problem and just ask for help when you get stuck. However, I have some spare time now, so I wrote a complete Batch solution:
#echo off
setlocal EnableDelayedExpansion
rem Process files in Camera1 folder and populate "F" array elements = 1
cd Camera1
for %%a in (*.tif) do set "F[%%~Na]=1"
rem Process files in Camera2 and *accumulate* files to "F" array
cd ..\Camera2
for %%a in (*.tif) do set /A "F[%%~Na]+=1"
rem Process files in Camera3 and accumulate files to "F" array
rem if counter == 3 then file is OK: remove "F" element
rem else: delete file
rem if counter == 1: remove "F" element
cd ..\Camera3
for %%a in (*.tif) do (
set /A "F[%%~Na]+=1"
if !F[%%~Na]! equ 3 (
set "F[%%~Na]="
) else (
del %%a
if !F[%%~Na]! equ 1 set "F[%%~Na]="
)
)
rem Remove files of "F" array in both Camera1 and Camera2 folders, ignoring error messages
cd ..
(for /F "tokens=2 delims=[]" %%a in ('set F[') do (
del Camera1\%%a.tif
del Camera2\%%a.tif
)) 2>nul
Please, report the result...
Perhaps not the fastest though a quite simple method:
#echo off
rem // Change into root directory:
pushd "%~dp0." && (
rem // Outer loop through target directories:
for /D %%J in ("Camera?") do (
rem // Create temporary file with matching contents of current directory:
dir /B /A:-D-H-S "%%~J\*.tif" > "%TEMP%\%%~nxJ.log"
rem // Inner loop through target directories:
for /D %%I in ("Camera?") do (
rem // Avoid comparing current directory with itself:
if /I not "%%~I"=="%%~J" (
rem /* List these files inside of the directory of the inner loop where no
rem respective files inside of the directory of the outer loop are found: */
for /F "delims= eol=|" %%K in ('
dir /B /A:-D-H-S "%%~I\*.tif" ^| findstr /L /I /V /G:"%TEMP%\%%~nxJ.log"
') do (
rem // Actually delete current file:
ECHO del "%%~I\%%K"
)
)
)
rem // Delete temporary file:
del "%TEMP%\%%~nxJ.log"
)
rem // Return from root directory:
popd
)
exit /B
The key is two nested loops over the target directories in order to compare each one with each other, and the findstr command used to filter out files from one directory that do not exist in the other one.
After having tested for the correct output, remove the upper-case ECHO command!
This new method is based on text files, so it should run faster than the environment variables method. The heavy task of search missing names in 12000 lines files (6 times!) is performed by findstr command.
This method is also simpler and allows to match more than 3 folders.
#echo off
setlocal EnableDelayedExpansion
rem Get a list of directories and create temp files with their contents
set "directories="
for /D %%d in (Camera?) do (
set "directories=!directories! %%d"
dir /B "%%d\*.tif" > %%d.txt
)
rem Process the directories "d"
for %%d in (%directories%) do (
rem Compare this directory "d" vs. the others "D"
for %%D in (!directories:%%d=!) do (
rem Remove files in this "d" that not exists in the other "D"
(for /F %%f in ('findstr /V /G:%%D.txt %%d.txt') do del "%%d\%%f") 2>nul
)
)
for %%d in (%directories%) do del %%d

How to move Header and Trailer from files to another file?

I have around 100 text files with close to thousand records in a folder. I want to copy header and trailer of these files into a new file with the file name of respective file.
So the output i want is as
File_Name,Header,Trailer
is this possible using Unix or Python?
one way to do it is with the bash shell in the folder containing the files:
for file in *; do echo "$file,$(head -1 $file),$(tail -1 $file)"; done
PowerShell-core one liner with aliases
gci *.txt |%{"{0},{1},{2}" -f $_.FullName,(gc $_ -Head 1),(gc $_ -Tail 1)}|set-content .\newfile.txt

Delete lines of text file if they reference a nonexistent file

I have a text file (images1.txt) with lists of .jpg names and I have a folder (Bones) with .jpg images. All image names are exactly 42 characters (including the file extension), and each is on a separate line containing the name and some information about the image. For example:
OO75768249870G_2018051_4A284DQ0-011628.jpg,1A4502432KJL459265,emergency
OO75768249870G_2018051_4A284DQ0-011629.jpg,1A451743245122,appointment
where everything after .jpg is my own personal notes about the photos. Bones contains many of the 4,000+ images named in images1 but not all. Using either the command prompt or python, how would I remove the lines from images1 which correspond to images not present in my Bones folder?
Thanks!
In python:
import os
LEN_OF_FILENAME = 42
with open('images1.txt', 'r') as image_file:
with open('filtered_images1.txt', 'w') as filtered_image_file:
for line in image_file:
image_name = line[:LEN_OF_FILENAME]
path_to_image = os.path.join('Bones', image_name)
if os.path.exists(path_to_image):
filtered_image_file.write(line)
Assuming images1.txt and Bones are in the same folder, if you run the above Python script in that folder you will get filtered_images1.txt. It will only contain lines that has a corresponding image in Bones.
This code will read the lines from image1.txt and create an image2.txt with the lines where the file exists in the bones directory.
#ECHO OFF
IF EXIST image2.txt (DEL image2.txt)
FOR /F "tokens=1,* delims=," %%f IN ('TYPE "image1.txt"') DO (
IF EXIST "bones\%%~f" (ECHO %%f,%%g >>"image2.txt")
)
EXIT /B
I think the easiest way is to use the findstr command:
rem /* Search for lines in file `images1.txt` in a case-insensitive manner that literally begin
rem with a file name found in the directory `Bones` which in turn matches the naming pattern;
rem then write all matching lines into a temporary file: */
dir /B /A:-D "Bones\??????????????_???????_????????-??????.jpg" | findstr /LIBG:/ "images1.txt" > "images1.tmp"
rem // Overwrite original `images1.txt` file by the temporary file:
move /Y "images1.tmp" "images1.txt" > nul

script to split file name into part1 and part2 and create directory part1 and put file part2 into part1 directory

I have bunch of files of the form st_hwk.txt If you must know, this is how Moodle downloads assignments for grading. It takes the name of the hwk and prepands the user name.
This solution needs to work on Linux bc that is what I am working on.
Ex:
I download j smith_hwk1a.txt, j smith_hwk1b.txt, m wong_hwk1a.txt, m wong_hwk1b.txt. (yes the file names have fname space lname)
It should read the files names and create dir jsmith, and mwong. (no space)
Put into jsmith files hwk1a.txt and hwk1b.txt. (the hwk1 that came from jsmith)
Put into mwong files hwk1a.txt and hwk1b.txt. (the hwk1 that came from mwong).
You can use any tool on typical linux, bash, php, ...?
thank you
for f in *_hwk*.txt; do
n=$(echo "$f"|tr -d ' '|tr _ /); # delete spaces, convert _ to /
mkdir -p "$(dirname "$n")"; # make directory if needed
mv "$f" "$n"; # move the file
done

Script to rename files in folder to match names of files in another folder

I need to do a batch rename given the following scenario:
I have a bunch of files in Folder A
A bunch of files in Folder B.
The files in Folder A are all ".doc",
the files in Folder B are all ".jpg".
The files in Folder A are named "A0001.doc"
The files in Folder B are named "A0001johnsmith.jpg"
I want to merge the folders, and rename the files in Folder A so that they append the name portion of the matching file in Folder B.
Example:
Before:
FOLDER A: Folder B:
A0001.doc A0001johnsmith.jpg
After:
Folder C:
A0001johnsmith.doc
A0001johnsmith.jpg
I have seen some batch renaming scripts, but the only difference is that i need to assign a variable to contain the name portion so I can append it to the end of the corresponding file in Folder A.
I figure that the best way to do it would be to do a simple python script that would do a recursive loop, working on each item in the folder as follows:
Parse filename of A0001.doc
Match string to filenames in Folder B
Take the portion following the string that matched but before the "." and assign variable
Take the original string A0001 and append the variable containing the name element and rename it
Copy both files to Folder C (non-destructive, in case of errors etc)
I was thinking of using python for this, but I could use some help with syntax and such. I only know a little bit using the base python library, and I am guessing I would be importing libraries such as "OS", and maybe "SYS". I have never used them before, any help would be appreciated. I am also open to using a windows batch script or even powershell. Any input is helpful.
This is Powershell since you said you would use that.
Please note that I HAVE NOT TESTED THIS. I don't have access to a Windows machine right now so I can't test it. I'm basing this off of memory, but I think it's mostly correct.
foreach($aFile in ls "/path/to/FolderA")
{
$matchString = $aFile.Name.Split("."}[0]
$bFile = $(ls "/path/to/FolderB" |? { $_.Name -Match $matchString })[0]
$addString = $bFile.Name.Split(".")[0].Replace($matchString, "")
cp $aFile ("/path/to/FolderC/" + $matchString + $addString + ".doc")
cp $bFile "/path/to/FolderC"
}
This makes a lot of assumptions about the name structure. For example, I assumed the string to add doesn't appear in the common filename strings.
It is very simple with a plain batch script.
#echo off
for %%A in ("folderA\*.doc") do (
for %%B in ("folderB\%%~nA*.jpg") do (
copy "%%A" "folderC\%%~nB.doc"
copy "%%B" "folderC"
)
)
I haven't added any error checking.
You could have problems if you have a file like "A1.doc" matching multiple files like "A1file1.jpg" and "A10file2.jpg".
As long as the .doc files have fixed width names, and there exists a .jpg for every .doc, then I think the code should work.
Obviously more code could be added to handle various scenarios and error conditions.

Categories

Resources