I have a text file (images1.txt) with lists of .jpg names and I have a folder (Bones) with .jpg images. All image names are exactly 42 characters (including the file extension), and each is on a separate line containing the name and some information about the image. For example:
OO75768249870G_2018051_4A284DQ0-011628.jpg,1A4502432KJL459265,emergency
OO75768249870G_2018051_4A284DQ0-011629.jpg,1A451743245122,appointment
where everything after .jpg is my own personal notes about the photos. Bones contains many of the 4,000+ images named in images1 but not all. Using either the command prompt or python, how would I remove the lines from images1 which correspond to images not present in my Bones folder?
Thanks!
In python:
import os
LEN_OF_FILENAME = 42
with open('images1.txt', 'r') as image_file:
with open('filtered_images1.txt', 'w') as filtered_image_file:
for line in image_file:
image_name = line[:LEN_OF_FILENAME]
path_to_image = os.path.join('Bones', image_name)
if os.path.exists(path_to_image):
filtered_image_file.write(line)
Assuming images1.txt and Bones are in the same folder, if you run the above Python script in that folder you will get filtered_images1.txt. It will only contain lines that has a corresponding image in Bones.
This code will read the lines from image1.txt and create an image2.txt with the lines where the file exists in the bones directory.
#ECHO OFF
IF EXIST image2.txt (DEL image2.txt)
FOR /F "tokens=1,* delims=," %%f IN ('TYPE "image1.txt"') DO (
IF EXIST "bones\%%~f" (ECHO %%f,%%g >>"image2.txt")
)
EXIT /B
I think the easiest way is to use the findstr command:
rem /* Search for lines in file `images1.txt` in a case-insensitive manner that literally begin
rem with a file name found in the directory `Bones` which in turn matches the naming pattern;
rem then write all matching lines into a temporary file: */
dir /B /A:-D "Bones\??????????????_???????_????????-??????.jpg" | findstr /LIBG:/ "images1.txt" > "images1.tmp"
rem // Overwrite original `images1.txt` file by the temporary file:
move /Y "images1.tmp" "images1.txt" > nul
Related
I have around 100 text files with close to thousand records in a folder. I want to copy header and trailer of these files into a new file with the file name of respective file.
So the output i want is as
File_Name,Header,Trailer
is this possible using Unix or Python?
one way to do it is with the bash shell in the folder containing the files:
for file in *; do echo "$file,$(head -1 $file),$(tail -1 $file)"; done
PowerShell-core one liner with aliases
gci *.txt |%{"{0},{1},{2}" -f $_.FullName,(gc $_ -Head 1),(gc $_ -Tail 1)}|set-content .\newfile.txt
I have bunch of files of the form st_hwk.txt If you must know, this is how Moodle downloads assignments for grading. It takes the name of the hwk and prepands the user name.
This solution needs to work on Linux bc that is what I am working on.
Ex:
I download j smith_hwk1a.txt, j smith_hwk1b.txt, m wong_hwk1a.txt, m wong_hwk1b.txt. (yes the file names have fname space lname)
It should read the files names and create dir jsmith, and mwong. (no space)
Put into jsmith files hwk1a.txt and hwk1b.txt. (the hwk1 that came from jsmith)
Put into mwong files hwk1a.txt and hwk1b.txt. (the hwk1 that came from mwong).
You can use any tool on typical linux, bash, php, ...?
thank you
for f in *_hwk*.txt; do
n=$(echo "$f"|tr -d ' '|tr _ /); # delete spaces, convert _ to /
mkdir -p "$(dirname "$n")"; # make directory if needed
mv "$f" "$n"; # move the file
done
I have many files in subdirectories eg. UCE-1…UCE-2000, which all contain the same two file types (a .cfg file and a .phylip file).
UCEs
UCE-13
partition_finder.cfg
UCE-13.phylip
I need to modify the .cfg file in all of these UCE-1...UCE-2000 folders. Specifically, I need to copy the file name of the .phylip file UCE-13.phylip and place it in a specific section of text inside the .cfg file, for instance change
alignment = ;
to
alignment = UCE-13.phylip;
A second modification I need to make is to copy a section of text always found in the .phylip file at the end of the first line preceded by a space and replace it in a specific location of the .cfg file.
Copy last set of numbers in 1st line of the .phylip file between the space and return
2 466\r
Then find replace it in .cfg
All = 1-;
to
All = 1-466;
The numbers very in length.
Any help with either of these problems would be greatly appreciated.
start in “All-UCEs"
the info is all common to a subdir so
go to that dir first
get the phylip name
get the last field of the first row of the phylip file
stick them in the .cfg file
(use double quote to allow expansion of shell var in sed)
change back out of the dir
for dir in UCE-*; do
cd ${dir}
phylip="*.phylip"
some_num=`awk 'NR==1{print $NF}' ${phylip}`
sed -i "s/alignment = ;/alignment = ${phylip};/;\
s/All = 1-;/All = 1-${some_num};" *.cfg
cd ..
done
(untested)
I have a bunch of TIFF image files ordered by date. I need to rename them using either python, or terminal commands. The file names are structured like this:
basename_unnecessary_x.tif
where:
basename = is part of the original filename I need to keep (16 characters long)
unnecessary = part of the original filename I want to discard (14 characters long)
x = ascending numbers I need to add. Starting at 0 and going up in steps of 250 for every subsequent file.
I know there are plenty of questions on batch renaming and adding ascending numbers to file names but I haven't found anything that keeps part of the original filename and deletes another portion and adds ascending numbers. Any help would be appreciated.
Thanks!
first get all file names you wanted to rename into a text file.
if you want to rename all files in the directory then simply run below command and redirect it to a text file.[ changed code, it will now list .tif files only]
dir /a:-D /b *.tif >cp1.txt
Now use below code , which will rename files basename_unnecessary_x.tif to basename_0.tif and so on.
#echo off
CD %CD%\<Folderpath in which .tif files should be renamed>
setlocal enabledelayedexpansion
set /a count=0
echo --------Script started -------------------------
echo.
for /f "tokens=*" %%a in (cp1.txt) do (
echo original file name %%a
echo ------------------------------------------
for /f "tokens=1 delims=_" %%b in ("%%a") do (
echo file will be renamed to %%b_!count!.tif
echo ------------------------------------------
rename %%a %%b_!count!.tif
set /a count+=250
)
)
echo.
echo --------Script Completed -------------------------
Changes to the script :
dir command will now only list .tif files to cp1.txt
you can execute the script from any location , provided you update the path in CD section of code.
Updated the code now, it will follow the sequence of 0 to 250 .. so on.
FYI the reason it was giving 250 to the first files even though i have initialized to zero , because i have increased it by 250 before using it in rename command.
So, Im writing a python script which will open a tar file and if there is a directory in it, my script will open that directory and check for files...
E = raw_input("Enter the tar file name: ") // This will take input from the user
tf = tarfile.open(E) // this will open the tar file
Now whats the best way to check it 'tf' is having directory or not ? Rather then going my terminal and doing ls there I want do something in the same python script that checks if there is a directory after unzipping the tar.
In Python you can check to see if paths exist by using the os.path.exists(f) command, where f is a string representation of the filename and its path:
import os
path = 'path/filename'
if os.path.exists(path):
print('it exists')
EDIT:
The tarfile object has a method "getnames()" which gives the paths of all the objects in the tar file.
paths = tf.getnames() #returns list of pathnames
f = paths[1] #say the 2nd element in the list is the file you want
tf.extractfile(f)
Say there's a file named "file1" in directory "S3". Then one of the elements of tf.getnames() will be 'S3/file1'. Then you can extract it.