bash: remove all files except last version in file name - python

I have 10K+ files like below. File system date(export time) is one for all files.
YYY101R1.corp.company.org-RUNNINGCONFIG-2015-07-10-23-10-15.config
YYY101R1.corp.company.org-RUNNINGCONFIG-2015-07-11-22-11-10.config
YYY101R1.corp.company.org-RUNNINGCONFIG-2015-10-01-10-05-08.config
LLL101S1.corp.company.org-RUNNINGCONFIG-2015-08-10-23-10-15.config
LLL101S1.corp.company.org-RUNNINGCONFIG-2015-09-11-20-11-10.config
LLL101S1.corp.company.org-RUNNINGCONFIG-2015-10-02-19-05-07.config
How can I delete all files except last version(last date) of file from file name and rename it to
YYY101R1.corp.company.org.config
LLL101S1.corp.company.org.config
Thank you.

The UNIX shell command
ls -t YYY101R1.corp.company.org*
will list files in order of age, newest first. Grab the first line as "latest" and make a symbolic ("soft") link to it:
ln -s $latest YYY101R1.corp.company.org.config
Repeat for each file group.
Does that get you going? If not, please post your code and explanation of the specific problem. See https://stackoverflow.com/help/mcve

I got something similar, first get list of all files sorted ascending by modification time, then get number of them, display all minus last 2 of them and pass that to list to command removing file.
ls -tr | wc -l
ls -tr | head -number_of_files_minus_2 | xargs rm
Was it helpful?

FLast=`ls -tr 'YYY101R1.corp.company.org*' | tail -n 1`
mv ${FLast} YYY101R1.corp.company.org.config
rm -f YYY101R1.corp.company.org-RUNNINGCONFIG-*

Related

Bash iterate through every file but start from 2nd file and get names of 1st and 2nd files

I have files all named following the convention:
xxx_yyy_zzz_ooo_date_ppp.tif
I have a python functions that needs 3 inputs: the date of two consecutive files in my folder, and an output name generated from those two dates.
I created a loop that:
goes through every file in the folder
grabs the date of the file and assigns it to a variable ("file2", 5th place in the file name)
runs a python function that takes as inputs: date file 1, date file 2, output name
How could I make my loop start at the 2nd file in my folder, and grab the name of the previous file to assign it to a variable "file1" (so far it only grabs the date of 1 file at a time) ?
#!/bin/bash
output_path=path # Folder in which my output will be saved
for file2 in *; do
f1=$( "$file1" | awk -F'[_.]' '{print $5}' ) # File before the one over which the loop is running
f2=$( "$file2" | awk -F'[_.]' '{print $5}' ) # File 2 over which the loop is running
outfile=$output_path+$f1+$f2
function_python -$f1 -$f2 -$outfile
done
You could make it work like this:
#!/bin/bash
output_path="<path>"
readarray -t files < <(find . -maxdepth 1 -type f | sort) # replaces '*'
for ((i=1; i < ${#files[#]}; i++)); do
f1=$( echo "${files[i-1]}" | awk -F'[_.]' '{print $5}' ) # previous file
f2=$( echo "${files[i]}" | awk -F'[_.]' '{print $5}' ) # current file
outfile="${output_path}/${f1}${f2}"
function_python -"$f1" -"$f2" -"$outfile"
done
Not exactly sure about the call to function_python though, I have never seen that tool before (can't ask since I can't comment yet).
Read the files into an array and then iterate from index 1 instead of over the whole array.
#!/bin/bash
set -euo pipefail
declare -r output_path='/some/path/'
declare -a files fsegments
for file in *; do files+=("$file"); done
declare -ar files # optional
declare -r file1="${files[0]}"
IFS=_. read -ra fsegments <<< "$file1"
declare -r f1="${fsegments[4]}"
for file2 in "${files[#]:1}"; do # from 1
IFS=_. read -ra fsegments <<< "$file2"
f2="${fsegments[4]}"
outfile="${output_path}${f1}${f2}"
function_python -"$f1" -"$f2" -"$outfile" # weird format!
done

grouping and divding files which contains numbers in it into saperate folders

I wanted to move the files in group of 30 in sequence starting from image_1,image_2... from current folder to the new folder.
the file name pattern is like below
image_1.png
image_2.png
.
.
.
image_XXX.png
I want to move image_[1-30].png to folder fold30
and image[31-60].png to fold60 and so on
I have following code to do this and it works wanted to know is there any shortcut to do this.
or is there any smaller code that i can write for the same
#!/bin/bash
counter=0
folvalue=30
totalFiles=$(ls -1 image_*.png | sort -V | wc -l)
foldernames=fold$folvalue
for file in $(ls -1 image_*.png | sort -V )
do
((counter++))
mkdir -p $foldernames
mv $file ./$foldernames/
if [[ "$counter" -eq "$folvalue" ]];
then
let folvalue=folvalue+30
foldernames="fold${folvalue}"
echo $foldernames
fi
done
the above code moves image_1,image_2,..4..30 in folder
fold30
image_31,....image_60 to folder
fold60
I really recommend using sed all the time. It's hard on the eyes but once you get used to it you can do all these jaring tasks in no time.
What it does is simple. Running sed -e "s/regex/substitution/" <(cat file) goes through each line replacing matching patterns regex with substitution.
With it you can just transform your input into comands and pipe it to bash.
If you want to know more there's good documentation here. (also not easy on the eyes though)
Anyway here's the code:
while FILE_GROUP=$(find . -maxdepth 0 -name "image_*.png" | sort -V | head -30) && [ -n "$FILE_GROUP" ]
do
$FOLDER="${YOUR_PREFIX}$(sed -e "s/^.*image_//" -e "s/\.png//" <(echo "$FILE_GROUP" | tail -1))"
mkdir -p $FOLDER
sed -e "s/\.\///" -e "s|.*|mv & $FOLDER|" <(echo "$FILE_GROUP") | bash
done
And here's what it should do:
- while loop grabs the first 30 files.
- take the number out of the last of those files and name the directory
- mkdir FOLDER
- go through each line and turn $FILE into mv $FILE $FOLDER then execute those lines (pipe to bash)
note: replace $YOUR_PREFIXwith your folder
EDIT: surprisingly the code did not work out of the box(who would have thought...) But I've done some fixing and testing and it should work now.
The simplest way to do that is with rename, a.k.a. Perl rename. It will:
let you run any amount of code of arbitrary complexity to figure out a new name,
let you do a dry run telling you what it would do without doing anything,
warn you if any files would be overwritten,
automatically create intermediate directory hierarchies.
So the command you want is:
rename -n -p -e '(my $num = $_) =~ s/\D//g; $_ = ($num+29)-(($num-1)%30) . "/" . $_' *png
Sample Output
'image_1.png' would be renamed to '30/image_1.png'
'image_10.png' would be renamed to '30/image_10.png'
'image_100.png' would be renamed to '120/image_100.png'
'image_101.png' would be renamed to '120/image_101.png'
'image_102.png' would be renamed to '120/image_102.png'
'image_103.png' would be renamed to '120/image_103.png'
'image_104.png' would be renamed to '120/image_104.png'
...
...
If that looks correct, you can run it again without the -n switch to do it for real.

bash: copy files with the same pattern

I want to copy files scattered in separate directories into a single directory.
find . -name "*.off" > offFile
while read line; do
cp "${line}" offModels #offModels is the destination directory
done < offFile
While the file offFile has 1831 lines, but cd offModels and ls | wc -l gives 1827. I think four files end with ".off" are not copied.
At first, I think that because I use the double quote in shell script, files with names which contain dollor sign, backtick or backslash may be missed. Then I find one file named $.... But how to find another three? After cd offModels and ls > ../File, I write a python script like this:
fname1="offFile" #records files scattered
with open(fname1) as f1:
contents1=f1.readlines()
fname2="File"
with open(fname2) as f2:
contents2=f2.readlines()
visited=[0]*len(contents1)
for substr in contents2:
substr="/"+substr
for i, string in enumerate(contents1):
if string.find(substr)>=0:
visited[i]=1
break
for i,j in enumerate(visited):
if j==0:
print contents1[i]
The output gives four lines while they are wrong. But I can find all the four files in the destination directory.
Edit
As the comment and answers point out, there are four files with duplicated names with other four. One thing interest me now is that, with the bash script I used, the file with name $CROSS.off is copied. That really suprised me.
Looks like you have files with the same filenames, and cp just overwrites them.
You can use the --backup=numbered option for cp; here is a one-liner:
find -name '*.off' -exec cp --backup=numbered '{}' '/full/path/to/offModels' ';'
The -exec option allows you to execute a command on every file matched; you should use {} to get the file's name and end the command with ; (usually written as \; or ';', because bash treats semicolons as command separators).

bach linux file rename - how to rename multiple files in linux console

I would like to rename cca 1000 files that are named like:
66-123123.jpg -> abc-123123-66.jpg. So in general file format is:
xx-yyyyyy.jpg -> abc-yyyyyy-xx.jpg, where xx and yyyyyy are numbers, abc is string.
Can someone help me with bash or py script?
Try doing this :
rename 's/(\d{2})-(\d{6})\.jpg/abc-$2-$1.jpg/' *.jpg
There are other tools with the same name which may or may not be able to do this, so be careful.
If you run the following command (linux)
$ file $(readlink -f $(type -p rename))
and you have a result like
.../rename: Perl script, ASCII text executable
then this seems to be the right tool =)
If not, to make it the default (usually already the case) on Debian and derivative like Ubuntu :
$ sudo update-alternatives --set rename /path/to/rename
(replace /path/to/rename to the path of your perl's rename command.
If you don't have this command, search your package manager to install it or do it manually.
Last but not least, this tool was originally written by Larry Wall, the Perl's dad.
for file in ??-??????.jpg ; do
[[ $file =~ (..)-(......)\.jpg ]]
mv "$file" "abc-${BASH_REMATCH[2]}-${BASH_REMATCH[1]}.jpg" ;
done
This requires bash 4 for the regex support. For POSIXy shells, this will do
for f in ??-??????.jpg ; do
g=${f%.jpg} # remove the extension
a=${g%-*} # remove the trailing "-yyyyyy"
b=${g#*-} # remove the leading "xx-"
mv "$f" "abc-$b-$a.jpg" ;
done
You could use the rename command, which renames multiple files using regular expressions. In this case you would like to write
rename 's/(\d\d)-(\d\d\d\d\d\d)/abc-$2-$1/' *
where \dmeans a digit, and $1 and $2 refer to the values matched by the first and second parenthesis.
Being able to do things like this easily, is why I name my files the way I do. Using a + sign lets me cut them all up into variables, and then I can just re-arrange them with echo.
#!/usr/bin/env bash
set -x
find *.jpg -type f | while read files
do
newname=$(echo "${files}" | sed s'#-#+#'g | sed s'#\.jpg#+.jpg#'g)
field1=$(echo "${newname}" | cut -d'+' -f1)
field2=$(echo "${newname}" | cut -d'+' -f2)
field3=$(echo "${newname}" | cut -d'+' -f3)
finalname=$(echo "abc-${field2}-${field1}.${field3}")
mv "${files}" "${finalname}"
done

traversing daily dump directories

I have 6 months of data to go through, looking like this
0101
0102
.
.
0131
0201
0202
.
.
all the way to
0630
I want to fo through each directory, and execute an awk file on the contents, or do it in a weekly manner (each 7 directories will make one week of data
is there an easy way to do this in awk or python?
many thanks
You can use find to walk your tree and xargs to apply your awk script:
find . -type f | xargs awk -f awkfile
EDIT: awk syntax corrected thanks to input from #nya. I Am Not An AWK Expert.
Why not use plain bash? You can try this:
find . -type f -exec awk -f 'your_awk_script.awk' {} \;
find traverses through directory tree, and -exec option makes it execute the given comand(in this case awk -f your_awk_script.awk) on each file({} is the default placeholder for argument).
To run this tiny script every seven days, look into cron.

Categories

Resources