I have a large list of images that have been misnamed by my artist. I was hoping to avoid giving him more work by using Automator but I'm new to it. Right now they're named in order what001a and what002a but that should be what001a and what001b. So basically odd numbered are A and even numbered at B. So i need a script that changes the even numbered to B images and renumbers them all to the proper sequential numbering. How would I go about writing that script?
A small Ruby script embedded in an AppleScript provides a very comfortable solution, allowing you to select the files to rename right in Finder and displaying an informative success or error message.
The algorithm renames files as follows:
number = first 3 digits in filename # e.g. "006"
letter = the letter following those digits # e.g. "a"
if number is even, change letter to its successor # e.g. "b"
number = (number + 1)/2 # 5 or 6 => 3
replace number and letter in filename
And here it is:
-- ask for files
set filesToRename to choose file with prompt "Select the files to rename" with multiple selections allowed
-- prepare ruby command
set ruby_script to "ruby -e \"s=ARGV[0]; m=s.match(/(\\d{3})(\\w)/); n=m[1].to_i; a=m[2]; a.succ! if n.even?; r=sprintf('%03d',(n+1)/2)+a; puts s.sub(/\\d{3}\\w/,r);\" "
tell application "Finder"
-- process files, record errors
set counter to 0
set errors to {}
repeat with f in filesToRename
try
do shell script ruby_script & (f's name as text)
set f's name to result
set counter to counter + 1
on error
copy (f's name as text) to the end of errors
end try
end repeat
-- display report
set msg to (counter as text) & " files renamed successfully!\n"
if errors is not {} then
set AppleScript's text item delimiters to "\n"
set msg to msg & "The following files could NOT be renamed:\n" & (errors as text)
set AppleScript's text item delimiters to ""
end if
display dialog msg
end tell
Note that it will fail when the filename contains spaces.
A friend of mine wrote a Python script to do what I needed. Figured I'd post it here as an answer for anyone stumbling upon a similar problem looking for help. It is in Python though so if anyone wants to convert it to AppleScript for those that may need it go for it.
import os
import re
import shutil
def toInt(str):
try:
return int(str)
except:
return 0
filePath = "./"
extension = "png"
dirList = os.listdir(filePath)
regx = re.compile("[0-9]+a")
for filename in dirList:
ext = filename[-len(extension):]
if(ext != extension): continue
rslts = regx.search(filename)
if(rslts == None): continue
pieces = regx.split(filename)
if(len(pieces) < 2): pieces.append("")
filenumber = toInt(rslts.group(0).rstrip("a"))
newFileNum = (filenumber + 1) / 2
fileChar = "b"
if(filenumber % 2): fileChar = "a"
newFileName = "%s%03d%s%s" % (pieces[0], newFileNum, fileChar, pieces[1])
shutil.move("%s%s" % (filePath, filename), "%s%s" % (filePath, newFileName))
Related
I have a text file (filenames.txt) that contains the file name with its file extension.
filename.txt
[AW] One Piece - 629 [1080P][Dub].mkv
EP.585.1080p.mp4
EP609.m4v
EP 610.m4v
One Piece 0696 A Tearful Reunion! Rebecca and Kyros!.mp4
One_Piece_0745_Sons'_Cups!.mp4
One Piece - 591 (1080P Funi Web-Dl -Ks-)-1.m4v
One Piece - 621 1080P.mkv
One_Piece_S10E577_Zs_Ambition_A_Great_and_Desperate_Escape_Plan.mp4
these are the example filename and its extension. I need to rename filename with the episode number (without changing its extension).
Example:
Input:
``````
EP609.m4v
EP 610.m4v
EP.585.1080p.mp4
One Piece - 621 1080P.mkv
[AW] One Piece - 629 [1080P][Dub].mkv
One_Piece_0745_Sons'_Cups!.mp4
One Piece 0696 A Tearful Reunion! Rebecca and Kyros!.mp4
One Piece - 591 (1080P Funi Web-Dl -Ks-)-1.m4v
One_Piece_S10E577_Zs_Ambition_A_Great_and_Desperate_Escape_Plan.mp4
Expected Output:
````````````````
609.m4v
610.m4v
585.mp4
621.mkv
629.mkv
745.mp4 (or) 0745.mp4
696.mp4 (or) 0696.mp4
591.m4v
577.mp4
Hope someone will help me parse and rename these filenames. Thanks in advance!!!
As you tagged python, I guess you are willing to use python.
(Edit: I've realized a loop in my original code is unnecessary.)
import re
with open('filename.txt', 'r') as f:
files = f.read().splitlines() # read filenames
# assume: an episode comprises of 3 digits possibly preceded by 0
p = re.compile(r'0?(\d{3})')
for file in files:
if m := p.search(file):
print(m.group(1) + '.' + file.split('.')[-1])
else:
print(file)
This will output
609.m4v
610.m4v
585.mp4
621.mkv
629.mkv
745.mp4
696.mp4
591.m4v
577.mp4
Basically, it searches for the first 3-digit number, possibly preceded by 0.
I strongly advise you to check the output; in particular, you would want to run sort OUTPUTFILENAME | uniq -d to see whether there are duplicate target names.
(Original answer:)
p = re.compile(r'\d{3,4}')
for file in files:
for m in p.finditer(file):
ep = m.group(0)
if int(ep) < 1000:
print(ep.lstrip('0') + '.' + file.split('.')[-1])
break # go to next file if ep found (avoid the else clause)
else: # if ep not found, just print the filename as is
print(file)
Program to parse episode number and renaming it.
Modules used:
re - To parse File Name
os - To rename File Name
full/path/to/folder - is the path to the folder where your file lives
import re
import os
for file in os.listdir(path="full/path/to/folder/"):
# searches for the first 3 or 4 digit number less than 1000 for each line.
for match_obj in re.finditer(r'\d{3,4}', file):
episode = match_obj.group(0)
if int(episode) < 1000:
new_filename = episode.lstrip('0') + '.' + file.split('.')[-1]
old_name = "full/path/to/folder/" + file
new_name = "full/path/to/folder/" + new_filename
os.rename(old_name, new_name)
# go to next file if ep found (avoid the else clause)
break
else:
# if episode not found, just leave the filename as it is
pass
Problem solved! was newfilename[0,3] instead of newfilename[0: 3]
I know this question has been asked before and I have look around on all the answers and the types of problems people have been having related to this error message, but was unable to find anyone with the same type of problem.
I am sowing the whole method just in case. So here is my problem;
When I am trying to get is a substring of "newfilename" using newfilename[int, int] and the compiler keeps thinking I don't have an integer there when I do, at least from my checking I do.
What I'm doing with this code: I am cutting of the end of a filename such as 'foo.txt' to get 'foo' that is saved as newfilename. Then I am adding the number (converted to a string) to the end of it to get 'foo 1' and after that adding back the '.txt' to get the final result of 'foo 1.txt'. The problem occurs when I try to get the substring out and delete the last four characters of the filename to get just 'foo'. After that, I do another check to see if there is a file like that still in the folder and if so I do another set of cutting and pasting to add 1 to the previous file. To be honest, I have not tested of the while loop will work I just thought it should work technically, but my code does not reach that far because of this error lol.
My error:
File "C:/Users/Reaper/IdeaProjects/Curch Rec Managment/Setup.py", line 243, in moveFiles
print(newfilename[0, 3])
TypeError: string indices must be integers
NOTE this error is from when I tried to hard code the numbers it to see if it would work
Here is the current error with the hard code commented out:
newfilename = newfilename[0, int(newfilename.__len__() - 4)] + " 1.m4a"
TypeError: string indices must be integers
What I have tried: I have tried hard coding the numbers is by literally typing in newfilename[0, 7] and still got the same error. I have tried doing this in a separate python file and it seems to work there fine. Also, what is really confusing me is that it works in another part of my program just fine as shown here:
nyear = str(input("Enter new Year: "))
if nyear[0:2] != "20" or nyear.__len__() > 4:
print("Sorry incorrect year. Please try again")
So I have been at it for a while now trying to figure out what in the world is going on and can't get there. Decided I would sleep on it but would post the question just in case. If someone could point out what may be wrong that would be awesome! Or tell me the compilers are just being stupid, well I guess that will do as well.
My function code
def moveFiles(pathList, source, filenameList):
# moves files to new location
# counter keeps track of file name position in list
cnter = 0
for x in pathList:
filename = filenameList[cnter]
#print(x + "\\" + filename)
# new filename
if filename.find("PR") == 0:
newfilename = filename[3:filename.__len__()]
else:
newfilename = filename[2:filename.__len__()]
# checking if file exists and adding numbers to the end if it does
if os.path.isfile(x + "\\" + newfilename):
print("File Name exists!!")
# adding a 1 to the end
print(newfilename)
# PROBLEM ON NEXT TWO LINES, also prob. on any line with the following calls
print(newfilename[0, 3])
newfilename = newfilename[0, int(newfilename.__len__() - 4)] + " 1.m4a"
print("Adding 1:", newfilename)
# once again check if the file exists and adding 1 to the last number
while os.path.isfile(x + "\\" + newfilename):
# me testing if maybe i just can't have math operations withing the substring call
print("File exists again!!")
num = newfilename.__len__() - 6
num2 = newfilename.__len__() - 4
num3 = int(newfilename[num, num2])
num = newfilename.__len__() - 5
newfilename = newfilename[0, num] + str(num3 + 1)
print("Adding 1:", newfilename)
# moving file and deleting prefix
if not os.path.isdir(x):
os.makedirs(x)
os.rename(source + "\\" + filename, x + "\\" + newfilename)
cnter += 1
I think you need this:
print(newfilename[0:3])
I'm new to Python and working on a little program that copies all files of given extension from a folder and it's subfolders to an another directory. Recently I added a simple progress bar and a counter of remaining files.
The problem is that when I run it from cmd and counter comes from say 1000 to 999 cmd adds a zero in the place of a last digit instead of space. Moreover, when the program is finished remaining files counter should be substituted by the word "Done." and it also doesn't work well.
I tried to replace sys.stdout.write with print and tried not to use f-strings, the result is the same.
def show_progress_bar(total, counter=0, length=80):
percent = round(100 * (counter / total))
filled_length = int(length * counter // total)
bar = '=' * filled_length + '-' * (length - filled_length)
if counter < total:
suffix = f'Files left: {total - counter}'
else:
suffix = 'Done.'
sys.stdout.write(f'\rProgress: |{bar}| {percent}% {suffix}')
sys.stdout.flush()
def selective_copy(source, destination, extension):
global counter
show_progress_bar(total)
for foldername, subfolders, filenames in os.walk(source):
for filename in filenames:
if filename.endswith(extension):
if not os.path.exists(os.path.join(destination, filename)):
shutil.copy(os.path.join(foldername, filename), os.path.join(destination, filename))
else:
new_filename = f'{os.path.basename(foldername)}_{filename}'
shutil.copy(os.path.join(foldername, filename), os.path.join(destination, new_filename))
counter += 1
show_progress_bar(total, counter)
I expected that the output in cmd will be the same as in the console, which is this:
Program running:
Progress: |=========-----------------------------------------------------------------------| 12% Files left: 976
Program finished:
Progress: |================================================================================| 100% Done.
But in the cmd I got this:
Program running:
Progress: |=========-----------------------------------------------------------------------| 12% Files left: 9760
Program finished:
Progress: |================================================================================| 100% Done. left: 100
Typically, printing "\r" will return the cursor to the beginning of the line, but it won't erase anything already written. So if you write "1000" followed by "\r" followed by "999", the last 0 of "1000" will still be visible.
(I'm not sure why this isn't happening in your Python console. Maybe it interprets "\r" in a different way. Hard to say without knowing exactly what software you're running.)
One solution is to print a couple of spaces after your output to ensure that slightly longer old messages get overwritten. You can probably get away with just one space for your "Files left:" suffix, since that only decreases by one character at most, but the "done" suffix will need more.
if counter < total:
suffix = f'Files left: {total - counter} '
else:
suffix = 'Done. '
I am writing a Python 2 program to find a file. This program should print each directory it searches at each iteration of the search, but always to the same line in the terminal (i.e. by erasing the text that is already there and moving the cursor to the beginning of the line before printing again.)
This is the code I have so far:
import os
import sys
for root, dirs, files in os.walk("/"):
print root +'\r',
print '\x1b[2K\r',
My problem is that it starts each printout (when it change directory) on a new line; in other words, it doesn't reuse the old line.
How can I ensure all printed output goes to a single line in the terminal?
You need to flush the stdout buffer (depends on the terminal system), and pad the line with whitespace. For example:
for root, dirs, files in os.walk(path):
print "%-80s\r" % (root),
sys.stdout.flush()
time.sleep(1) # For testing
This assumes an arbitrary maximum filename length of 80 characters.
EDIT:
This new solution uses curses, which is part of the standard library:
import curses
import os
import time
win = curses.initscr()
for root, dirs, files in os.walk(path):
win.clear()
win.addstr(0, 0, root)
win.refresh()
time.sleep(1) # For testing purposes
curses.endwin()
This should do it.
for root, dirs, files in os.walk(path):
print '\r', root,
The \r tells python to rewind to the beginning of the current line, like old typewriters.
You might want to pad with spaces to erase the rest of the line, if the current path is shorter than the previous path.
If the text is longer than one line, it will still overflow to the next line.
You need to shorten your output to under the terminal limit.
You could just truncate and put ellipsis at the front:
limit = 30 # for example
message = 'ABCDEFGHIJKLMNOPQRSTUVWX' * 4
if len(message) > limit:
message = '...' + message[-limit+3:]
print message # ...VWXABCDEFGHIJKLMNOPQRSTUVWX
If you want to replace the middle with ..., then you could do:
limit = 30 # for example
message = 'ABCDEFGHIJKLMNOPQRSTUVWX' * 4
length = len(message) # will be 100
if length > limit:
message = list(message)
cut_size = length - limit
start_cut = (length - cut_size) / 2
message[start_cut:start_cut + cut_size + 3] = '...'
message = ''.join(message)
print message # ABCDEFGHIJKLMNO...MNOPQRSTUVWX
Inspired by several ideas from here and there, this works for me well:
import os
import sys
import time # only if you use sleep() function for debugging
top_folder = "/"
max_line_length = 80
for root, dirs, files in os.walk(top_folder):
message = root
# truncate if the path longer than what you want it to be
if len(message) > max_line_length:
message = '[...]' + message[-max_line_length+5:]
# prepare the output string of lenght determined by a variable
output_string = '{0: <' + str(max_line_length) + '}\r' # \r = carret return
# output
print output_string.format(message), # the comma is crucial here
# to see it in action in slow-motion
time.sleep(.4)
The last 2 code lines before the sleep() function line could be combined into one line:
print '{msg: <{width}}\r'.format(msge = message, width = max_line_length),
I have a bunch of data in .txt file and I need it in a format that I can use in fusion tables/spreadsheet. I assume that that format would be a csv that I can write into another file that I can then import into a spreadsheet to work with.
The data is in this format with multiple entries separated by a blank line.
Start Time
8/18/14, 11:59 AM
Duration
15 min
Start Side
Left
Fed on Both Sides
No
Start Time
8/18/14, 8:59 AM
Duration
13 min
Start Side
Right
Fed on Both Sides
No
(etc.)
but I need it ultimately in this format (or whatever i can use to get it into a spreadsheet)
StartDate, StartTime, Duration, StartSide, FedOnBothSides
8/18/14, 11:59 AM, 15, Left, No
- , -, -, -, -
The problems I have come across are:
-I don't need all the info or every line but i'm not sure how to automatically separate them. I don't even know if the way I am going about sorting each line is smart
-I have been getting an error that says that "argument 1 must be string or read-only character buffer, not list" when I use .read() or .readlines() sometimes (although it did work at first). also both of my arguments are .txt files.
-the dates and times are not in set formats with regular lengths (it has 8/4/14, 5:14 AM instead of 08/04/14, 05:14 AM) which I'm not sure how to deal with
this is what I have tried so far
from sys import argv
from os.path import exists
def filework():
script, from_file, to_file = argv
print "copying from %s to %s" % (from_file, to_file)
in_file = open(from_file)
indata = in_file.readlines() #.read() .readline .readlines .read().splitline .xreadlines
print "the input file is %d bytes long" % len(indata)
print "does the output file exist? %r" % exists(to_file)
print "ready, hit RETURN to continue, CTRL-C to abort."
raw_input()
#do stuff section----------------BEGIN
for i in indata:
if i == "Start Time":
pass #do something
elif i== '{date format}':
pass #do something
else:
pass #do something
#do stuff section----------------END
out_file = open(to_file, 'w')
out_file.write(indata)
print "alright, all done."
out_file.close()
in_file.close()
filework()
So I'm relatively unversed in scripts like this that have multiple complex parts. Any help and suggestions would be greatly appreciated. Sorry if this is a jumble.
Thanks
This code should work, although its not exactly optimal, but I'm sure you'll figure out how to make it better!
What this code basically does is:
Get all the lines from the input data
Loop through all the lines, and try to recognize different keys (the start time etc)
If a keys is recognize, get the line beneath it, and apply a appropriate function to it
If a new line is found, add the current entry to a list, so that other entries can be read
Write the data to a file
Incase you haven't seen string formatting being done this way before:
"{0:} {1:}".format(arg0, arg1), the {0:} is just a way of defining a placeholder for a variable(here: arg0), and the 0 just defines which arguments to use.
Find out more here:
Python .format docs
Python OrderedDict docs
If you are using a version of python < 2.7, you might have to install a other version of ordereddicts by using pip install ordereddict. If that doesn't work, just change data = OrderedDict() to data = {}, and it should work. But then the output will look somewhat different each time it is generated, but it will still be correct.
from sys import argv
from os.path import exists
# since we want to have a somewhat standardized format
# and dicts are unordered by default
try:
from collections import OrderedDict
except ImportError:
# python 2.6 or earlier, use backport
from ordereddict import OrderedDict
def get_time_and_date(time):
date, time = time.split(",")
time, time_indic = time.split()
date = pad_time(date)
time = "{0:} {1:}".format(pad_time(time), time_indic)
return time, date
"""
Make all the time values look the same, ex turn 5:30 AM into 05:30 AM
"""
def pad_time(time):
# if its time
if ":" in time:
separator = ":"
# if its a date
else:
separator = "/"
time = time.split(separator)
for index, num in enumerate(time):
if len(num) < 2:
time[index] = "0" + time[index]
return separator.join(time)
def filework():
from_file, to_file = argv[1:]
data = OrderedDict()
print "copying from %s to %s" % (from_file, to_file)
# by using open(...) the file closes automatically
with open(from_file, "r") as inputfile:
indata = inputfile.readlines()
entries = []
print "the input file is %d bytes long" % len(indata)
print "does the output file exist? %r" % exists(to_file)
print "ready, hit RETURN to continue, CTRL-C to abort."
raw_input()
for line_num in xrange(len(indata)):
# make the entire string lowercase to be more flexible,
# and then remove whitespace
line_lowered = indata[line_num].lower().strip()
if "start time" == line_lowered:
time, date = get_time_and_date(indata[line_num+1].strip())
data["StartTime"] = time
data["StartDate"] = date
elif "duration" == line_lowered:
duration = indata[line_num+1].strip().split()
# only keep the amount of minutes
data["Duration"] = duration[0]
elif "start side" == line_lowered:
data["StartSide"] = indata[line_num+1].strip()
elif "fed on both sides" == line_lowered:
data["FedOnBothSides"] = indata[line_num+1].strip()
elif line_lowered == "":
# if a blank line is found, prepare for reading a new entry
entries.append(data)
data = OrderedDict()
entries.append(data)
# create the outfile if it does not exist
with open(to_file, "w+") as outfile:
headers = entries[0].keys()
outfile.write(", ".join(headers) + "\n")
for entry in entries:
outfile.write(", ".join(entry.values()) + "\n")
filework()