Python Script to Change Folder Names - python

I'm on OS X and I'm fed up with our labeling system where I work. The labels are mm/dd/yy and I think that they should be yy/mm/dd. Is there a way to write a script to do this? I understand a bit of Python with lists and how to change the position of characters.
Any suggestions or tips?
What I have now:
083011-HalloweenBand
090311-ViolaClassRecital
090411-JazzBand
What I want:
110830-HalloweenBand
110903-ViolaClassRecital
110904-JazzBand
Thanks

Assuming the script is in the same directory as the files you want to rename, and you already have the list of files that you want to rename, you can do this:
for file in rename_list:
os.rename(file, file[4:6] + file[:2] + file[2:4] + file[6:])

There is a Q&A with information on traversing directories with Python that you could modify to do this. The key method is walk(), but you'll need to add the appropriate calls to rename().
As a beginner it is probably best to start by traversing the directories and writing out the new directory names before attempting to change the directory names. You should also make a backup and notify anyone who might care about this change before doing it.

i know you asked for python, but I would do it from the shell. this is a simple one liner.
ls | awk '{print "mv " $0 FS substr($1,5,2) substr($1,1,4) substr($1,7) }' | bash
I do not use osx but I think it is a bash shell. you may need to rename bash to sh, or awk to gawk.
but what that line is doing is piping the directory listing to awk which is printing "mv" $0 (the line) and a space (FS = field separator, which defaults to space) then two substrings.
substr(s,c,n). This returns the substring from string s starting from character position c up to a maximum length of n characters. If n is not supplied, the rest of the string from c is returned.
lastly this is piped to the shell. allowing it to be executed. This works without problems on ubuntu and variations of this command I use quite a bit. a version of awk (awk,nawk,gawk) should be isntalled on osx which I believe uses bash

Related

How can I concatenate multiple text or xml files but omit specific lines from each file?

I have a number of xml files (which can be considered as text files in this situation) that I wish to concatenate. Normally I think I could do something like this from a Linux command prompt or bash script:
cat somefile.xml someotherfile.xml adifferentfile.xml > out.txt
Except that in this case, I need to copy the first file in its entirety EXCEPT for the very last line, but in all subsequent files omit exactly the first four lines and the very last line (technically, I do need the last line from the last file but it is always the same, so I can easily add it with a separate statement).
In all these files the first four lines and the last line are always the same, but the contents in between varies. The names of the xml files can be hardcoded into the script or read from a separate data file, and the number of them may vary from time to time but always will number somewhere around 10-12.
I'm wondering what would be the easiest and most understandable way to do this. I think I would prefer either a bash script or maybe a python script, though I generally understand bash scripts a little better. What I can't get my head around is how to trim off just those first four lines (on all but the first file) and the last line of every file. My suspicion is there's some Linux command that can do this, but I have no idea what it would be. Any suggestions?
sed '$d' firstfile > out.txt
sed --separate '1,4d; $d' file2 file3 file4 >> out.txt
sed '1,4d' lastfile >> out.txt
It's important to use the --separate (or shorter -s) option so that the range statements 1,4 and $ apply to each file individually.
From GNU sed manual:
-s, --separate
By default, sed will consider the files specified on the command line as a single continuous long stream. This GNU sed
extension allows the user to consider them as separate files.
Do it in two steps:
use the head command (to get the lines you want)
Use cat to combine
You could use temp files or bash trickery.

Bash/Python to gather several paths to one and replacing the file name by '*' character

I'm coding a bash script in order to generate .spec file (building RPM) automatically. I read all the files in the directory (which I hope to convert it into rpm package) and write all the paths of files needed to install in .spec file, I realize that I need to shorten them. An example:
/tmp/a/1.jpg
/tmp/a/2.conf
/tmp/a/b/srf.cfg
/tmp/a/b/ssp.cfg
/tmp/a/conf_web_16.2/c/.htaccess
/tmp/a/conf_web_16.2/c/.htaccess.WebProv
/tmp/a/conf_web_16.2/c/.htprofiles
=> What I want to get:
/tmp/a/*.jpg
/tmp/a/*.conf
/tmp/a/b/*.cfg
/tmp/a/conf_web_16.2/c/*
/tmp/a/conf_web_16.2/c/*.WebProv
You guys please give me some advice about my problem. I hope you guys can suggest your idea in bash shell, python or C. Thank you in advance.
To convert any file name which contains a dot in a character other than the first into a wildcard covering the part up to just before the dot, and any remaining files to just a wildcard,
sed -e 's%/[^/][^/]*\(\.[^./]*\)$%/*\1%' -e t -e 's%/[^/]*$%/*%'
The behavior of sed is to read its input one line at a time, and execute the script of commands on each in turn. The s%foo%bar% substitution command replaces a regex match with a string, and the t command causes the script to skip further substitutions if one was already performed on the current line. (I'm simplifying somewhat.) The first regex matches file names which contain a dot in a position other than the first, and captures the match from the dot through the end in a back reference which is used in the substitution as well (that's the \1). The second is applied to any remaining file names, because of the t command in between.
The result will probably need to be piped to sort -u to remove any duplicates.
If you don't have a list of the file names, you can use find to pipe in a listing.
find . -type f | sed ... | sort -u

Ghostcript destination name with blank space returns error [duplicate]

I have a main file which uses(from the main I do a source) a properties file with variables pointing to paths.
The properties file looks like this:
TMP_PATH=/$COMPANY/someProject/tmp
OUTPUT_PATH=/$COMPANY/someProject/output
SOME_PATH=/$COMPANY/someProject/some path
The problem is SOME_PATH, I must use a path with spaces (I can't change it).
I tried escaping the whitespace, with quotes, but no solution so far.
I edited the paths, the problem with single quotes is I'm using another variable $COMPANY in the path
Use one of these threee variants:
SOME_PATH="/mnt/someProject/some path"
SOME_PATH='/mnt/someProject/some path'
SOME_PATH=/mnt/someProject/some\ path
I see Federico you've found solution by yourself.
The problem was in two places. Assignations need proper quoting, in your case
SOME_PATH="/$COMPANY/someProject/some path"
is one of possible solutions.
But in shell those quotes are not stored in a memory,
so when you want to use this variable, you need to quote it again, for example:
NEW_VAR="$SOME_PATH"
because if not, space will be expanded to command level, like this:
NEW_VAR=/YourCompany/someProject/some path
which is not what you want.
For more info you can check out my article about it http://www.cofoh.com/white-shell
You can escape the "space" char by putting a \ right before it.
SOME_PATH=/mnt/someProject/some\ path
should work
If the file contains only parameter assignments, you can use the following loop in place of sourcing it:
# Instead of source file.txt
while IFS="=" read name value; do
declare "$name=$value"
done < file.txt
This saves you having to quote anything in the file, and is also more secure, as you don't risk executing arbitrary code from file.txt.
If the path in Ubuntu is "/home/ec2-user/Name of Directory", then do this:
1) Java's build.properties file:
build_path='/home/ec2-user/Name\\ of\\ Directory'
Where ~/ is equal to /home/ec2-user
2) Jenkinsfile:
build_path=buildprops['build_path']
echo "Build path= ${build_path}"
sh "cd ${build_path}"

Star in sys.argv in python

I am attempting to write a script that utilises sys.argv to wrap the scp command. The idea, is that you will be able to run: pyscp folder/* host but if I run this script with those arguments:
import sys
for arg in sys.argv:
print arg
I get a list of all the folders inside folder:
pyscp.py
folder/0
folder/1
folder/2
folder/3
folder/4
folder/5
folder/67
folder/8
folder/9
host
Assuming a UNIXoid operating system: The shell is expanding the * into the matching files. Try to call your script like
pyscp "folder/*" host
The quotes keep the shell from interpreting the * character.
If you do not escape the asterisk, the shell is performing filename expansion for you. The pattern including the asterisk becomes replaced with an alphabetically sorted list of file names matching the pattern before your Python program becomes executed. You can prevent the shell from performing filename expansion using e.g. single quotes, i.e.
pyscp 'folder/*' hostname
You can then do this yourself within Python using the glob module and control things the way you want it.
The shell is expanding the file list for you. You can leverage this by allowing multiple parameters in the command.
import sys
files = sys.argv[1:-1]
host = sys.argv[-1]
Now you have a more flexible program that lets caller jump through whatever hoops he wants for the transfer, like maybe all text files in folder1 plus anything that's changed in the last day in folder2 (on a linux machine):
pyscp folder1/*.txt `find -mtime -1` example.com

Problems with running a python script over many files

I am on a Linux (Ubuntu 11.10) machine; bourne again shell.
I have to process a directory full of files with a python script. My colleague wrote the python script and I have successfully used it before on one file at a time. It takes two arguments: a path to the file to be processed enclosed in quotes and a secondary argument called -min which requires an integer. Also, the script writes to standard out.
From my experience of shell scripting and following others on this forum, I used the following method to iterate over the directory of files:
for f in path/to/data_directory/*; do
path/to/pythonscript.py $f -min 1 > path/to/out_directory/$f;
done
I get the desired file names in the out_directory. The content of each is something only the python script can write. That is, the above for loop successfully passes the files to the script. However, the nature of the content of each file is completely wrong (as in the computation the script does was wrong). When I run the python script on one of the files in the data_directory, the output file has the correct content (the computation performed by the script is correct).
The thing that makes it more complex is that the same shell method (the for loop) works perfectly in the Mac OS X my colleague has.
Where is the issue? Am I missing something very fundamental about Linux shells? Maybe it's a syntax error?
Any help will be appreciated.
Update: I just ran the for loop again but instead of pointing it to the data_directory of files, I pointed it to a file within the data_directory. I had the same problem - the python script did not compute the correct result.
The only problem I see is that filenames may contain white-space - so you should quote filenames:
for f in path/to/data_directory/*; do
path/to/pythonscript.py "$f" -min 1 > "path/to/out_directory/$f"
done
Well I don't know if this helps but.
path/to/pythonscript.py $f -min > path/to/out_director/$f
Substitutes out to
path/to/pythongscript.py path/to/data_directory/myfile -min 1 > path/out_directory/path/to/data_directory/myfile
script should be
cd path/to/data_directory
for f in *; do
path/to/pythonscript.py $f -min 1 > path/to/out_directory/$f
done
What version of bash are you running?
what do you get if you run this script?
cd path/to/data_directory
for f in *; do
echo $f > /tmp/$f
done
of course that should give you a bunch of files containing their own file names.

Categories

Resources