How can I rename every file that matches a regex? - python

I want to rename filenames of the form xyz.ogg.mp3 to xyz.mp3.
I have a regex that looks for .ogg in every file then it replaces the .ogg with an empty string but I get the following error:
Traceback (most recent call last):
File ".\New Text Document.py", line 7, in <module>
os.rename(files, '')
TypeError: rename() argument 1 must be string, not _sre.SRE_Match
Here is what I tried:
for file in os.listdir("./"):
if file.endswith(".mp3"):
files = re.search('.ogg', file)
os.rename(files, '')
How can I make this loop look for every .ogg in each file then replace it with an empty string?
The file structure looks like this: audiofile.ogg.mp3

You can do something like this:
for file in os.listdir("./"):
if file.endswith(".mp3") and '.ogg' in file:
os.rename(file, file.replace('.ogg',''))

Would be far more quicker to write a command line :
rename 's/\.ogg//' *.ogg.mp3
(perl's rename)

An example using Python 3's pathlib (but not regular expressions, as it's kind of overkill for the stated problem):
from pathlib import Path
for path in Path('.').glob('*.mp3'):
if '.ogg' in path.stem:
new_name = path.name.replace('.ogg', '')
path.rename(path.with_name(new_name))
A few notes:
Path('.') gives you a Path object pointing to the current working directory
Path.glob() searches recursively, and the * there is a wildcard (so you get anything ending in .mp3)
Path.stem gives you the file name minus the extension (so if your path were /foo/bar/baz.bang, the stem would be baz)

Related

Python Error:CSV File Error, Read/Print CSV file

I am getting this error when trying to print the contents of a CSV file in Python.
Traceback (most recent call last):
File "/Users/cassandracampbell/Library/Preferences/PyCharmCE2018.2/scratches/Player.py", line 5, in
with open('player.csv') as csvfile:
FileNotFoundError: [Errno 2] No such file or directory: 'player.csv'
Get the exact file path to the csv, if you are on a windows get the entire folder path and then the name, and then do:
with open(r'C:\users\path\players.csv') as csvfile:
If you're using a windows and the exact path, easiest to put the r before the path like I did because it is a literal which will allow the string to be interpreted and the path to be found.
You must put player.csv to the same location with your script Player.py
Example like your code, both files should be here: /Users/cassandracampbell/Library/Preferences/PyCharmCE2018.2/scratches/
Or you can put the specific directory of player.csv,
Ex:
with open("/Users/cassandracampbell/Library/Preferences/PyCharmCE2018.2/scratches/player.csv") as csvfile:
...
Check that you have the file in question in same directory as your .py file you're working on. If that doesn't work you might want to use the full path to the file:
with open ('/Users/cassandracampbell/Library/Preferences/PyCharmCE2018.2/scratches/player.CSV') as csvfile:
And you should try to check the name of the file too should be case sensitive otherwise, let's say you have Player.csv then don't try to open player.csv all lower case won't work!
Plus I don't know what you're trying to do with your CSV file, just print the raw content? You might like using pandas.

itertools ChainFromInterables Backslash Normalisation?

I posted a question a while back:
Writing List To Text File
I was able to write a list of files from a dictionary into a text file.
So my code looks like this:
simonDuplicates = chain.from_iterable([files for files in file_dict.values() if len(files) > 1])
text_file.write("Duplicates Files:%s" % '\n'.join(simonDuplicates))
This basically prints out directories and files, for example:
C:/Users/Simon/Desktop/myfile.jpg
The problem is the forward slashes. I want them to be backslashes (as used in Windows). I tried using os.path.normpath but it doesn't work
simonDuplicates = chain.from_iterable([files for files in file_dict.values() if len(files) > 1])
os.path.normpath(simonDuplicates)
text_file.write("Duplicates Files:%s" % '\n'.join(simonDuplicates))
I get the following error:
Traceback (most recent call last):
File "duff.py", line 125, in <module>
os.path.normpath(duplicates)
File "C:\Python27\lib\ntpath.py", line 402, in normpath
path = path.replace("/", "\\")
AttributeError: 'itertools.chain' object has no attribute 'replace'
I think it doesn't work because I should be using iSlice?
Martijn Pieter's answer to this question looks about right:
Troubleshooting 'itertools.chain' object has no attribute '__getitem__'
Any suggestions?
You are trying to apply os.path.normpath on chain.from_iterable,
which returns a generator object. So, you have to iterate till the generator exhausts, to get the complete list of filenames which are of type string.
You can use list comprehension like this
simonDuplicates = [os.path.normpath(path) for path in chain.from_iterable(files for files in file_dict.values() if len(files) > 1)]
Or you can use map function like this
simonDuplicates = map(os.path.normpath, chain.from_iterable(files for files in file_dict.values() if len(files) > 1))

Ignore bracket in csv file python

I wrote a python script for a friend that:
takes a CSV of photos she's been cataloging that has the name of the photos in an ordered list
finds the image files on the filesystem
matches the files in the csv with files on the system
copies the images on the filesystem to a folder with a figure name in the order the files appear in the CSV
So essentially, it does:
INPUT: myphoto1.tiff, mypainting.jpeg, myphoto9.jpg, orderedlist.csv
OUTPUT: fig001.jpg, fig002.tiff, fig003.jpeg
This code is going to run on a mac. This works fine except we ran into an issue where some of the files (all by the same photographer) have 1 bracket in them, e.g.
myphoto[fromitaly.jpg
This seems to break my regular expression search:
The relevant code:
orderedpaths = [path for item in target for path in filenames if re.search(item, path)]
Where filenames is a list of the photo files on the system and target is the list from the CSV. This code is supposed to match the CSV file name (and it's subsequent order in the list) to the filename to give an ordered list of the filenames on the system.
The error:
Traceback (most recent call last):
File "renameimages.py", line 43, in <module>
orderedpaths = [path for item in target for path in filenames if re.search(item, path)]
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 142, in search
return _compile(pattern, flags).search(string)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 244, in _compile
raise error, v # invalid expression
sre_constants.error: unexpected end of regular expression
I tried or considered:
Changing the filenames/csv, but this isn't scalable and ideally her
department will be using this script more in the future
Investigating treating the files as "raw" -- but it didn't seem like
that was possible for input from CSV
Deleting the [ character from the input, but the problem is that
then the input won't match the actual files on the system.
I suppose I should mention I only suspect this was the issue: by printing out the progress of the code, it appears as if the code gets to the CSV item with the bracket and errors.
The relevant code is the part where you buld a regular expression using a user input, without sanitizing it. You should not do that.
I believe you don't need to use RE at all. you can find matching string using if item in path or path.endswith(item) or something like that.
The best option is to use your library:
from os.path import basename
orderedpaths = [ ... if basename(path) == item]
If you insist on using REs, you should escape your input using re.escape():
orderedpaths = [path for item in target for path in filenames
if re.search(re.escape(item), path)]

Reading results from glob into a python function

I am trying to automate some plotting using python and fortran together.
I am very close to getting it to work, but I'm having problems getting the result from a glob search to feed into my python function.
I have a .py script that says
import glob
run=glob.glob('JUN*.aijE*.nc')
from plot_check import plot_check
plot_check(run)
But I am getting this error
plot_check(run)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "plot_check.py", line 7, in plot_check
ncfile=Dataset(run,'r')
File "netCDF4.pyx", line 1328, in netCDF4.Dataset.__init__ (netCDF4.c:6336)
RuntimeError: No such file or directory
I checked that the glob is doing its job and it is, but I think it's the format of my variable "run" that's screwing me up.
In python:
>>run
>>['JUN3103.aijE01Ccek0kA.nc']
>>type(run)
<type 'list'>
So my glob is finding the file name of the file I want to put into my function, but something isn't quite working when I try to input the variable "run" in to my function "plot_check".
I think it might be something to do with the format of my variable "run", but I'm not quite sure how to fix it.
Any help would be greatly appreciated!
glob.glob returns a list of all matching filenames. If you know there's always going to be exactly one file, you can just grab the first element:
filenames = glob.glob('JUN*.aijE*.nc')
plot_check(filenames[0])
Or, if it might match more than one file, then iterate over the results:
filenames = glob.glob('JUN*.aijE*.nc')
for filename in filenames:
plot_check(filename)
Perhaps Dataset expects to be passed a single string filename, rather than a list with one element?
Try using run[0] instead (though you may want to check to make sure your glob actually matches a file before you do that).

extraction of file from filepath

I need to extract file name without extension name.
example.
/home/si/text.txt
/home/si/text.vx.txt
In the both case I should receive output text only. I am not sure how many trailing extension file can have but I need to extract only file name. I have tried spliitext(filename)[0] but it gave me output text.vx rather than text
This should work for your needs:
from os.path import basename
print basename("/home/si/text.vx.txt").split('.')[0]
>>> text
I use split function after getting file name.
filename.split('.')[0]

Categories

Resources