python- moving files and checking duplicate names - python

I'm trying to make a code that can move files from one folder to another.
For instance, I have files named 0001.jpg, 0002.jpg ... and so on in /test1/ folder and want to move those files to /test3/ folder if the same file name doesn't exist in /test2/.
So, if there's 0001.jpg both in folder /test1/ and /test2/ the file in /test1/ won't be moved to /test3/ folder but if there's 0002.jpg in /test1/ and not in /test2/, it moves to /test/3.
I've tried to write the code on my own but it won't work.
Can you please help with this?
Thanks in advance!
import os
import shutil
def Move_files(root_path, refer_path, out_path) :
root_path_list= [file for file in os.listdir(root_path)]
refer_path_list= [file for file in os.listdir(refer_path)]
for file in root_path_list:
if refer_path_list in root_path_list:
shutil.move(os.path.join(os.listdir(root_path, file)),os.path.join(os.listdir(refer_path, file)))
if __name__ == '__main__' :
Move_files("D:\\Dataset\\test1", "D:\\Dataset\\test2", "D:\\Dataset\\test3")

Updated: You can check if the file exists in your other directory using os.path.exists, and then only moving it if it does not already exist in /test2/:
if not os.path.exists(os.path.join(refer_path, file)):
shutil.move(os.path.join(os.listdir(root_path, file)),os.path.join(os.listdir(refer_path, file)))
Also, os.listdir only accepts one argument, the path to the directory of which you want to list the files. I think you want to change your shutil.move statement to this: shutil.move(os.path.join(root_path, file),os.path.join(out_path, file))

Try this
import os
import shutil
def Move_files(root_path, refer_path, out_path) :
root_path_list= [file for file in os.listdir(root_path)]
refer_path_list= [file for file in os.listdir(refer_path)]
for file in root_path_list:
if file not in refer_path_list:
shutil.move(os.path.join(os.listdir(root_path, file)),os.path.join(os.listdir(out_path, file)))
if __name__ == '__main__' :
Move_files("D:\\Dataset\\test1", "D:\\Dataset\\test2", "D:\\Dataset\\test3")

You can use set to find the difference between the file lists. I added an isfile check to ignore subdirectories (e.g., the "." and ".." directories in linux) and since shutil.move accepts a target directory, there is no need to build the target file name.
import os
import shutil
def Move_files(root_path, refer_path, out_path) :
root_files = set(filename for filename in os.listdir(root_path)
if os.path.isfile(filename))
refer_files = set(filename for filename in os.listdir(refer_path)
if os.path.isfile(filename))
move_files = root_files - refer_files
for file in move_files:
shutil.move(os.path.join(root_path, file), out_path)
if __name__ == '__main__' :
Move_files("D:\\Dataset\\test1", "D:\\Dataset\\test2", "D:\\Dataset\\test3")

Related

Loop through each subdirectory in a main directory and run code against each file using OS

Essentially what I'm trying to do is loop through a directory that contains multiple sub-directories and within those run code against each file in a for loop.
The only start I managed to make was listing the directories but as I've rarely ever used os I'm not sure if I could potentially loop through os.chdir and a bit of f string formatting to loop through each subdirectory.
The files I want to run code against are just txt files.
Here goes my code, up to the moment:
import os
for folders in os.listdir('../main_directory'):
for something in os.listdir(f'{folders}'):
# run some function of sorts
pass
Any help would be greatly appreciated.
I like using pure os:
import os
for fname in os.listdir(src):
# build the path to the folder
folder_path = os.path.join(src, fname)
if os.path.isdir(folder_path):
# we are sure this is a folder; now lets iterate it
for file_name in os.listdir(folder_path):
file_path = os.path.join(folder_path, file_name)
# now you can apply any function assuming it is a file
# or double check it if needed as `os.path.isfile(file_path)`
Note that this function just iterate over the folder given at src and one more level:
src/foo.txt # this file is ignored
src/foo/a.txt # this file is processed
src/foo/foo_2/b.txt # this file is ignored; too deep.
src/foo/foo_2/foo_3/c.txt # this file is ignored; too deep.
In case you need to go as deep as possible, you can write a recursive function and apply it to every single file, as follows:
import os
def function_over_files(path):
if os.path.isfile(path):
# do whatever you need with file at path
else:
# this is a dir: we will list all files on it and call recursively
for fname in os.listdir(path):
f_path = os.path.join(path, fname)
# here is the trick: recursive call to the same function
function_over_files(f_path)
src = "path/to/your/dir"
function_over_files(src)
This way you can apply the function to any file under path, don't care how deep it is in the folder, now:
src/foo.txt # this file is processed; as each file under src
src/foo/a.txt # this file is processed
src/foo/foo_2/b.txt # this file is processed
src/foo/foo_2/foo_3/c.txt # this file is processed
You could try something like this:
for subdir, dirs, files in os.walk(rootdir):
Now you have "access" to all subdirs, dirs, and files for your main folder.
Hope it helps

How do I move file types not a specific file in PyCharm?

I am trying to make a program to organize my downloads folder every time I download something, but if I use this code:
import shutil
shutil.move("/Users/plapl/downloads/.zip", "/Users/plapl/Desktop/Shortcuts/winrar files")
shutil.move("/Users/plapl/downloads/.png", "/Users/plapl/Desktop/Shortcuts/images")
It searches for a file name called .zip and .png, but I want it to search for all files that are that type. Can anyone tell me how to do that?
You want to iterate over the files in the directory. Here is an example from source
import shutil
import os
source = os.listdir("/Users/plapl/downloads/")
destination = "/Users/plapl/Desktop/Shortcuts/winrar files"
for files in source:
if files.endswith(".zip"):
shutil.move(files,destination)
I made something based off of that, but it says unresolved reference 'file'
import shutil
import os
source = os.listdir("/Users/plapl/Downloads/")
destination1 = "/Users/plapl/desktop/Shortcuts/images"
destination2 = "/Users/plapl/Shortcuts/winrar files"
destination3 = "/Users/plapl/torrents"
for files in source:
if file.endswith(".png"):
shutil.move(files, destination1)
if file.endswith(".zip"):
shutil.move(files, destination2)
if file.endswith(".torrent"):
shutil.move(files, destination3)
It is complaining for the variable name file which is not defined.
You should use files since that is your iterating variable name.

Renaming multiple files in a directory using Python

I'm trying to rename multiple files in a directory using this Python script:
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
i = 1
for file in files:
os.rename(file, str(i)+'.jpg')
i = i+1
When I run this script, I get the following error:
Traceback (most recent call last):
File "rename.py", line 7, in <module>
os.rename(file, str(i)+'.jpg')
OSError: [Errno 2] No such file or directory
Why is that? How can I solve this issue?
Thanks.
You are not giving the whole path while renaming, do it like this:
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
for index, file in enumerate(files):
os.rename(os.path.join(path, file), os.path.join(path, ''.join([str(index), '.jpg'])))
Edit: Thanks to tavo, The first solution would move the file to the current directory, fixed that.
You have to make this path as a current working directory first.
simple enough.
rest of the code has no errors.
to make it current working directory:
os.chdir(path)
import os
from os import path
import shutil
Source_Path = 'E:\Binayak\deep_learning\Datasets\Class_2'
Destination = 'E:\Binayak\deep_learning\Datasets\Class_2_Dest'
#dst_folder = os.mkdir(Destination)
def main():
for count, filename in enumerate(os.listdir(Source_Path)):
dst = "Class_2_" + str(count) + ".jpg"
# rename all the files
os.rename(os.path.join(Source_Path, filename), os.path.join(Destination, dst))
# Driver Code
if __name__ == '__main__':
main()
As per #daniel's comment, os.listdir() returns just the filenames and not the full path of the file. Use os.path.join(path, file) to get the full path and rename that.
import os
path = 'C:\\Users\\Admin\\Desktop\\Jayesh'
files = os.listdir(path)
for file in files:
os.rename(os.path.join(path, file), os.path.join(path, 'xyz_' + file + '.csv'))
Just playing with the accepted answer define the path variable and list:
path = "/Your/path/to/folder/"
files = os.listdir(path)
and then loop over that list:
for index, file in enumerate(files):
#print (file)
os.rename(path+file, path +'file_' + str(index)+ '.jpg')
or loop over same way with one line as python list comprehension :
[os.rename(path+file, path +'jog_' + str(index)+ '.jpg') for index, file in enumerate(files)]
I think the first is more readable, in the second the first part of the loop is just the second part of the list comprehension
If your files are renaming in random manner then you have to sort the files in the directory first. The given code first sort then rename the files.
import os
import re
path = 'target_folder_directory'
files = os.listdir(path)
files.sort(key=lambda var:[int(x) if x.isdigit() else x for x in re.findall(r'[^0-9]|[0-9]+', var)])
for i, file in enumerate(files):
os.rename(path + file, path + "{}".format(i)+".jpg")
I wrote a quick and flexible script for renaming files, if you want a working solution without reinventing the wheel.
It renames files in the current directory by passing replacement functions.
Each function specifies a change you want done to all the matching file names. The code will determine the changes that will be done, and displays the differences it would generate using colors, and asks for confirmation to perform the changes.
You can find the source code here, and place it in the folder of which you want to rename files https://gist.github.com/aljgom/81e8e4ca9584b481523271b8725448b8
It works in pycharm, I haven't tested it in other consoles
The interaction will look something like this, after defining a few replacement functions
when it's running the first one, it would show all the differences from the files matching in the directory, and you can confirm to make the replacements or no, like this
This works for me and by increasing the index by 1 we can number the dataset.
import os
path = '/Users/myName/Desktop/directory'
files = os.listdir(path)
index=1
for index, file in enumerate(files):
os.rename(os.path.join(path, file),os.path.join(path,''.join([str(index),'.jpg'])))
index = index+1
But if your current image name start with a number this will not work.

Open a file without specifying the subdirectory python

Lets say my python script is in a folder "/main". I have a bunch of text files inside subfolders in main. I want to be able to open a file just by specifying its name, not the subdirectory its in.
So open_file('test1.csv') should open test1.csv even if its full path is /main/test/test1.csv.
I don't have duplicated file names so it should no be a problem.
I using windows.
you could use os.walk to find your filename in a subfolder structure
import os
def find_and_open(filename):
for root_f, folders, files in os.walk('.'):
if filename in files:
# here you can either open the file
# or just return the full path and process file
# somewhere else
with open(root_f + '/' + filename) as f:
f.read()
# do something
if you have a very deep folder structure you might want to limit the depth of the search
import os
def get_file_path(file):
for (root, dirs, files) in os.walk('.'):
if file in files:
return os.path.join(root, file)
This should work. It'll return the path, so you should handle opening the file, in your code.
import os
def open_file(filename):
f = open(os.path.join('/path/to/main/', filename))
return f

Python: Check if data file exists relative to source code file

I have a small text (XML) file that I want a Python function to load. The location of the text file is always in a fixed relative position to the Python function code.
For example, on my local computer, the files text.xml and mycode.py could reside in:
/a/b/text.xml
/a/c/mycode.py
Later at run time, the files could reside in:
/mnt/x/b/text.xml
/mnt/x/c/mycode.py
How do I ensure I can load in the file? Do I need the absolute path? I see that I can use os.path.isfile, but that presumes I have a path.
you can do a call as follows:
import os
BASE_DIR = os.path.dirname(os.path.realpath(__file__))
This will get you the directory of the python file you're calling from mycode.py
then accessing the xml files is as simple as:
xml_file = "{}/../text.xml".format(BASE_DIR)
fin = open(xml_file, 'r+')
If the parent directory of the two directories are always the same this should work:
import os
path_to_script = os.path.realpath(__file__)
parent_directory = os.path.dirname(path_to_script)
for root, dirs, files in os.walk(parent_directory):
for file in files:
if file == 'text.xml':
path_to_xml = os.path.join(root, file)
You can use the special variable __file__ which gives you the current file name (see http://docs.python.org/2/reference/datamodel.html).
So in your first example, you can reference text.xml this way in mycode.py:
xml_path = os.path.join(__file__, '..', '..', 'text.xml')

Categories

Resources