searching and moving files using python - python

I have been trying to write some python code in order to get each line from a .txt file and search for a file with that name in a folder and its subfolders. After this I want to move that file in a preset destination folder.
I have tried the following code which was posted on stack overflow only but it doesn't seem to work and I am unable to figure out the problem.Any help would be highly appreciated:
import os
import shutil
def main():
destination = '/Users/jorjis/Desktop/new'
with open('/Users/jorjis/Desktop/articles.txt', 'r') as lines:
filenames_to_copy = set(line.rstrip() for line in lines)
for root, _, filenames in os.walk('/Users/jorjis/Desktop/folder/'):
for filename in filenames:
if filename in filenames_to_copy:
shutil.copy(os.path.join(root, filename), destination)

Without any debugging output (which you have now obtained) I can only guess a common pitfall of os.walk: the filenames returned in filenames are just that, filenames without any path. If your file contains filenames with paths they will never match. Use this instead:
if os.path.join(root, filename) in filenames_to_copy:
shutil.copy(os.path.join(root, filename), destination)

Related

FileNotFoundError: [Errno 2] Stuck

Can anyone tell me why I get this error? The file is in the folder.
import os
with open("C:\\Users\\42077\\Desktop\\test\\vystup\\!output.txt", "a")as f:
for root, dirs, files in os.walk("C:\\Users\\42077\\Desktop\\test\\"):
for path in files:
if path.endswith(".txt"):
with open(path, 'r') as file:
data = file.readlines()
f.write("{0} {1}\n".format(data[2], path))
the files os.walk() returns is not a list of paths to files it's a list of names (as strings) of files in the directory where os.walk() is currently looking (root).
for path in files:
if path.endswith(".txt"):
with open(path, 'r') as file:
so at the end here open() is given a file name like example.txt.
And when open() is not given an absolute path it looks from the current working directory. Meaning it tries to find this file wherever this python file is located, and promptly gives an error.

How to get complete path of files that are being searched within a folder in python?

My folder structure is as follows:
I want to get the paths of all files that contain the string xyz. The result must be as:
folder/folderA/fileA2
folder/folderB/fileB1
folder/file1
I tried this:
for path, subdirs, files in os.walk(folderTestPath):
for file in files:
if "xyz" in open(folderTestPath+file,'r'):
print (os.path.abspath(file))
folderTestPath contains the path of the folder. This code only gives me the file names followed by a file not found error. I know this is a simple thing, but for some reason am unable to get it. Please help.
You can use the os.path.join method:
for path, subdirs, files in os.walk(folderTestPath):
for file in files:
filePath = os.path.join(path, file)
if "xyz" in open(filePath ,'r').read():
print("xyz")
print(filePath)
As Eric mentioned to close the file after reading it use the below snippet:
import os
for path, subdirs, files in os.walk(folderTestPath):
for file in files:
filePath = os.path.join(path, file)
with open(filePath ,'r') as data:
if "xyz" in data.read():
print("xyz")
print(filePath)
data.close()

Python: linecache.getline not working as intended

I have a directory with numerous subdirectories.
At the bottom of the directories there are some .txt files i need to extract line 2 from.
import os
import os.path
import linecache
for dirpath, dirnames, filenames in os.walk("."):
for filename in [f for f in filenames if f.endswith(".txt")]:
#print os.path.join(dirpath, filename)
#print filename
print linecache.getline(filename, 2)
I am able to successfully parse all the directories and find every text file. But linecache.getline simply returns newline (where there should be data from that line of the files). Using
print linecache.getline(filename, 2).rstrip('\n')
Does not solve this either.
I am able to correctly print out just the filenames in each directory, but passing these to linecache seems to potentially be the issue. I am able to use linecache.getline(file, lineno.) successfully if I just run the script on 1 .txt file in the current directory.
linecache.getline takes filename from current working directory.
Solution is thus:
import os
import os.path
import linecache
for dirpath, dirnames, filenames in os.walk("."):
for filename in [f for f in filenames if f.endswith(".txt")]:
direc = os.path.join(dirpath, filename)
print linecache.getline(direc, 2)

Open a file without specifying the subdirectory python

Lets say my python script is in a folder "/main". I have a bunch of text files inside subfolders in main. I want to be able to open a file just by specifying its name, not the subdirectory its in.
So open_file('test1.csv') should open test1.csv even if its full path is /main/test/test1.csv.
I don't have duplicated file names so it should no be a problem.
I using windows.
you could use os.walk to find your filename in a subfolder structure
import os
def find_and_open(filename):
for root_f, folders, files in os.walk('.'):
if filename in files:
# here you can either open the file
# or just return the full path and process file
# somewhere else
with open(root_f + '/' + filename) as f:
f.read()
# do something
if you have a very deep folder structure you might want to limit the depth of the search
import os
def get_file_path(file):
for (root, dirs, files) in os.walk('.'):
if file in files:
return os.path.join(root, file)
This should work. It'll return the path, so you should handle opening the file, in your code.
import os
def open_file(filename):
f = open(os.path.join('/path/to/main/', filename))
return f

Python - Need to loop through directories looking for TXT files

I am a total Python Newb
I need to loop through a directory looking for .txt files, and then read and process them individually. I would like to set this up so that whatever directory the script is in is treated as the root of this action. For example if the script is in /bsepath/workDir, then it would loop over all of the files in workDir and its children.
What I have so far is:
#!/usr/bin/env python
import os
scrptPth = os.path.realpath(__file__)
for file in os.listdir(scrptPth)
with open(file) as f:
head,sub,auth = [f.readline().strip() for i in range(3)]
data=f.read()
#data.encode('utf-8')
pth = os.getcwd()
print head,sub,auth,data,pth
This code is giving me an invalid syntax error and I suspect that is because os.listdir does not like file paths in standard string format. Also I dont think that I am doing the looped action right. How do I reference a specific file in the looped action? Is it packaged as a variable?
Any help is appriciated
import os, fnmatch
def findFiles (path, filter):
for root, dirs, files in os.walk(path):
for file in fnmatch.filter(files, filter):
yield os.path.join(root, file)
Use it like this, and it will find all text files somewhere within the given path (recursively):
for textFile in findFiles(r'C:\Users\poke\Documents', '*.txt'):
print(textFile)
os.listdir expects a directory as input. So, to get the directory in which the script resides use:
scrptPth = os.path.dirname(os.path.realpath(__file__))
Also, os.listdir returns just the filenames, not the full path.
So open(file) will not work unless the current working directory happens to be the directory where the script resides. To fix this, use os.path.join:
import os
scrptPth = os.path.dirname(os.path.realpath(__file__))
for file in os.listdir(scrptPth):
with open(os.path.join(scrptPth, file)) as f:
Finally, if you want to recurse through subdirectories, use os.walk:
import os
scrptPth = os.path.dirname(os.path.realpath(__file__))
for root, dirs, files in os.walk(scrptPth):
for filename in files:
filename = os.path.join(root, filename)
with open(filename, 'r') as f:
head,sub,auth = [f.readline().strip() for i in range(3)]
data=f.read()
#data.encode('utf-8')

Categories

Resources