I'm using a python script where I'm using a shell command to copy from local to hdfs.
import os
import logging
import subprocess
filePath = "/tmp"
keyword = "BC10^Dummy-Segment"
for root, dirs, files in os.walk(filePath):
for file in files:
if keyword in file:
subprocess.call(["hadoop fs -copyFromLocal /tmp/BC10%5EDummy-Segment* /user/app"], shell=True)
subprocess.call(["hadoop fs -rm /tmp/BC10%5EDummy-Segment*"], shell=True)
I'm seeing this error:
copyFromLocal: `/tmp/BC10^Dummy-Segment*': No such file or directory
rm: `/tmp/BC10^Dummy-Segment_2019': No such file or directory
Updated code:
import glob
import subprocess
import os
from urllib import urlencode, quote_plus
filePath = "/tmp"
keyword = "BC10^Dummy-Segment"
wildcard = os.path.join(filePath, '{0}*'.format(keyword))
print(wildcard)
files = [urlencode(x, quote_via=quote_plus) for x in glob.glob(wildcard)]
subprocess.check_call(["hadoop", "fs", "-copyFromLocal"] + files + ["/user/app"])
#subprocess.check_call(["hadoop", "fs", "-rm"] + files)
Seeing error when I run:
Traceback (most recent call last):
File "ming.py", line 11, in <module>
files = [urlencode(x, quote_via=quote_plus) for x in glob.glob(wildcard)]
TypeError: urlencode() got an unexpected keyword argument 'quote_via'
I'm guessing you are URL-encoding the path to pass it properly to Hadoop, but in doing so you basically hide it from the shell. There really are no files matching the wildcard /tmp/BC10%5EDummy-Segment* where % etc are literal characters.
Try handling the glob from Python instead. With that, you can also get rid of that pesky shell=True; and with that change, it is finally actually correct and useful to pass the commands as a list of strings (never a list of a singe space-separated string, and with shell=True, don't pass a list at all). Notice also the switch to check_call so we trap errors and don't delete the source files if copying them failed. (See also https://stackoverflow.com/a/51950538/874188 for additional rationale.)
import glob
import subprocess
import os
from urllib import quote_plus
filePath = "/tmp"
keyword = "BC10^Dummy-Segment"
wildcard = os.path.join(filePath, '{0}*'.format(keyword))
files = [quote_plus(x) for x in glob.glob(wildcard)]
subprocess.check_call(["hadoop", "fs", "-copyFromLocal"] + files + ["/user/app"])
subprocess.check_call(["hadoop", "fs", "-rm"] + files)
This will not traverse subdirectories; but neither would your attempt with os.walk() do anything actually useful if it found files in subdirectories. If you actually want that to happen, please explain in more detail what the script should do.
Related
How can I delete a file or folder?
os.remove() removes a file.
os.rmdir() removes an empty directory.
shutil.rmtree() deletes a directory and all its contents.
Path objects from the Python 3.4+ pathlib module also expose these instance methods:
pathlib.Path.unlink() removes a file or symbolic link.
pathlib.Path.rmdir() removes an empty directory.
Python syntax to delete a file
import os
os.remove("/tmp/<file_name>.txt")
or
import os
os.unlink("/tmp/<file_name>.txt")
or
pathlib Library for Python version >= 3.4
file_to_rem = pathlib.Path("/tmp/<file_name>.txt")
file_to_rem.unlink()
Path.unlink(missing_ok=False)
Unlink method used to remove the file or the symbolik link.
If missing_ok is false (the default), FileNotFoundError is raised if the path does not exist.
If missing_ok is true, FileNotFoundError exceptions will be ignored (same behavior as the POSIX rm -f command).
Changed in version 3.8: The missing_ok parameter was added.
Best practice
First, check if the file or folder exists and then delete it. You can achieve this in two ways:
os.path.isfile("/path/to/file")
Use exception handling.
EXAMPLE for os.path.isfile
#!/usr/bin/python
import os
myfile = "/tmp/foo.txt"
# If file exists, delete it.
if os.path.isfile(myfile):
os.remove(myfile)
else:
# If it fails, inform the user.
print("Error: %s file not found" % myfile)
Exception Handling
#!/usr/bin/python
import os
# Get input.
myfile = raw_input("Enter file name to delete: ")
# Try to delete the file.
try:
os.remove(myfile)
except OSError as e:
# If it fails, inform the user.
print("Error: %s - %s." % (e.filename, e.strerror))
Respective output
Enter file name to delete : demo.txt
Error: demo.txt - No such file or directory.
Enter file name to delete : rrr.txt
Error: rrr.txt - Operation not permitted.
Enter file name to delete : foo.txt
Python syntax to delete a folder
shutil.rmtree()
Example for shutil.rmtree()
#!/usr/bin/python
import os
import sys
import shutil
# Get directory name
mydir = raw_input("Enter directory name: ")
# Try to remove the tree; if it fails, throw an error using try...except.
try:
shutil.rmtree(mydir)
except OSError as e:
print("Error: %s - %s." % (e.filename, e.strerror))
Use
shutil.rmtree(path[, ignore_errors[, onerror]])
(See complete documentation on shutil) and/or
os.remove
and
os.rmdir
(Complete documentation on os.)
Here is a robust function that uses both os.remove and shutil.rmtree:
def remove(path):
""" param <path> could either be relative or absolute. """
if os.path.isfile(path) or os.path.islink(path):
os.remove(path) # remove the file
elif os.path.isdir(path):
shutil.rmtree(path) # remove dir and all contains
else:
raise ValueError("file {} is not a file or dir.".format(path))
You can use the built-in pathlib module (requires Python 3.4+, but there are backports for older versions on PyPI: pathlib, pathlib2).
To remove a file there is the unlink method:
import pathlib
path = pathlib.Path(name_of_file)
path.unlink()
Or the rmdir method to remove an empty folder:
import pathlib
path = pathlib.Path(name_of_folder)
path.rmdir()
Deleting a file or folder in Python
There are multiple ways to Delete a File in Python but the best ways are the following:
os.remove() removes a file.
os.unlink() removes a file. it is a Unix name of remove() method.
shutil.rmtree() deletes a directory and all its contents.
pathlib.Path.unlink() deletes a single file The pathlib module is available in Python 3.4 and above.
os.remove()
Example 1: Basic Example to Remove a File Using os.remove() Method.
import os
os.remove("test_file.txt")
print("File removed successfully")
Example 2: Checking if File Exists using os.path.isfile and Deleting it With os.remove
import os
#checking if file exist or not
if(os.path.isfile("test.txt")):
#os.remove() function to remove the file
os.remove("test.txt")
#Printing the confirmation message of deletion
print("File Deleted successfully")
else:
print("File does not exist")
#Showing the message instead of throwig an error
Example 3: Python Program to Delete all files with a specific extension
import os
from os import listdir
my_path = 'C:\Python Pool\Test\'
for file_name in listdir(my_path):
if file_name.endswith('.txt'):
os.remove(my_path + file_name)
Example 4: Python Program to Delete All Files Inside a Folder
To delete all files inside a particular directory, you simply have to use the * symbol as the pattern string.
#Importing os and glob modules
import os, glob
#Loop Through the folder projects all files and deleting them one by one
for file in glob.glob("pythonpool/*"):
os.remove(file)
print("Deleted " + str(file))
os.unlink()
os.unlink() is an alias or another name of os.remove() . As in the Unix OS remove is also known as unlink.
Note: All the functionalities and syntax is the same of os.unlink() and os.remove(). Both of them are used to delete the Python file path.
Both are methods in the os module in Python’s standard libraries which performs the deletion function.
shutil.rmtree()
Example 1: Python Program to Delete a File Using shutil.rmtree()
import shutil
import os
# location
location = "E:/Projects/PythonPool/"
# directory
dir = "Test"
# path
path = os.path.join(location, dir)
# removing directory
shutil.rmtree(path)
Example 2: Python Program to Delete a File Using shutil.rmtree()
import shutil
import os
location = "E:/Projects/PythonPool/"
dir = "Test"
path = os.path.join(location, dir)
shutil.rmtree(path)
pathlib.Path.rmdir() to remove Empty Directory
Pathlib module provides different ways to interact with your files. Rmdir is one of the path functions which allows you to delete an empty folder. Firstly, you need to select the Path() for the directory, and then calling rmdir() method will check the folder size. If it’s empty, it’ll delete it.
This is a good way to deleting empty folders without any fear of losing actual data.
from pathlib import Path
q = Path('foldername')
q.rmdir()
How do I delete a file or folder in Python?
For Python 3, to remove the file and directory individually, use the unlink and rmdir Path object methods respectively:
from pathlib import Path
dir_path = Path.home() / 'directory'
file_path = dir_path / 'file'
file_path.unlink() # remove file
dir_path.rmdir() # remove directory
Note that you can also use relative paths with Path objects, and you can check your current working directory with Path.cwd.
For removing individual files and directories in Python 2, see the section so labeled below.
To remove a directory with contents, use shutil.rmtree, and note that this is available in Python 2 and 3:
from shutil import rmtree
rmtree(dir_path)
Demonstration
New in Python 3.4 is the Path object.
Let's use one to create a directory and file to demonstrate usage. Note that we use the / to join the parts of the path, this works around issues between operating systems and issues from using backslashes on Windows (where you'd need to either double up your backslashes like \\ or use raw strings, like r"foo\bar"):
from pathlib import Path
# .home() is new in 3.5, otherwise use os.path.expanduser('~')
directory_path = Path.home() / 'directory'
directory_path.mkdir()
file_path = directory_path / 'file'
file_path.touch()
and now:
>>> file_path.is_file()
True
Now let's delete them. First the file:
>>> file_path.unlink() # remove file
>>> file_path.is_file()
False
>>> file_path.exists()
False
We can use globbing to remove multiple files - first let's create a few files for this:
>>> (directory_path / 'foo.my').touch()
>>> (directory_path / 'bar.my').touch()
Then just iterate over the glob pattern:
>>> for each_file_path in directory_path.glob('*.my'):
... print(f'removing {each_file_path}')
... each_file_path.unlink()
...
removing ~/directory/foo.my
removing ~/directory/bar.my
Now, demonstrating removing the directory:
>>> directory_path.rmdir() # remove directory
>>> directory_path.is_dir()
False
>>> directory_path.exists()
False
What if we want to remove a directory and everything in it?
For this use-case, use shutil.rmtree
Let's recreate our directory and file:
file_path.parent.mkdir()
file_path.touch()
and note that rmdir fails unless it's empty, which is why rmtree is so convenient:
>>> directory_path.rmdir()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "~/anaconda3/lib/python3.6/pathlib.py", line 1270, in rmdir
self._accessor.rmdir(self)
File "~/anaconda3/lib/python3.6/pathlib.py", line 387, in wrapped
return strfunc(str(pathobj), *args)
OSError: [Errno 39] Directory not empty: '/home/username/directory'
Now, import rmtree and pass the directory to the funtion:
from shutil import rmtree
rmtree(directory_path) # remove everything
and we can see the whole thing has been removed:
>>> directory_path.exists()
False
Python 2
If you're on Python 2, there's a backport of the pathlib module called pathlib2, which can be installed with pip:
$ pip install pathlib2
And then you can alias the library to pathlib
import pathlib2 as pathlib
Or just directly import the Path object (as demonstrated here):
from pathlib2 import Path
If that's too much, you can remove files with os.remove or os.unlink
from os import unlink, remove
from os.path import join, expanduser
remove(join(expanduser('~'), 'directory/file'))
or
unlink(join(expanduser('~'), 'directory/file'))
and you can remove directories with os.rmdir:
from os import rmdir
rmdir(join(expanduser('~'), 'directory'))
Note that there is also a os.removedirs - it only removes empty directories recursively, but it may suit your use-case.
This is my function for deleting dirs. The "path" requires the full pathname.
import os
def rm_dir(path):
cwd = os.getcwd()
if not os.path.exists(os.path.join(cwd, path)):
return False
os.chdir(os.path.join(cwd, path))
for file in os.listdir():
print("file = " + file)
os.remove(file)
print(cwd)
os.chdir(cwd)
os.rmdir(os.path.join(cwd, path))
shutil.rmtree is the asynchronous function,
so if you want to check when it complete, you can use while...loop
import os
import shutil
shutil.rmtree(path)
while os.path.exists(path):
pass
print('done')
import os
folder = '/Path/to/yourDir/'
fileList = os.listdir(folder)
for f in fileList:
filePath = folder + '/'+f
if os.path.isfile(filePath):
os.remove(filePath)
elif os.path.isdir(filePath):
newFileList = os.listdir(filePath)
for f1 in newFileList:
insideFilePath = filePath + '/' + f1
if os.path.isfile(insideFilePath):
os.remove(insideFilePath)
For deleting files:
os.unlink(path, *, dir_fd=None)
or
os.remove(path, *, dir_fd=None)
Both functions are semantically same. This functions removes (deletes) the file path. If path is not a file and it is directory, then exception is raised.
For deleting folders:
shutil.rmtree(path, ignore_errors=False, onerror=None)
or
os.rmdir(path, *, dir_fd=None)
In order to remove whole directory trees, shutil.rmtree() can be used. os.rmdir only works when the directory is empty and exists.
For deleting folders recursively towards parent:
os.removedirs(name)
It remove every empty parent directory with self until parent which has some content
ex. os.removedirs('abc/xyz/pqr') will remove the directories by order 'abc/xyz/pqr', 'abc/xyz' and 'abc' if they are empty.
For more info check official doc: os.unlink , os.remove, os.rmdir , shutil.rmtree, os.removedirs
To remove all files in folder
import os
import glob
files = glob.glob(os.path.join('path/to/folder/*'))
files = glob.glob(os.path.join('path/to/folder/*.csv')) // It will give all csv files in folder
for file in files:
os.remove(file)
To remove all folders in a directory
from shutil import rmtree
import os
// os.path.join() # current working directory.
for dirct in os.listdir(os.path.join('path/to/folder')):
rmtree(os.path.join('path/to/folder',dirct))
To avoid the TOCTOU issue highlighted by Éric Araujo's comment, you can catch an exception to call the correct method:
def remove_file_or_dir(path: str) -> None:
""" Remove a file or directory """
try:
shutil.rmtree(path)
except NotADirectoryError:
os.remove(path)
Since shutil.rmtree() will only remove directories and os.remove() or os.unlink() will only remove files.
My personal preference is to work with pathlib objects - it offers a more pythonic and less error-prone way to interact with the filesystem, especially if You develop cross-platform code.
In that case, You might use pathlib3x - it offers a backport of the latest (at the date of writing this answer Python 3.10.a0) Python pathlib for Python 3.6 or newer, and a few additional functions like "copy", "copy2", "copytree", "rmtree" etc ...
It also wraps shutil.rmtree:
$> python -m pip install pathlib3x
$> python
>>> import pathlib3x as pathlib
# delete a directory tree
>>> my_dir_to_delete=pathlib.Path('c:/temp/some_dir')
>>> my_dir_to_delete.rmtree(ignore_errors=True)
# delete a file
>>> my_file_to_delete=pathlib.Path('c:/temp/some_file.txt')
>>> my_file_to_delete.unlink(missing_ok=True)
you can find it on github or PyPi
Disclaimer: I'm the author of the pathlib3x library.
I recommend using subprocess if writing a beautiful and readable code is your cup of tea:
import subprocess
subprocess.Popen("rm -r my_dir", shell=True)
And if you are not a software engineer, then maybe consider using Jupyter; you can simply type bash commands:
!rm -r my_dir
Traditionally, you use shutil:
import shutil
shutil.rmtree(my_dir)
When I run my python script via the terminal by going into the directory the python script is held and running > python toolstation.py, the script runs successfully.
Then what I try to do is run the script via a .bat file. My .bat file is set as so:
"C:\Users\xxxx\AppData\Local\Programs\Python\Python39\python.exe" "C:\Users\xxxx\Downloads\axp_solutions\python_scripts\toolstation.py"
When I run this bat file, it gives me an exception which states it cannot find the directory to open the csv file, which is one directory above the python script.
Exception:
Traceback (most recent call last):
File "C:\Users\xxx\Downloads\axp_solutions\python_scripts\toolstation.py", line 12, in <module>
f = open('../input/toolstation.csv', 'r')
FileNotFoundError: [Errno 2] No such file or directory: '../input/toolstation.csv'
The code for this in the python script is set like so:
f = open('../input/toolstation.csv', 'r')
Now I can set this via a hardcoded path like so to get around it:
f = open('C:/Users/xxxx/Downloads/axp_solutions/input/toolstation.csv', 'r')
But as I am sending this script and bat file to a friend, they will have a different path set. So my question is, how should the dynamic path be set so that it is able to recognise the directory to go to?
Instead of constructing the path to the CSV file (based on this answer), I would suggest using Python's argparse library to add an optional argument which takes the path to that CSV file.
You could give it a reasonable default value (or even use the automatically determined relative path as the default), so that you don't have to specify anything if you are on your system, while at the same time adding a lot of flexibility to your script.
You and everyone using your script, can at any moment decide which CSV file to use, while the overhead is manageable.
Here's what I would start with:
import argparse
import os
DEFAULT_PATH = r'C:\path\that\makes\sense\here\toolstation.csv'
# Alternatively, automatically determine default path:
# DEFAULT_PATH = os.path.join(
# os.path.dirname(os.path.abspath(__file__)),
# r'..\input\toolstation.csv',
# )
parser = argparse.ArgumentParser(prog='toolstation')
parser.add_argument(
'-i',
'--input-file',
default=DEFAULT_PATH,
help='Path to input CSV file for this script (default: %(default)s).'
)
args = parser.parse_args()
try:
in_file = open(args.input_file, 'r')
except FileNotFoundError:
raise SystemExit(
f"Input file '{args.input_file}' not found. "
"Please provide a valid input file and retry. "
"Exiting..."
)
Your script gets a nice and extensible interface with out-of-the-box help (you can just run python toolstation.py --help).
People using your script will love you because you provided them with the option to choose their input file via:
python toolstation.py --input-file <some_path>
The input file path passed to the script can be absolute or relative to the directory the script is executed from.
Use os module to get the script path and then add the path to the src file.
Ex:
import os
path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'input', 'toolstation.csv')
with open(path, 'r') as infile:
...
You can pass the path to that CSV in as an argument to the python script.
This tutorial may be helpful.
The batch file:
set CSV_PATH="C:\...\toolstation.csv"
"C:\...\python.exe" "C:\...\toolstation.py" %CSV_PATH%
You'll have to fill in the "..."s with the appropriate paths; this isn't magic.
The Python script:
import sys
toolstation_csv = sys.argv[1]
...
f = open(toolstation_csv, 'r')
...
file = open(r"C:\Users\MyUsername\Desktop\PythonCode\configure.txt")
Right now this is what im using. However if people download the program on their computer the file wont link because its a specific path. How would i be able to link the file if its in the same folder as the script.
You can use __file__. Technically not every module has this attribute, but if you're not loading your module from a file, loading a text file from the same folder becomes a moot point anyway.
from os.path import abspath, dirname, join
with open(join(dirname(abspath(__file__)), 'configure.txt')):
...
While this will do what you're asking for, it's not necessarily the best way to store configuration.
Use os module to get your current filepath.
import os
this_dir, this_filename = os.path.split(__file__)
myfile = os.path.join(this_dir, 'configure.txt')
file = open(myfile)
# import the OS library
import os
# create the absolute location with correct OS path separator to file
config_file = os.getcwd() + os.path.sep + "configure.txt"
# Open file
file = open(config_file)
This method will make sure the correct path separators are used.
I've been trying to figure this out for hours with no luck. I have a list of directories that have subdirectories and other files of their own. I'm trying to traverse through all of them and move all of their content to a specific location. I tried shutil and glob but I couldn't get it to work. I even tried to run shell commands using subprocess.call and that also did not work either. I understand that it didn't work because I couldn't apply it properly but I couldn't find any solution that moves all contents of a directory to another.
files = glob.glob('Food101-AB/*/')
dest = 'Food-101/'
if not os.path.exists(dest):
os.makedirs(dest)
subprocess.call("mv Food101-AB/* Food-101/", shell=True)
# for child in files:
# shutil.move(child, dest)
I'm trying to move everything in Food101-AB to Food-101
shutil module of the standart library is the way to go:
>>> import shutil
>>> shutil.move("Food101-AB", "Food-101")
If you don't want to move Food101-AB folder itself, try using this:
import shutil
import os
for i in os.listdir("Food101-AB"):
shutil.move(os.path.join("Food101-AB", i), "Food-101")
For more information about move function:
https://docs.python.org/3/library/shutil.html#shutil.move
Try to change call function to run in order to retrieve the stdout, stderr and return code for your shell command:
from subprocess import run, CalledProcessError
source_dir = "full/path/to/src/folder"
dest_dir = "full/path/to/dest/folder"
try:
res = run(["mv", source_dir, dest_dir], check=True, capture_output=True)
except CalledProcessError as ex:
print(ex.stdout, ex.stderr, ex.returncode)
I want to convert multiple FASTA format files (DNA sequences) to the NEXUS format using BIO.SeqIO module but I get this error:
Traceback (most recent call last):
File "fasta2nexus.py", line 28, in <module>
print(process(fullpath))
File "fasta2nexus.py", line 23, in process
alphabet=IUPAC.ambiguous_dna)
File "/Library/Python/2.7/site-packages/Bio/SeqIO/__init__.py", line 1003, in convert
with as_handle(in_file, in_mode) as in_handle:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
File "/Library/Python/2.7/site-packages/Bio/File.py", line 88, in as_handle
with open(handleish, mode, **kwargs) as fp:
IOError: [Errno 2] No such file or directory: 'c'
What am I missing?
Here is my code:
##!/usr/bin/env python
from __future__ import print_function # or just use Python 3!
import fileinput
import os
import re
import sys
from Bio import SeqIO, Nexus
from Bio.Alphabet import IUPAC
test = "/Users/teton/Desktop/test"
files = os.listdir(os.curdir)
def process(filename):
# retuns ("basename", "extension"), so [0] picks "basename"
base = os.path.splitext(filename)[0]
return SeqIO.convert(filename, "fasta",
base + ".nex", "nexus",
alphabet=IUPAC.ambiguous_dna)
for files in os.listdir(test):
for file in files:
fullpath = os.path.join(file)
print(process(fullpath))
This code should solve the majority of problems I can see.
from __future__ import print_function # or just use Python 3!
import fileinput
import os
import re
import sys
from Bio import SeqIO, Nexus
from Bio.Alphabet import IUPAC
test = "/Users/teton/Desktop"
def process(filename):
# retuns ("basename", "extension"), so [0] picks "basename"
base = os.path.splitext(filename)[0]
return SeqIO.convert(filename, "fasta",
base + ".nex", "nexus",
alphabet=IUPAC.ambiguous_dna)
for root, dirs, files in os.walk(test):
for file in files:
fullpath = os.path.join(root, file)
print(process(fullpath))
I changed a few things. First, I ordered your imports (personal thing) and made sure to import IUPAC from Bio.Alphabet so you can actually assign the correct alphabet to your sequences. Next, in your process() function, I added a line to split the extension off the filename, then used the full filename for the first argument, and just the base (without the extension) for naming the Nexus output file. Speaking of which, I assume you'll be using the Nexus module in later code? If not, you should remove it from the imports.
I wasn't sure what the point of the last snippet was, so I didn't include it. In it, though, you appear to be walking the file tree and process()ing each file again, then referencing some undefined variable named count. Instead, just run process() once, and do whatever count refers to within that loop.
You may want to consider adding some logic to your for loop to test that the file returned by os.path.join() actually is a FASTA file. Otherwise, if any other file type is in one of the directories you search and you process() it, all sorts of weird things could happen.
EDIT
OK, based on your new code I have a few suggestions. First, the line
files = os.listdir(os.curdir)
is completely unnecessary, as below the definition of the process() function, you're redefining the files variable. Additionally, the above line would fail, as you are not calling os.curdir(), you are just passing its reference to os.listdir().
The code at the bottom should simply be this:
for file in os.listdir(test):
print(process(file))
for file in files is redundant, and calling os.path.join() with a single argument does nothing.
NameError
You imported SeqIO but are calling seqIO.convert(). Python is case-sensitive. The line should read:
return SeqIO.convert(filename + '.fa', "fasta", filename + '.nex', "nexus", alphabet=IUPAC.ambiguous_dna)
IOError: for files in os.walk(test):
IOError is raised when a file cannot be opened. It often arises because the filename and/ or file path provided does not exist.
os.walk(test) iterates through all subdirectories in the path test. During each iteration, files will be a list of 3 elements. The first element is the path of the directory, the second element is a list of subdirectories in that path, and the third element is a list of files in that path. You should be passing a filename to process(), but you are passing a list in process(files).
You have implemented it correctly in this block for root, dirs, files in os.walk(test):. I suggest you implement it similarly in the for loop below.
You are adding .fa to your filename. Don't add .fa.