This question already has answers here:
How to split a dos path into its components in Python
(23 answers)
Closed 6 years ago.
This is what I'm doing. I'm taking a text from a folder, modifying that text, and writing it out to another folder with a modified file name. I'm trying to establish the file name as a variable. Unfortunately this happens:
import os
import glob
path = r'C://Users/Alexander/Desktop/test/*.txt'
for file in glob.glob(path):
name = file.split(r'/')[5]
name2 = name.split(".")[0]
print(name2)
Output: test\indillama_Luisa_testfile
The file name is 'indillama_Luisa_testfile.txt' it is saved in a folder on my desktop called 'test'.
Python is including the 'test\' in the file name. If I try to split name at [6] it says that index is out of range. I'm using regex and I'm assuming that it's reading '/*' as a single unit and not as a slash in the file directory.
How do I get the file name?
You can split by the OS path separator:
import os
import glob
path = r'C://Users/Alexander/Desktop/test/*.txt'
for file in glob.glob(path):
name = file.split(os.path.sep)[-1]
name2 = name.split(".")[0]
print(name2)
import os
import glob
path = r'C://Users/Alexander/Desktop/test/*.txt'
for file in glob.glob(path):
name = os.path.basename(file)
(path, ext) = os.path.splitext(file)
print(ext)
os.path.basename() will extract the filename part of the path. os.path.splitext() hands back a tuple containing the path and the split-off extension. Since that's what your example seemed to be printing, that's what I did in my suggested answer.
For portability, it's usually safer to use the built-in path manipulation routines rather than trying to do it yourself.
You can use os.listdir(path) to list all the files in a directory.
Then iterate over the list to get the filename of each file.
for file in os.listdir(path):
name2 = file .split(".")[0]
print(name2)
Related
I am trying to run a program with requires pVCF files alone as inputs. Due to the size of the data, I am unable to create a separate directory containing the particular files that I need.
The directory contains multiple files with 'vcf.gz.tbi' and 'vcf.gz' endings. Using the following code:
file_url = "file:///mnt/projects/samples/vcf_format/*.vcf.gz"
I tried to create a file path that only grabs the '.vcf.gz' files while excluding the '.vcf.gz.tbi' but I have been unsuccesful.
The code you have, as written, is just assigning your file path to the variable file_url. For something like this, glob is popular but isn't the only option:
import glob, os
file_url = "file:///mnt/projects/samples/vcf_format/"
os.chdir(file_url)
for file in glob.glob("*.vcf.gz"):
print(file)
Note that the file path doesn't contain the kind of file you want (in this case, a gzipped VCF), the glob for loop does that.
Check out this answer for more options.
It took some digging but it looks like you're trying to use the import_vcf function of Hail. To put the files in a list so that it can be passed as input:
import glob, os
file_url = "file:///mnt/projects/samples/vcf_format/"
def get_vcf_list(path):
vcf_list = []
os.chdir(path)
for file in glob.glob("*.vcf.gz"):
vcf_list.append(path + "/" + file)
return vcf_list
get_vcf_list(file_url)
# Now you pass 'get_vcf_list(file_url)' as your input instead of 'file_url'
mt = hl.import_vcf(get_vcf_list(file_url), force_bgz=True, reference_genome="GRCh38", array_elements_required=False)
This question already has answers here:
Find a file in python
(9 answers)
Closed 1 year ago.
file searching
how would imake a program that would check if my pc had a file with a certain name no matter what type the file is and then print yes or no.
For example i would put in a file name then it would print if i had it or not. ive tried similar things but they didnt work for me.
You can use the os module for this.
import os
def find(name, path):
for root, dirs, files in os.walk(path):
if name in files:
return os.path.join(root, name) # or you could print 'found' or something
This will find the first match. If it exists, it will return the file path, otherwise it returns None. Note that it is case sensitive.
This was taken from this answer.
you can also use Pathlib module. The code written using Pathlib will work on any OS.
#pathlib will work on any OS (linux,windows,macOS)
from pathlib import Path
# define the search Path here '.' means current working directory. you can specify other path like Path('/User/<user_name>')
search_path = Path('.')
# define the file name to be searched here.
file_name = 'test.txt'
#iteratively search glob() will return generator object. It'll be a lot faster.
for file in search_path.glob('**/*'):
if file.name == file_name:
print(f'file found = {file.absolute()}') #print the file path if file is found
break
This question already has answers here:
Change the file extension for files in a folder?
(7 answers)
Closed 2 years ago.
I have a folder full of .txt files and would like to change them to .dat using a method. From what I have researched I have constructed the portion of code below. However, when I run it nothing is changed and they stay as .txt.
def ChangeFileExt(path, curr_ext, new_ext)
with os.scandir(path) as itr:
for entry in itr:
if entry.name.endswith(curr_ext):
name = entry.name.split('.')
name = name + '.' + new_ext
src = os.path.join(path,entry.name)
dst = os.path.join(path,name)
os.rename(src, dst)
The following is what i will do, that may be robust in many situations.
import glob
from pathlib import Path
from shutil import copyfile
# glob all the absolute file directories
f_glob = "/[the absolute directory]/*.txt"
ls_f_dirs = glob.glob(f_glob)
# loops through the file directories list for renaming
# (i will create a new folder storing the copied/renamed file
# but will not be renaming the original files directly on the existing folder.
for f_dir in ls_f_dirs:
# to get the file stem excluding the extension
f_stem = Path(f_dir).stem
# copying the file to new file name in a new absolute directory
copyfile(f_dir, '/[the new storing absolute directory]/{}.bat'.format(f_stem))
I have multiple text files with names containing 6 groups of period-separated digits matching the pattern year.month.day.hour.minute.second.
I want to add a .txt suffix to these files to make them easier to open as text files.
I tried the following code and I I tried with os.rename without success:
Question
How can I add .txt to the end of these file names?
path = os.chdir('realpath')
for f in os.listdir():
file_name = os.path.splitext(f)
name = file_name +tuple(['.txt'])
print(name)
You have many problems in your script. You should read each method's documentation before using it. Here are some of your mistakes:
os.chdir('realpath') - Do you really want to go to the reapath directory?
os.listdir(): − Missing argument, you need to feed a path to listdir.
print(name) - This will print the new filename, not actually rename the file.
Here is a script that uses a regex to find files whose names are made of 6 groups of digits (corresponding to your pattern year.month.day.hour.minute.second) in the current directory, then adds the .txt suffix to those files with os.rename:
import os
import re
regex = re.compile("[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+[.][0-9]+[.][0-9]+")
for filename in os.listdir("."):
if regex.match(filename):
os.rename(filename, filename + ".txt")
This question already has answers here:
How do I list all files of a directory?
(21 answers)
Closed 9 years ago.
I have this code:
allFiles = os.listdir(myPath)
for module in allFiles:
if 'Module' in module: #if the word module is in the filename
dirToScreens = os.path.join(myPath, module)
allSreens = os.listdir(dirToScreens)
Now, all works well, I just need to change the line
allSreens = os.listdir(dirToScreens)
to get a list of just files, not folders.
Therefore, when I use
allScreens [ f for f in os.listdir(dirToScreens) if os.isfile(join(dirToScreens, f)) ]
it says
module object has no attribute isfile
NOTE: I am using Python 2.7
You can use os.path.isfile method:
import os
from os import path
files = [f for f in os.listdir(dirToScreens) if path.isfile(f)]
Or if you feel functional :D
files = filter(path.isfile, os.listdir(dirToScreens))
"If you need a list of filenames that all have a certain extension, prefix, or any common string in the middle, use glob instead of writing code to scan the directory contents yourself"
import os
import glob
[name for name in glob.glob(os.path.join(path,'*.*')) if os.path.isfile(os.path.join(path,name))]