Glob files for a range of numbers [duplicate] - python

This question already has answers here:
How to use glob to read limited set of files with numeric names?
(2 answers)
Closed 3 years ago.
I have a directory of files with names 10.txt, 11.txt .. 29.txt, 30.txt
How can I select files 12 to 21?
Ive tried:
glob.glob('path/[12-21].txt')

Glob is good if you do not know what file names and just need a rough pattern, however, in your case you do know so you might not need to use glob.
It would be better to simply generate your expected list of files and check if they exist eg
from os.path import isfile
# Generate a list of expected file names
expected_files = ["path/{}.txt".format(i) for i in range(12, 22)]
# Filter the list to just the files that actually exist.
actual_files = [f for f in expected_files if isfile(f)]

Related

How to parse mutiple fastq files from a directory? [duplicate]

This question already has an answer here:
Can't Open files from a directory in python [duplicate]
(1 answer)
Closed 2 years ago.
I am trying to create a loop to parse one by one 5 fasta files that I have in a same directory. Now I will explain a little bit, I have 5 fasta files with the genome of 5 microorganism, each one in each file. The idea is to obtain de Ids from each file and put them in to a dictionary {Mo_Id1:0, Mo_Id2:0...,Mo_Id5:0}
I think my loop reads the first file, but then it gave me the following error; No such file or directory 'GCF_000006532.1_ASM696v3_genomic.fna' (this is the name of the second file that I have in my folder).
I show you my code:
from Bio import SeqIO
import os
dicc_MO=[]
files = os.listdir("/home/alumno/Escritorio/Asig2Python/Semana4/Tarea/genomas/genomas")
for f in files:
for record_seqMO in SeqIO.parse(f,"fasta"):
record_seqMO.id not in dicc_MO:
dicc_MO[record_seqMO.id] = 0
print(dicc_MO)
With dicc_MO i was trying to check if the loop was OKEY, in that case, I should have a dictionary where the keys are the microorganism name and the values are 0.
The command os.listdir only shows the name of files without path. So you need to add path to each file name in your list files.
from Bio import SeqIO
import os
dicc_MO=[]
files=os.listdir("/home/alumno/Escritorio/Asig2Python/Semana4/Tarea/genomas/genomas")
for f in files:
f = f + "/home/alumno/Escritorio/Asig2Python/Semana4/Tarea/genomas/genomas/"
for record_seqMO in SeqIO.parse(f,"fasta"):
...

Using files in directory that are matching with a list [duplicate]

This question already has answers here:
Get a filtered list of files in a directory
(14 answers)
Closed 2 years ago.
I have a directory with a lot of audio files and a list that contains the names of some of these files. I want to process only the files in the directory that are matching with my name list.
My attempt (that is not working) is this:
path_to_audio_files = 'D:/Data_small/output_conv/'
sample_list = ['s1.wav', 's2.wav', 's3.wav']
import fnmatch
for file in os.listdir(path_to_audio_files):
if fnmatch.fnmatch(file, sample_list):
fs, signal = wavfile.read(file)
Is this even the right way to do it or how can it be done?
Thanks!
Why use fnmatch (I have never heard of it) instead of python native capabilities? (in operator)
path_to_audio_files = 'D:/Data_small/output_conv/'
sample_list = ['s1.wav', 's2.wav', 's3.wav']
for file in os.listdir(path_to_audio_files):
if file in sample_list:
fs, signal = wavfile.read(file)

How to find all objects of specific type in folders? [duplicate]

This question already has answers here:
Obtaining file basename with a prespecified extension in Python
(2 answers)
Closed 2 years ago.
Say, I have a folders structure with files inside. In some files, objects of a specific type can be found. I want to initiate them dynamically (I know each instance of that type has a specific method). But I don't know how I would implement it. I think the pseudocode could be something like that:
for root, dirs, files in os.walk(root_folder):
for f in files:
if os.path.splitext(f)[1] == '.py':
get_fully_name_of_the_file # like app.modules.amazon
from from_fully_name import that_instance
that_instance.my_method()
import glob
all_types=[]
for f in glob.glob("root_folder"):
all_types.append[f[f.rindex("."):]]
print(all_types)

How can I move files with random names from one folder to another in Python? [duplicate]

This question already has answers here:
python copy files by wildcards
(3 answers)
Closed 4 years ago.
I have a large number of .txt files named in the combination of "cb" + number (like cb10, cb13), and I need to filter them out from a source folder that contains all the files named in "cb + number", including the target files.
The numbers in the target file names are all random, so I have to list all the file names.
import fnmatch
import os
import shutil
os.chdir('/Users/college_board_selection')
os.getcwd()
source = '/Users/college_board_selection'
dest = '/Users/seperated_files'
files = os.listdir(source)
for f in os.listdir('.'):
names = ['cb10.txt','cb11.txt']
if names in f:
shutil.move(f,dest)
if names in f: isn't going to work as f is a filename, not a list. Maybe you want if f in names:
But you don't need to scan a whole directory for this, just loop on the files you're targetting, it they exist:
for f in ['cb10.txt','cb11.txt']:
if os.path.exists(f):
shutil.move(f,dest)
If you have a lot of cbxxx.txt files, maybe an alternative would be to compute the intersection of this list with the result of os.listdir using a set (for faster lookup than a list, worth if there are a lot of elements):
for f in {'cb10.txt','cb11.txt'}.intersection(os.listdir(".")):
shutil.move(f,dest)
On Linux, with a lot of "cb" files, this would be faster because listdir doesn't perform a fstat, whereas os.path.exists does.
EDIT: if the files have the same prefix/suffix, you can build the lookup set with a set comprehension to avoid tedious copy/paste:
s = {'cb{}.txt'.format(i) for i in ('10','11')}
for f in s.intersection(os.listdir(".")):
or for the first alternative:
for p in ['10','11']:
f = "cb{}.txt".format(p)
if os.path.exists(f):
shutil.move(f,dest)
EDIT: if all cb*.txt files must be moved, then you can use glob.glob("cb*.txt"). I won't elaborate, the linked "duplicate target" answer explains it better.

Read files sequentially in order [duplicate]

This question already has answers here:
Is there a built in function for string natural sort?
(23 answers)
Closed 9 years ago.
I have a number of files in a folder with names following the convention:
0.1.txt, 0.15.txt, 0.2.txt, 0.25.txt, 0.3.txt, ...
I need to read them one by one and manipulate the data inside them. Currently I open each file with the command:
import os
# This is the path where all the files are stored.
folder path = '/home/user/some_folder/'
# Open one of the files,
for data_file in os.listdir(folder_path):
...
Unfortunately this reads the files in no particular order (not sure how it picks them) and I need to read them starting with the one having the minimum number as a filename, then the one with the immediate larger number and so on until the last one.
A simple example using sorted() that returns a new sorted list.
import os
# This is the path where all the files are stored.
folder_path = 'c:\\'
# Open one of the files,
for data_file in sorted(os.listdir(folder_path)):
print data_file
You can read more here at the Docs
Edit for natural sorting:
If you are looking for natural sorting you can see this great post by #unutbu

Categories

Resources