Generalize a program using iglob - python

I have a program which scans a folder or path for a given type of files and then analyze them.
import glob
path = raw_input("ENTER PATH TO SEARCH: ")
list_of_file = glob.iglob('*.txt')
for filename in list_of_file:
print filename`
But in this script the program will scan the directory only in which it was stored in path statement gets no value.
Now if I write:
list_of_file = glob.iglob(path + '*.txt')
this also do not make my work going.
So please suggest a way in which whatever path I enter the program follows that path and search for particular file types, no matter where I kept my script.

I assume English is not your first language, so please forgive me if I have misunderstood your question. It seems to me that you want to be able to enter a directory and then find all the ".txt" files in that directory?
If so, then the following code should suffice to give you an example:
"""Find all .py files is a given directory."""
import os
import glob
path = raw_input("Enter path: ")
fullglob = os.path.join(path, "*.py")
for fn in glob.iglob(fullglob):
print os.path.split(fn)[1]
The last line takes off the path you put in to ensure iglob() finds the right directory. You can use the help() function to get documentation on os.path.join() and os.path.split().

Related

How to get absolute file path of folder from user input in python? The input gets added at the end of path

import os
print("enter folder name")
FolderName = input()
flag = os.path.isabs(FolderName)
if flag == False:
path = os.path.abspath(FolderName)
print("The absolute path is: " ,path)
What am I doing wrong here? Let's say the Folder name input is Neon.
The code output gives C:\Users\Desktop\Codes\Neon\Neon
Instead what I want is: C:\Users\Desktop\Codes\Neon\
The os.path.abspath function normalizes the users current working directory and the input argument and then merges them together.
So if your input is 'Neon' and your current working directory is C:\Users\Desktop\Codes\Neon, then the output is C:\Users\Desktop\Neon\Neon.
Likewise if your input is fkdjfkjdsk then the output would be C:\Users\Desktop\Neon\fkdjfkjdsk.
If you are looking for a way to get the absolute path of the current directory you can use:
os.getcwd()
For the official definition:
os.path.abspath(path)
Return a normalized absolutized version of the pathname path. On most platforms, this is equivalent to calling the function normpath() as follows: normpath(join(os.getcwd(), path)).
You are probably running your code when you are at the C:\Users\Desktop\Codes\Neon\ directory
Therefore, when you run os.path.abspath("Neon"), the function is assuming you are trying to refer to a file in the current directory, and returns C:\Users\Desktop\Codes\Neon\Neon.
If you want to have the absolute path of the current directory, use:
os.path.abspath(".")
Most of the function inside the path module of the os library doesn't perform file/directory presence checks before performing operations. i.e., It is probable that if you enter the path to a filesystem object that doesn't exist, the function would still return a result for it.
Your current working directory of the Python file is not the one you expect.
Previous answers have covered the liability of the abspath function. The following code would produce the desired output (only for your case).
import os
os.chdir(r"C:\Users\Desktop\Codes")
print("enter folder name")
FolderName = input()
flag = os.path.isabs(FolderName)
if flag == False:
path = os.path.abspath(FolderName)
print("The absolute path is: " ,path)
But if you want to be sure, first display the current working directory to assure that the parent directory is the correct one. Also, include some directory presence functions within the code (such as isdir) in the code to assure that the directory name provided as input is real.

Make glob directory variable

I'm trying to write a Python script that searches a folder for all files with the .txt extension. In the manuals, I have only seen it hardcoded into glob.glob("hardcoded path").
How do I make the directory that glob searches for patterns a variable? Specifically: A user input.
This is what I tried:
import glob
input_directory = input("Please specify input folder: ")
txt_files = glob.glob(input_directory+"*.txt")
print(txt_files)
Despite giving the right directory with the .txt files, the script prints an empty list [ ].
If you are not sure whether a path contains a separator symbol at the end (usually '/' or '\'), you can concatenate using os.path.join. This is a much more portable method than appending your local OS's path separator manually, and much shorter than writing a conditional to determine if you need to every time:
import glob
import os
input_directory = input('Please specify input folder: ')
txt_files = glob.glob(os.path.join(input_directory, '*.txt'))
print(txt_files)
For Python 3.4+, you can use pathlib.Path.glob() for this:
import pathlib
input_directory = pathlib.Path(input('Please specify input folder: '))
if not input_directory.is_dir():
# Input is invalid. Bail or ask for a new input.
for file in input_directory.glob('*.txt'):
# Do something with file.
There is a time of check to time of use race between the is_dir() and the glob, which unfortunately cannot be easily avoided because glob() just returns an empty iterator in that case. On Windows, it may not even be possible to avoid because you cannot open directories to get a file descriptor. This is probably fine in most cases, but could be a problem if your application has a different set of privileges from the end user or from other applications with write access to the parent directory. This problem also applies to any solution using glob.glob(), which has the same behavior.
Finally, Path.glob() returns an iterator, and not a list. So you need to loop over it as shown, or pass it to list() to materialize it.

Python - Getting file directory as user input

Having some trouble getting a list of files from a user defined directory. The following code works fine:
inputdirectory = r'C:/test/files'
inputfileextensions = 'txt'
files = glob.glob(inputdirectory+"*."+inputfileextensions)
But I want to allow the user to type in the location. I've tried the following code:
inputdirectory = input("Please type in the full path of the folder containing your files: ")
inputfileextensions = input("Please type in the file extension of your files: ")
files = glob.glob(inputdirectory+"*."+inputfileextensions)
But it doesn't work. No error message occurs, but files returns as empty. I've tried typing in the directory with quotes, with forward and backward slashes but can't get it to work. I've also tried converting the input to raw string using 'r' but maybe by syntax is wrong. Any ideas?
Not quite sure how the first version works for you. The way the variables are defined, you should have the input to glob as something like:
inputdirectory+"*."+inputfileextensions == "C:\test\files*.txt"
Looking at the above value you can realize that its not something that you are trying to achieve. Instead, you need to join the two paths using the backslash operator. Something like:
os.path.join(inputdirectory, "*."+inputfileextensions) == "C:\test\files\*.txt"
With this change, the code should work regardless of whether the input is taken from the user or predefined.
Try to join path with os.path.join. It will handle slash issue.
import os
...
files = glob.glob(os.path.join(inputdirectory, "*."+inputfileextensions))
Working code for sample, with recursive search.
#!/usr/bin/python3
import glob
import os
dirname = input("What is dir name to search files? ")
path = os.path.join(dirname,"**")
for x in glob.glob(path, recursive=True):
print(x)

How can I design a function which opens a folder based on user input, picks files with certain title format, and then reads specific lines from them?

I am having a moment of complete brain freeze right now and cannot seem to figure out how to design a code that will perform a series of tasks in one function. Basically, I want to design a code that asks the User for the input of a name of a folder in the working directory. The folders will have a 6 digit number for their name (for this example, let's assume the number of the folder is 111234). Once the folder is specified, certain files inside said folder will be opened and read. These files will be selected based on how their filenames are formatted, which is
(foldername)_(filenumber)_0_structure.in
wherein, for this example, the foldername is 111234 and the filenumber represents the order that the file appears in within the folder (can be the number zero or higher). The other terms in the filename (such as the zero after the filenumber), the word "structure", and the .in file extension are all constant. After all files that conform to this format are selected and opened, I want the 2nd and 3rd lines of these files to be read and copied into a dict whose keys are the file's filenumber and whose values contain a list of strings (i.e. the 2nd and 3rd lines).
So far, I have written the following in order to address these needs:
import os
from os import path
import re
def folder_validation(foldername):
folder_contents= {}
while True:
try:
foldername= str(raw_input(foldername))
file_path= path.join(current_directory, foldername)
except IOError:
print("Please give the name of a folder that exists in the current working directory.")
continue
for filename in os.listdir(file_path):
if re.search("{}_{}_0_detect1.in".format(foldername,[0*]), filename):
file_contents= str(open(filename).readlines()[2:3])
folder_contents[filenumber]= file_contents
return folder_contents
folder_input= folder_validation("Please give the name of the relevant folder you wish to analyze:")
The most obvious problem with the above code is that I am not sure how to format the regular expression search to include the user's input and the placement of any integer number in the filenumber variable. Additionally, the raw_input does not seem to be working. Any assistance would be most appreciated.
There were two main problems in my code: the first problem was that I did not properly configure the while loop condition and so the code would get stuck. The second problem was that I did not set up the filepath to my files in my folder correctly, and as a result, my code could not open the files and read them. The regex line was also improved to include any filenames that had numbers 0 and above be read (in the specified format). The corrected version of the code is posted below.
import os
from os import path
import re
def folder_validation(foldername):
folder_contents= {}
while True:
try:
foldername= str(raw_input(foldername))
file_path= path.join(current_directory, foldername)
except IOError:
print("Please give the name of a folder that exists in the current working directory.")
continue
else:
break
while True:
for filename in os.listdir(file_path):
if re.search("{}_[0-9]+_0_detect1.in".format(foldername,[0*]), filename):
file_contents= open(path.join(file_path,filename))
file_lines= file_contents.readlines()[2:3]
folder_contents[filename]= file_lines
return folder_contents
folder_input= folder_validation("Please give the name of the relevant folder you wish to analyze:")

How to confirm only html files exist in a given folder and if not then how to prompt the user to specify a folder with only html files within

I would like to create a python script that will do 3 things: 1) Take user input to navigate to a file directory2) Confirm the file contents (a particular set of files need to be in the folder for the script to proceed)3) Do a Find and Replace
The Code as of now:
import os, time
from os.path import walk
mydictionary = {"</i>":"</em>"}
for (path, dirs, files) in os.walk(raw_input('Copy and Paste Course Directory Here: ')):
for f in files:
if f.endswith('.html'):
filepath = os.path.join(path,f)
s = open(filepath).read()
for k, v in mydictionary.iteritems(): terms for a dictionary file and replace
s = s.replace(k, v)
f = open(filepath, 'w')
f.write(s)
f.close()
Now i Have parts 1 and 3, I just need part 2.
for part 2 though I need to confirm that only html files exist in the directory the the user will specified otherwise the script will prompt the user to enter the correct folder directory (which will contain html files)
Thanks
From what I understand, here's your pseudocode:
Ask user for directory
If all files in that directory are .html files:
Do the search-and-replace stuff on the files
Else:
Warn and repeat from start
I don't think you actually want a recursive walk here, so first I'll write that with a flat listing:
while True:
dir = raw_input('Copy and Paste Course Directory Here: ')
files = os.listdir(dir)
if all(file.endswith('.html') for file in files):
# do the search and replace stuff
break
else:
print 'Sorry, there are non-HTML files here. Try again.'
Except for having the translate the "repeat from start" into a while True loop with a break, this is almost a word-for-word translation from the English pseudocode.
If you do need the recursive walk through subdirectories, you probably don't want to write the all as a one-liner. It's not that hard to write "all members of the third member of any member of the os.walk result end with '.html'", but it will be hard to read. But if you turn that English description into something more understandable, you should be able to see how to turn it directly into code.

Categories

Resources