directory and file related doubts? - python

i have a directory with around 1000 files....i want to run a same code for each of these file...
my code requires the file name to be inputted.
i have written code to copy the information of one into other in other format...
please suggest a method to copy all 1000 files one by one without need to change the file name every time
and i have a field serial_num which need to be continous i.e if 1st file has upto 30 then while coping other file it should continue from 30not from 0 again
require help
thanks..
from string import Template
from string import Formatter
import pickle
f=open("C:/begpython/wavnk/text0004.lab",'r')
p='C:/begpython/wavnk/text0004.wav'
f1=open("C:/begpython/text.txt",'a')
m=[]
i=0
k=f.readline()
while k is not '':
k=f.readline()
k=k.rstrip('\n')
mi=k.split(' ')
m=m+[mi]
i=i+1
y=0
x=[]
j=1
t=(i-2)
while j<t:
k=j-1
l=j+1
if j==120 or j==i:
j=j+1
else:
x=[]
x = x + [y, m[j][2], m[k][2], m[l][2], m[j][0], m[l][0], p]
y=y+1
#f1.writelines(str(x)+'\n')
for item in x:
f1.write(str(item)+' ')
f1.write(str('\n'))
j=j+1
f.close()
f1.close()
my code.....
and i have files name in series like text0001.....text1500.lab and want to run them at a time without need to call them everytime by changin name
enter code here

Why not just use an iterator over the list of files in the directory? I would post some example code but I do get the feeling that you're getting everyone else here to do your whole job for you.

You could take a look at the glob module as well. It's this easy:
import glob
list_of_files = glob.glob('C:/begpython/wavnk/*.lab')
And yes, it works on windows as well.
However, it only finds the matching files, doesn't read them or anything.
By the looks of your code example, you may or may not be interested in the python
csv module as well.

You can list the contents of the directory with [listdir][1].
You can the filter on extension with something like
allnames = listdir...
inputnames = [name for name in allnames \
where os.path.[splitext][2](name)\[1\] == ".lab" ]
You can also look at the filter() or map() built-in functions.
http://docs.python.org/library/os.path.html#os.path.splitext

Related

Re-Naming Files as they are being opened in Python For Loop

I'm trying to create a program that will read in multiple .txt files and rename them in one go. I'm able to read them in, but I'm falling flat when it comes to defining them all.
First I tried including an 'as' statement after the open call in my loop, but the files kept overwriting each other since it's only one name I'm defining. I was thinking I could read them in as 'file1', 'file2', 'file3'... etc
Any idea on how I can get this naming step to work in a for loop?
import os
os.chdir("\\My Directory")
#User Inputs:
num_files = 3
#Here, users' actual file names in their directory would be 'A.txt',
'B.txt', 'C.txt'
filenames = [A, B, C]
j = 1
for i in filenames:
while j in range(1,num_files):
open(i + ".txt", 'r').read().split() as file[j]
j =+ 1
I was hoping that each time it read in the file, it would define each one as file#. Clearly, my syntax is wrong because of the way I'm indexing 'file'. I've tried using another for loop in the for loop, but that gave me a syntax error as well. I'm really new to python and programming logic in general. Any help would be much appreciated.
Thank you!
You should probably use the rename() function in the os module. An example could be:
import os
os.rename("stackoverflow.html", "xyz.html")
stack overflow.html would be the name you want to call the file and xyz.html would be the current name of the file/the destination of the file. Hope this helps!

Delete every 1 of 2 or 3 files on a folder with Python

What I'm trying to do is to write a code that will delete a single one of 2 [or 3] files on a folder. I have batch renamed that the file names are incrementing like 0.jpg, 1.jpg, 2.jpg... n.jpg and so on. What I had in mind for the every single of two files scenario was to use something like "if %2 == 0" but couldn't figure out how actually to remove the files from the list object and my folder obviously.
Below is the piece of NON-WORKING code. I guess, it is not working as the file_name is a str.
import os
os.chdir('path_to_my_folder')
for f in os.listdir():
file_name, file_ext = os.path.splitext(f)
print(file_name)
if file_name%2 == 0:
os.remove();
Yes, that's your problem: you're trying to use an integer function on a string. SImply convert:
if int(file_name)%2 == 0:
... that should fix your current problem.
Your filename is a string, like '0.jpg', and you can’t % 2 a string.1
What you want to do is pull the number out of the filename, like this:
name, ext = os.path.splitext(filename)
number = int(name)
And now, you can use number % 2.
(Of course this only works if every file in the directory is named in the format N.jpg, where N is an integer; otherwise you’ll get a ValueError.)
1. Actually, you can do that, it just doesn’t do what you want. For strings, % means printf-style formatting, so filename % 2 means “find the %d or similar format spec in filename and replace it with a string version of 2.
Thanks a lot for the answers! I have amended the code and now it looks like this;
import os
os.chdir('path_to_the_folder')
for f in os.listdir():
name, ext = os.path.splitext(f)
number = int(name)
if number % 2 == 0:
os.remove()
It doesn't give an error but it also doesn't remove/delete the files from the folder. What in the end I want to achieve is that every file name which is divisible by two will be removed so only 1.jpg, 3.jpg, 5.jpg and so on will remain.
Thanks so much for your time.
A non-Python method but sharing for future references;
cd path_to_your_folder
mkdir odd; mv *[13579].png odd
also works os OSX. This reverses the file order but that can be re-corrected easily. Still want to manage this within Python though!

Renaming files (while counting the number)

I'm currently trying to write a python script to rename a bunch of files. The file is named like this: [Name][Number]-[Number]. To give a specific example: milk-00-00. The next file is milk-00-01, then 02, 03 until X. After that milk-01-00 starts with the same pattern.
What I need to do is to switch 'milk' into a number and replace the '-XX-XX' by '-01', '02', ...
I hope you guys get the idea. The current state of my code is pretty poor, it was hard enough to get it this far though. It looks like this and with this I'm at least able to replace something. I'll also manage to get rid of the 'milk' with the help of google. However, if there is an easier way, I'd really appreciate a push in the right direction!
import os
import sys
path = 'C:/Users/milk/Desktop/asd'
i=00
for filename in os.listdir(path):
if filename.endswith('.tiff'):
newname = filename.replace('00', 'i')
os.rename(filename,newname)
i=i+1
You can use the format function
temp = (' ').join(filename.split('.')[:-1])
os.rename(filename, '10{}-{}.tiff'.format(temp.split('-')[-2],temp.split('-')[-1]))
Since filename has the .tiff extension this program first creates a version of filename without the extension - temp - and then creates new names from that.
os.rename(filename, '1000-%02d.tiff' % i)
i += 1

Breaking the loop properly in Python

Currently I am trying to upload a set of files via API call. The files have sequential names: part0.xml, part1.xml, etc. It loops through all the files and uploads them properly, but it seems it doesn't break the loop and after it uploads the last available file in the directory I am getting an error:
No such file or directory.
And I don't really understand how to make it stop as soon as the last file in the directory is uploaded. Probably it a very dumb question, but I am really lost. How do I stop it from looping through non-existent files?
The code:
part = 0
with open('part%d.xml' % part, 'rb') as xml:
#here goes the API call code
part +=1
I also tried something like this:
import glob
part = 0
for fname in glob.glob('*.xml'):
with open('part%d.xml' % part, 'rb') as xml:
#here goes the API call code
part += 1
Edit: Thank you all for the answers, learned a lot. Still lots to learn. :)
You almost had it. This is your code with some stuff removed:
import glob
for fname in glob.glob('part*.xml'):
with open(fname, 'rb') as xml:
# here goes the API call code
It is possible to make the glob more specific, but as it is it solves the "foo.xml" problem. The key is to not use counters in Python; the idiomatic iteration is for x in y: and you don't need a counter.
glob will return the filenames in alphabetical order so you don't even have to worry about that, however remember that ['part1', 'part10', 'part2'] sort in that order. There are a few ways to cope with that but it would be a separate question.
Alternatively, you can simply use a regex.
import os, re
files = [f for f in os.listdir() if re.search(r'part[\d]+\.xml$', f)]
for f in files:
#process..
This will be really useful in case you require advanced filtering.
Note: you can do similar filtering using list returned by glob.glob()
If you are not familiar with the list comprehension and regex, I would recommend you to refer to:
Regex - howto
List Comprehensions
Your for loop is saying "for every file that ends with .xml"; if you have any file that ends with .xml that isn't a sequential part%d.xml, you're going to get an error. Imagine you have part0.xml and foo.xml. The for loop is going to loop twice; on the second loop, it's going to try to open part1.xml, which doesn't exist.
Since you know the filenames already, you don't even need to use glob.glob(); just check if each file exists before opening it, until you find one that doesn't exist.
import os
from itertools import count
filenames = ('part%d.xml' % part_num for part_num in count())
for filename in filenames:
if os.path.exists(filename):
with open(filename, 'rb') as xmlfile:
do_stuff(xml_file)
# here goes the API call code
else:
break
If for any reason you're worried about files disappearing between os.path.exists(filename) and open(filename, 'rb'), this code is more robust:
import os
from itertools import count
filenames = ('part%d.xml' % part_num for part_num in count())
for filename in filenames:
try:
xmlfile = open(filename, 'rb')
except IOError:
break
else:
with xmlfile:
do_stuff(xmlfile)
# here goes the API call code
Consider what happens if there are other files that match the '*.xml'
suppose that you have 11 files "part0.xml"..."part10.xml" but also a file called "foo.xml"
Then the for loop will iterate 12 times (since there are 12 matches for the glob). On the 12th iteration, you are trying to open "part11.xml" which doesn't exist.
On approach is to dump the glob and just handle the exception.
part = 0
while True:
try:
with open('part%d.xml' % part, 'rb') as xml:
#here goes the API call code
part += 1
except IOerror:
break
When you use a counter, you need to test, if the file exists:
import os
from itertools import count
for part in count():
filename = 'part%d.xml' % part
if not os.path.exists(filename):
break
with open(filename) as inp:
# do something
You are doing it wrong.
Suppose folder has 3 files- part0.xml part1.xml and foo.xml. So loop will iterate 3 times and it will give error for third iteration, it will try to open part2.xml, which is not present.
Don't loop through all files with extension .xml.
Only Loop through files which start with 'part', have a digit in the name before the extension and having extension .xml
So your code will look like this:
import glob
for fname in glob.glob('part*[0-9].xml'):
with open(fname, 'rb') as xml:
#here goes the API call code
Read - glob – Filename pattern matching
If you want files to be uploaded in sequential order then read : String Natural Sort

open a file which name is contained in a variable

I have a hundred of files in a folder which have the form i.ext where I is an integer (0 <= i). I wrote a script which take 2 files in entries but I wanted to use the script with all the files of my folder.
Could I write a script in Python with a loop such a way that the name file is in a variable like this:
from difference import *
# I have a module called "difference"
for i in range (0,100):
for j in range (0,100):
leven(i+".ext",j+".ext") #script in module which take two files in entries
Obviously my code is wrong, but I don't know how can I do :(
You cannot add a number and a string in Python.
'%d.ext' % (i,)
but i wanted to use the script with all the files of my folder.Could i write a script in Python with a loop such a way that the name file is in a variable like this:
This is most certainly possible, but if you want to use all the files from a directory following a certain pattern, I suggest you glob them.
import glob
import difference
ifile_list = glob('*.iext')
jfile_list = glob('*.jext')
for i,j in [[(ifile, jfile) for jfile in jfile_list] for ifile in ifile_list]:
difference.leven(i,j)
However I strongly suggest that instead of hardcodig those file patterns I'd supply them through command line parameters.
use str(i) and str(j) to convert i and j from integer to str.

Categories

Resources