Removing/renaming files in python

Removing/renaming files in python - python

I am trying to figure out how I could remove certain words from a file name. So if my file name was lolipop-three-fun-sand,i would input three and fun in and they would get removed. Renaming the file to lolipop--sand. Any ideas on how to start this?

Use string.replace() to remove the words from the filename. Then call os.rename() to perform the rename.
newfilename = filename.replace('three', '').replace('fun', '')
os.rename(filename, newfilename)

import os
line = 'lolipop-three-fun-sand'
delete_list = raw_input("enter ur words to be removed : write them in single quote separated by a comma")
for word in delete_list:
line = line.replace(word, "")
print line

Related

search each text file words in another main text file and append if not found in main file using python

I need help on python code on below scenario.
I have two text files. one main file and one list file. Main file contains many words which i need to update when i found new word from list file.
I need to search each word of list file in main file. if any word not found in main file then i need to append that new word in main file.
i have code which will update file if string not found. but, i need to search each word from text file.
Main_File = "file path"
list_file="file path"
with open("Main_File", "r+") as file:
for line in file:
if needle in line:
break
else: # not found, we are at the eof
file.write(needle) # append missing data
#this code will append if specific word not found in file.. but,i need to search each word from another file.

You could load the mainFile with mmap and search for the words from list file as follows:
import mmap
mainFilePath= "mainFile.txt"
listFilePath= "listFile.txt"
newWords=[]
# open main file with mmap
with open(mainFilePath, 'r') as mainFile:
mainFileMmap = mmap.mmap(mainFile.fileno(), 0 , access=mmap.ACCESS_READ)
# open list file and search for words in main file with mmap.find()
with open(listFilePath, 'r') as listFile:
for line in listFile:
line= line.replace("\r", "").replace("\n", "") # remove line-feeds (quick and dirty)
if mainFileMmap.find(line.encode()) == -1:
newWords.append(line)
# append new words to main file
with open(mainFilePath, 'a') as mainFile:
for newWord in set(newWords):
mainFile.write("\n{}".format(newWord))

if word on your main file can be loaded in memory then you can load the words in set and check if the word is in main file like shown in sudo code below
main_file_words = set("load words from your main file".split())
list_file = # read list file
for word in list_file:
if word not in main_file_words:
main_file_words.add(word)
list_file.write(word)

How to rejoin split words in a file?

So far on python I have made a file using the code:
text_file = open("Sentences_Positions.txt", "w")
text_file.write (str(positions))
text_file.write (str(ssplit))
text_file.close()
The code makes the file and writes individual words to it which I previously split, I need to find a way to open the file and join the split words then print it I have tried.
text_file = open("Sentences_Positions.txt", "r")
rejoin = ("Sentences_positions.txt").join('')
print (rejoin)
But all this does is print a blank line in the shell, how should I approach this and what other code could i try?

Read the file content and join them by ''
content = textfile.read().split(' ')
print ''.join(content)

Replace:
rejoin = ("Sentences_positions.txt").join('')
with:
rejoin = ''.join(text_file.read().split(' '))
Also, you should probably not use open but rather the context manager:
with open("Sentences_Positions.txt") as text_file:
rejoin = ''.join(text_file.read().split(' '))
print (rejoin)
Otherwise the file remains open. Using the context manager, it will close it when it's done. (True for the first part of your code as well).

Python edit line containing specific characters

I was wondering if there is a way to edit those lines of a file that contain certain characters.
Something like this:
file.readlines()
for line in file:
if 'characters' in line:
file[line] = 'edited line'
If it matters: I'm using python 3.5

I think what you want is something like:
lines = file.readlines()
for index, line in enumerate(lines):
if 'characters' in line:
lines[index] = 'edited line'
You can't edit the file directly, but you can write out the modified lines over the original (or, safer, write to a temporary file and renamed once you've validated it).

You can use tempfile.NamedTemporaryFile to create a temporary file object and write your lines in it the use shutil module to replace the temp file with your preceding file.
from tempfile import NamedTemporaryFile
import shutil
tempfile = NamedTemporaryFile(delete=False)
with open(file_name) as infile,tempfile:
for line in infile:
if 'characters' in line:
tempfile.write('edited line')
else:
tempfile.write(line)
shutil.move(tempfile.name, file_name)

How does one parse a .lua file with Python and pull out the require statements?

I am not very good at parsing files but have something I would like to accomplish. The following is a snippet of a .lua script that has some require statements. I would like to use Python to parse this .lua file and pull the 'require' statements out.
For example, here are the require statements:
require "common.acme_1"
require "common.acme_2"
require "acme_3"
require "common.core.acme_4"
From the example above I would then like to split the directory from the required file. In the example 'require "common.acme_1"' the directory would be common and the required file would be acme_1. I would then just add the .lua extention to acme_1. I need this information so I can validate if the file exists on the file system (which I know how to do) and then against luac (compiler) to make sure it is a valid lua file (which I also know how to do).
I simply need help pulling these require statements out using Python and splitting the directory name from the filename.

You can do this with built in string methods, but since the parsing is a little bit complicated (paths can be multi-part) the simplest solution might be to use regex. If you're using regex, you can do the parsing and splitting using groups:
import re
data = \
'''
require "common.acme_1"
require "common.acme_2"
require "acme_3"
require "common.core.acme_4"
'''
finds = re.findall(r'require\s+"(([^."]+\.)*)?([^."]+)"', data, re.MULTILINE)
print [dict(path=x[0].rstrip('.'),file=x[2]) for x in finds]
The first group is the path (including the trailing .), the second group is the inner group needed for matching repeated path parts (discarded), and the third group is the file name. If there is no path you get path=''.
Output:
[{'path': 'common', 'file': 'acme_1'}, {'path': 'common', 'file': 'acme_2'}, {'path': '', 'file': 'acme_3'}, {'path': 'common.core', 'file': 'acme_4'}]

Here ya go!
import sys
import os.path
if len(sys.argv) != 2:
print "Usage:", sys.argv[0], "<inputfile.lua>"
exit()
f = open(sys.argv[1], "r")
lines = f.readlines()
f.close()
for line in lines:
if line.startswith("require "):
path = line.replace('require "', '').replace('"', '').replace("\n", '').replace(".", "/") + ".lua"
fName = os.path.basename(path)
path = path.replace(fName, "")
print "File: " + fName
print "Directory: " + path
#do what you want to each file & path here

Here's a crazy one-liner, not sure if this was exactly what you wanted and most certainly not the most optimal one...
In [270]: import re
In [271]: [[s[::-1] for s in rec[::-1].split(".", 1)][::-1] for rec in re.findall(r"require \"([^\"]*)", text)]
Out[271]:
[['common', 'acme_1'],
['common', 'acme_2'],
['acme_3'],
['common.core', 'acme_4']]

This is straight forward
One liners are great but they take too much effort to understand early and this is not a job for using regular expressions in my opinion
mylines = [line.split('require')[-1] for line in open(mylua.lua).readlines() if line.startswith('require')]
paths = []
for line in mylines:
if 'common.' in line:
paths.append('common, line.split('common.')[-1]
else:
paths.append('',line)

You could use finditer:
lua='''
require "common.acme_1"
require "common.acme_2"
require "acme_3"
require 'common.core.acme_4'
'''
import re
print [m.group(2) for m in re.finditer(r'^require\s+(\'|")([^\'"]+)(\1)', lua, re.S | re.M)]
# ['common.acme_1', 'common.acme_2', 'acme_3', 'common.core.acme_4']
Then just split on the '.' to split into paths:
for e in [m.group(2) for m in re.finditer(r'^require\s+(\'|")([^\'"]+)(\1)', lua, re.S | re.M)]:
parts=e.split('.')
if parts[:-1]:
print '/'.join(parts[:-1]), parts[-1]
else:
print parts[0]
Prints:
common acme_1
common acme_2
acme_3
common/core acme_4

file = '/path/to/test.lua'
def parse():
with open(file, 'r') as f:
requires = [line.split()[1].strip('"') for line in f.readlines() if line.startswith('require ')]
for r in requires:
filename = r.replace('.', '/') + '.lua'
print(filename)
The with statement opens the file in question. The next line creates a list of all lines that start with 'require ' and splits them, ignoring the 'require' and grabbing only the last part and strips off the double quotes. Then go though the list and replace the dots with slashes and appends '.lua'. The print statement shows the results.

Adding new line at the end of each line using Python

How can I preserve the file structure after substitution?
# -*- coding: cp1252 -*-
import os
import os.path
import sys
import fileinput
path = "C:\\Search_replace" # Insert the path to the directory of interest
#os.path.exists(path)
#raise SystemExit
Abspath = os.path.abspath(path)
print(Abspath)
dirList = os.listdir(path)
print ('seaching in', os.path.abspath(path))
for fname in dirList:
if fname.endswith('.txt') or fname.endswith('.srt'):
#print fname
full_path=Abspath + "\\" + fname
print full_path
for line in fileinput.FileInput(full_path, inplace=1):
line = line.replace("þ", "t")
line = line.replace("ª", "S")
line = line.replace("º", "s")
print line
print "done"

The question is not great in the clarity department, but if you want Python to print stuff to standard out without a newline at the end you can use sys.stdout.write() instead of print().
If you want to perform substitution and save it to a file, you can do what Senthil Kumaran suggested.

Instead of print line in the fileinput line, do a sys.stdout.write(line) at the end. And don't use print something in the other places in the loop.
Instead of using fileinput for word replace, you also use this simple method for word substitution:
import shutil
o = open("outputfile","w") #open an outputfile for writing
with open("inputfile") as infile:
for line in infile:
line = line.replace("someword","newword")
o.write(line + "\n")
o.close()
shutil.move("outputfile","inputfile")

When you iterate the lines of the file with
for line in fileinput.FileInput(full_path,inplace=1)
the line will contain the line data including the linefeed character if this isn't the last line. So usually in this kind of pattern you will either want to strip the extra whitespace with
line = line.rstrip()
or print them out without appending your own linefeed (like print does) by using
sys.stdout.write(line)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Removing/renaming files in python - python

I am trying to figure out how I could remove certain words from a file name. So if my file name was lolipop-three-fun-sand,i would input three and fun in and they would get removed. Renaming the file to lolipop--sand. Any ideas on how to start this?

Use string.replace() to remove the words from the filename. Then call os.rename() to perform the rename. newfilename = filename.replace('three', '').replace('fun', '') os.rename(filename, newfilename)

import os line = 'lolipop-three-fun-sand' delete_list = raw_input("enter ur words to be removed : write them in single quote separated by a comma") for word in delete_list: line = line.replace(word, "") print line

Related

search each text file words in another main text file and append if not found in main file using python

How to rejoin split words in a file?

Python edit line containing specific characters

How does one parse a .lua file with Python and pull out the require statements?

Adding new line at the end of each line using Python

Categories

Resources