Change all text in file to lowercase - python

from itertools import chain
from glob import glob
file = open('FortInventory.txt','w')
lines = [line.lower() for line in lines]
with open('FortInventory.txt', 'w') as out:
out.writelines(sorted(lines))
I am trying to convert all text in a txt file to lowercase how would i go about doing this, here is the code i have so far and i looked at some questions on stack overflow but i couldnt quite figure it out, if anyone could link me to the right article or tell me what is wrong with my code i would greatly appreciate it.

Two problems:
Open the file with 'r' for read.
Change lines to file in your list comprehension.
Here's the fixed code:
from itertools import chain
from glob import glob
file = open('FortInventory.txt', 'r')
lines = [line.lower() for line in file]
with open('FortInventory.txt', 'w') as out:
out.writelines(sorted(lines))

Related

Python Removing Custom Stop-Words from CSV files

Hi I am new to Python programing and I need help removing custom made stop-words from multiple files in a directory. I have read almost all the relevant posts online!!!!
I am using Python 2.7
Here are two sample lines of one of my files.I want to keep this format and just remove the stop-words from the rows:
"8806";"Demonstrators [in Chad] demand dissolution of Legis Assembly many hurt as police disperse crowd.";"19"
"44801";"Role that American oil companies played in Iraq's oil-for-food program is coming under greater scrutiny.";"19"
I have a list of stop-words in a dat file called Stopwords.
This is my code:
import io
import os
import os.path
import csv
os.chdir('/home/Documents/filesdirectory')
stopwords = open('/home/StopWords.dat','r').read().split('\n')
for i in os.listdir(os.getcwd()):
name= os.path.splitext(i)[0]
with open(i,"r") as fin:
with open(name,"w") as fout:
writer=csv.writer(fout)
for w in csv.reader(fin):
if w not in stopwords:
writer.writerow(w)
It does not give me any errors but creates empty files. Any help is very much appreciated.
import os
import os.path
os.chdir('/home/filesdirectory')
for i in os.listdir(os.getcwd()):
filein = open(i, 'r').readlines()
fileout = open(i, 'w')
stopwords= open('/home/stopwords.dat', 'r').read().split()
for line in filein:
linewords= line.split()
filteredtext1 = []
filteredtext1 = [t for t in linewords if t not in stopwords]
filteredtext = str(filteredtext1)
fileout.write(filteredtext + '\n')
Well, I solved the problem.
This code removes the stopwords(or any list of the words you give it) for each line, writes each line to a file with the same filenmae and at the end replaces the old file with a new file without stopwords. Here are the steps:
declare the working directory
enter a loop to go over each file
open the file to read and read each line using readlines()
open a file to write
read the stopwords file and split its words
enter to a for loop to deal with each line separately
split the line to words
create a list
write the words of the line as items of the list if they are not in the stopwords list
change the list to string
write the string to a file

importing external ".txt" file in python

I am trying to import a text with a list about 10 words.
import words.txt
That doesn't work...
Anyway, Can I import the file without this showing up?
Traceback (most recent call last):
File "D:/python/p1.py", line 9, in <module>
import words.txt
ImportError: No module named 'words'
Any sort of help is appreciated.
You can import modules but not text files. If you want to print the content do the following:
Open a text file for reading:
f = open('words.txt', 'r')
Store content in a variable:
content = f.read()
Print content of this file:
print(content)
After you're done close a file:
f.close()
As you can't import a .txt file, I would suggest to read words this way.
list_ = open("world.txt").read().split()
The "import" keyword is for attaching python definitions that are created external to the current python program. So in your case, where you just want to read a file with some text in it, use:
text = open("words.txt", "rb").read()
This answer is modified from infrared's answer at Splitting large text file by a delimiter in Python
with open('words.txt') as fp:
contents = fp.read()
for entry in contents:
# do something with entry
numpy's genfromtxt or loadtxt is what I use:
import numpy as np
...
wordset = np.genfromtxt(fname='words.txt')
This got me headed in the right direction and solved my problem.
Import gives you access to other modules in your program. You can't decide to import a text file. If you want to read from a file that's in the same directory, you can look at this. Here's another StackOverflow post about it.

Reading file stops at halfway

I am writing a direct x importer for Blender, to be able to open ascii .x files. For this I try to make a good Python script. I am pretty new to it, actually I just started, but got good results, except a strange ... ummm ... problem: my .x file is pretty large, exactly 3 263 453 bytes long. I will not put my whole code here, just some workaround so the problem is still visible, and in console.
>>> teszt = open('d:\DRA_ACTORDEF_T0.x','rt')
>>> teszt
<_io.TextIOWrapper name='d:\\DRA_ACTORDEF_T0.x' mode='rt' encoding='cp1250'>
then I read the file:
>>> t2 = teszt.readlines()
>>> len(t2)
39768
but then again, when I verify:
>>> import os
>>> os.fstat(teszt.fileno()).st_size
3263453
Could someone lend me a hand and tell me, what the problem is? Maybe I am to set a buffer size or such? Got no idea, how this works in Python.
I open the file same way as above, and I use .readline().
Thank you very much.
EDIT:
The code simplified. I need .readline().
fajlnev = 'd:\DRA_ACTORDEF_T0.x'
import bpy
import os
fajl = open(fajlnev, 'rt')
fajl_teljes_merete = os.fstat(fajl.fileno()).st_size
while (fajl.tell() < fajl_teljes_merete):
print(fajl.tell(),fajl.readline())
readlines returns a list of lines, so when you do len(t2) this will return the number of lines in the file and the length of a file.
If you want the numbers to match you should do:
with open('your_file', 'rb') as f:
data = f.read()
print(len(data))
Also if the file is encoded rt might incorrectly interpret the newlines. So it's much safer to do something like:
import io
with io.open('your_file', 'r', encoding='your_file_encoding') as f:
lines = f.readlines()
And if you want a streaming line by line read then it's best to do:
import io
with io.open('d:\\DRA_ACTORDEF_T0.x', 'r', encoding='your_encoding') as f:
for line in f:
print line
This will take care of streaming and not reading the whole file into memory.
If you still want to use readline:
import io
filename = 'd:\\DRA_ACTORDEF_T0.x'
size = os.stat(filename).st_size
with io.open(filename, 'r', encoding='your_encoding') as f:
while f.tell() < size:
# Do what you want
line = f.readline()

I can't make a search-and-replace script for Python

I've been trying to do this, but I'm pretty new at Python, and can't figure out how to make it work.
I have this:
import fileinput
for line in fileinput.input(['tooltips.txt'], inplace=True, backup="bak.txt"):
line.replace("oldString1", "newString1")
line.replace("oldString2", "newString2")
But it just deletes everything from the txt.
What am I doing wrong?
I have tried with print(line.replace("oldString1", "newString1")
but it doesn't remove the existing words.
As I said, I'm pretty new at this.
Thanks!
line.replace() doesn't modify line it returns the modified string
import fileinput, sys
for line in fileinput.input(['tooltips.txt'], inplace=True, backup="bak.txt"):
sys.stdout.write(line.replace("oldString1", "newString1"))
One simple way to do this is with the open function and the os module:
import os
with open(tmp_file) as tmp:
with open(my_file) as f:
for line in f.readlines():
tmp.write(line.replace("oldString1", "newString1").replace("oldString2", "newString2") + "\n")
os.remove(my_file)
os.rename(tmp_file, my_file)

Problem with file concatenation in Python?

I have 3 files 1.txt, 2.txt, and 3.txt and I am trying to concatenate together the contents of these files into one output file in Python. Can anyone explain why the code below only writes the content of 1.txt and not 2.txt or 3.txt? I'm sure it's something really simple, but I can't seem to figure out the problem.
import glob
import shutil
for my_file in glob.iglob('/Users/me/Desktop/*.txt'):
with open('concat_file.txt', "w") as concat_file:
shutil.copyfileobj(open(my_file, "r"), concat_file)
Thanks for the help!
you constantly overwrite the same file.
either use:
with open('concat_file.txt', "a")
or
with open('concat_file.txt', "w") as concat_file:
for my_file in glob.iglob('/Users/me/Desktop/*.txt'):
shutil.copyfileobj(open(my_file, "r"), concat_file)
I believe that what's wrong with your code is that in every loop iteration, you are essentially adding files to themselves.
If you manually unroll the loop you will see what I mean:
# my_file = '1.txt'
concat_file = open(my_file)
shutil.copyfileobj(open(my_file, 'r'), concat_file)
# ...
I'd suggest deciding beforehand which file you want all the files to be copied to, maybe like this:
import glob
import shutil
output_file = open('output.txt', 'w')
for my_file in glob.iglob('/Users/me/Desktop/*.txt'):
with open('concat_file.txt', "w") as concat_file:
shutil.copyfileobj(open(my_file, "r"), output_file)

Categories

Resources