Python overwriting file instead of appending [duplicate] - python

This question already has answers here:
How do I append to a file?
(13 answers)
Closed 7 years ago.
I'm creating a personal TV show and movie database and I use Python to get the information of the TV shows and movies. I have a file to get information for movies in a folder, which works fine.
I also have a Python file that gets the information of a TV show (which is a folder, e.g. Game of Thrones) and it also gets all the episode files from inside the folder and gets the information for those (it's formatted like this: e.g. Game of Thrones;3;9)
All this information is stored into 2 text files which MySQL can read: tvshows.txt and episodes.txt.
Python easily gets the information of the TV show in the first part of the program.
The second part of the program is to get each episode in the TV show folder and store the information in a file (episodes.txt):
def seTv(show):
pat = '/home/ryan/python/tv/'
pat = pat + show
epList = os.listdir(pat)
fileP = "/home/ryan/python/tvtext/episodes.txt"
f = open(fileP, "w")
print epList
hdrs = ['Title', 'Plot', 'imdbRating', 'Season', 'Episode', 'seriesID']
def searchTvSe(ep):
ep = str(ep)
print ep
seq = ep.split(";")
print seq
tit = seq[0]
seq[0] = seq[0].replace(" ", "+")
url = "http://www.omdbapi.com/?t=%s&Season=%s&Episode=%s&plot=full&r=json" % (seq[0], seq[1], seq[2])
respo = u.urlopen(url)
respo = json.loads(str(respo.read()))
if not os.path.exists("/var/www/html/images/"+tit):
os.makedirs("/var/www/html/images/"+tit)
imgNa = "/var/www/html/images/" + tit + "/" + respo["Title"] + ".jpg";
for each in hdrs:
#print respo[each] # ==== This checks to see if it is working, it is =====
f.write(respo[each] + "\t")
urllib.urlretrieve(respo["Poster"], imgNa)
for co, tt in enumerate(epList):
f.write("\N \t" + str(co) + "\t")
searchTvSe(tt)
f.write("\n")
f.close()
fullTv()
The second part only works once and I have 3 folders inside the tv folder (Game of Thrones, Breaking Bad, The Walking Dead) and inside those files are one episode from the series (Game of Thrones;3;4, Breaking Bad;1;1, The Walking Dead;3;4).
This was working fine before I added 'seriesID' and changed the files (before I had a text file for each folder, which was needed as I had a table for each TV Show).
In episodes.txt, the information for Game of Thrones is the only one that appears. I deleted the Game of Thrones folder and it appears that the final one to be searched is the only one that has been added. It seems to be overwriting it?
Thanks.

Change this line:
f = open(fileP, "w")
To this:
f = open(fileP, "a")

You need to open the file with 'a' instead of 'w':
with open('file.txt', 'a') as myfile:
myfile.write("Hello world!")
You can find more details in the documentation at https://docs.python.org/2/library/functions.html#open.

Related

Python (on replit.com) only allowing one instance of one function to run at a time

Here is some code I'm working on requiring string and the opening and closing of files.
#Importing required Packages---------------------------------------------
import string
# Importing Datasets-----------------------------------------------------
allNames = open("allNames.csv", "r")
onlyNames = open("onlyNames.csv", "r")
#=========Tasks==========================================================
# [1] findName(name, outputFile)-----------------------------------------
# Works ####
def findName(name, outputFile):
outfile = open(outputFile + ".csv", "w") # Output file
outfile.write("Artist \tSong \tYear\n") # Initial title lines
alreadyAdded = [] # List of lines already added to remove duplicates
for aline in allNames: # Looping through allNames.csv
fields = aline.split("\t") # Splitting elements of a line into a list
if fields[-1] == name + "\n": # Selecting lines with only the specified name (last element)
dataline = fields[0] + "\t" + fields[1] + "\t" + fields[3] # Each line in the .csv file
if dataline not in alreadyAdded: # Removing Duplicates
outfile.write(dataline + "\n") # Writing the file
alreadyAdded.append(dataline) # Adding lines already added
outfile.close()
# findName("Mary Anne", "mary anne")
# findName("Jack", "jack")
# findName("Mary", "mary")
# findName("Peter", "peter")
The code serves its intended purpose as I get an exported file. However, this only works for one function at a time, for example if I try to run both findName("Mary Anne", "mary anne") and findName("Jack", "jack") at the same time, the second instance of the function does not work. Moreover, all subsequent functions on the project file do not work unless I comment out this code.
Let me know what the issue is, thank you!

Creating a search function in a list from a text file

everyone. I have a Python assignment that requires me to do the following:
Download this CSV fileLinks to an external site of female Oscar winners (https://docs.google.com/document/d/1Bq2T4m7FhWVXEJlD_UGti0zrIaoRCxDfRBVPOZq89bI/edit?usp=sharing) and open it into a text editor on your computer
Add a text file to your sandbox project named OscarWinnersFemales.txt
Copy and paste several lines from the original file into your sandbox file. Make sure that you include the header.
Write a Python program that does the following:
Open the file and store the file object in a variable
Read the entire contents line by line into a list and strip away the newline character at the end of each line
Using list slicing, print lines 4 through 7 of your file
Write code that will ask the user for an actress name and then search the list to see if it is in there. If it is it will display the record and if it is not it will display Sorry not found.
Close the file
Below is the code I currently have. I've already completed the first three bullet points but I can't figure out how to implement a search function into the list. Could anyone help clarify it for me? Thanks.
f = open('OscarsWinnersFemales.txt')
f = ([x.strip("\n") for x in f.readlines()])
print(f[3:7])
Here's what I tried already but it just keeps returning failure:
def search_func():
actress = input("Enter an actress name: ")
for x in f:
if actress in f:
print("success")
else:
print("failure")
search_func()
I hate it when people use complicated commands like ([x.strip("\n") for x in f.readlines()]) so ill just use multiple lines but you can do what you like.
f = open("OscarWinnersFemales.txt")
f = f.readlines()
f.close()
data = {} # will list the actors and the data as their values
for i, d in enumerate(data):
f[i] = d.strip("\n")
try:
index, year, age, name, movie = d.split(",")
except ValueError:
index, year, age, name, movie, movie2 = d.split(",")
movie += " and " + movie2
data[name] = f"{index}-> {year}-{age} | {movie}"
print(f[3:7])
def search_actr(name):
if name in data: print(data[name])
else: print("Actress does not exist in database. Remember to use captols and their full name")
I apologize if there are any errors, I decided not to download the file but everything I wrote is based off my knowledge and testing.
I have figured it out
file = open("OscarWinnersFemales.txt","r")
OscarWinnersFemales_List = []
for line in file:
stripped_line = line.strip()
OscarWinnersFemales_List.append(stripped_line)
file.close()
print(OscarWinnersFemales_List[3:7])
print()
actress_line = 0
name = input("Enter An Actress's Name: ")
for line in OscarWinnersFemales_List:
if name in line:
actress_line = line
break
if actress_line == 0:
print("Sorry, not found.")
else:
print()
print(actress_line)

Python script times out or will not finish running

I've been working on a python script that will scrape certain webpages.
The beginning of the script looks like this:
# -*- coding: UTF-8 -*-
import urllib2
import re
database = ''
contents = open('contents.html', 'r')
for line in contents:
entry = ''
f = re.search('(?<=a href=")(.+?)(?=\.htm)', line)
if f:
entry = f.group(0)
page = urllib2.urlopen('https://indo-european.info/pokorny-etymological-dictionary/' + entry + '.htm').read()
m = re.search('English meaning( )+\s+(.+?)</font>', page)
if m:
title = m.group(2)
else:
title = 'N/A'
This accesses each page and grabs a title from it. Then I have a number of blocks of code that test whether certain text is present in each page, here is an example of one:
abg = re.findall('\babg\b', page);
if len(abg) == 0:
abg = 'N'
else:
abg = 'Y'
Then, finally, still in the for loop, I add this information to the variable database:
database += '\n' + str('<F>') + str(entry) + '<TITLE="' + str(title) + '"><FQ="N"><SQ="N"><ABG="' + str(abg) + '"></F>'
Note that I have used str() for each variable because I was getting a "can't concatenate strings and lists" error for some reason.
Once the for loop is completed, I write the database variable to a file:
f = open('database.txt', 'wb')
f.write(database)
f.close()
When I run this in the command line, it times out or never completes running. Any ideas as to what might be causing the issue?
EDIT: I fixed it. It seems the program was getting slowed down by the fact that I was having the database variable store the result of each line's iteration through the loop. All I had to do to fix the issue was change the write function to happen during the for loop.

divided pdf in two

I need a program to divide a each PDF page in two (left,right). So I made this code, but for some reason it doesn't catch the image for the title. When trying with other books it didn't work either.
import os
#Info that i collect to know the numbers of pages and the pdf file name
number = int(input("Number os pages: " ))
file = input("Name of the file: " )
file = str(file) + ".pdf"
text = open("for_the_latex.txt","w")
#Putting the first part of the latex document
a = "\documentclass{article}" + "\n"
b = "\\usepackage{pdfpages}" + "\n"
c = "\\begin{document}"
text.write(a)
text.write(b)
text.write(c)
#This is the core of the program
#It basically write in a text document to include the pdf for each page
for i in range(1,number +1):
a = "\includepdf[pages=" + str( i) + ",trim=0 0 400 0]{" + file + "}" + "\n"
text.write(a)
#Writing the finish part
quatro = "\end{document}"
text.write(quatro)
text.close()
#renaming to .tex
os.rename("for_the_latex.txt", "divided.tex")
#activating the latex
os.system("pdflatex divided.tex")
where is the error ?
I want to divide the PDF in two.
Consider the following minimal document (called example-document.pdf) that contains 6 pages, each exactly split in half by colour and number:
\documentclass{article}
\usepackage[paper=a4paper,landscape]{geometry}
\usepackage{pdfpages}
\begin{document}
% http://mirrors.ctan.org/macros/latex/contrib/mwe/example-image-a4-numbered.pdf
\includepdf[pages={1-2},nup=2]{example-image-a4-numbered.pdf}
\includepdf[pages={3-4},nup=2]{example-image-a4-numbered.pdf}
\includepdf[pages={5-6},nup=2]{example-image-a4-numbered.pdf}
\includepdf[pages={7-8},nup=2]{example-image-a4-numbered.pdf}
\includepdf[pages={9-10},nup=2]{example-image-a4-numbered.pdf}
\includepdf[pages={11-12},nup=2]{example-image-a4-numbered.pdf}
\end{document}
The idea is to split these back into a 12-page document. Here's the code for LaTeX:
\documentclass{article}
\usepackage[paper=a4paper]{geometry}
\usepackage{pdfpages,pgffor}
\newlength{\pagedim}% To store page dimensions, if necessary
\begin{document}
\foreach \docpage in {1,...,6} {
\settowidth{\pagedim}{\includegraphics[page=\docpage]{example-document.pdf}}% Establish page width
\includepdf[pages={\docpage},trim=0 0 .5\pagedim{} 0,clip]{example-document.pdf}% Left half
\includepdf[pages={\docpage},trim=.5\pagedim{} 0 0 0,clip]{example-document.pdf}% Right half
}
\end{document}
It's not necessary to read in every page and establish its width (stored in \pagedim), but I wasn't sure whether your pages may have differing sizes.
As mentioned in the comment, I'm not quite sure, if I understand your problem correctly. Since I can execute your program and it includes only the left part of the initial document, I modified the code a bit.
import os
#Info that i collect to know the numbers of pages and the pdf file name
number = int(input("Number of pages: "))
file = input("Name of the file: ")
file = str(file) + ".pdf"
text = open("divided.tex","w")
#Putting the first part of the latex document
header ='''
\\documentclass{article}
\\usepackage{pdfpages}
\\begin{document}
'''
#This is the core of the program
#It basically write in a text document to include the pdf for each page
middle=''
for i in range(1,number +1):
middle += "\includepdf[pages={},trim=0 0 400 0]{{{}}}\n".format(i, file)
middle += "\includepdf[pages={},trim=400 0 0 0]{{{}}}\n".format(i, file)
#Writing the finish part
quatro = "\end{document}"
text.write(header)
text.write(middle)
text.write(quatro)
text.close()
#activating the latex
os.system("pdflatex divided.tex")

Format path within Text File for consumption in Python

I am writing a Python script for use by multiple non-Python users.
I have a text file containing the parameters my script needs to run.
One of the inputs is a path. I cannot get my script to run and was thinking it was because I had referenced my path incorrectly.
I have tried:
C:\temp\test
"C:\temp\test"
r"C:\temp\test"
C:/temp/test
"C:/temp/test"
C:\\temp\\test
"C:\\temp\\test"
I have added each one of these into a text file, which is called and read in my Python script.
I have other parameters and they are called correctly, my script seems to run when I hard code the path in. I say seems because I think there are a few bugs I need to check, but it runs with no errors.
When I use the text file I get this error - which varies depending on if I used one of the above examples:
WindowsError: [Error 123] The filename, directory name, or volume
label syntax is incorrect: 'c:\temp\match1\jpg\n/.'
My code is as follows:
print ("Linking new attachments to feature")
fp = open(r"C:\temp\Match1\Match_Table.txt","r") #reads my text file with inputs
lines=fp.readlines()
InFeat = lines[1]
print (InFeat)
AttFolder = lines[3] #reads the folder from the text file
print (AttFolder)
OutTable = lines[5]
if arcpy.Exists(OutTable):
print("Table Exists")
arcpy.Delete_management(OutTable)
OutTable = lines[5]
print (OutTable)
LinkF = lines[7]
print (LinkF)
fp.close()
#adding from https://community.esri.com/thread/90280
if arcpy.Exists("in_memory\\matchtable"):
arcpy.Delete_management("in_memory\\matchtable")
print ("CK Done")
input = InFeat
inputField = "OBJECTID"
matchTable = arcpy.CreateTable_management("in_memory", "matchtable")
matchField = "MatchID"
pathField = "Filename"
print ("Table Created")
arcpy.AddField_management(matchTable, matchField, "TEXT")
arcpy.AddField_management(matchTable, pathField, "TEXT")
picFolder = r"C:\temp\match1\JPG" #hard coded in
print (picFolder)
print ("Fields added")
fields = ["MatchID", "Filename"]
cursor = arcpy.da.InsertCursor(matchTable, fields)
##go thru the picFolder of .png images to attach
for file in os.listdir(picFolder):
if str(file).find(".jpg") > -1:
pos = int(str(file).find("."))
newfile = str(file)[0:pos]
cursor.insertRow((newfile, file))
del cursor
arcpy.AddAttachments_management(input, inputField, matchTable, matchField, pathField, picFolder)
From your error "'c:\temp\match1\jpg\n/.'", i can see "\n" character, \n is symbole of new line ( when you press enter button ) you should remove that character from end of your path! did you try to do that? you can use .lstrip("\n") , replcae() or regx methods for remove that character.
Try to open and read line by line of your input file like this:
read_lines = [line.rstrip('\n') for line in open(r"C:\temp\Match1\Match_Table.txt")]
print(read_lines)
print(read_lines[1])

Categories

Resources