python + objectlistview + updatelist - python

I have an objectlistview. I remove a line from it and then I want to update the list without the removed line. I fill the list with data from a database. I tried repopulatelist, but then it seems to use the data that is already in the list.
I think I can solve it with clearAll (clearing the list) and then addobjects and add the database again. But it seems that it should be possible to just update the list. This is my code:
def deletemeas(self):
MAid = self.objectma.id
MAname = self.pagename
objectsRemList = self.tempmeasurements.GetCheckedObjects()
print 'objectremlist', objectsRemList
for measurement in objectsRemList:
print measurement
Measname = measurement.filename
Measid = database.Measurement.select(database.Measurement.q.filename == Measname)[0].id
deleteMeas = []
deleteMeas.append(MAid)
deleteMeas.append(Measid)
pub.sendMessage('DELETE_MEAS', Container(data=deleteMeas)) #to microanalyse controller
#here I get the latest information from the database what should be viewed in the objectlist self.tempmeasurements
MeasInListFromDB = list(database.Microanalysismeasurement.select(database.Microanalysismeasurement.q.microanalysisid == MAid))
print 'lijstmetingen:', MeasInListFromDB
#this doesn't work
self.tempmeasurements.RefreshObjects(MeasInListFromDB)

Ok, this was actually easier than I thought ...
I added this line:
self.tempmeasurements.RemoveObject(measurement)
So I first removed the data from my database table and then I just removed the line in my objectlistview.

Related

Writing to databases on Python

I have been trying to write to a database and am having trouble setting data using two different classes in the one function.
Firstly, all values are being passed through a GUI and in this case, only the following entries were passed: 'Categories' = C, 'Usage' = Charter, 'DispTHR' = 5000.
Here you can see that I have two classes I am wanting to access (airport and runway) where the function set_opsdatabase_details() will go to the appropriate class and write to our database. This is all well and good when the airport and runway are separate from each other; however, when integrating them in the same function I can't seem to get the corresponding airportvalueDict = {} and runwayvalueDict = {] to display the values I want. Could someone help me understand how to write the correct entry box values into the corresponding valueDict dictionaries?
Thank you in advance! (a screenshot of the output from the print statements is attached)
Function in python
Output of function with the print statements
Example of code in text format:
#edit function for first part of operational window
def edit_details1(self):
airport = self.manager.get_chosen_airport()
runway = airport.get_chosen_runway()
airportlabels = ['Category', 'Usage', 'Curfew',
'Surveyor', 'LatestSurvey',
'FuelAvail', 'FuelPrice', 'TankerPort', 'RFF']
runwaylabels = ['DispTHR', 'VASI', 'TCH']
airportvalueDict = {}
runwayvalueDict = {}
print(self.entryDict['Category']["state"])
if self.entryDict['Category']["state"] == 'disabled':
for entry in self.entryDict.values():
entry.config(state=tk.NORMAL)
self.editButton.config(text="Confirm")
elif self.entryDict['Category']['state'] == 'normal':
airport = self.manager.get_chosen_airport()
runway = airport.get_chosen_runway()
values = [x.get() for x in self.varDict.values()]
for label, value in zip(airportlabels, values):
airportvalueDict[label] = value
airport.set_opsdatabase_details(airportvalueDict)
print(f'airport: {airportvalueDict}')
for label, value in zip(runwaylabels, values):
runwayvalueDict[label] = value
runway.set_opsdatabase_details(runwayvalueDict)
print(f'runway: {runwayvalueDict}')
for entry in self.entryDict.values():
entry.config(state=tk.DISABLED)
self.editButton.config(text="Edit...")

Whoosh index file overwritten when I open it to add new documents

I have a problem with Whoosh. I want to create an index in different moments, because the query to extract data is heavy. I fixed almost all the problems, but I can't get over the problem that every time I reopen the index to add new documents, the file is cleaned instead of simply adding new documents. I tried to use update_document instead of add_document, and FileStorage.open_index instead of index.open_dir, but nothing changed: I always had an index file much smaller than expected.
if is_new_index_file:
if os.path.isdir(<dirname>):
rmtree(<dirname>)
os.mkdir(<dirname>)
else:
os.mkdir(<dirname>)
schema = TranslationSchema()
index.create_in(<dirname>, <schema>, indexname=<indexname>)
ix = index.open_dir(<dirname>, indexname=<indexname>, schema=<schema>)
else:
#open an existing index object
# ix = index.open_dir(<dirname>, indexname=<indexname>)
# open file storage
ix = FileStorage(<dirname>)
ix.open_index(indexname = <indexname>)
...
list-of-fields = <query-to-the-database-to-extract-fields>
...
writer = ix.writer()
#writer.add_document(<list-of-fields>)
writer.update_document(<list-of-fields>)
writer.commit(merge=False, optimize=True)
ix.close()

feed class from list

I am still new to python but using it for my linguistics research.
So I am doing some research into toponyms, and I got a list of input data from a topographic institution, which looks like the following:
Official_Name, tab, Dialect_Name, tab, Administrative_district, Topographic_district, Y_coordinates, X_coordinates, Longitude, Latitude.
So, I defined a class:
class MacroTop:
def __init__(self, Official_Name, Dialect_Name, Adm_District, Topo_District, Y, X, Long, Lat):
self.Official_Name = Official_Name
self.Dialect_Name = Dialect_Name
self.Adm_District = Adm_District
self.Topo_District = Topo_District
self.Y = Y
self.X = X
self.Long = Long
self.Lat = Lat
So, with open(), I wanted to load my .txt file with the data I have to read it into the class using a loop but it did not work.
The result I want is to be able to access a feature of the class, say, Dialect_Name and be able to look through all the entries of that feature. I can do that just in the loop, but I wanted to define a class so I could be able to do more manipulation afterwards.
my loop:
with open("locLuxAll.txt", "r") as topo_list:
lines = topo_list.readlines()
for line in lines:
line = line.split('\t')
print(line)
print(line[0]) # This would access all the data that is characterized as Official_Name
I tried to make another loop:
for i in range(0-len(lines)):
lines[i] = MacroTop(str(line[0]), str(line[1]), str(line[2]), str(line[3]), str(line[4]), str(line[5]), str(line[6]), str(line[7]))
But that did not seem to work.
This line fails:
for i in range(0-len(lines)):
You're trying to loop through negative number I guess, so the output will be an empty list.
In [11]: [i for i in range(-200)]
Out[11]: []
EDIT:
Your code seems unreadable to me, you have for i in range(len(lines)) but in this for loop, you're iterating through line variable, where is it from? First of all I'd not write back to lines list as it comes from readlines. Create new list for that, and you dont need i variable, those lines will be kept in order anyway.
class_lines = []
for line in lines:
class_lines.append(MacroTop(str(line[0]), str(line[1]), str(line[2]), str(line[3]), str(line[4]), str(line[5]), str(line[6]), str(line[7])))
Or even with list comprehension:
class_lines = [MacroTop(str(line[0]), str(line[1]), str(line[2]), str(line[3]), str(
line[4]), str(line[5]), str(line[6]), str(line[7])) for line in lines]

Using Python gdata to clear the rows in worksheet before adding data

I have a Google Spreadsheet which I'm populating with values using a python script and the gdata library. If i run the script more than once, it appends new rows to the worksheet, I'd like the script to first clear all the data from the rows before populating it, that way I have a fresh set of data every time I run the script. I've tried using:
UpdateCell(row, col, value, spreadsheet_key, worksheet_id)
but short of running a two for loops like this, is there a cleaner way? Also this loop seems to be horrendously slow:
for x in range(2, 45):
for i in range(1, 5):
self.GetGDataClient().UpdateCell(x, i, '',
self.spreadsheet_key,
self.worksheet_id)
Not sure if you got this sorted out or not, but regarding speeding up the clearing out of current data, try using a batch request. For instance, to clear out every single cell in the sheet, you could do:
cells = client.GetCellsFeed(key, wks_id)
batch_request = gdata.spreadsheet.SpreadsheetsCellsFeed()
# Iterate through every cell in the CellsFeed, replacing each one with ''
# Note that this does not make any calls yet - it all happens locally
for i, entry in enumerate(cells.entry):
entry.cell.inputValue = ''
batch_request.AddUpdate(cells.entry[i])
# Now send the entire batchRequest as a single HTTP request
updated = client.ExecuteBatch(batch_request, cells.GetBatchLink().href)
If you want to do things like save the column headers (assuming they are in the first row), you can use a CellQuery:
# Set up a query that starts at row 2
query = gdata.spreadsheet.service.CellQuery()
query.min_row = '2'
# Pull just those cells
no_headers = client.GetCellsFeed(key, wks_id, query=query)
batch_request = gdata.spreadsheet.SpreadsheetsCellsFeed()
# Iterate through every cell in the CellsFeed, replacing each one with ''
# Note that this does not make any calls yet - it all happens locally
for i, entry in enumerate(no_headers.entry):
entry.cell.inputValue = ''
batch_request.AddUpdate(no_headers.entry[i])
# Now send the entire batchRequest as a single HTTP request
updated = client.ExecuteBatch(batch_request, no_headers.GetBatchLink().href)
Alternatively, you could use this to update your cells as well (perhaps more in line with that you want). The link to the documentation provides a basic way to do that, which is (copied from the docs in case the link ever changes):
import gdata.spreadsheet
import gdata.spreadsheet.service
client = gdata.spreadsheet.service.SpreadsheetsService()
# Authenticate ...
cells = client.GetCellsFeed('your_spreadsheet_key', wksht_id='your_worksheet_id')
batchRequest = gdata.spreadsheet.SpreadsheetsCellsFeed()
cells.entry[0].cell.inputValue = 'x'
batchRequest.AddUpdate(cells.entry[0])
cells.entry[1].cell.inputValue = 'y'
batchRequest.AddUpdate(cells.entry[1])
cells.entry[2].cell.inputValue = 'z'
batchRequest.AddUpdate(cells.entry[2])
cells.entry[3].cell.inputValue = '=sum(3,5)'
batchRequest.AddUpdate(cells.entry[3])
updated = client.ExecuteBatch(batchRequest, cells.GetBatchLink().href)

Optimize python file comparison script

I have written a script which works, but I'm guessing isn't the most efficient. What I need to do is the following:
Compare two csv files that contain user information. It's essentially a member list where one file is a more updated version of the other.
The files contain data such as ID, name, status, etc, etc
Write to a third csv file ONLY the records in the new file that either don't exist in the older file, or contain updated information. For each record, there is a unique ID that allows me to determine if a record is new or previously existed.
Here is the code I have written so far:
import csv
fileAin = open('old.csv','rb')
fOld = csv.reader(fileAin)
fileBin = open('new.csv','rb')
fNew = csv.reader(fileBin)
fileCout = open('NewAndUpdated.csv','wb')
fNewUpdate = csv.writer(fileCout)
old = []
new = []
for row in fOld:
old.append(row)
for row in fNew:
new.append(row)
output = []
x = len(new)
i = 0
num = 0
while i < x:
if new[num] not in old:
fNewUpdate.writerow(new[num])
num += 1
i += 1
fileAin.close()
fileBin.close()
fileCout.close()
In terms of functionality, this script works. However I'm trying to run this on files that contain hundreds of thousands of records and it's taking hours to complete. I am guessing the problem lies with reading both files to lists and treating the entire row of data as a single string for comparison.
My question is, for what I am trying to do is this there a faster, more efficient, way to process the two files to create the third file containing only new and updated records? I don't really have a target time, just mostly wanting to understand if there are better ways in Python to process these files.
Thanks in advance for any help.
UPDATE to include sample row of data:
123456789,34,DOE,JOHN,1764756,1234 MAIN ST.,CITY,STATE,305,1,A
How about something like this? One of the biggest inefficiencies of your code is checking whether new[num] is in old every time because old is a list so you have to iterate through the entire list. Using a dictionary is much much faster.
import csv
fileAin = open('old.csv','rb')
fOld = csv.reader(fileAin)
fileBin = open('new.csv','rb')
fNew = csv.reader(fileBin)
fileCout = open('NewAndUpdated.csv','wb')
fNewUpdate = csv.writer(fileCout)
old = {row[0]:row[1:] for row in fOld}
new = {row[0]:row[1:] for row in fNew}
fileAin.close()
fileBin.close()
output = {}
for row_id in new:
if row_id not in old or not old[row_id] == new[row_id]:
output[row_id] = new[row_id]
for row_id in output:
fNewUpdate.writerow([row_id] + output[row_id])
fileCout.close()
difflib is quite efficient: http://docs.python.org/library/difflib.html
Sort the data by your unique field(s), and then use a comparison process analogous to the merge step of merge sort:
http://en.wikipedia.org/wiki/Merge_sort

Categories

Resources