I need to move images from several directories into one and capture metadata on the files before and after the move.
In each directory:
Read the index of jpg images from indexfile.csv, including metadata on each image
Upload the corresponding image file to google drive, with metadata
Add entry to uberindex.csv which includes the metadata from indexfile.csv and the file url from google drive after upload
My plan was to create an instance of the class ybpic() - def below – for each row of indexfile.csv and use that instance to identify the actual file to be moved ( it’s reference in the indexfile ), hold the metadata from the indexfile.csv, then update that ybpic.instance with the results of the google drive upload ( the other metadata ) before finally writing out all of the instances to the uberindex.csv.
I know I’m going to kick myself when the answer comes ( real noob ).
I can csv.reader the indexfile.csv into a ybpic.instance but I’m not able refer to each instance individually to use or update the instance later.
I can just append the rows from indexfile.csv to indexlist[], and I’m able to return the updated list back to the caller but I don’t know a good way to then update that list row, for the corresponding image file, later with the new metadata.
Here's the ybpic def
class ybpic():
def __init__(self,FileID, PHOTO, Source, Vintage, Students,Folder,Log):
self.GOBJ=" "
self.PicID=" "
self.FileID=FileID
self.PHOTO=PHOTO
self.Source=Source
self.Students=Students
self.Vintage=Vintage
self.MultipleStudents=" "
self.CurrentTeacher=" "
self.Folder=Folder ## This may be either the local folder or the drive folder attr
self.Room=" "
self.Log=Log ## The source csvfile from which the FileID came
Here is the function populating the instance and list. The indexfile.csv is passed as photolog and cwd is just the working directory:
def ReadIndex(photolog, cwd, indexlist) :
""" Read the CSV log file into an instance of YBPic. """
with open(photolog,'r') as indexin :
readout = csv.reader(indexin)
for row in readout:
indexrow=ybpic(row[0],row[1],row[2],row[3],row[4],cwd,photolog)
indexlist.append(row) ### THIS WORKS TO APPEND TO THE LIST
### THAT WAS PASSED TO ReadIndex
return(indexlist)
Any and all help is greatly appreciated.
Instead of using a list, you could use a dictionary of objects with PhotoID as the key (assuming it's stored in row[0]).
def ReadIndex(photolog, cwd, indexlist) :
""" Read the CSV log file into an instance of YBPic. """
ybpic_dict = {}
with open(photolog,'r') as indexin :
readout = csv.reader(indexin)
for row in readout:
ybpic_dict[row[0]] = ybpic(row[0],row[1],row[2],row[3],row[4],cwd,photolog)
return ybpic_dict
Then when you need to update the attributes later
ybpic_dict[PhotoID].update(...)
Okay, since I found the answer myself, no kicks are in order....
Store the ybpic.instance object in a list.
The answer was, in the for loop which creates the instance of ybpic from the row of the indexfile, rather than putting the associated values for the instance in a list to be passed back to the caller, append the actual object of the instance into the list which is then passed back to the caller. Once I'm back in the calling function, I then have access to the objects(instances).
I'm not sure this is the best answer, but it's the one that gets me moving to the next.
New code:
def ReadIndex(photolog, cwd, indexlist) :
""" Read the CSV log file into an instance of YBPic. """
with open(photolog,'r') as indexin :
readout = csv.reader(indexin)
for row in readout:
indexrow=ybpic(row[0],row[1],row[2],row[3],row[4],cwd,photolog)
indexlist.append(indexrow) ## Store the ybpic.indexrow instance
return(indexlist)
.
Related
I'm working on a program that, in short, creates a CSV file and compares it line by line to a previously generated CSV file (created on previous script execution), and logs differences between the two.
I'm having a weird issue where csv.dictreader does not appear to be reading all rows of the NEW log - but it DOES read all rows of the OLD log. What makes this issue even more puzzling is that if I run the program again, it creates a new log, and will now read the previous log all the way to the end.
Here's what I mean if that made no sense:
Run program to generate LOG A
Run program again to generate LOG B
csv.dictreader does not read all the way through LOG B, but it DOES read all the way through LOG A
Run program again to generate LOG C
csv.dictreader does NOT read all the way through LOG C, but it DOES read all the way through LOG B (which it previously didn't although no information in the csv file has changed)
Here's the relevant function:
def compare_data(newLog, oldLog):
# this function compares the new scan Log to the old Log to determine if, and how much, memory space has changed
# within each directory
# both arguments should be filenames of the logs
# use the .csv library to create dictionary objects out of files and read thru them
newReader = csv.DictReader(newLog)
oldReader = csv.DictReader(oldLog)
oldDirs, newDirs = [], []
# write data from new log into dictionary list
for row in newReader:
newLogData = {}
newLogData['ScanDate'] = row['ScanDate']
newLogData['Directory'] = row['Directory']
newLogData['Size'] = int(row['Size'])
newDirs.append(newLogData)
# write data from old log into another dictionary list
for row in oldReader:
oldLogData = {}
oldLogData['ScanDate'] = row['ScanDate']
oldLogData['Directory'] = row['Directory']
oldLogData['Size'] = int(row['Size'])
oldDirs.append(oldLogData)
# now compare data between the two dict lists to determine what's changed
for newDir in newDirs:
for oldDir in oldDirs:
dirMatch = False
if newDir['Directory'] == oldDir['Directory']:
# we have linked dirs. now check to see if their size has changed
dirMatch = True
if newDir['Size'] != oldDir['Size']:
print(newDir['Directory'], 'has changed in size! It used to be', oldDir['Size'],
'bytes, now it\'s', newDir['Size'], 'bytes! Hype')
# now that we have determined changes, now we should check for unique dirs
# unique meaning either a dir that has been deleted since last scan, or a new dir added since last scan
find_unique_dirs(oldDirs, newDirs)
Based on the fact that the old log gets read in its entirety, I don't think it would be an issue of open quotes in filenames or anything like that.
If I'm given *.hdf file, how can I print out all the data it contains?
>>> import h5py
>>> f = h5py.File('my_file.hdf', 'r')
>>> # What's next?
All the questions here describe how to either create an hdf file or just read it without printing out the data in contains. So don't mark it as a duplicate.
This is not a proper answer to this question, but the one other answer is a bit unsatisfactory.
To have a look at what's inside an .hdf file, I usually use NASA's Panoply software. It can be downloaded here: http://www.giss.nasa.gov/tools/panoply/ and it lets you open, explore and plot data in all sorts of geo-referenced formats, including netCDF and hdf.
Then I can find out the name of the subdataset I'm interested in and open it in my python script.
Hope this will be a helpful tip for some people looking up this question!
You might want to use the visititems method.
Recursively visit all objects in this group and subgroups. Like Group.visit(), except your callable should have the signature:
callable(name, object) -> None or return value.
In this case object will be a Group or Dataset instance.
So the idea is to have a function that will take as argument the name of the visited group (or dataset) and the group (or dataset) instance to log and call the visititems function of the opened file with this log function as argument.
Here is a simple example implementation:
def log_hdf_file(hdf_file):
"""
Print the groups, attributes and datasets contained in the given HDF file handler to stdout.
:param h5py.File hdf_file: HDF file handler to log to stdout.
"""
def _print_item(name, item):
"""Print to stdout the name and attributes or value of the visited item."""
print name
# Format item attributes if any
if item.attrs:
print '\tattributes:'
for key, value in item.attrs.iteritems():
print '\t\t{}: {}'.format(key, str(value).replace('\n', '\n\t\t'))
# Format Dataset value
if hasattr(item, 'value'):
print '\tValue:'
print '\t\t' + str(item.value).replace('\n', '\n\t\t')
# Here we first print the file attributes as they are not accessible from File.visititems()
_print_item(hdf_file.filename, hdf_file)
# Print the content of the file
hdf_file.visititems(_print_item)
with h5py.File('my_file.h5') as hdf_file:
log_hdf_file(hdf_file)
I want to update nodal values of an existing Abaqus odb file using python script. I already have new nodal valued, but don't how to put them into odb file instead of previous data.
I might be wrong about this, but there is no way of calling some method to replace the existing values in the odb. What you can do, though, is to create a new step and frame (or just a frame in an existing step) and then create a new field output object with the new values.
If you can live with this approach, check documentation for FieldOutput object. You would probably do something like this:
odb = session.odbs['yourOdbName']
instance = odb.rootAssembly.instances['nameOfYourInstance']
field_output = odb.steps['stepName'].frames[frameId].FieldOutput(
name='DefineTheName', description='WhatItRepresents',
type=SCALAR # or whatever other type you need
)
field.addData(
position=NODAL, instance=instance, labels=your_node_labels,
data=your_data
)
After you're done with this, or even better before, try calling the following:
odb = session.odbs['yourOdbName']
del odb.steps['stepWithResults'].frames[frameId].fieldOutputs['variableName']
This is a wild guess but it might work. If it does, you can simply delete the existing field output, create a new one and then save the odb.
Whatever you choose, make sure not to open odb in read-only mode and save the odb and then open it because probably nothing will be visible in the current session.
Right now, I have a Django application with an import feature which accepts a .zip file, reads out the csv files and formats them to JSON and then inserts them into the database. The JSON file with all the data is put into temp_dir and is called data.json.
Unfortunatly, the insertion is done like so:
Building.objects.all().delete()
call_command('loaddata', os.path.join(temp_dir, 'data.json'))
My problem is that all the data is deleted then re-added. I need to instead find a way to update and add data and not delete the data.
I've been looking at other Django commands but I can't seem to find out that would allow me to insert the data and update/add records. I'm hoping that there is a easy way to do this without modifying a whole lot.
If you loop through your data you could use get_or_create(), this will return the object if it exist and create it if it doesn't:
obj, created = Person.objects.get_or_create(first_name='John', last_name='Lennon', defaults={'birthday': date(1940, 10, 9)})
Django and Python newbie here. Ok, so I want to make a webpage where the user can enter a number between 1 and 10. Then, I want to display an image corresponding to that number. Each number is associated with an image filename, and these 10 pairs are stored in a list in a .txt file.
One way to retrieve the appropriate filename is to create a NumToImage model, which has an integer field and a string field, and store all 10 NumToImage objects in the SQL database. I could then retrieve the filename for any query number. However, this does not seem like such a great solution for storing a simple .txt file which I know is not going to change.
So, what is the way to do this in Python, without using a database? I am used to C++, where I would create an array of strings, one for each of the numbers, and load these from the .txt file when the application starts. This vector would then lie within a static object such that I can access it from anywhere in my application.
How can a similar thing be done in Python? I don't know how to instantiate a Python object and then enable it to be accessible from other Python scripts. The only way I can think of doing this is to pass the object instance as an argument for every single function that I call, which is just silly.
What's the standard solution to this?
Thank you.
The Python way is quite similar: you run code at the module level, and create objects in the module namespace that can be imported by other modules.
In your case it might look something like this:
myimage.py
imagemap = {}
# Now read the (image_num, image_path) pairs from the
# file one line at a time and do:
# imagemap[image_num] = image_path
views.py
from myimage import imagemap
def my_view(image_num)
image_path = imagemap[image_num]
# do something with image_path