Tracking file upload status?


#1

[For background, this is motivated by PAN006 filling its disk sometime in the last few days.]

The housekeeping stage of the PANOPTES software takes care of uploading images to Google Cloud Storage, but we don’t have a great way of keeping track of whether this has happened on a per-file basis. On Windows there is an Archive bit associated with each file, something not present in Linux. However, Ext4 (the default file system on Ubuntu) supports extended attributes, which allow arbitrary key:value pairs to be attached to files. We could possibly use these to track when and to where each file was uploaded. This would make it fairly easy to determine when it is safe to delete local files. My preference is to keep the files on the local system for a while, just in case it turns out there is some problem with the upload or (more likely) a desire to investigate an issue with the operation of the scope.

Does anyone on the team have experience using these extended attributes?


#2

My original idea was to just mark the observation as uploaded in the metadb (mongo) on the unit. Might be slightly more portable.

I’ve never used the extended attributes so am ignorant about their efficacy.


#3

For even more portability and simplicity, we could just have a file in each observation directory that tracks the upload status, or any other metadata, of the image files in that directory. That way the metadata is easy to archive or send along with the images.


#4

Not a bad idea. On the image processing side of things I am doing something similar and storing some files alongside the sequence directory once they are processed. Essentially the same thing.