VARS-ML-SE-tasks

	A	B	C	D	E	F	G	H	I	J
1	VARS-ML (Initial Release--November 2023)
2	Priority (originally assigned)	Task	Due (as prioritized)	Description	Assigned to	Reported by	This problem impacts	Needs to be done for which component	NOTES	STATUS

3	25	i2map midwater localizations underway. We think benthic localizations are in good shape, but Lonny and Larissa can take another pass	12/31/2023 In process	Need more localizations of various MW concepts from i2MAP.	KRW, KLS, RS	KRW, KLS, RS	training	1: Before training next model		On hold for item #2 above (vars gridview queries are failing)--which is now done. Also pending creation of h264 mp4 mezzanine files (1920x1080) as .mov (2k) files cannot currently be handled--Lonny is makingn good progress on making these files now (as of 9/13/23). Brian will need to link up the i2map mezz directories so they're registered in VARS. 9/20/23: Video files have been transcoded. Midwater annotators to resume localization/verification work. Brian and Kevin to work out scaling so that bounding boxes appear ini the correct location (may require some experimentation--Danelle had some add'l suggestions to store the bounding boxes in normalized form, e.g. 0-1.0 which should avoid any future issues). 10/11: mulitple frame grab timing mis-mismatch issues persist. ON HOLD still. 10/18: as of now, can resume localization/validation tasks for i2map data. 10/25 Localizations are back ON HOLD due 11K images w/ no dimensions registered in VARS and therefore being skipped in gridview load
4	28	Full paper trail for model training (added 10/18). Includes M3ML (central archive for storing trained models and metadata).	12/31/2023	.tar files (and other metadata/parameters?) from AWS saved with model in MBARI's archive on titan	BS, DC		documentation	1: Before training next model		10/18: identified potential disconnect between .tar files generated during training on AWS and permanent archive of the metadata/paper trail on smb://titan.shore.mbari.org/M3_ML (MBARI's model archive). E.g., record of class ID names with their corresponding class #s used with each model. 10/26 DC updated njs: Task #28 is a little fuzzy, and I need to coordinate with Brian and Lonny on this task to estimate. 11/8 Brian suggests putting on M3ML (or verify they're already there)--archive exists, still need to discuss what goes here, organization, naming, etc. 11/8 Lonny thinks this paper trail might not yet include things like: Yolo version, flags set in training, yaml files (names and where data live and # of classes), etc.
5	23	Upgrade to Ultralytics (supports yolov5, yolov8, etc.)	ON HOLD	Intern Sabrina noted improved model performance when using yolov8, small percentage improvement but might be worth it in the long run. Likely replacing yolov5 code so will help us to stay current..	DC, LL, KB	LL	training, ml proposals	1: Before training next model (ON HOLD)	may not be prudent to wait on implementing	ON HOLD This need to move later unless there is a compelling need; a few percent improvement may not warrant the shift. Reinstrumenting the docker images to YOLOv8 and testing not likely until later this year given Danelle's current workload. Danelle has run out of time on the project for 2023. She will revisit first thing in 2024. Recent discussion: develop protocol outlining requirements/impacts of switching to YOLOv8 (or any new model) so we're best prepared when the time comes (Kevin suggested he could assist w/ this). WE NEED TO HAVE BENCHMARKS BEFORE switching to any new model to measure performance. Please add this as a task for the video lab.
6	1	Frame grab time mis-match	done	For training data generation, when there’s not an existing frame grab, on-the-fly frame capture (Beholder) for M3 download has to be frame accurate as well. Test: make some localizations without saving the frame grab and review in RectLabel (where won’t be registered in VARS) (Kevin knows of this now, KRW emailed him) https://github.com/mbari-org/vars-feedback/issues/56	BS, KB	KRW, VL	validation, training data generation	1: Before training next model	ML run on nearest preceding keyframes. In Beholder, just looks for next frame. In Sharktapoda 2, looks for nearest keyframe. NEED TO determine what happens when fg made in Sharktapoda 2. Hopefully, existing fg's will be retained after fix in place...tbd https://github.com/mbari-org/vars-feedback/issues/56	From Brian's 8/9/2023 email: Recommendations: 1. Existing localization ML generated in VARS should be moved so they are correctly aligned in the video. In theory, this can be done in either gridview or VARS/Sharktopoda, but some testing to verify this should be done. 2. Fix the off-by-one error in the ML-to-VARS import to remove the largest source of error. 3. Investigate changes in Sharktopoda to improve frame-accurate seek and framecapture. 9/13/23 Brian has discussed with Paul Rogers and he is working on this. 9/27/23: Brian meeting w/ Paul this Friday for status update ONGOING - there are numerous mismatch and loading errors engineers are still working on
7	2	vars-gridview queries are failing	done	MW and Benthic concepts will not open in vars-gridview for various concepts. Example: Mitrocoma cellularia and Pennatulacea – we can’t open imageset for in vars-gridview. Here is a github issue for this problem: https://github.com/mbari-org/vars-gridview/issues/23	KB, BS	LL, KRW, LML	validation, training data generation	1: Before training next model	issue with time codes in VARS that have dashes instead of colons. Also look into optimizing file recall time.	FIXED in database 6/28/2023 (by Brian)
8	3	Localizations offset from RectLabel import	done	Localizations generated with RecLabel on letterboxed images were cropped and then imported onto uncropped framegrabs in VARS resulting in offset boxes: Search by observer: expert-observer in gridview and investigate which concepts this is happening for. Note: We thought these had been addressed but MW folks are finding more of these (i.e., krill molt). Query in gridview is failing, similar to #2 above.	KB, VL	KRW, KLS	validation, training data generation	1: Before training next model	Kevin found about 5K localizations with the part ingest tag that have the offset boxes. That tag is in JSON, so not retrievable in gridview now, but Kevin will add to filter, so VL can review the full impact. Determine resolution once w. Kevin: The 60 bounding boxes outside of the image dimensions were actually in-bounds. I was associating them to the wrong source video in my query.	DONE (Kevin, VL). Kris thinks she has cleaned up all the "expert observers" in Gridview. Kevin found about 60 others where the box is outside bound of image--these can be deleted--for the others, Kevin will send list to VL to determine what to do. KRW will look at vars-part-ingest as generator, to see all 5000 localizations that were cropped and pulled in, krw checked all these and there were a few dozen boxes that needed correcting, this is done.
9	14	Preserve model proposal confidence value	done	Preserve model’s confidence value that the model gives to the localization or track, so we can sort by confidence in Gridview (probably stored in JSON blob) – many reasons to store confidence.	BS, KB	VL	ml detection/tracking proposal processing in VARS	1: Before training next model	Lives in the bounding box association JSON, keyed by "confidence"	Completed 2023-11-06 (by Brian)
10	5	Local ML Service	TBD -	Brian will put together a spec. for Danelle for building a local service for ML processing. Need to ID a GPU to use for this--Duane & Danelle will investigate. Doug IS may be adding GPU that we can use < although may not be the best idea to "share". Purpose of this service: comparative analysis, special projects, testing; removes the gatekeeper present for AWS, so anyone can run inference--for smaller jobs--using own application. E.g., inferences for images (Pythia). Could also be used shipboard.	BS, DC, DE	BS	For ad hoc analysis of individual files, enabling team to quickly evaluate options and results as we continue to refine our models, VARS-ML workflow, etc.	1: Before training next model	$12-$13K for GPU 7/19/2023 update: Brian has sent a spec to Danelle; she thinks it is a good idea/ need use cases use case would be to help compare model vs manual annotations see. #24	In progress. Related to adding hyperparameters to workflow #10--actually needs to come before this task. https://github.com/mbari-org/plutus for archiving protion of service 9/13/23 In progress. DC has added a SQL light server to track parameters (full paper trail). Still looking for a GPU for this (or work with IS). We think, but need to test, that the Lambda system does has the same network drop issues experienced on Macs. VL should have the budget for this ($5K for a Lamda Tensorbook; maybe $15K for tower instead), once we finalize the decision. 9/20/23: Brian and Danelle met to review and discuss testing/integration. Could be ready to put on the new GPU in a week or so. Note: Again this system is primarily for inferencing, but will likely not be able to handle large training jobs (smaller ones ok). May purchase 2 new GPUs, if affordable models are selected, for doing other tasks, e.g. can use one for transcoding jobs. 10/6/23 API deployed to 1 GPU machine at maximilian.shore.mbari.org:8000/docs. Testing loading to M3_TRACKS database. 1/18/24 What are the remaing tasks? Integration, testing, training, data managment?
11	7	Command to upload files -- upload files from linux	done--few tasks to wrap/test	Uploading of files – b/c M3 mount disconnects sporadically from Mac, may be best to use Linux machine that won’t time out; need to remote into a Linux machine to kick job off–especially if the job is going to be HUGE (multiple dives) or files are large (4K).	DC, DE	LL	ml detection/tracking proposal processing in AWS	1: Before training next model	Best if this is done from an office with high-speed network support. Not all offices have this. Can use a dedicated linux VM w/ high-speed connection (62 MB/s max).	In progress/Batch script as starting point for this is at https://bitbucket.org/mbari/m3-scripts/src/master/bin/aws_process_dive.sh Implementing on MBARI virtual Machine (VM) DORIS. 10/18: about a week's worth of work remains (DC installing elastic container service) 10/27 DC updated njs: Duane and I will do another working session tomorrow to do the final stages of testing the ecsprocess command on Doris, and we will report back to the team. https://docs.mbari.org/deepsea-ai/commands/process/ 11/01 Testing continues. All of dive V4464 has been processed using deepsea-ai:1.24.6 ecsprocess on DORIS, all but one file of V4474 has been processed, and V4487 is currently being processed (20 of 54 as of 16:00PDT, processing 6 or so per hour). We will continue testing, evaluation of paper trail, and pull of results back to MBARI.
12	8	Command to upload files -- one job ID# per dive	done	Ensure each ind. dive has a “job ID #” and we initiate only one dive per job	DC	LL, BS	ml detection/tracking proposal processing in AWS	1: Before training next model		DONE. Bash script as starting point for this is at https://bitbucket.org/mbari/m3-scripts/src/master/bin/aws_process_dive.sh
13	9	Command to upload files -- exclude _trashme files	done	Exclude files w/ extension *_trashme from uploading to AWS (these files will eventually be removed from m3 and VARS registry)	DE	KRW	ml detection/tracking proposal processing in AWS	1: Before training next model	DONE. But do add to default script	Done. Added to default script.
14	10	Command to upload files -- hyperparameters	done	When kicking off a job, be sure that all hyperparameters below are available and that they work. For example, ensure that we can turn on agnostic-nms, adjust conf, and iou-threshold. Here’s a list of flags we might want to be able to use/modify per run: parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'yolov5s.pt', help='model path or triton URL') parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w') parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold') parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold') parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image') parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --classes 0, or --classes 0 2 3') parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')	DC	LL	ml detection/tracking proposal processing in AWS	1: Before training next model		DONE: https://github.com/mbari-org/deepsea-ai/issues/22 9/20/23: Essentially done and can be utilized next time training is done. 9/27/23: This is complete
15	16	Parts annotations/proposals	done	What to do about parts versus whole organism annotations/proposals if we need to edit. Cannot edit body part localizations in VARS and Gridview (we think). Lonny collapsed them in training the Paull run. Long-term goal: consolidate in manual tracking and eventually have ML figure out these boxes are connected to the same animal. Idea: global replace of the incorrectly configured body parts. Is there a list of these incorrectly named “concepts”, especially if they are no longer using them.	BS, KB	VL	validation, training data generation	1: Before training next model	VL to discuss IF we always want to exclude the parts from training data. Sometimes there is only a part of an animal in view, so could be useful to keep. Kevin: part annotations are indeed editable in GridView; the box to the right of the concept box	DONE: 9/20/23 VL discussed exlude parts (except keep all Bathochordaeus) from next inferencing run. Today Kevin sent list of localized parts to VL to evaluate--need to decide if worth doing global replace to match VARS format. Future--still need to discuss use cases and if want to develop process for training on associations like "part-of", "contains", "feeding-on", etc.
16	22	Increase frame size when training	done	Intern Sabrina noted improved model performance when using larger frame size for training, especially better at detecting smaller objects.	DC	LL	training, ml proposals	1: Before training next model	may not be prudent to wait on implementing	DONE. This can be done with our current workflow which supports any of the YOLO 1280x1280 models, e.g. yolov5n6.
17	12	Gaps in coverage of ML proposals	TBD 2024	Need to identify a parameter that indicates if there are gaps in coverage of annotations per unit of time—SE’s to help us decide how to best monitor this (e.g. how long it took to process in AWS and/or a report (possibly a visual report like the timeline bar in VARS or more of a table, similar to existing dive timecode report for IMES work) post-AWS processing.	BS, KB	KRW, LML	ml detection/tracking proposal processing in AWS	2: Before next detection/tracking job	First check to see if they did not get loaded or if there really were no detections. Would a scene change cause no detections? Larissa to get team a list of files where this occured. We may need to rerun these to figure out what's happened.
18	30	DeepSea-AI Dashboard	HOLD	Snap shot/summary views of dives run through detection/tracking (from AWS or Maximillian)	DC			out of current VARS-ML scope	Demononstrated at 11/8/2023 meeting	11/8 Discussed current and potential utilitly/features of the dashboard. We felt the tools/features needed to train, process ML output, preview, validate, etc. have been scoped and developed (or are underway) in our existing VARS-ML workflow tools, i.e. M3 model and output repositories on titan, loadpath directly to VARS, gridview, Sharktopoda, VARS UI, Doris, Maximillian, AWS etc. At this time, we need to focus all VARS resource allocations on those existing tools.
19	11	Full “paper trail” for inferencing. Includes titan DeepSea-AI archive, development, productions (central archive for storing ML inferencing results)	11/30/2023--few tasks to wrap/test and train VL users	Need a more detailed accounting for the entire workflow, a full “paper trail” . For video tracks that have already been filtered and before they get loaded into VARS, need a report (table)—per dive—of what we intended to upload (files on M3), what uploaded to AWS, and what AWS produced, highlighting diffs (i.e., all files should be uploaded and processed by AWS, if any were missed this should be clearly identified in the table as a problem – bold red text or ?? to indicate a mismatch between intention and what actually got processed).	DC, DE	LL	ml detection/tracking proposal processing in AWS	2: Before next detection/tracking job	Need to figure out 1) what's in the paper trail (dive, # of files, Model used, hyperparameters, count of tracks per file) and 2) the best method/place to archive/share this metadata. Ideally, from user perspecive, we have one place to view all steps that happened for a dive.	10/26 DC reported to njs: Task #11 is done with the monitor command, which we will test too. The plan is to have Duane trained to run everything. https://docs.mbari.org/deepsea-ai/commands/monitor/ 11/8 actually not quite done yet isn't quite done yet--still need to clean up/organize file structure & names, confirm contents, add README, elevate metadata, train VL, etc. on titan DeepSea-AI repositories. 12/13/2023 DRE see smb://titan.shore.mbari.org/DeepSea-AI/production/reports/500055-videolab where "paper trail" reports for each dive are archived. 2024 task, to be specified, to improve display of "paper trail" and add required features. Also on doris in /data/reports/500055-videolab. eg see Ventana_Dive_V4487_20231201.txt
20	21	Adjust the confidence higher	done	Lots of 'junk' on this last run. Perhaps higher confidence will help filter junk, but at the expense of missing a lot more things. Discuss trade-off of doing this. Related to #29 below.	DC, LL, DE	KRW	ml detection/tracking proposal processing in AWS	2: Before next detection/tracking job		10/26 DC reported to njs: Task #21 is done. I looked up the processing parameters from the last large jobs, which are stored in the M3_TRACKS database, and that confidence value was 0.5. Lonny can decide what to use.
21	26	Better understand/compare hyperparameter effects	done	Before running the next model, can we come up with a matrix of all of the different possible hyperparameters/conditions that can be set (confidence, frame size, code that deals with “junk,”, etc.) and strive to better understand/compare the outcomes when adjustments are made? Get definitions. Identify ones that may impact our outcomes. Adjust based on our needs. Ask Eric O. Kevin B. to give us an overview.	KB, EO	KRW	VL understanding of model output	2: Before next detection/tracking job		Kevin shared slides at 10/18 meeting.
22	4	Additional vars-gridview issues	11/30/2023	https://github.com/mbari-org/vars-gridview/issues	KB, DC	LL, VL	validation	3: Before importing next ml proposals	Kevin complete multiple sort and query by activity and obsesrvation group (avail. after Kevin pushes out new version). Mutliple other features/bugs in the GitHub list still.	In progress (Kevin). 10/26 DC reported to njs, re: Danelle's contributions: That task is to add a similarity sort to Gridview. We can discuss it with the group if I'm misunderstanding the use case, though, and elevate that in priority if needed. There is also the possibility that Kevin B. can do this work with some guidance from me.
23	6	Knowledgebase in gridview is static	11/30/2023	This is a problem because we lose taxonomic name updates (e.g. Anthomastus versus Heteropolypus). Possibly exclude former and common names in text output from KB (or come up w/ a different solution actually using API to KB) – either develop API for current KB concepts or update list weekly or ??	KB	VL	validation	3: Before importing next ml proposals	Kevin: GridView's list of concepts is not static; it pulls the list of concepts from the KB every time you launch the application. VL wants GridView to accept any alias name and automatically enter primary concept name and carry forward into VARS.	Done as of v0.3.17
24	17	Pending verifications (group with #20 "Observer is different after adjusting localization")	TBD 2024	Pending verifications are difficult to pickout—need to determine a long term workflow for these pending verifications (do we start a new field in VARS db?) Temporary workflow: activities other than unspecified will be changed from ROV pending verification to ROV (automatically by Brian) Highlight pending verifications in VARS and gridview and have a button in each app that allows a user to ‘verify’ the ml-proposal – changing the annotation is one way to accomplish this but, need an easy way to ‘validate’ if simply accepting the ml proposal. TEMPORARILY: exclude pending verifications from Query while we are still in the testing phase (duration TBD). EVENTUALLY include in Query (as default setting), but make sure they are clearly marked as pending verification. Note: Group is not currently available as return data column in VARS DataViz (may have been called something different or it wasn’t something we used often back then) Always exclude from training data	BS, KB	VL	validation	3: Before importing next ml proposals		10/25 Discussed need for VL to meet w/ KB to review workflow in the lab w/ VARS open
25	19	Associations in gridview	TBD 2024	Need to have a minimum set of associations available, if not all. At the very least identity reference and maybe	KB	LL, VL	validation	3: Before importing next ml proposals
26	20	Observer is different after adjusting localization	11/30/2023	When a person adjusts the box in gridview, the observer changes and then appears as if it may have been validated when in fact that was not the intention. Issue discussed in early June meeting with Kevin. Need clarification.	KB	LML	validation	3: Before importing next ml proposals	Kevin: I've checked this several times; does not seem to be the case. Will need more details on how to reproduce if it is indeed happening.	Discussed 7/12 meeting. Kevin will look into this further.
27	18	Sharktopoda2 integration with gridview	11/30/2023	Sharktopoda2 integration is not fully working. Video is not queued up at location of ROI #1 priority What interaction will there be with S2 and gridview – can we edit boxes and add new localizations with S2?	KB	LL	validation	3: Before importing next ml proposals	Kevin: May need some fixes in S2's UDP comms. GridView is not set up to create new annotations (we avoided this on principle in the design); will require some overhauling if we want to add annotations.	In progress (Kevin) 10/18: Brian and Kevin enumerated add'l improvements related to launching/maintaining Sharktopoda connection (to eliminate need to restart everything when going bet VARS & Gridview) and a command (fetch video?) that causes system crash if run too close to starting up the app.
28	13	All ML proposals imported	11/17/2023 (nearly done)	We are thinking (as the default for new dives) we’d like to have all ML proposals imported even if there are existing annotations. I.e. NO blocking of ML annotations on imports…we humans will review and manage any duplicates in VARS.	BS	VL	ml detection/tracking proposal processing in VARS	3: Before importing next ml proposals
29	29	Preview ML output in Sharktapoda (before loading in VARS)	done	Accommodates testing models and workflows, so VL can review output before muddling up the VARS database w/ potentially undesirabe/inaccurate data	BS		validation	3: Before importing next ml proposals		10/27, Instructions from Brian "1. Download it 2. Unzip it somewhere, you’ll need Java 17+ installed. 3. `cd /location/of/mlviz` 3. Grab a zip file from the ML pipeline: eg. from titan:/DeepSea-AI/production. They will have names like D1379_20210817T150446Z_h264.tracks.tar.gz 4. Open sharktopoda 5. Run a command to view it: `./view-vars-in-sharktopoda /path/to/D1379_20210817T150446Z_h264.tracks.tar.gz -c 0 -e "http://m3.shore.mbari.org/vam/v1”` There’s docs for it at https://mbari-org.github.io/m3-support/tools/sharktopoda.html . To view as frame accurate, set Sharktopoda’s time window setting to 17 milliseconds. For quick viewing set it to 100 ms or so This is a first cut so if we need a change of how it works, just let me know." Download link: https://drive.google.com/file/d/16zPzB-AZZrfcpFCvDjdgAZSr603skACS/view?usp=sharing Code/docs: https://mbari-org.github.io/m3-support/tools/sharktopoda.html
30	31	Gridview jumping to full frame unexpectedly.	done	Gridview preview window snaps to full frame if you click inside the ROI. This is not the action we would expect.	KB	KLS		3: Before importing next ml proposals
31	15	Choose the best localization	done	Need to choose the best or, at least, a better localization from a track to become the imported annotation. Currently the view is not the best and often the first observation is only a part of the animal like feet, tails, etc. The entire animal needs to be visible for it to 1) be useful as training data and 2) to be viewable by the annotator in gridview and vars. This may change the duration of the annotation but….at this time the need for a quality view is more important than the need for the entire duration (duration may not be accurate at this stage in our development anyways).	BS, DC	LL, VL	validation, training data generation	3: Before importing next ml proposals		10/18: I think Brian indicated the method for selecting best localization is "ready"...can we confirm this is 100% done?
32	24	Metrics of model vs. human	12/15/23 (for demo and requirements discussion only)	Need to make at least a cursory comparison/metrics of what can be done/captured with ML pipeline vs human annotation. VAA track analysis code is a good place to start.	BS, KB, DC, EO, LL, KRW	NJS, VL	ML integration in general and how we discuss/compare it to previous human annotations.	4: By end of this iteration		10/18. Need to identify/setup infrasructure and workflow to support comparisons. Tools/methods widely available. Warrants deeper discussion. 10/26 DC reported to njs:Task #24 is to measure performance. Brian and I have notebooks that can measure the performance in two ways. I'd estimate a day to remember how to run that. 11/8 VL needs to be brought up to speed on what exists and then have further discussion about what need in the future (may bring in others throughout institute with ecology data/population comparisons (Monique, John R.)
33	27	Name for the instant or on-the-fly ML proposal button	done	Kyra proposes to call the button 'the Oracle'	KLS, BS	KLS	Easy way to identify the button?	4: By end of this iteration		DONE: yes?
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100