Training (reporters from each table please fill in)

	A	B	D	F	H
1		Question 1. Who should CFDE target for training and why?	Question 2: What types of training activities CFDE could offer?	Question 3: What materials should be delivered in the training? Should we provide basic training in Data Science? Specific training on how to use data from a CF program? Training on how to combine data from across CF programs?	Question 4: How can we better coordinate and integrate training activities among the DCCs and partnership projects over the next year?

2	Table 1	Data scientists, basic scientists -- beginner to expert	Create a "stackexchange" knowledgebase for CFDE. Most data science needs I have are met by a stackexchange search.	A "stackexchange" site for CFDE. Leverage free data science courses. Unique content on combining CFDE resources should be a focus area.	Use self-organizing best practices for procing valuable answers on how to solve problems -- incentivizing contribution. Again, look at stackexchange and other knowledgebase models.
3		Industry scientists and academic scientistis -- also for graduate students with thesis-ready use cases	Curated set of general resources, leveraging open courses -- then CFDE-and DCC-specific courses.	Tutorials, YouTube tutorial videos, Jupyter notebooks, R and python notebooks.	Put money behind it.
4		CFDE members - to get immediate feedback, be our own beta testers	best practices in informatics and engineering	Focus on what only CFDE can do, leverage external resources for everything else. Do NOT focus on basic data science training.	Identify together who our users are and who can use training across DCCs
5		Both CFDE investigators and staff.	Focus on what only CFDE can do, leverage external resources for everything else. Do NOT focus on basic data science training.		Have hands-on sessions dedicated to one DCC at a time -- with all of the DCCs participating.
6		Graduate students from wet labs to better identify experimental targets.	training program for future DCC leaders -- potentially K-award		Hackatons end up being de facto training activities
7		"Data parasites" -- knowledge discovey using other peoples' data			Focus on making use cases doable (reproducible) in a short period of time by non-experts
8					Sharing people between projects (DCCs)
9					share/expose existing CFDE resources at DCCs
10
11	Table 2	Our training should focus on what data we have, which is something that crosses all experience levels. So I don't think we should focus on a particular experience level of users. Junior investigators are the ones that typically use the data, but the PI level people need to understand the data in order to direct their research. We should target both wet lab and computational biologists.	We should provide 1) pre-recorded videos, 2) live webinars, and 3) workshops (both in-person and virtual).	We should deliver training videos, presentations, and runable coding environments (e.g., Jupyter notebooks). No, we should not provide training in basic data science -- that is a very large subject and there are likely organizations better able to do that than us. We should provide training on how to use CF data. Combining data across CF programs is an unsolved research problem with significant pitfalls. We can not provide a recipe for how to do that, but we should present pitfalls and considerations.
12		Researchers who are interested in CF datasets.	1) Datasets that are available 2) How to access those datasets 3) Examples on how to use the datasets and available pipeline and tools	How to use the data from the CF program and how to leverage other CF datasets to combine them for analysis	Keep it simple. Can we define one or two training events to which we can contribute?
13		Graduate students, postdocs, and MD/Phd trainees. Less frequently established PIs.	Presentations that are overviews of data. How to use the CFDE portal. Hands-on workshops for using CF data in cloud environments like Terra.	We may need to provide basic training in data science, but only in the context of CF datasets.	Central tracker of CFDE training activities
14		Pharma/industry rsearchers	Train on use of APIs versus web interface.	We should provide API documentation.
15		Medical trainees	Training with the use case collection. What is missing from the CFDE portal?
16		Other CFDE DCCs	Simple cloud workshops with one or two of the datasets.
17		Why: People who are going to make use of the dataset and need them	What happens behind the scenes? For example, how is the data normalized.
18		Train course instructors (e.g., professors).	Hackathons.
19
20	Table 3	All members of internal CFDE to be aware of what other groups are doing.	Academic course in such topics as a clinical modeler.	Training videos in a centralized location, perhaps on a YouTube channel. Ensure a logical order/grouped modules in which the videos should be watched.	Working group-develop plan for moving forward
21		Other NIH programs that have cross-sectional research activities. So they can learn more about what groups are offering, can pull together data, etc.	Workshops	Tutorial documents	Training hackathon
22		External users for awareness, perspective, and feedback loop.	Web-based platform training	Jupyter notebooks	Research what communities can/should be served; gathering feedback
23		Ensure PMs from all domains are aware of the process.	Hackathons with targeted challenges that results in a prize.	Place content or tutorials on site such as biostars.org
24		Data scientists at universities to get them interested. Develop courses around the data.	Create a forum where teams can present on what they're doing. Solicit researchers to speak.	Longer term: development of academic courses (e.g. SPARC - RFA-DK-20-020)
25		Bio-informatics, Clinical informatics students, Physician scientists	Series of courses/modules.	Coursera trainings that provides a certificate of completion.
26			Train the trainer (like software and data carpentry philosophy) where trainers can take training back to their local institution	FAIR training material - ability to utilize material developed by the DCC's from, for example, a CFDE training portal
27
28
29	Table 4	Clinicians and practioners / translational medicine	Online learning courses	Videos	Money
30		Patients (appears to be a demand)	Hackathons (solve a demo problem)	Scrolli maps	Coordination group
31		Junior researchers (build a demand)	Pathways to answer common questions / data types and how they connect to othersd to asnwer questions	MOOC (Coursera, Canvas) - Massive open online courses	Focus on partnership projects
32		Phd Students (build a demand)	Workshops face-to-face (become MOOCs)	Jupiter notebooks	Having dedicated training courses for integrating multiple datasets
33		Medical students (build a demand)		R markdown documents
34		Project communities (focus on domains, problems)
35		Datascientist with Biological interest
36		BioInformatics
37		Database providers
38
39
40	Table 5	Thought: define a cohort to track over time (e.g, medical students)	"See one, do one, teach one model" similar to medical school.	Specific training is important to empower the trainee to perform the types of analyses he/she cares about.	Use metrics similar to those used by T32s
41		Teach the teacher: the impact of training bioinformaticians may multiply.	We're better off to be training people to teach in order to scale	Generic training for common platforms (RStudio, Python) is good if it can be connected to relevant data sets.	Focus on outcomes such as empowering a class of users to recapitulate results published in model papers.
42		Inserting CFDE data into existing training programs.	Summer schools - common in other STEM	Recapitulate papers with (at least partial) use of CF data - materials to support this. e.g. https://allisonhorst.github.io/palmerpenguins/ or https://datacarpentry.org/genomics-workshop/	Focus of the humans as the product of training -- i.e. coordinate amongst the trainees
43		Explicitly targeting students who are NOT affiliated with the DCCs	Master's or PhD or postbac students
44		Masters-level programs.	A T32 to train students to use CF data
45		Educational settings that themselves lack the local infrustructure for data generation/analysis	Recapitualting analyses published in model papers.
46		location-independent "data scholars" https://datascience.nih.gov/data-scholars-2022	"Project-based-learning" models
47		Leverage the CFDE to exapnd the data science workforce	Using google docs
48
49	Table 6	New DCCs, any interested user	Jamborees (collaboration fest!), workshops, webinars, video walk throughs	Yes	identify commonalities
50		Identify community of practice, to identify the users and chart their user journey	Matchmaking events	Training to onboard the new DCCs - Newest DCCs need to understand CFDE consortium structure	share training strategies
51		Analytical modelers	Online MOOCs	how they submit data to the central repository;	shared repository of materials
52		Data scientists, developers, implementers, clinicians, etc...		what is expected from them and what they should expect from the consortium	exchange communities of practice, visiting internships, grad student diaries for credible practices
53		Next generation: grad students, postdocs, ECR, ESI		Online tutoria, interactive learning environments, all of the above
54
55	Remote	Grad students (develop data driven hypotheses)	MOOCs	Q3 (SPARC): it would be key to offer some basic training in graph databases and leverage distillery to combine knowledge from across CFDE DCCs	training working group
56		High school students (learn about the science of tomorrow)	Training programs where undergraduate students spend 8-10 weeks in a DCC to complete a project	Q3 (SPARC) contd: I support the idea of Education Games mentioned in Q2: have goal-driven questions to trace complex but biomedically interesting cross-graph paths	design a joint online course
57		Faculty - PIs (learn about new methodologies and resources)	Office hours	There are already a lot of great materials about training in Data Science and Bioinformatics out there so it might be better just point to them.	build upon what Titus' team developed so far
58		Bioinformaticians and Data Scientists (learn about tools and datasets that they can leverage)	summer internships / programs to grad students	Crowdsourcing challenges (?)	Have a system where existing training materials are announced, perhaps in the weekly newsletter.
59		Post docs - good opportunity to introduce Common Fund to labs/ groups that haven't worked with CF data yet	workshops at relevant conferences	There are already quite a few basic data science trainings. If we are going to create another basic training, needs to use CF data for examples, ideally multiple.	Q4 (SPARC): this might make a strong case for a training partnership to be supported by NIH
60		Bioinformaticians and Data Scientists (learn about tools and datasets that they can leverage)	Online course for grad students with lectures from representatives from each DCC	Training on the tools CFDE is creating and maybe examples of how they can be applied	Q4 (SPARC) contd: This would require for considerable planning for such a partnership in the current year.
61		Medical Doctors (learn about the future of deep phenotyping)	Educational games (?)	User manuals and video tutorials about CFDE products	joint workshop presentations/trainings that show how tools are unique but complimentary
62		The General Public (learn about how their taxes are spent to advance medicine)	Q2 (SPARC): Navigational training: given the very extensive breadth of CFDE data coverage, pre-planned searches that introduce the user to the rich detail of our coverage.
63		Computer Science Undergrads (apply programming skills to analyze real biomedical data)	Q2 (SPARC): prep scenarios that frame a "quest" for data answering a well-defined biomedical question (find tissue location of drug transporter involved in disease X)
64		Experienced data scientists have indicated they find short videos helpful to orient/ learn new tools
65		Trainees need to see value in training. If you can demonstrate a good use case for a tool/resource and publicize (publish?) you can attract interest and design training around it.
66		resources for training within an MD-PhD program
67
68
69
70
71
72
73
74
75