| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | AB | AC | AD | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Question 1. Who should CFDE target for training and why? | Question 2: What types of training activities CFDE could offer? | Question 3: What materials should be delivered in the training? Should we provide basic training in Data Science? Specific training on how to use data from a CF program? Training on how to combine data from across CF programs? | Question 4: How can we better coordinate and integrate training activities among the DCCs and partnership projects over the next year? | |||||||||||||||||||||||||||
2 | Table 1 | Data scientists, basic scientists -- beginner to expert | Create a "stackexchange" knowledgebase for CFDE. Most data science needs I have are met by a stackexchange search. | A "stackexchange" site for CFDE. Leverage free data science courses. Unique content on combining CFDE resources should be a focus area. | Use self-organizing best practices for procing valuable answers on how to solve problems -- incentivizing contribution. Again, look at stackexchange and other knowledgebase models. | ||||||||||||||||||||||||||
3 | Industry scientists and academic scientistis -- also for graduate students with thesis-ready use cases | Curated set of general resources, leveraging open courses -- then CFDE-and DCC-specific courses. | Tutorials, YouTube tutorial videos, Jupyter notebooks, R and python notebooks. | Put money behind it. | |||||||||||||||||||||||||||
4 | CFDE members - to get immediate feedback, be our own beta testers | best practices in informatics and engineering | Focus on what only CFDE can do, leverage external resources for everything else. Do NOT focus on basic data science training. | Identify together who our users are and who can use training across DCCs | |||||||||||||||||||||||||||
5 | Both CFDE investigators and staff. | Focus on what only CFDE can do, leverage external resources for everything else. Do NOT focus on basic data science training. | Have hands-on sessions dedicated to one DCC at a time -- with all of the DCCs participating. | ||||||||||||||||||||||||||||
6 | Graduate students from wet labs to better identify experimental targets. | training program for future DCC leaders -- potentially K-award | Hackatons end up being de facto training activities | ||||||||||||||||||||||||||||
7 | "Data parasites" -- knowledge discovey using other peoples' data | Focus on making use cases doable (reproducible) in a short period of time by non-experts | |||||||||||||||||||||||||||||
8 | Sharing people between projects (DCCs) | ||||||||||||||||||||||||||||||
9 | share/expose existing CFDE resources at DCCs | ||||||||||||||||||||||||||||||
10 | |||||||||||||||||||||||||||||||
11 | Table 2 | Our training should focus on what data we have, which is something that crosses all experience levels. So I don't think we should focus on a particular experience level of users. Junior investigators are the ones that typically use the data, but the PI level people need to understand the data in order to direct their research. We should target both wet lab and computational biologists. | We should provide 1) pre-recorded videos, 2) live webinars, and 3) workshops (both in-person and virtual). | We should deliver training videos, presentations, and runable coding environments (e.g., Jupyter notebooks). No, we should not provide training in basic data science -- that is a very large subject and there are likely organizations better able to do that than us. We should provide training on how to use CF data. Combining data across CF programs is an unsolved research problem with significant pitfalls. We can not provide a recipe for how to do that, but we should present pitfalls and considerations. | |||||||||||||||||||||||||||
12 | Researchers who are interested in CF datasets. | 1) Datasets that are available 2) How to access those datasets 3) Examples on how to use the datasets and available pipeline and tools | How to use the data from the CF program and how to leverage other CF datasets to combine them for analysis | Keep it simple. Can we define one or two training events to which we can contribute? | |||||||||||||||||||||||||||
13 | Graduate students, postdocs, and MD/Phd trainees. Less frequently established PIs. | Presentations that are overviews of data. How to use the CFDE portal. Hands-on workshops for using CF data in cloud environments like Terra. | We may need to provide basic training in data science, but only in the context of CF datasets. | Central tracker of CFDE training activities | |||||||||||||||||||||||||||
14 | Pharma/industry rsearchers | Train on use of APIs versus web interface. | We should provide API documentation. | ||||||||||||||||||||||||||||
15 | Medical trainees | Training with the use case collection. What is missing from the CFDE portal? | |||||||||||||||||||||||||||||
16 | Other CFDE DCCs | Simple cloud workshops with one or two of the datasets. | |||||||||||||||||||||||||||||
17 | Why: People who are going to make use of the dataset and need them | What happens behind the scenes? For example, how is the data normalized. | |||||||||||||||||||||||||||||
18 | Train course instructors (e.g., professors). | Hackathons. | |||||||||||||||||||||||||||||
19 | |||||||||||||||||||||||||||||||
20 | Table 3 | All members of internal CFDE to be aware of what other groups are doing. | Academic course in such topics as a clinical modeler. | Training videos in a centralized location, perhaps on a YouTube channel. Ensure a logical order/grouped modules in which the videos should be watched. | Working group-develop plan for moving forward | ||||||||||||||||||||||||||
21 | Other NIH programs that have cross-sectional research activities. So they can learn more about what groups are offering, can pull together data, etc. | Workshops | Tutorial documents | Training hackathon | |||||||||||||||||||||||||||
22 | External users for awareness, perspective, and feedback loop. | Web-based platform training | Jupyter notebooks | Research what communities can/should be served; gathering feedback | |||||||||||||||||||||||||||
23 | Ensure PMs from all domains are aware of the process. | Hackathons with targeted challenges that results in a prize. | Place content or tutorials on site such as biostars.org | ||||||||||||||||||||||||||||
24 | Data scientists at universities to get them interested. Develop courses around the data. | Create a forum where teams can present on what they're doing. Solicit researchers to speak. | Longer term: development of academic courses (e.g. SPARC - RFA-DK-20-020) | ||||||||||||||||||||||||||||
25 | Bio-informatics, Clinical informatics students, Physician scientists | Series of courses/modules. | Coursera trainings that provides a certificate of completion. | ||||||||||||||||||||||||||||
26 | Train the trainer (like software and data carpentry philosophy) where trainers can take training back to their local institution | FAIR training material - ability to utilize material developed by the DCC's from, for example, a CFDE training portal | |||||||||||||||||||||||||||||
27 | |||||||||||||||||||||||||||||||
28 | |||||||||||||||||||||||||||||||
29 | Table 4 | Clinicians and practioners / translational medicine | Online learning courses | Videos | Money | ||||||||||||||||||||||||||
30 | Patients (appears to be a demand) | Hackathons (solve a demo problem) | Scrolli maps | Coordination group | |||||||||||||||||||||||||||
31 | Junior researchers (build a demand) | Pathways to answer common questions / data types and how they connect to othersd to asnwer questions | MOOC (Coursera, Canvas) - Massive open online courses | Focus on partnership projects | |||||||||||||||||||||||||||
32 | Phd Students (build a demand) | Workshops face-to-face (become MOOCs) | Jupiter notebooks | Having dedicated training courses for integrating multiple datasets | |||||||||||||||||||||||||||
33 | Medical students (build a demand) | R markdown documents | |||||||||||||||||||||||||||||
34 | Project communities (focus on domains, problems) | ||||||||||||||||||||||||||||||
35 | Datascientist with Biological interest | ||||||||||||||||||||||||||||||
36 | BioInformatics | ||||||||||||||||||||||||||||||
37 | Database providers | ||||||||||||||||||||||||||||||
38 | |||||||||||||||||||||||||||||||
39 | |||||||||||||||||||||||||||||||
40 | Table 5 | Thought: define a cohort to track over time (e.g, medical students) | "See one, do one, teach one model" similar to medical school. | Specific training is important to empower the trainee to perform the types of analyses he/she cares about. | Use metrics similar to those used by T32s | ||||||||||||||||||||||||||
41 | Teach the teacher: the impact of training bioinformaticians may multiply. | We're better off to be training people to teach in order to scale | Generic training for common platforms (RStudio, Python) is good if it can be connected to relevant data sets. | Focus on outcomes such as empowering a class of users to recapitulate results published in model papers. | |||||||||||||||||||||||||||
42 | Inserting CFDE data into existing training programs. | Summer schools - common in other STEM | Recapitulate papers with (at least partial) use of CF data - materials to support this. e.g. https://allisonhorst.github.io/palmerpenguins/ or https://datacarpentry.org/genomics-workshop/ | Focus of the humans as the product of training -- i.e. coordinate amongst the trainees | |||||||||||||||||||||||||||
43 | Explicitly targeting students who are NOT affiliated with the DCCs | Master's or PhD or postbac students | |||||||||||||||||||||||||||||
44 | Masters-level programs. | A T32 to train students to use CF data | |||||||||||||||||||||||||||||
45 | Educational settings that themselves lack the local infrustructure for data generation/analysis | Recapitualting analyses published in model papers. | |||||||||||||||||||||||||||||
46 | location-independent "data scholars" https://datascience.nih.gov/data-scholars-2022 | "Project-based-learning" models | |||||||||||||||||||||||||||||
47 | Leverage the CFDE to exapnd the data science workforce | Using google docs | |||||||||||||||||||||||||||||
48 | |||||||||||||||||||||||||||||||
49 | Table 6 | New DCCs, any interested user | Jamborees (collaboration fest!), workshops, webinars, video walk throughs | Yes | identify commonalities | ||||||||||||||||||||||||||
50 | Identify community of practice, to identify the users and chart their user journey | Matchmaking events | Training to onboard the new DCCs - Newest DCCs need to understand CFDE consortium structure | share training strategies | |||||||||||||||||||||||||||
51 | Analytical modelers | Online MOOCs | how they submit data to the central repository; | shared repository of materials | |||||||||||||||||||||||||||
52 | Data scientists, developers, implementers, clinicians, etc... | what is expected from them and what they should expect from the consortium | exchange communities of practice, visiting internships, grad student diaries for credible practices | ||||||||||||||||||||||||||||
53 | Next generation: grad students, postdocs, ECR, ESI | Online tutoria, interactive learning environments, all of the above | |||||||||||||||||||||||||||||
54 | |||||||||||||||||||||||||||||||
55 | Remote | Grad students (develop data driven hypotheses) | MOOCs | Q3 (SPARC): it would be key to offer some basic training in graph databases and leverage distillery to combine knowledge from across CFDE DCCs | training working group | ||||||||||||||||||||||||||
56 | High school students (learn about the science of tomorrow) | Training programs where undergraduate students spend 8-10 weeks in a DCC to complete a project | Q3 (SPARC) contd: I support the idea of Education Games mentioned in Q2: have goal-driven questions to trace complex but biomedically interesting cross-graph paths | design a joint online course | |||||||||||||||||||||||||||
57 | Faculty - PIs (learn about new methodologies and resources) | Office hours | There are already a lot of great materials about training in Data Science and Bioinformatics out there so it might be better just point to them. | build upon what Titus' team developed so far | |||||||||||||||||||||||||||
58 | Bioinformaticians and Data Scientists (learn about tools and datasets that they can leverage) | summer internships / programs to grad students | Crowdsourcing challenges (?) | Have a system where existing training materials are announced, perhaps in the weekly newsletter. | |||||||||||||||||||||||||||
59 | Post docs - good opportunity to introduce Common Fund to labs/ groups that haven't worked with CF data yet | workshops at relevant conferences | There are already quite a few basic data science trainings. If we are going to create another basic training, needs to use CF data for examples, ideally multiple. | Q4 (SPARC): this might make a strong case for a training partnership to be supported by NIH | |||||||||||||||||||||||||||
60 | Bioinformaticians and Data Scientists (learn about tools and datasets that they can leverage) | Online course for grad students with lectures from representatives from each DCC | Training on the tools CFDE is creating and maybe examples of how they can be applied | Q4 (SPARC) contd: This would require for considerable planning for such a partnership in the current year. | |||||||||||||||||||||||||||
61 | Medical Doctors (learn about the future of deep phenotyping) | Educational games (?) | User manuals and video tutorials about CFDE products | joint workshop presentations/trainings that show how tools are unique but complimentary | |||||||||||||||||||||||||||
62 | The General Public (learn about how their taxes are spent to advance medicine) | Q2 (SPARC): Navigational training: given the very extensive breadth of CFDE data coverage, pre-planned searches that introduce the user to the rich detail of our coverage. | |||||||||||||||||||||||||||||
63 | Computer Science Undergrads (apply programming skills to analyze real biomedical data) | Q2 (SPARC): prep scenarios that frame a "quest" for data answering a well-defined biomedical question (find tissue location of drug transporter involved in disease X) | |||||||||||||||||||||||||||||
64 | Experienced data scientists have indicated they find short videos helpful to orient/ learn new tools | ||||||||||||||||||||||||||||||
65 | Trainees need to see value in training. If you can demonstrate a good use case for a tool/resource and publicize (publish?) you can attract interest and design training around it. | ||||||||||||||||||||||||||||||
66 | resources for training within an MD-PhD program | ||||||||||||||||||||||||||||||
67 | |||||||||||||||||||||||||||||||
68 | |||||||||||||||||||||||||||||||
69 | |||||||||||||||||||||||||||||||
70 | |||||||||||||||||||||||||||||||
71 | |||||||||||||||||||||||||||||||
72 | |||||||||||||||||||||||||||||||
73 | |||||||||||||||||||||||||||||||
74 | |||||||||||||||||||||||||||||||
75 |