Unless the NSF guidlines state otherwise, assume each section to be 3-5 pages. We’ll adjust for length after the first rough draft is done.
CIF21 IGERT: Creating The New Scientist - Training Graduate Students in Open Science and Informatics
(This table is copied right off the call for proposal page)
Previous IGERT Experience (Award #, Role)
University Libraries, University of New Mexico
Informatics, Open research
Physics and Astronomy, University of New Mexico
0504276, IGERT Fellow
University of New Mexico
University of New Mexico
Civil Engineering, University of New Mexico
GIS, hydrology, water resources
Freshwater Sciences IGERT (DGE 9972810), Faculty Advisor
Biology, University of New Mexico
University Libraries, University of New Mexico
Office of Experimental Program to Stimulate Competitive Research,
Biology, University of New Mexico
Current research practices in most fields are still rooted in practices developed in the mid-20th century, while technology has raced ahead. The NSF is progressively taking a positive stance toward open research practices. These practices are unfamiliar and often frightening to researchers, and the scientific community has failed to embrace the shift in culture. Although researchers in the STEM sciences increasingly collaborate across institutions and internationally, they are often hampered by their lack of skills with current cyber-technology like social media, which is built around collaboration. To compound this many researchers are lacking skills in data management, data curation, intellectual property management, online tools for collaboration, and marketing. Simply put, the culture of science needs to change and education is the mechanism to drive this change.
In this proposal we describe a program to train new researchers to better participate in this innovative scientific culture. We are organizing the program around the data life cycle with an emphasis on open research and collaboration as our guiding models. Our goal is to train researchers who are able to fully take advantage of all aspects of today’s dynamic research environment and collaborate successfully in ways we cannot currently foresee. Students will also learn how to participate fully in an open data environment. They will learn the skills necessary to use existing data as fully as possible, and also gain the tools to maximize the reuse of their own data. They will be able to work in trans-disciplinary environments using the most up to date tools and give them the knowledge to use the correct tool for the job.
Our program will consist of two core courses and two elective courses. The core classes, 1) Data Management and Curation and 2) Collaborative and Open Research in Practice will give the students practical and hands on experience in data management and collaboration with an emphasis on open research conduct and practices. The electives will serve to give students additional experience and education applicable to a diverse set of disciplines.
We believe this program is unique in that we are essentially ignoring disciplinary boundaries, and focusing on training students to be good data stewards, naturally collaborative and trans-disciplinary, and willing participants in open research. In addition, they will be learning and in many cases developing, cutting edge tools for mining data across disciplines in ways that aren’t possible with current educational standards.
In accordance with the University structure, students will be awarded degrees from individual departments. Although the advisor will be in the same department as the degree, the PhD committee will include another faculty member engaged in this program. To insure interdisciplinary work, three departments shall be represented on each PhD committee. For example, a PhD student working with Dr. Julie Coonrod will receive a PhD in Civil Engineering researching topics related to hydrology, hydraulics, or water resources engineering. If the dissertation considered potential climate change impacts to river restoration projects, the data would be immense. The student’s committee might include Dr. Rob Olendorf who could provide expertise in the area of open science/data, a faculty member from Biology, and another from Earth & Planetary Sciences. The result would be students with the tradition
This program has the potential to serve as a model for changing the way graduate students are trained. The graduates of this program will be the new innovators, bridging disciplines and fields in new, unpredictable ways to create new and exciting discoveries we may not even be able to dream of right now. They will be trained not only to use and synthesize existing data, but also to contribute high quality reusable data themselves. They will also learn to become leaders in the emerging open research culture.
We are building our program around three themes which capture the cultural and environmental shifts confronting researchers in all fields, Data, Collaboration and Openness. By giving students training in these fields, they will be well prepared to enter the new research world that is developing.
The data lifecycle is frequently used as a tool to develop policies, procedures and educational modules for data management and we are using it here as the organizational paradigm for this program. Although there are probably hundreds of different manifestations of the data lifecycle, we are choosing the simple one depicted here (Figure 1). One important aspect of the data lifecyle is its cyclical nature. The collect phase can be either creating new data, or collecting previously created data and creating new data from that.
We assume that each of the four major activities shown here are composed of sub-activities which are often depicted in other lifecycles. Our goal is to give students training in each aspect of the life cycle, with special emphasis on giving experience on those aspects that are particular to today’s open and online environment and are likely not learned in their respective programs. Also, while the life cycle shows 4 distinct activities, in practice the activities can often overlap. For instance, it is possible to disseminate one’s data real time as they are being collected, so that dissemination can occur at any time of the life cycle. Below we outline some of the essential new skills a researcher should have to fully participate in the new research landscape.
Traditionally, data collection has been synonymous with creating new data sets. Even for researchers generating new data with automated processes, remote sensing, high throughput genomics and any number of other technologies, even a single researcher can rapidly find herself managing huge data sets composed of hundreds of thousands of data objects and requiring massive amounts of storage.
Although reusing data is not a new concept, changes in technology have created an environment where data reuse can be accomplished at even greater scales. For instance, in the field of genomics, researchers routinely mine the NCBI database for data to use. Also, the web and social networks provide a rich source of data for a variety of fields. There are more and more data repositories every day, enough that we can consider ourselves to be in a repository deluge as well as a data deluge.
The collection phase is critical to the life cycle as problems here can often be irreversible and propagate through the other stages and further iterations of the life cycle. Researchers must have excellent data management skills which is a rarely trained skill. Additionally, they often require more extensive programming skills than in the past. If researchers are going to be using existing data for their work, experience with data mining, machine learning and other computational techniques that have been developed recently can be very useful.
Also, when reusing data, issues of intellectual property rights can manifest themselves, requiring researchers to have familiarity with this field. Lastly, the methods and tools used for extracting the data, the formats and structure of the data and other issues of provenance need to be documented if the data is going to be reused in the future. This is often best accomplished by the creators of the data, but researchers are often unfamiliar with current metadata standards and inexperienced with workflows to enable them to document this information easily.
After the data has been collected it must be processed. At the bare minimum this involves careful quality control routines to verify the accuracy and integrity of the data. However, if data is to be meaningfully shared, then these processes must be carefully documented. If the dataset is composed of a synthesis of other data sets, then the the accuracy and quality of the joins must also be established and documented. Again,
The techniques used to analyze data will often depend on the data and research being conducted. In the past, these procedures are often poorly documented, often relying on brief descriptions of statistical procedures in published manuscripts. However, attempts to replicate these analyses often fail due to inadequate documentation. Therefore, the procedures and analyses should be documented step by step, ideally with well documented scripts, workflows or other tools. Students will learn the most up to date analytical methods in their fields as well as tools and techniques to allow them to fully document their work to increase replicability and the level of transparency in their work.
The trend toward sharing data puts increasing demand on researchers to not only make their data available, but just as importantly to make their data understandable and useable. Traditionally, if researchers shared their data it was via request. This limited the value of the data to the lifetime of the researchers ability to store the data, the researchers memory and the researchers lifetime. However, as we move towards depositing data in various repositories, supplying sufficient information (metadata) with the data to ensure that it is understandable to others is crucial. This includes data capture during the previous stages of the life cycle, as well as adding additional information to help in finding, indexing and further describing the data.
As noted above, the amount of data is growing exponentially along with the number of repositories. This makes finding the desired data a bit like finding a needle in a haystack. Researchers who are skilled, not only in creating, maintaining and annotating their data, but also marketing their data will also see their careers benefit while at the same time furthering the progress of research.
Researchers typically have little training in this aspect of the data lifecycle. However, as data becomes more important, data reuse and citation will have similar impacts on a researcher’s career as publication and citation rates have now. Researchers who are skilled and disseminating their data will see their data used more often. Graduates from our program will learn skills to allow them to better share their data by providing higher quality metadata, finding the best outlets for their data and understanding issues surrounding intellectual property rights that go with sharing data.
While the data lifecycle is useful to map how we can interact with data another, perhaps an even more important trend in research is the increasing collaborative and trans-disciplinary nature of research. In many fields of science, it is not unusual for there to be 5-10 authors on a publication and the average number of authors/pubMed citation has been growing steadily.
While much of this collaboration has historically been among close colleagues, often at the same institution, the tools and services now exist online to allow collaborations to occur anywhere. In fact, shared data can be thought of a special case of asynchronous collaboration. Collaboration provides a number of advantages. Perhaps the greatest advantage is creating a group with a skill set greater than any single person can have. This allows the team to tackle questions that could otherwise not be addressed. Just from the discussion of the data lifecycle, it is clear that any substantial research project requires a large number of skills just to manage the data. If we add questions that are interdisciplinary by nature, the number of required skills increases accordingly.
Being able to collaborate online increases a researchers pool of possible collaborators drastically but also presents a number of problems. These include but are not limited to
Most researchers are unaware of the many tools, services and best practices developed for collaboration. For instance, in the software industry, there are large number of tools and methods for allowing software programmers to work on the same project remotely. Some of the largest open source projects such as Drupal rely on these methods. The Agile development method for instance, provides a framework for teams to rapidly develop robust software, but can be adapted for a variety of project types. Versioning services such as GitHub and Bitbucket allow software developers to share code, and develop code collaboratively. Even tools such as GoogleDrive, Sharepoint and Dropbox, can be used effectively. Researchers in other fields often are poorly trained to use these tools and others, relying on traditional tools such as email sharing of files, phone calls etc.
The biggest need for quality scientific research is reproducibility. By providing an open experimental record, scientists can guarantee reproducible results not just for future students in the home lab, but for researchers anywhere on the planet. A completely open approach also provides data transparency and provenance and gives researchers access to the entire experimental record. In the end this promotes scientific integrity and researcher accountability, which leads to fundamentally sound science and produces better researchers.
Currently there aren’t many adopters of “open science” (as proponents refer to it). There are a few reasons for this. First is the common fear that by releasing their data, they will lose publication opportunities when others publish using their data before them. While this is not an unreasonable fear, the culture of research already strongly supports crediting others when using their ideas and this culture is almost certainly applicable to data as well. Secondly, and one of the focuses of this proposal, is that many researchers don’t have the skills to adequately attend to their data to make it useable by others. This is especially true when creating complex data sets. Lastly, they don’t see a benefit to spending the time and resources needed to properly manage, curate and disseminate data. This is partly a cultural shift that needs to happen where the publication of data can be considered for career advancement just as traditional publications are today. Unfortunately these fears and limitations promote an environment where scientific creativity is stifled and collaboration is hindered. The fears are so entrenched that many scientists believe providing open information will actually hinder their scientific career. One of the goals of this program is to educate new researchers and give them the tools to at least partially offset these real and perceived costs and risks to open research.
Issues with patenting research may persist, but US patent laws currently allow for open research and patent offices at UNM support the movement with the use of services such as provisional patents. Scientists also fear that by publishing openly, their data, techniques, and ideas are unsafe but the truth is that there are copyright protections currently in place that prevent data misconduct. In addition, Creative Commons have developed a series of licenses that scientists can tag their scientific products which would determine the use of those products.
The only way to end the misconceptions of open research in science is through education and training of researchers to give them to tools to practice open research safely and effectively. This IGERT proposal would produce both of those outcomes. Students will be educated on open research practices and tools and proper data management techniques. The will also be trained to conduct ethical open research in their home labs and in a broader environment, the internet. Because of their training, students will be aware of the impact of open research and be more likely to adopt these principles in their own future labs. As such the IGERT would be producing supporters of open research that would educate others and turn them into advocates for open science.
This IGERT will fund students while training them in the concepts outlined above. The courses we outline below will give students real experience in data management, data curation, open research and trans-disciplinary collaboration. We expect that this program will not only give the the skills to do this, but also give them the conceptual basis to further the fields of data management, data curation and open research. Additionally, they will learn that with the right skills, the taking time to good stewards of their data is a benefit to their career. Additionally, they will have gained experience in working broadly across disciplines, created relationships with their cohort and others that will allow them to be the new open trans-disciplinary researchers of the future.
The UNM Libraries recently developed a new Informatics Program. The program is intended to train librarians and other information professionals in the most modern practices and techniques in data and information management. The faculty involved with this effort also make data management, data services and open science part of their research efforts. Lori Townsend, Robert Olendorf and Kathleen Keating are teaching in this program.
In the 2012 Spring semester, several faculty, including several from the University Libraries taught a broadly interdisciplinary class “Women Water and Work” that received funding from NSF. The class included faculty from Earth and Planetery Sciences, American Studies, Civil Engineering and History, as well as several faculty from the library. The class explored issues of gender equality, water access and how they interact. The class explored the issues from a variety of viewpoints as represented by the faculty. In addition, the students learned about data use from the Library faculty and applied all they learned in interdisciplinary team projects. Julie Coonrod and Robert Olendorf were both instructors for this class.
The UNM Libraries have merged with the Organization of Learning and Information Technology (OLIT), whose mission it is to research and teach methods for knowledge management and education, primarily to corporations and businesses. This partnership is unique among libraries and will provide a broad and interdisciplinary point of view to data intensive research, data management and collaboration.
Open Notebook Research is the practice of publishing one’s primary research record online in real-time or as near real time as possible. This may include providing open access to raw and/or analyzed data, methods, and any other associated experimental details. Typically any documentation that would go into a paper research notebook could be supplied via open access. Anthony Salvagno is a leading open notebook scientist at the University and internationally. His notebook is well maintained in real-time providing access to the entire research record from pre-planning experiments, methods, data, to conclusions and includes access to both failed and negative data which is generally not published by scientists. His research is available at http://research.iheartanthony.com. He also teaches best practices and available tools for open notebook science at conferences and online via his notebook and other science blogs. He taught Physics Electronics Junior Lab in the Spring of 2012, which brought the open notebook experience to the labs performed in the class. The class open notebooks can be found at http://2012juniorlab.iheartanthony.com. The educational experience from this class will be applied to the capstone course detailed in section 3.B. Students will be trained to be efficient open notebook scientists and encouraged to participate in whatever capacity works for their research.
In addition to open notebook research, Anthony Salvagno and Robert olendorf are researching and practicing open manuscript preparation and open grant writing. In fact this proposal is being written openly and we actively seek input from the outside community. Additionally, we are currently exploring and comparing a variety of open platforms for scholarly communication and expect to publish a manuscript on it by the end of the year.
Open access data repositories like figshare.com are being looked at as viable platforms to support publications. Scientists can cite raw data and supplemental data from these repositories to their open notebooks and journal publications. Anthony Salvagno uses figshare to host his experimental data, both raw and processed, and works with figshare to ensure the usability of the software for the academic community. Figshare would provide internship opportunities to the IGERT should this proposal be funded. Rob Olendorf is currently helping to develope a repository system for UNM that would provide an open platform for data and research at the University. This repository will be maintained up to the standards taught in the IGERT curriculum and be managed according to the latest NSF data management guidelines. IGERT trainees would be expected to contribute their data to the repository. And the source code for the repository would be available online for download to be implemented at universities around the country.
Video protocols allow researchers to share their methods exactly as they are demonstrated by a research group. Websites like BenchFly.com provide free uploads and open access and are specifically targeted to the scientific community. The Journal of Video Experiments (JoVE) is a publishing platform that is quite similar to BenchFly, but has the additional benefit of providing a researcher with a traditional publication credit as video articles are peer reviewed. Anthony Salvagno frequently works with BenchFly to expand the role of video in the scientific process by documenting experimental setups and providing tutorials. BenchFly would provide internship opportunities to the IGERT should this proposal be funded.
Science blogging is becoming a powerful form of outreach for scientists. Blogging in general has really taken off as a communication medium in the past 10 years. For science, blogging provides researchers a platform to convey their experiments to the general public and help bridge the gap between labs and the public. Many labs are publicly funded, yet a lot of the research undertaken around the country is a mystery. Often, people do not know about research until there is a press release or a magazine article written about the research, and this is usually reserved for potentially high impact research. Science blogging brings more mainstream research into the public eye and provides better access to publicly funded information and studies. Anthony Salvagno uses his open notebook as a science blog by sharing all of his research at a general audience level. He is also undertaking efforts to create a science blog community at UNM that would give graduate students an outlet to share their research both locally and globally. Anthony also teaches students effective communication techniques for blogging and would incorporate this into the IGERT programs Special Seminar (see section 3.B). The IGERT program would maintain a blog (by the program coordinator and the trainees) to keep the community involved in the outreach efforts of the trainees and to provide a media outlet for the research of the trainees.
Crowdfunding has been developed recently and used primarily to fund entrepreneurial ventures as an alternative to venture capitalist and seed funding methods. A very new movement has begun that applies crowdfunding to scientific research as an alternative to the traditional grant process. Using the web as an outlet, researchers can reach a very diverse and broad audience to essentially let the public decide the merits of the research presented. Not only does this provide an effective medium for scientific publishing, but it gives the public a more intimate relationship with the research they are directly funding further bridging the gap between scientist and the community. And while any scientist can participate in the varitlety of crwodfunding platforms, open scientists should embrace the medium. Anthony Salvagno successfully participated in the #Scifund Challenge raising over $2000 to fund publicly accessible and open research. As part of the IGERT curriculum students will be trained to understand the nature of crowdfunding and the role that outreach via the web, social media, and personal networks play in supporting this culture. They will also be encouraged and trained to participate in crowdfunding for science endeavors for those who want to pursue this avenue of funding.
Data Observation Network for Earth (DataONE) is the foundation of new innovative environmental science through a distributed framework and sustainable cyberinfrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data. DataONE is one of the leaders in facilitating greater openness in research in the environmental sciences.
DataONE provides a variety of data services geared towards the environmental sciences spanning a large number of disciplines. DataONE not only acts as a data repository, but also is actively engaged in developing tools for researchers to use in data management as well as providing education.
Bill Michener is the lead PI on the DataONE grant funded by NSF. Robert Olendorf has participated in several workshops to develop best practices and educational modules for DataONE.
The UNM Libraries are developing a suite of services aimed at helping researchers with data management, data curation and data archiving. This emerging suite of services is integrating the University Libraries with the research culture of UNM and starting to create an environment of trans-disciplinary research. The IGERT program will be integral to the development of this suite and will be a core feature of the program.
Graduate students are increasingly working in a data-intensive research environment, yet they are not receiving the conceptual and hands-on training that allows them to effectively design, manage, analyze, visualize, preserve, and reuse data and information. Additionally, the opportunities for collaboration online allows for levels of interdisciplinarity never before possible. However, traditional graduate education focuses almost entirely on creating discipline specialists and seldom gives students training in data practices or collaboration. Even multi-disciplinary programs typically still focus on narrow fields of research. Additionally, although data sharing and reuse is increasing in many fields, students do not receive training in how to fully use these resource or to adequately manage and document their own data for reuse, especially if data is to be useful across disciplines.
The required courses were developed with input from academia, research networks, and Federal agencies to address the information management needs of the 21st Century researcher. Students will receive extensive hands-on experience with all aspects of the data lifecycle: from managing data files and creating databases and web portal, through analysis and visualization, as well as managing, and providing open access to data. Additionally, they will learn tools and practices to facilitate greater collaboration in their careers.
The framework for the courses will be open access and other academics can reuse, develop or enhance the curriculum. After completing the sequential course work, students will be at a significant competitive advantage as they pursue further academic and professional efforts.
Three core graduate level courses will be required with an optional fourth class for more in-depth student specialization. Students will also be required to take a 1 credit ethics course and a 1 credit hourly seminar every semester they are enrolled in the IGERT program. Each course also includes lecture, lab and discussion time to provide necessary context and direction for completion of the course student outcomes. Below are the three required course descriptions, objectives and student learning outcomes. Courses are listed in the order they will be taught.
Collaboration and Open Research: An interactive course in which students work on a team project and employ all aspects of open research. This includes open lab notebooks, open data and open access to research findings. Projects will be developed by teams of diverse individuals with the final product being a research proposal to be submitted for the competitive research challenge or other funding agencies.
The student will create a final project will be a research proposal that will be suitable for submission to either the internal competitive research award funded from this grant or potentially any national funding agency. It is hoped many of these proposals will become active research projects for the students.
Introductory hands-on course on information management and the data life cycle for multi-disciplinary arts and sciences and engineering. With emphasis on data acquisition technologies, metadata, QA/QC procedures, data preservation, database management, web portal development, and open access.
A hands-on course in quantitative and qualitative data analysis and visualization, data exploration, tool assessment, and creation of effective visual representations of analytical results. A special emphasis is placed on using tools to document the analyses and visualizations to facilitate replication of results.
INFO XX Ethical Issues of Online Collaborative Research (1)
The web is quickly becoming a very social network. Social media, blogs, tools, and other sites allow people across the world to interact in a rapidly evolving manner. Science is beginning to move toward this model as well, but there is a large usage gap amongst researchers which may lead to a lack of education with using online scientific information. This 1 credit hour discussion course will be held every other year and will guide students and IGERT trainees on ethical uses for online science. Students will discuss topics with examples provided and be expected to apply their knowledge in the real-world. Issues discussed will include but will not be limited to:
INFO XX Weekly Seminar Series (1): Students will participate in a weekly seminar every semester during their IGERT enrollment. The seminar will introduce special topics not covered in the core curriculum classes and feature speakers from around the university and country. The seminar will cover topics in informatics, open data and open research. Methods and design of systems for communication of research, methods of data analysis and developing research reports will also be covered. Students will develop necessary communication skills graphically, orally, and written. Students will also learn valuable career development skills. Topics will vary each semester and include:
INFO 506 (3) Metadata: A hands-on course about describing and structuring information. Students will learn XML and XSLT, and will develop a thorough understanding of current metadata standards as well as crosswalking metadata schemas and data management applications.
INFO 520 (3) Introduction to Spatial Data Management: This course builds upon the foundations of information practice with an emphasis on spatial data. Students will survey essential methods for evaluating, accessing, organizing, storing and securing spatial data and information.
INFO 522 (3) Information Modelling: A practical course where students will learn how to model real world systems as object-oriented models, relational databases, XML schema and ontologies. Students will learn the fundamentals of data modelling.
CE 547 (3) GIS in Water Resources Engineering: Principles and operation of geographic information systems using Arc GIS, work with surface and subsurface digital representations of the environment considering hydrologic and transportation processes. Course project is required.
LEAD 503 (3) Data-Informed Instructional Leadership: Development of instructional programs, human resources, and organizational improvements should be grounded in data, both qualitative and quantitative. Explores conceptual and practical approaches to analyzing data to improve schooling.
BIOM 564 (3) Biomedical Informatics in Clinical and Translational Research: This course covers information technology tools and biomedical informatics strategies to optimize collection, storage, retrieval, and intra-/inter-institutional sharing of quantitative and qualitative data in support of clinical and translational research.
CS 515 (3) (Also offered as ECE 515) Scientific and Information Visualization: Introduction to scientific and data visualization techniques. Topics: data manipulation, feature extraction, visual display, peer critique of project design, data formats and sampling, geometric extraction, volume visualization, flow visualization, abstract data visualization, user interaction techniques.
LAW 589 (2-3) Information, Technology and Law (awaiting course description from William Jackson, School of Law Registrar)
NURS 727 (3) Health Care Innovations and Informatics: This course focuses on nursing informatics. Topics include: (1) introduction to the nursing informatics; (2) health care informatics applications; (3) evidence-based decision support; (4) information systems design; and (5) new opportunities and emerging trends. Restriction: admission to D.N.P. program.
BIOL 518 Ecological Genomics(3): Emerging role of genomics in ecological sciences; genomic approaches to ecological research; application of ecological theory to genomics.
BIOL 544 Genomes and Genomic Analysis(3): Overview of genomic analyses from DNA sequence to gene expression and proteomics.
BIOL 592 Introductory Mathematical Biology (3): Application of mathematics to models of biological systems, from genes to communities. Emphasis placed on broadly-applicable concepts and qualitative solution techniques. laboratory exercises introduce students to MATLAB programming.
CJ 507 Quantitative Data Analysis (3):
CRP 582 Graphic Communications (3): An introduction to hand drawing and graphic techniques. Students will become comfortable in expressing and communicating design thinking and ideas in graphic form.
CRP 583 Introduction to Geographic Information Systems (GIS): Overview of GIS capabilities in the context of community issues and local government. Includes direct manipulation of ArcView software, lectures, demonstrations and analysis of urban GIS applications.
CS 529 Introduction to Machine Learning (3): Introduction to principles and practice of systems that improve performance through experience. Topics include statistical learning framework, supervised and unsupervised learning, Bayesian analysis, time series analysis, reinforcement learning, performance evaluation and empirical methodology; design tradeoffs.
CS 564 Introduction to Database Management: Introduction to database management systems. Emphasis is on the relational data model. Topics covered include query languages, relational design theory, file structures and query optimization. Students will implement a database application using a nonprocedural query language interfaced with a host programming language.
STAT 527 Advanced Data Analysis I.(3): Statistical tools for scientific research, including parametric and non-parametric methods for ANOVA and group comparisons, simple linear and multiple linear regression and basic ideas of experimental design and analysis. Emphasis placed on the use of statistical packages such as Minitab® and SAS®. Course cannot be counted in the hours needed for graduate degrees in Mathematics and Statistics.
STAT 528 / 428 Advanced Data Analysis II (3): A continuation of 527 that focuses on methods for analyzing multivariate data and categorical data. Topics include MANOVA, principal components, discriminate analysis, classification, factor analysis, analysis of contingency tables including log-linear models for multidimensional tables and logistic regression.
The key innovative feature of this IGERT proposal is the emphasis on open research. Education in this area is sorely lacking and as such real-world career opportunities are available but not well known. This IGERT proposes to fund students for travel and lodging, while providing fellows their stipend, to several global locations that feature industry leaders in open access tools.
The Public Library of Science is a nonprofit publishing agency that features open access in it’s core principles. They have several high quality journals and PLOS One, specifically, provides researchers a platform for publication with an emphasis on high quality science without preconception of importance (high impact). PLOS has committed to providing IGERT fellows internship opportunities that would allow students to understand the publishing side of science. On top of this, PLOS One is looking to incorporate new advances in web technology to increase the efficiency and reach of scientific publishing and students would get to be part of this development process.
BenchFly is an open access video repository that specializes in video protocols for life scientists. With various competitors such as Youtube, SciVee, and JOVE, BenchFly needs to maintain a marketing advantage and working with scientists from around the globe has allowed them to be a leading provider of digital scientific protocols. BenchFly has committed to this IGERT proposal and is willing to host students for an internship both electronically and on-location. The potential for an electronic collaboration allows students to remain at UNM, while getting real-world experience in many facets of digital communication for online science. The on-location opportunity will allow students to work with BenchFly in multiple areas that wouldn’t be accessible electronically.
Each year starting in year 2, students will participate in the Open Research Challenge. The Challenge will provide students $50,000 to develop a cutting edge tool, strategy, etc that will enhance open research. Teams of students, led by the IGERT fellows, will submit proposals that are centered around innovation in open research. The proposals must include: impact to open research, a data management plan, a budget, and a project plan to ensure the success of the proposal. Students will be guided on the proposal process during the capstone course for the educational component of the program. Proposals will be judged by the IGERT Steering Committee along with leaders in open research efforts from around the country. The challenge will be open to students from any university as long as at least one IGERT fellow is on the proposal team.
The organization and management plan is intended to provide a responsive and transparent mechanism for responding to the needs of both the IGERT fellows, their faculty advisors and other participating faculty.
The administrative groups are designed to be inclusive and reactive groups that can work effectively to establish the program and continue to adjust the program to meet the needs of the program fellows. The function groups are Program Director, Management Group, an Internal Steering Committee and and an External Advisory Committee.
The University Libraries are developing an Informatics Program and see themselves as a natural hub in training of graduates students in informatics, open research and data sharing. The Dean of the Libraries sees this IGERT as a natural extension of our Informatics Program and is committed to its existence after IGERT support has ended. Along with the courses currently offered or in development, they will work to develop courses as needed. A number of faculty in the Libraries will be teaching the courses, Lori Townsend and Robert Olendorf, both PIs on this grant, are teaching courses that are part of this IGERT program. The University Libraries will also provide space as needed to house and maintain the computers to be purchased for this program (See Letters of Support). Additionally, the University of Libraries will share space with Scholarly Communications department to house computers and has space for meetings and classrooms to be used as necessary.
The Assistant Vice President of Research views this program as necessary for the development of truly interdisciplinary and open research. The research office also typically returns a portion of the Facilities and Administration (F&A) generated from IGERT grants back to the program and is committed to continuing the practice, thereby increasing the impact of the program (See Letters of Support). The returned funds will go toward, but not limited to, student travel for conferences, educational workshops (food, room rentals, etc).
Many faculty at UNM are enthusiastic supporters of this program and are excited to have their graduate students participate (see Letters of Intent for examples). We believe acceptance by the faculty is crucial to our success as open research, data curation and other concepts are often not widely accepted by more traditional faculty. The success of this program will help change these attitudes.
Robert Olendorf was recently awared a grant to fund collaboration between Los Alamos National Laboratory (LANL) and the University of New Mexico. The university is allowing some portion of this grant to be used to facilitate open science collaboration between LANL and UNM (See Letters of Support).
The Graduate and Professional Student Association and the Office of Graduate Studies provide additional funding opportunities for research and travel to all graduate students in addition to any funds we provide to students.
A necessary and important component of any training program is an assessment strategy for determining the effectiveness of the program. Haviland (2002) details assessment approaches that have proven useful in previously funded IGERT programs. Active involvement of Graduate Trainees in the assessment process is particularly important and we will include input from IGERT Trainees. We will also combine contributions from different sources to achieve a more communal model of learning and assessment. These sources will include the faculty in charge, faculty collaborators, external advisors, and the fellows.
In collaboration with Dr. Lisa Broidy, Director of the Institute of Social Research, we will devise an assessment tool that will involve quantitative measures of success for evidence-based outcomes. The evaluation will concentrate on continuous program assessment and improvement as well as the measurement of the impact of the program on the project participants. The formative assessment tools and preparation of tests assessing the success of the program will be designed, administered and analyzed in collaboration with the Institute of Social Research. Surveys of both the student fellows, their advisors, participating faculty and staff will be based on those used by successful IGERT programs at the University of New Mexico and also the University of Washington and modified to reflect the specific needs of this IGERT program in collaboration with the Institute of Social Research, the IGERT faculty and key project staff.
The surveys will be conducted bi-annually. Any needed corrective action will be a responsibility of the project’s management team and steering committee. A critical role of assessment is the characterization and evaluation of the effectiveness of the program in terms of its goals and outcomes, and to provide feedback to program administrators to insure continual improvement (Berk, 1990).
The summative evaluation will measure the effectiveness of the program in terms of individual outcomes for the students as well as program level outcomes. The effectiveness in terms of individual outcomes will be measured by following the academic progress of the fellows, their professional development and career tracks. The outcomes of the fellows will be compared against those of comparable non-IGERT students in similar programs at the University of New Mexico. We expect that the inclusion of an informatics, interdisciplinary collaboration and open research program will expand students vision of potential careers and also the number and types of opportunities open to them. Program level outcomes will include recruitment and retention statistics, also compared to similar non-IGERT students at the University of New Mexico.
How many individual trainees will be
supported over the course of the 5-year
How many years of funding will trainees
2 years per student
How will students be supported when they are no longer receiving NSF IGERT funding?
Research Assistantships, Teaching Assistantships
From which departments or programs will
NSF-funded IGERT trainees receive their
Physics, Biology, Chemistry, School of Engineering, Computer Sciences, Mathematics, Organization for Learning and Information Technology
List the institutions at which any NSF supported IGERT trainees will be enrolled.
University of New Mexico
Incoming and first year PhD students will be recruited from all departments of the university for the Informatics program, although only STEM students are eligible for IGERT funding. Students applying for IGERT funding will submit a written proposal about their research and how the IGERT program would enhance their PhD education. They would also submit a letter of intent stating how they will incorporate open access into their research, and their PI would provide a letter of support in this endeavor. The final step in the recruitment process would be an interview either in person or via the internet (video chat) to discuss the requirements of the IGERT program. The Co-PIs of this grant will serve on a steering committee for student recruitment and program admittance.
Successful applicants will receive their funding for their first and second years of graduate study enrolled in a PhD program and will be funded for two years. Eligible departments (STEM programs) at the University of New Mexico will be contacted to raise awareness of the IGERT program and to increase interest in the program. Students will be placed under the advisement of faculty of diverse origins. Co-PI Anthony Salvagno is of Hispanic decent and Co-PI Lori Townsend is an American Indian, a member of the Shoshone-Paiute tribes, and we plan to recruit a diverse group of supporting faculty members.
Workshops will be maintained to provide student support prior to and during the application process to give students a better understanding of what will be required from the IGERT funding and how they can improve their application approval chances.
Graduate student organizations will also be included in outreach efforts to reach a broad and diverse audience specifically targeted toward STEM students. The organizations that this proposal will reach include (but are not limited to):
The IGERT will support students beginning their graduate career in new labs and provide faculty advisor aid for these new students. The program will provide services to students as they adjust to the rigors of PhD level coursework and research.
Anthony Salvagno (Co-PI) is a former Nanoscience and Microsystem (NSMS) IGERT student and will provide students guidance during the IGERT program. Faculty office hours will be made available several times a week, and impromptu meetings can be arranged as well. Also communication will be emphasized between students and faculty both traditionally and through modern means both publicly and privately, ie email, an online IGERT group website, social media, etc.
A weekly seminar series will be conducted to provide students a forum to develop skills crucial to their career both in graduate school and after graduation (see section 3.B).
The funding proposed provides students more freedom in their education to pursue ideal research projects. It also relieves academic pressure from PI’s because PI grant money is not being used to support the students. This allows a more organic development and places students in an environment where they are more likely to succeed.
Through the graduate student mentoring proposed above we hope to retain a larger proportion than without the aid of the IGERT funding. Students will have relatable experiences to the faculty they will be advised by, because they are from underrepresented groups.
Many of the skills we will be teaching in this program will increase the students productiveness making their graduate experience a more positive and rewarding experience. For instance, the data management class will help reduce the time spent managing their data giving them tools to manage and preserve their data more effectively. Teaching multi-disciplinary collaboration will expose students to new ideas that could be transformative to their work. Additionally, by increasing their pool of collaborators, students will increase their publication rate and also likely increase the quality and satisfaction gained in their work leading to increased retention.
Anthony Salvagno was a fellow of the Nanoscience and Microsystems (NSMS) IGERT program, award #0504276.
The NSMS IGERT funded students for 2 non-consecutive years and focused on nanotechnology research in an interdisciplinary setting. Students were required to have two mentors from different disciplines (for example Biology and Physics), and would perform a semester long rotation in three different disciplines. This training would fulfil the interdisciplinary aspect of the IGERT.
The NSMS IGERT featured a curriculum of five courses based on principles of nanotechnology. One of the courses was an ethics of nanotechnology. In addition to the class component, the IGERT held a “Nano Cafe” monthly seminar. The seminar featured 2 student speakers and one faculty presentation and would provide students a forum to develop their presentation skills. The core courses for the IGERT became the core courses for an NSMS PhD degree program.
The NSMS IGERT provided many opportunities for outreach. Students were provided with funding for travel to conferences in each of their two years of funding. Students also worked in the local community mentoring junior high and high school students with science fair projects, performing science demonstrations at elementary schools, the local museum Explora!, and Sandia National Labs, and traveling to Native American reservations in NM educating students about the science of nanotechnology.
Julie Coonrod was a faculty advisor and participant in the Freshwater Sciences IGERT (DGE 9972810). Dr. Coonrod (Civil Engineering) co-advised IGERT funded student Dianne McDonnell with Dr. Cliff Dahm in Biology. Dr. McDonnell was able to complete her research with funding that Dr. Coonrod received from the United States Bureau of Reclamation. Dr. Coonrod had a cohort of the Freshwater IGERT students take her ‘GIS in Water Resources Engineering’ class and was active in student committees, seminars, and workshops.
This IGERT proposal hopes to incorporate many of the same basic principles and expand upon them while providing educational standards for data management and open research which are two necessary and emerging skills for modern research.
Data is interdisciplinary by nature. Every field of research acquires, analyzes, outputs, and stores data. As such, students from diverse disciplines should be trained to understand the varieties of data, how to manage the data, and how to interpret it from an array of perspectives. Students will be required to perform a rotation in one other lab to become familiar with various data types and to foster a collaborative course of study. In addition to this students will be mentored by two faculty advisors of differing disciplines.
This program differs from the NSMS program because students will be trained and encouraged to engage in collaborative research. The courses featured in this program will be tailored toward collaborative research practices both in the physical lab and online environments. The students will also participate in projects tailored around collaboration, including the Open Research Competition which will require teams of students from differing disciplines to cultivate ideas to push the boundaries of open online research practices and tools.
Much like the NSMS IGERT, this IGERT proposal will feature courses that will integrate with a newly developed program that is maintained by the University Libraries. The courses feature three required courses that specifically address techniques in data management and data analysis with an emphasis on open research. The capstone course features tools to conduct open research and provides an environment for team building and collaborative research. Students will develop proposals for submission to the Open Research Competition, which will provide additional funds for students to carry out their research.
This IGERT will provide a new form of outreach. In addition to live student demonstrations at local schools, in underrepresented communities, and at community science events trainees will engage a broad audience online. Trainees will work with the Project Coordinator to maintain a blog about the program. Educational materials from workshops and the class will be open access and published online. The data repository created for the University will be open sourced for implementation in other institutions. The Open Research Competition looks to provide funding for the development of tools or research that can enhance online science and open research practices, which would be open access in nature.
Because one of the PI’s is a former IGERT fellow, he will be better equipped to guide students through the program and can provide a much more tailored mentorship/advisor role.
The very nature of this proposal hopes to provide a framework for international collaboration. Online scientific information can be accessible from anywhere on the planet and by encouraging students to provide content online via the University repository and open notebooks, we hope to grant students access to a broader form of collaboration than made possible through typical laboratory networks.
This program also features a study abroad program in collaboration with La Universidad Técnica Particular de Loja (UTPL) in Ecuador. The IGERT would pay for travel and lodging. We propose to offer students two options of study with this collaborator:
Another available international opportunity is provided by figshare. Students would travel to England for 3 months to participate in activities to expand the platforms current capabilities. They would explore the framework for a commercial data repository, collaborate with scientists globally, become familiar with alternative metrics for scientific publication, and understand critical aspects of online scientific information first hand. The IGERT would pay for travel and lodging while continuing to pay their stipend. An added benefit to this opportunity is that the United Kingdom is a global leader in providing open access to publicly funded information, and students will get to be at the forefront of policy development for this innovative movement. They will have access to valuable knowledge that could vault the United States to the forefront of open access science policy.
The University of New Mexico has as strong record attracting well qualified graduate students and minorities into graduate programs. Previous IGERT programs, such as the Nanoscience and Microsystems program (NSMS IGERT), the Integrating Nanotechnology with Cell Biology and Neuroscience program (INCBN IGERT) and the Freshwater Interdisciplinary Doctoral Program have been successful at recruiting and retaining a diverse body of qualified students.
The graduate programs in total at UNM have increased enrollment 14.88% from 2007-2011 and increased 3.15% just from 2010 and 2011. The greatest increases have been in Engineering (13.55% over five years) and Water Resources ( 17.02%). Arts & Sciences graduate programs have held steady (http://registrar.unm.edu/reports--statistics/fall2011oer.pdf ).
The UNM graduate student body has traditionally been very diverse and promises to continue to do so. As seen in table 9.1, UNM has a particularly strong Hispanic and Native American presence. The undergraduates especially represent a potentially rich source for recruitment that might be more apt to focus on STEM research in an interdisciplinary environment.
Two or More
$3.3M without international component
$3.5M with international component
$200,000 total ($50,000/year) innovation incentive
Funds to give teams of students incentives and opportunities to carry out research.
Funds may be used to prepare students to be successful in the international setting (pre-departure orientations, language or special training). Funds may be used for trainee international and in-country travel, living expenses, and limited support for research and education related costs abroad such as bench fees and/or field guides. Funds may also be used for short-term visits by IGERT faculty to foreign sites for supervising students, coordinating research and networking with foreign scientists.
a. one month per year of salary support for the Principal Investigator for management purposes;
b. up to 6 months total of faculty salary support for the development of IGERT curricula as part of a special allocation, which includes faculty salaries, benefits, and equipment costs, and applicable indirect costs on these items.
The total of all faculty salaries (excluding that of the PI), fringe benefits, and equipment costs cannot exceed $300,000. A special allocation table showing faculty salaries, fringe benefits, equipment costs, applicable indirect costs, and the total should be provided in the proposal.
Salary and benefits:
Support is included for 1.0 summer month for the PI, 12 months for the Post Doctoral Fellow who is a Co-PI, and 0.24 of a month for the other three Co-PIs and the two Key Persons.
Fringe benefits are charged at a rate of 21.1% for Faculty (summer salary only for 9 mo.), 31.6% All other Faculty, 35.2% Staff, and 28.3% Postdoctoral Fellows, incrementing per a table at http://research.unm.edu/policiesprocedures/FringeBenefitRatesonProposalsFY13.pdf
Postdoctoral fellows have a higher fringe than 9-month faculty who are paid by the grant in the summer because the former have the insurance benefit that is paid by the grant year-round, while the total faculty insurance benefit is paid during the academic year, the period of the PI’s faculty contract.
We request travel support for the PI, one graduate student, and one administrator to attend the annual IGERT Project meeting, and in year one only, for the PI to attend a one-day orientation meeting in Washington DC.
One day orientation meeting for PI (year 1 only)
RT Airfare $ 600
Lodging (1 night) 250
Per Diem and transportation 200
Annual IGERT Project Meeting (PI , one graduate student, one administrator)
RT Airfare ($600 x 3 people) $ 1,800
Lodging (3 nights @ $250/night x 3 people) 2,250
Per Diem ($115/day x 3 days x 3 people) 1,035
Total Domestic Travel in Year 1 $6,135
We request funds in the amount of $10,000 per years in periods 2, 3, 4 & 5 for short-term visits by IGERT faculty to foreign sites for supervising students, coordinating research and networking for foreign scientists.
Participant Support Costs:
Trainee Stipends ($30,000 per student per year)
International Travel ($40,000 per year in periods 2, 3, 4, & 5)
Competitive Innovation Incentive Fund ($50,000 per year in periods 2, 3, 4, & 5)
Trainee Tuition and Health Insurance : In the first year tuition is calculated at $3,295.76 per semester x 6% estimated inflation factor each year beginning with year 1. The health insurance is $1951 in year 1, and that amount increases per a table at http://research.unm.edu/policiesprocedures/FringeBenefitRatesonProposalsFY13.pdf
Other Direct Costs:
Performance Assessment/Project Evaluation $ 20,000
10 computer workstations ($1200 each) and 1 server ($4000) 16,000
Computer Software 5,000
Tablets (10 @ $400 each) 4,000
Photo and Video Equipment 6,350
Total Other Direct Costs in Year 1 $45,000
Facilities and Administration Costs (F&A):
The Federally negotiated F&A rate for a grant to the University of New Mexico is 51%. This percentage is not applied to capital equipment (cost exceeding $5000) or to tuition.
The F&A rate agreement, which expires in the summer of 2013, can be read at http://research.unm.edu/policiesprocedures/FandARate0709.pdf
Provide a description of facilities and major instruments that
are available to the project and require no additional support from NSF.
Letters of Support and stuff