| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Contact: Ben Winters | winters@epic.org | |||||||||||||||||||||||
2 | Use | Agency | Explanation | |||||||||||||||||||||
3 | 4% Repair Dashboard | Agricultural Research Service | ||||||||||||||||||||||
4 | ARS Project Mapping | Agricultural Research Service | ||||||||||||||||||||||
5 | NAL Automated indexing | Agricultural Research Service | ||||||||||||||||||||||
6 | Forecasting Grasshopper Outbreaks in the Western United States using Machine Learning Tools | Animal and Plant Health Inspection Service | ||||||||||||||||||||||
7 | Facial recognition | Agricultural Research Service | ||||||||||||||||||||||
8 | Coleridge Initiative, Show US the Data | Economic Research Service | ||||||||||||||||||||||
9 | Westat | Economic Research Service | ||||||||||||||||||||||
10 | Land Change Analysis Tool (LCAT) | Farm Production and Conservation (FPAC) | ||||||||||||||||||||||
11 | Retailer Receipt Analysis | Food and Nutrition Services | ||||||||||||||||||||||
12 | Ecosystem Management Decision Support System (EMDS) | Forest Service | ||||||||||||||||||||||
13 | Wildland Urban Interface - Mapping Wildfire Loss | Forest Service | ||||||||||||||||||||||
14 | National Land Cover Database (NLCD) Tree Canopy Cover Mapping | Forest Service | ||||||||||||||||||||||
15 | The BIGMAP Project | Forest Service | ||||||||||||||||||||||
16 | DISTRIB-II: Habitat Suitability of Eastern United States Trees | Forest Service | ||||||||||||||||||||||
17 | CLT Knowledge Database | Forest Service | ||||||||||||||||||||||
18 | RMRS Raster Utility | Forest Service | ||||||||||||||||||||||
19 | TreeMap 2016 | Forest Service | ||||||||||||||||||||||
20 | Landscape Change Monitoring System (LCMS) | Forest Service | ||||||||||||||||||||||
21 | Forest Health Detection Monitoring | Forest Service | ||||||||||||||||||||||
22 | Land Cover Data Development | Forest Service | ||||||||||||||||||||||
23 | Cropland Data Layer | National Agricultural Statistics Service | ||||||||||||||||||||||
24 | List Frame Deadwood Identification | National Agricultural Statistics Service | ||||||||||||||||||||||
25 | Climate Change Classification NLP | National Institute for Food and Agriculture | ||||||||||||||||||||||
26 | Video Surveillance System | Office of Safety Security and Protection | ||||||||||||||||||||||
27 | Acquisition Approval Request Compliance Tool | Office of the Chief Information Officer | ||||||||||||||||||||||
28 | Operational water supply forecasting for western US rivers | USDA Natural Resources Conservation Service (NRCS) Snow Survey and Water Supply Forecast (SSWSF) program | ||||||||||||||||||||||
29 | B2B Matchmaking | International Trade Administration (ITA) | The system's algorithms and AI technology qualifies data and makes B2B matches with event participants according to their specific needs and available opportunities. The systems inputs are data related to event participants and the outputs are suggested B2B matches between participants and a match strength scorecard. | |||||||||||||||||||||
30 | Chatbot Pilot | International Trade Administration (ITA) | Chatbot embedded into trade.gov to assist ITA clients with FAQs, locating information and content, suggesting events and services. ITA clients would enter input into the chatbot in the form of questions or responses to prompts. The chatbot would scan ITA content libraries and input from ITA staff and return answers and suggestions based on client persona (exporter, foreign buyer, investor). | |||||||||||||||||||||
31 | Consolidated Screening List | International Trade Administration (ITA) | The Consolidated Screening List (CSL) is a list of parties for which the United States Government maintains restrictions on certain exports, reexports, or transfers of items. It consists of the consolidation of 13 export screening lists of the Departments ofCommerce, State, and Treasury. The CSL search engine has “Fuzzy Name Search”capabilities, allowing a search without knowing the exact spelling of an entity’s name. In Fuzzy Name mode, the CSL returns a “score” for results that exactly or nearly match the searched name. This is particularly helpful when searching on CSL for names that have been translated into English from non-Latin alphabet languages. | |||||||||||||||||||||
32 | AD/CVD Self Initiation | International Trade Administration (ITA) | The ADCVD program investigates allegations of dumping and/or countervailing of duties. Investigations are initiated when a harmed US entity files a petition identifying the alleged offence and the specific harm inflicted. Self-Initiation will allow ITA to monitor trade patterns for this activity and preemptively initiate investigations by identifying harmed US entities, often before these entities are aware of the harm. | |||||||||||||||||||||
33 | Market Diversification Toolkit | International Trade Administration (ITA) | The Market Diversification Tool identifies potential new export markets using current trade patterns. A user enters what products they make and the markets they currently export to. The Market Diversification Tool applies a ML algorithm to identify and compare markets that should be considered. The tool brings together product-specific trade and tariff data and economy-level macroeconomic and governance data to provide a picture of which markets make sense for further market research. Users can limit the markets in the results to only the ones they want to consider and modify how each of the eleven indicators in the tool contributes to a country’s overall score. Users can export all the data to a spreadsheet for further analysis. | |||||||||||||||||||||
34 | Fisheries Electronic Monitoring Image Library | National Oceanic and Atmospheric Administration (NOAA) | The Fisheries Electronic Monitoring Library (FEML) will be the central repository for electronic monitoring (EM) data related to marine life. | |||||||||||||||||||||
35 | Passive acoustic analysis using ML in Cook Inlet, AK | National Oceanic and Atmospheric Administration (NOAA) | Passive acoustic data is analyzed for detection of beluga whales and classification of the different signals emitted by these species. Detection and classification are done with an ensemble of 4 CNN models and weighted scoring developed in collaboration with Microsoft. Results are being used to inform seasonal distribution, habitat use, and impact from anthropogenic disturbance within Cook Inlet beluga critical habitat. The project is aimed to expand to other cetacean species as well as anthropogenic noise. | |||||||||||||||||||||
36 | AI-based automation of acoustic detection of marine mammals | National Oceanic and Atmospheric Administration (NOAA) | Timely processing of these data is critical for adapting mitigation measures as climate change continues to impact Arctic marine mammals. Infrastructure for Noise and Soundscape Tolerant Investigation of Nonspecific Call Types (INSTINCT) is command line software which was developed in-house for model training, evaluation, and deployment of machine learning models for the purpose of marine mammal detection in passive acoustic data. It also includes annotation workflows for labeling and validation. INSTINCT has been successfully deployed in several analyses, and further development of detectors within INSTINCT is desired for future novel studies and automation. Continued integration of AI methods into existing processes of the CAEP acoustics group requires a skilled operator familiar with INSTINCT, machine learning, and acoustic repertoire of Alaska region marine mammals. | |||||||||||||||||||||
37 | Developing automation to determine species and count using optical survey data in the Gulf of Mexico | National Oceanic and Atmospheric Administration (NOAA) | VIAME - This project focuses on optical survey collected in the Gulf of Mexico: 1) develop an image library of landed catch, 2) develop of automated image processing (ML/DL) to identify and enumerate species from underwater imagery and 3) develop automated algorithms to process imagery in near real time and download information to central database. | |||||||||||||||||||||
38 | Fast tracking the use of VIAME for automated identification of reef fish | National Oceanic and Atmospheric Administration (NOAA) | We've been compiling image libraries for use in creating automated detection and classification models for use in automating the annotation process for the SEAMAP Reef Fish Video survey of the Gulf of Mexico. This work is being conducted in VIAME but we're looking at several other paths forward in the project to identify best performing models. Current status is that models are performing well enough that we will incorporate automated analysis in video reads this spring as part of a supervised annotation-qa/qc process. | |||||||||||||||||||||
39 | A Hybrid Statistical-Dynamical System for the Seamless Prediction of Daily Extremes and Subseasonal to Seasonal Climate Variability | National Oceanic and Atmospheric Administration (NOAA) | Demonstrate the skill and suitability for operations of a statistical- dynamical prediction system that yields seamless probabilistic forecasts of daily extremes and sub seasonal-to- seasonal temperature and precipitation. We recently demonstrated a Bayesian statistical method for post-processing seasonal forecasts of mean temperature and precipitation from the North American Multi-Model Ensemble (NMME). We now seek to test the utility of an updated hybrid statistical-dynamical prediction system that facilitates seamless sub seasonal and seasonal forecasting. Importantly, this method allows for the representation of daily extremes consistent with climate conditions. This project explores the use of machine learning. | |||||||||||||||||||||
40 | FathomNet | National Oceanic and Atmospheric Administration (NOAA) | FathomNet provides much-needed training data (e.g., annotated, and localized imagery) for developing machine learning algorithms that will enable fast, sophisticated analysis of visual data. We've utilized interns and college class curriculums to localize annotations on NOAA video data for inclusion in FathomNet and to begin training our own algorithms. | |||||||||||||||||||||
41 | ANN to improve CFS T and P outlooks | National Oceanic and Atmospheric Administration (NOAA) | Fan Y., Krasnopolsky, V., van den Dool H., Wu, C. , and Gottschalck J. (2021). Using Artificial Neural Networks to Improve CFS Week 3-4 Precipitation and Temperature Forecasts. | |||||||||||||||||||||
42 | Drought outlooks by using ML techniques | National Oceanic and Atmospheric Administration (NOAA) | Drought outlooks by using ML techniques with NCEP models. Simple NN and Deep Learning techniques used for GEFSv12 to predict Week 1-5 Prcp & T2m over CONUS | |||||||||||||||||||||
43 | EcoCast: A dynamic ocean management tool to reduce bycatch and support sustainable fisheries | National Oceanic and Atmospheric Administration (NOAA) | Operational tool that uses boosted regression trees to model the distribution of swordfish and bycatch species in the California Current | |||||||||||||||||||||
44 | Coastal Change Analysis Program (C-CAP) | National Oceanic and Atmospheric Administration (NOAA) | Beginning in 2015, C-CAP embarked on operational high resolution land cover development effort that utilized geographic object-based image analysis and ML algorithms such as Random Forest to classify coastal land cover from 1m multispectral imagery. More recently, C-CAP has been relying on a CNN approach for the deriving the impervious surface component of their land cover products. The majority of the work is accomplished through external contracts. Prior to the high-res effort, C-CAP focused on developing Landsat based moderate resolution multi-date land cover for the coastal U.S. In 2002, C-CAP adopted a methodology that employed Classification and Regression Trees for land cover data development. | |||||||||||||||||||||
45 | Deep learning algorithms to automate right whale photo id | National Oceanic and Atmospheric Administration (NOAA) | AI for right whale photo id began with a Kaggle competition and has since expanded to include several algorithms to match right whales from different viewpoints (aerial, lateral) and body part (head, fluke, peduncle). The system is now live and operational on the Flukebook platform for both North Atlantic and southern right whales. We have a paper in review at Mammalian Biology. | |||||||||||||||||||||
46 | NN Radiation | National Oceanic and Atmospheric Administration (NOAA) | Developing fast and accurate NN LW- and SW radiations for GFS and GEFS. NN LW- and SW radiations have been successfully developed for previous version of GFS, see: doi: 10.1175/2009MWR3149.1 and the stability and robustness of the approach used was demonstrated, see: https://arxiv.org/ftp/arxiv/papers/2103/2103.07024.pdf NN LW- and SW radiations will be developed for the current versions of for GFS and GEFS. | |||||||||||||||||||||
47 | NN training software for the new generation of NCEP models | National Oceanic and Atmospheric Administration (NOAA) | Optimize NCEP EMC Training and Validation System for efficient handling of high spatial resolution model data produced by the new generation of NCEP's operational models | |||||||||||||||||||||
48 | Coral Reef Watch | National Oceanic and Atmospheric Administration (NOAA) | For more than 20 years, NOAA Coral Reef Watch (CRW) has been using remote sensing, modeled, and in situ data to operate a Decision Support System (DSS) to help resource managers (our target audience), researchers, decision makers, and other stakeholders around the world prepare for and respond to coral reef ecosystem stressors, predominantly resulting from climate change and warming of the Earth's oceans. Offering the world's only global early-warning system of coral reef ecosystem physical environmental changes, CRW remotely monitors conditions that can cause coral bleaching, disease, and death; delivers information and early warnings in near real-time to our user community; and uses operational climate forecasts to provide outlooks of stressful environmental conditions at targeted reef locations worldwide. CRW products are primarily sea surface temperature (SST)-based but also incorporate light and ocean color, among other variables. | |||||||||||||||||||||
49 | Robotic microscopes and machine learning algorithms remotely and autonomously track lower trophic levels for improved ecosystem monitoring and assessment | National Oceanic and Atmospheric Administration (NOAA) | Phytoplankton are the foundation of marine food webs supporting fisheries and coastal communities. They respond rapidly to physical and chemical oceanography, and changes in phytoplankton communities can impact the structure and functioning of food webs. We use a robotic microscope called an Imaging Flow Cytobot (IFCB) to continuously collect images of phytoplankton from seawater. Automated taxonomic identification of imaged phytoplankton uses a supervised machine learning approach (random forest algorithm). We deploy the IFCB on fixed (docks) and roving (aboard survey ships) platforms to autonomously monitor phytoplankton communities in aquaculture areas in Puget Sound and in the California Current System. We map the distribution and abundance of phytoplankton functional groups and their relative food value to support fisheries and aquaculture and describe their changes in relation to ocean and climate variability and change. | |||||||||||||||||||||
50 | Edge AI survey payload development | National Oceanic and Atmospheric Administration (NOAA) | Continued support of multispectral aerial imaging payload running detection model pipelines in real-time. This is a nine camera (color, infrared, ultraviolet) payload controlled by dedicated on-board computers with GPUs. YOLO detection models run at a rate faster than image collection, allowing real-time processing of imagery as it comes off the cameras. Goals of effort are to reduce overall data burden (by TBs) and reduce the data processing timeline, expediting analysis and population assessment for arctic mammals. | |||||||||||||||||||||
51 | Ice seal detection and species classification in multispectral aerial imagery | National Oceanic and Atmospheric Administration (NOAA) | Refine and improve detection and classification pipelines with the goal of reducing false positive rates (to < 50%) while maintaining > 90% accuracy and significantly reducing or eliminating the labor intensive, post survey review process. | |||||||||||||||||||||
52 | First Guess Excessive Rainfall Outlook | National Oceanic and Atmospheric Administration (NOAA) | Machine Learning Product that is a first guess for the WPC Excessive Rainfall Outlook - It is learned from the ERO with atmospheric variables. It is for the Day 4-7 products | |||||||||||||||||||||
53 | First Guess Excessive Rainfall Outlook | National Oceanic and Atmospheric Administration (NOAA) | Machine Learning Product that is a first guess for the WPC Excessive Rainfall Outlook - It is learned from the ERO with atmospheric variables. It is for the Day 1, 2, 3 products | |||||||||||||||||||||
54 | CoralNet: Ongoing operational use, improvement, and development, of machine vision point classification | National Oceanic and Atmospheric Administration (NOAA) | CoralNet is our operational point annotation software for benthic photo quadrat annotation. Our development of our classifiers has allowed us to significantly reduce our human annotation, and we continue to co-develop (and co-fund) new developments in CoralNet, | |||||||||||||||||||||
55 | Automated detection of hazardous low clouds in support of safe and efficient transportation | National Oceanic and Atmospheric Administration (NOAA) | This is a maintenance and sustainment project for the operational GOES-R fog/low stratus (FLS) products. The FLS products are derived from the combination of GOES-R satellite imagery and NWP data using machine learning. The FLS products, which are available in AWIPS, are routinely used by the NWS Aviation Weather Center and Weather Forecast Offices. | |||||||||||||||||||||
56 | The Development of ProbSevere v3 - An improved nowcasting model in support of severe weather warning operations | National Oceanic and Atmospheric Administration (NOAA) | ProbSevere is a ML model that utilizes NWP, satellite, radar, and lightning data to nowcast severe wind, severe hail, and tornadoes. ProbSevere, which was transitioned to NWS operations in October 2020, is a proven tool that enhances operational severe weather warnings. This project aims to develop the next version of ProbSevere, ProbSevere v3. ProbSevere v3 utilizes additional data sets and improved machine learning techniques to improve upon the operational version of ProbSevere. ProbSevere v3 was successfully demonstrated in the 2021 Hazardous Weather Testbed and a JTTI proposal was recently submitted to facilitate an operational update. The development is funded by GOES-R. | |||||||||||||||||||||
57 | The VOLcanic Cloud Analysis Toolkit (VOLCAT): An application system for detecting, tracking, characterizing, and forecasting hazardous volcanic events | National Oceanic and Atmospheric Administration (NOAA) | Volcanic ash is a major aviation hazard. The VOLcanic Cloud Analysis Toolkit (VOLCAT) consists of several AI powered satellite applications including: eruption detection, alerting, and volcanic cloud tracking. These applications are routinely utilized by Volcanic Ash Advisory Centers to issue volcanic ash advisories. Under this project, the VOLCAT products will be further developed, and subsequently transitioned to the NESDIS Common Cloud Framework, to help ensure adherence to new International Civil Aviation Organization requirements. | |||||||||||||||||||||
58 | SUVI Thematic Maps | National Oceanic and Atmospheric Administration (NOAA) | The GOES-16 Solar Ultraviolet Imager (SUVI) is NOAA's operational solar extreme- ultraviolet imager. The SUVI Level 2 Thematic Map files in these directories are produced by NOAA's National Centers for Environmental Information in Boulder, Colorado. These data have been processed from Level 2 High Dynamic Range (HDR) composite SUVI images. The FITS file headers are populated with metadata to facilitate interpretation by users of these observations. Please note that these files are considered to be experimental and thus will be improved in future releases. Users requiring assistance with these files can contact the NCEI SUVI team by emailing goesr.suvi@noaa.gov. The SUVI Thematic Maps product is a Level 2 data product that (presently) uses a machine learning classifier to generate a pixel-by-pixel map of important solar features digested from all six SUVI spectral channels. | |||||||||||||||||||||
59 | BANTER, a machine learning acoustic event classifier | National Oceanic and Atmospheric Administration (NOAA) | A supervised machine learning acoustic event classifier using hierarchical random forests | |||||||||||||||||||||
60 | ProbSR (probability of subfreezing roads | National Oceanic and Atmospheric Administration (NOAA) | A machine-learned algorithm that provides a 0-100% probability roads are subfreezing | |||||||||||||||||||||
61 | VIAME: Video and Image Analysis for the Marine Environment Software Toolkit | National Oceanic and Atmospheric Administration (NOAA) | The Video and Image Analysis for the Marine Environment Software Toolkit, commonly known as VIAME, is an open-source, modular software toolkit that allows users to employ high-level, deep-learning algorithms for automated annotation of imagery using a low code/no code graphical user interface. VIAME is available free of charge to all NOAA users. The NOAA Fisheries Office of Science and Technology supports an annual maintenance contract covering technical and customer support by the developer, routine software updates, bug fixes, and development efforts that support broad, cross center application needs. | |||||||||||||||||||||
62 | ENSO Outlooks using observed/analyzed fields | National Oceanic and Atmospheric Administration (NOAA) | LSTM model that uses ocean and atmospheric predictors throughout the tropical Pacific to forecast ONI values up to 1 year in advance. An extension of this was submitted to the cloud portfolio with the intent of adding a CNN layer that that uses reforecast data to improve the ONI forecasts. | |||||||||||||||||||||
63 | Using community-sourced underwater photography and image recognition software to study green sea turtle distribution and ecology in southern California | National Oceanic and Atmospheric Administration (NOAA) | The goal of this project is to study green turtles in and around La Jolla Cove in the San Diego Region-a highly populated site with ecotourism-by engaging with local photographers to collect green turtle underwater images. The project uses publicly available facial recognition software (HotSpotter) to identify individual turtles, from which we determine population size, residency patterns, and foraging ecology | |||||||||||||||||||||
64 | An Interactive Machine Learning Toolkit for Classifying Species Identity of Cetacean Echolocation | National Oceanic and Atmospheric Administration (NOAA) | Develop robust automated machine learning detection and classification tools for acoustic species identification of toothed whale and dolphin echolocation clicks for up to 20 species found in the Gulf of Mexico. Tool development project funded from June 2018 to | |||||||||||||||||||||
65 | Signals in Passive Acoustic Recordings | May 2021. Tool will be used for automated analyses of long-term recordings from Gulf- wide passive acoustic moored instruments deployed from 2010-2025 to look at environmental processes driving trends in marine mammal density and distribution. | ||||||||||||||||||||||
66 | Steller sea lion automated count program | National Oceanic and Atmospheric Administration (NOAA) | NOAA Fisheries Alaska Fisheries Science Center's Marine Mammal Laboratory (MML) is mandated to monitor the endangered western Steller sea lion population in Alaska. MML conducts annual aerial surveys of known Steller sea lion sites across the southern Alaska coastline to capture visual imagery. It requires two full-time, independent counters to process overlapping imagery manually (to avoid double counting sea lions in multiple frames), and count and classify individuals by age and sex class. These counts are vital for population and ecosystem-based modeling to better understand the species and ecosystem, to inform sustainable fishery management decisions, and are eagerly anticipated by stakeholders like the NOAA Alaska Regional Office, industry, and environmental groups. MML worked with Kitware to develop detection and image registration pipelines with VIAME (updates to the DIVE program to support updated interface needs). MML is now working to assess the algorithms efficacy and develop a workflow to augment the traditional counting method (to RL 9). | |||||||||||||||||||||
67 | Steller sea lion brand sighting | National Oceanic and Atmospheric Administration (NOAA) | Detection and identification of branded steller sea lions from remote camera images in the western Aleutian Islands, AK. The goal is to help streamline photo processing to reduce the effort required to review images. | |||||||||||||||||||||
68 | Replacing unstructured WW3 in the Great Lakes with a Recurrent neural network and a boosted ensemble decision tree | National Oceanic and Atmospheric Administration (NOAA) | Investigated replacing unstructured WW3 in the Great Lakes with (i) a Recurrent Neural Network (RNN, especially an LSTM) developed by EMC and (ii) a boosted ensemble decision tree (XGBoost) developed by GLERL. These two AI models were trained on two decades of wave observations in Lake Erie and compared to the operational Great Lakes unstructured WW3. | |||||||||||||||||||||
69 | Using k-means clustering to identify spatially and temporally consistent wave systems | National Oceanic and Atmospheric Administration (NOAA) | Postprocessing that uses k-means clustering to identify spatially and temporally consistent wave systems from the output of NWPS v1.3. Has been successfully evaluated in the field by NWS marine forecasters nationwide and has been implemented into operations on February 3, 2021. | |||||||||||||||||||||
70 | Picky | National Oceanic and Atmospheric Administration (NOAA) | Using CNN to pick out objects of a particular size from sides scan imagery. Presents users with a probability that allows for automation of contact picking in the field. Side scan imagery is simple one channel intensity image which lends itself well to basic CNN techniques. | |||||||||||||||||||||
71 | Data Science: Clutter | National Telecommunications and Information Administration (NTIA) | NTIA’s Institute for Telecommunication Sciences (ITS) is investigating the use of AI toautomatically identify and classify clutter obstructed radio frequency propagation paths. Clutter is vegetation, buildings, and other structures that cause radio signal loss through dispersion, reflection, and diffraction. It does not include terrain effects. The classifier is a convolutional neural network (CNN) trained using lidar data coinciding with radio frequency propagation measurements made by ITS. This trained CNN can be fed new radio path lidar data and a clutter classification label is predicted. | |||||||||||||||||||||
72 | WAWENETS | National Telecommunications and Information Administration (NTIA) | The algorithm produces estimates of telecommunications speech quality and speech intelligibility. The input is a recording of speech from a telecommunications system in digital file format. The output is a single number that indicates speech quality (typically on a 1 to 5 scale) or speech intelligibility (typically on a 0 to 1 scale). | |||||||||||||||||||||
73 | Azure (Microsoft) | Minority Business Development Administration (MBDA) | Azure Chatbot is being leveraged to automate and streamline the user response to potential questions for MBDA users while interacting with the external facing MBDA website. The solution leverages AI based chatbot response coupled with Machine Learning and Natural Language Processing capabilities. | |||||||||||||||||||||
74 | AI retrieval for patent search | United States Patent and Trade Office (USPTO) | Augmentation for next generation patent search tool to assist examiners identify relevant documents and additional areas to search. System takes input from published or unpublished applications and provides recommendations on further prior art areas to search, giving the user the ability to sort by similarity to concepts of their choosing. | |||||||||||||||||||||
75 | AI use for CPC classification | United States Patent and Trade Office (USPTO) | System that classifies incoming patent application based on the cooperative patent classification scheme for operational assignment of work and symbol recommendation for aI search. Backoffice processing system that uses incoming patent applications as input and outputs the resulting classification symbols. | |||||||||||||||||||||
76 | AI retrieval for TM design coding and Image search | United States Patent and Trade Office (USPTO) | Clarivate COTS solution to assist examiner identification of similar trademark images, to suggest the correct assignment of mark image design codes, and to determine the potential acceptability of the identifications of goods and services. System is anticipated to use both incoming trademark images and registered trademark images and output design codes and/or other related images. | |||||||||||||||||||||
77 | Enriched Citation | United States Patent and Trade Office (USPTO) | Data dissemination system that identifies which references, or prior art, were cited in specific patent application office actions, including: bibliographic information of the reference, the claims that the prior art was cited against, and the relevant sections that the examiner relied upon. System extracts information from unstructured office actions and provides the information through a structured public facing API. | |||||||||||||||||||||
78 | Inventor Search Assistant (iSAT) | United States Patent and Trade Office (USPTO) | Service to help inventors "get started" identifying relevant documents, figures, and classification codes used to conduct a novelty search. System takes a user entered short description of invention and provides a user selectable set of recommended documents, figures, and classification areas. | |||||||||||||||||||||
79 | Aidan Chat-bot | Federal Student Aid | FSA's virtual assistant uses natural language processing to answer common financial aid questions and help customers get information about their federal aid on StudentAid.gov. In just over two years, Aidan has interacted with over 2.6 million unique customers, resulting in more than 11 million user messages. | |||||||||||||||||||||
80 | Advances in Nuclear Fuel Cycle Nonproliferation, Safeguards, and Security Using an Integrated Data Science Approach | U.S. Department of Energy | This research will develop a digital twin of a centrifugal contactor system that receives data from traditional and real time sensors, constructs a digital representation or simulation of the chemical separations component within the nuclear fuel cycle, and performs data analysis through machine learning to determine anomalies, failures, and trends. The research will include the identification and implementation of advanced artificial intelligence, machine learning, and data analysis techniques advised by a team of nuclear safeguards experts. | Idaho National Laboratory | No | |||||||||||||||||||
81 | Development of a multi-sensor data science system used for signature development on solvent extraction processes conducted within Beartooth facility | U.S. Department of Energy | This project will develop a system that utilizes non-traditional measurement sources such as vibration, acoustics, current, and light, and traditional sources such as flow, and temperature in conjunction with data-based, machine learning techniques that will allow for signal discovery. The goal is to characterize stages within a solvent extraction process can increase target metals recovery, indicate process faults, account for special nuclear material, and inform near real-time decision making. | Idaho National Laboratory | No | |||||||||||||||||||
82 | Scalable Framework of Hybrid Modeling with Anticipatory Control Strategy for Autonomous Operation of Modular and Microreactors | U.S. Department of Energy | The goal this research is to develop and validate novel and scalable models to achieve faster-than-real-time prediction and decision-making capabilities. To achieve the project goal of autonomous operation of microreactors, a novel hybrid modeling approach combining both physics-based and artificial intelligence techniques will be developed at the component or sub-system level, integrated with anticipatory control techniques, and scaled. A novel distributed anticipatory control strategy will be developed as part of the scalability analysis to understand the risk of cascading failures when emerging reactors are deployed as part of a full feeder microgrid. | Idaho National Laboratory | No | |||||||||||||||||||
83 | Accelerating and Improving the Reliability of Low Failure Probability Computations to Support the Efficient Safety Evaluation and Deployment of Advanced Reactor Technologies | U.S. Department of Energy | This project will research artificial intelligence enabled Monte Carlo algorithms to significantly reduce the computational burden by reducing the number of finite element evaluations when estimating low failure probabilities. These will be implemented in the Multiphysics Object-Oriented Simulation Environment, which will help the nuclear engineering community to efficiently conduct probabilistic failure analyses and uncertainty quantification studies for the design and optimization of advanced reactor technologies. | Idaho National Laboratory | No | |||||||||||||||||||
84 | Accelerating deployment of nuclear fuels through reduced-order thermo- physical property models and machine learning | U.S. Department of Energy | This project will develop a novel physics-based tool that combines 1) reduced-order models, 2) machine learning algorithms, 3) fuel performance methods, and 4) state-of- the-art thermal property characterization equipment and irradiated nuclear fuel data sets to accelerate nuclear fuel discovery, development, and deployment. The models will describe thermal conductivity, specific heat, thermal expansion, and self-diffusion coefficients as a function of temperature and irradiation. | Idaho National Laboratory | No | |||||||||||||||||||
85 | Promoting Optimal Sparse Sensing and Sparse Learning for Nuclear Digital Twins | U.S. Department of Energy | This project will address the efficient use of limited experimental data available for nuclear digital twin (NDT) training and demonstration. This involves developing sparse data reconstruction methods and using NDT models to define sensor requirements (location, number, accuracy) for the design of demonstration experiments. NDTs should leverage 1) sparse sensing for identifying optimal locations and the minimal set of required sensors and 2) sparse learning and recovery of full maps of responses of interest for stronger prediction, diagnostics, and prognostics capabilities. | Idaho National Laboratory | No | |||||||||||||||||||
86 | Artificial Intelligence Enhanced Advanced Post Irradiation Examination | U.S. Department of Energy | This project uses post irradiation examination of uranium-10wt.% zirconium (UZr) metallic fuel as a case study to show how artificial intelligence (AI)-based technology can facilitate and accelerate nuclear fuel development. The approach will 1) revisit the microstructural image and local thermal conductivity data collected from UZr, 2) build a benchmark dataset for the microstructural patterns of irradiated UZr, and 3) train the machine learning and deep learning models to uncover the relationships between micro/nanoscale structure, zirconium phase redistribution, local thermal conductivity, and engineering scale fuel properties. | Idaho National Laboratory | No | |||||||||||||||||||
87 | Secure Millimeter Wave Spectrum Sharing with Autonomous Beam Scheduling | U.S. Department of Energy | This approach exploits the millimeter wave beam directionality and utilizes the beam sensing capabilities at end devices to prove that an autonomous radio frequency beam scheduler can support secure 5G spectrum sharing and guarantee optimality for base stations. Measurements and predictive analytics are used to develop the autonomous beam scheduling algorithms. These improvements will benefit mission critical communications and emergency response operations as well as enable secure communication for critical infrastructure without expensive and competitive licensed bands. | Idaho National Laboratory | No | |||||||||||||||||||
88 | Objective-Driven Data Reduction for Scientific Workflows | U.S. Department of Energy | This project aims to develop theories and algorithms for objective-driven reduction of scientific data in workflows that are composed of various models, including data- driven AI models | Brookhaven National Laboratory | Problem Scoping | No | ||||||||||||||||||
89 | The Grid Resilience and Intelligence Platform (GRIP) | U.S. Department of Energy | AI within GRIP is used to develop metrics that quantify the impact of the anticipated weather related extreme events. The platform uses utility data combined with physical models, distribution power solver to infer the potential grid impacts given a major storm. | Office of Electricity | No | |||||||||||||||||||
90 | Open-Source High-Fidelity Aggregate Composite Load Models of Emerging Load Behaviors for Large-Scale Analysis (GMLC 0064) | U.S. Department of Energy | 1. Machine learning methods such as cross-correlation, random forest, regression tree and transfer learning are used to estimate the load composition data and motor protection profiles for different climante regions in the Western US 2. Deep learning algorithm is appplied to calibrate the parameters of WECC composite load model to match the responses with detailed feeder model | Office of Electricity | No | |||||||||||||||||||
91 | Big Data Synchrophasor Monitoring and Analytics for Resiliency Tracking (BDSMART) | U.S. Department of Energy | Explore the use of big data, artificial intelligence (AI), and machine learning technology and tools on phasor measurement unit (PMU) data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management. | Office of Electricity | No | |||||||||||||||||||
92 | Combinatorial Evaluation of Physical Feature Engineering and Deep Temporal Modeling for Synchrophasor Data at Scale | U.S. Department of Energy | Explore the use of big data, artificial intelligence (AI), and machine learning technology and tools on phasor measurement unit (PMU) data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management. | Office of Electricity | No | |||||||||||||||||||
93 | MindSynchro | U.S. Department of Energy | Explore the use of big data, artificial intelligence (AI), and machine learning technology and tools on phasor measurement unit (PMU) data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management. | Office of Electricity | No | |||||||||||||||||||
94 | PMU-Based Data Analytics Using Digital Twin Phasor Analytics Software | U.S. Department of Energy | Explore the use of big data, artificial intelligence (AI), and machine learning technology and tools on phasor measurement unit (PMU) data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management. | Office of Electricity | No | |||||||||||||||||||
95 | A Robust Event Diagnostic Platform: Integrating Tensor Analytics and Machine Learning Into Real-time Grid Monitoring | U.S. Department of Energy | Explore the use of big data, artificial intelligence (AI), and machine learning technology and tools on phasor measurement unit (PMU) data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management. | Office of Electricity | No | |||||||||||||||||||
96 | Discovery of Signatures, Anomalies, and Precursors in Synchrophasor Data with Matrix Profile and Deep Recurrent Neural Networks | U.S. Department of Energy | Explore the use of big data, artificial intelligence (AI), and machine learning technology and tools on phasor measurement unit (PMU) data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management. | Office of Electricity | No | |||||||||||||||||||
97 | Machine Learning Guided Operational Intelligence | U.S. Department of Energy | Explore the use of big data, artificial intelligence (AI), and machine learning technology and tools on phasor measurement unit (PMU) data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management. | Office of Electricity | No | |||||||||||||||||||
98 | Robust Learning of Dynamic Interactions for Enhancing Power System Resilience | U.S. Department of Energy | Explore the use of big data, artificial intelligence (AI), and machine learning technology and tools on phasor measurement unit (PMU) data to identify and improve existing knowledge, and to discover new insights and tools for better grid operation and management. | Office of Electricity | No | |||||||||||||||||||
99 | Artificial Intelligence Based Process Control and Optimization for Advanced Manufacturing | U.S. Department of Energy | This project will develop the capability to intelligently control and optimize advanced manufacturing processes instead of the existing trial and error approach. To achieve this goal, artificial intelligence (AI) based control algorithms will be developed by employing deep reinforcement learning. To reduce the computational expense with advanced manufacturing models, physics-informed reduced order models (ROMs) will be developed. The AI-based control algorithms will employ the ROMs’ predictions to adaptively inform processing decisions in a simulation environment. | Idaho National Laboratory | No | |||||||||||||||||||
100 | Smart Contingency Analysis Neural Network for in-depth Power Grid Vulnerability Analyses | U.S. Department of Energy | Typical contingency analysis for a power utility is limited to n-1 due to computational complexity and cost. A machine learning framework and resilience-chaos plots are leveraged to reduce computational expense required to discover, with 90% accuracy, n-2 contingencies by 50%. | Idaho National Laboratory | No | |||||||||||||||||||
101 | Resilient Attack Interceptor for Intelligent Devices | U.S. Department of Energy | The Resilient Attack Interceptor for Intelligent Devices approach focuses on developing external monitoring methods to protect industrial internet of things devices by correlating observable physical aspects that are produced naturally and involuntarily during the operational lifecycle with anomalous functionality. | Idaho National Laboratory | No | |||||||||||||||||||
102 | Infrastructure eXpression | U.S. Department of Energy | The project developed a framework and process to translate industrial control system features to a machine-readable format for use with automated cyber tools. This research also examined other current and evolving standards for usability with diverse grid architectures that represent a set of variable conditions to establish the foundation for determining where future research should focus and to support improvements to industry standards and architecture designs for machine-learning cyber defense solutions. This project’s success can serve as the foundation for prioritizing the next research steps to realize automated threat response, improving the timeliness and fidelity of cyber incident consequence models, and enriching national capabilities to share actionable threat intelligence at machine speed. | Idaho National Laboratory | No | |||||||||||||||||||
103 | Protocol Analytics to enable Forensics of Industrial Control Systems | U.S. Department of Energy | The goal of this research is to discover methods and technologies to bridge gaps between the various industrial control systems (ICS) communication protocols and standard Ethernet to enable existing cybersecurity tools defend ICS networks and empower cybersecurity analysts to detect compromise before threat actors can disrupt infrastructure, damage property, and inflict harm. Research focuses on electronic signal analysis of captured communication to determine the protocol, using use machine learning to identify unknown protocols. Findings will be incorporated into a prototype device. | Idaho National Laboratory | No | |||||||||||||||||||
104 | Automated Type and Data Structure Resolution | U.S. Department of Energy | This research identified and labeled type and structure data in an automated and scalable way such that the information can be used in other tools and other Reverse Engineering at Scale research areas such as symbolic execution. This was done initially by utilizing heuristic methods and then scaled by adopting a machine learning approach. | Idaho National Laboratory | No | |||||||||||||||||||
105 | Signal Decomposition for Intrusion Detection in Reliability Assessment in Cyber Resilience | U.S. Department of Energy | The objective of this project is to research, assess, and implement machine learning and artificial intelligence and physics-based algorithms for signal decomposition and provide a straightforward framework wherein an anomaly detection algorithm can be trained on existing expected data and then used for false data injection detection. An advanced library for signal decomposition and analysis will be developed that allows combining machine learning and artificial intelligence algorithms and high-fidelity model comparisons for greatly improved false data injection detection. This library will facilitate online and posteriori analysis of digital signals for the purpose of detecting potential malicious tampering in physical processes. | Idaho National Laboratory | No | |||||||||||||||||||
106 | Advanced Machine Learning-based Fifth Generation Network Attack Detection System | U.S. Department of Energy | The project goal is to prove that enhancing attack detection via innovative machine learning and artificial intelligence techniques into the fifth generation (5G) cellular network can help to secure mission-critical applications, such as automated vehicles and drones, connected health, emergency response operations, and other mission- critical devices that either are or will be connected to the 5G cellular network. | Idaho National Laboratory | No | |||||||||||||||||||
107 | Red Teaming Artificial Intelligence | U.S. Department of Energy | This research will advance the state of the art for red team security assessment of machine learning and artificial intelligence systems by providing methods for the reverse engineering, exploitation, risk assessment and vulnerability remediation. The insights gained from the explorations into vulnerability assessment research will proactively address critical gaps in the cybersecurity community’s understanding of these systems and can be used to create appropriate risk evaluation metrics and provide best practices for inclusion into consequence-driven cyber-informed engineering. | Idaho National Laboratory | No | |||||||||||||||||||
108 | Unattended Operation through Digital Twin Innovations | U.S. Department of Energy | The team hypothesizes that artificial intelligence can predict events using the integrated data from test bed sensors and physics-based models. A second hypothesis is that integrating software and artificial intelligence with sensor data from a test bed will lead to a framework for future digital twins. The team will train artificial intelligence models to determine what attributes are most important for enabling intelligent autonomous control and will determine best practices for digital twin cybersecurity. | Idaho National Laboratory | No | |||||||||||||||||||
109 | Secure and Resilient Machine Learning System for Detecting Fifth Generation (5G) Attacks including Zero-Day Attacks | U.S. Department of Energy | This project will implement an advanced machine learning based 5G attack detection system that can achieve high classification speed (10k packets per second) with high accuracy (90% or greater) as well as address a vulnerability to zero-day attacks (90% accuracy against real zero-day attacks recorded by Amazon Web Services) using field programmable gate array based deep autoencoders. | Idaho National Laboratory | No | |||||||||||||||||||
110 | Automated Malware Analysis Via Dynamic Sandboxes | U.S. Department of Energy | The goal of this project is to develop an analysis framework enabled by dynamic sandboxes that allows for automated analysis, provides non-existing core capabilities to analyze industrial control system malware, and outputs to a format that is machine readable and an industry standard in sharing threat information. This will enable further analysis efforts via machine learning and provide a foundational platform that would allow for timely, automated analysis of malware samples. | Idaho National Laboratory | No | |||||||||||||||||||
111 | Interdependent Infrastructure Systems Resilience Analysis for Enhanced Microreactor Power Grid Penetration | U.S. Department of Energy | This project will develop machine learning enabled integrated resource planning methodologies to help quantify key resilience elements across integrated energy systems and their vulnerabilities to threats and hazards. This includes the ability to accurately analyze and visualize a region’s critical infrastructure systems ability to sustain impacts, maintain critical functionality, recover from disruptive events. This advanced decision support capability can improve our understanding of these complex relationships and help predict the potential impacts that microreactors and distributed energy resources have on the reliability and resiliency of our energy systems. | Idaho National Laboratory | No | |||||||||||||||||||
112 | Adaptive Fingerprinting of Control System Devices through Generative Adversarial Networks | U.S. Department of Energy | This project focuses on the reduction of manual labor and operational cost required for training an electromagnetic (EM)-based anomaly detection system for legacy industrial control systems devices and Industrial Internet of Things. This research would enable EM-based intrusion detection systems to be deployed to protect legacy control systems. | Idaho National Laboratory | No | |||||||||||||||||||
113 | Support Vector Analysis for Computational Risk Assessment, Decision-Making, and Vulnerability Discovery in Complex Systems | U.S. Department of Energy | This project addressed limitations in current probabilistic risk assessment (PRA) by combining a support vector machine and PRA software to auto-detect system design vulnerabilities and find previously unseen issues, reduce human error, and reduce human costs. This method does not require training data that would only be available in the event of system or subsystem failures. | Idaho National Laboratory | No | |||||||||||||||||||
114 | Deep Reinforcement Learning and Decision Analytics for Integrated Energy Systems | U.S. Department of Energy | This project will develop a novel deep reinforcement learning approach that can manage distributed or tightly coupled multi-agent systems utilizing deep neural networks for automatic system representation, modeling, and end-to-end learning. This new control method will enable complex, nonlinear system optimization over timescales from milliseconds to months. | Idaho National Laboratory | No | |||||||||||||||||||
115 | Nuclear-Renewable-Storage Digital Twin: Enhancing Design, Dispatch, and Cyber Response of Integrated Energy Systems | U.S. Department of Energy | This project will develop a learning-based and digital twin enabled modeling and simulation framework for economic and resilient real-time decision-making of physics- informed integrated energy systems (IES) operation. High-fidelity physics models will be linked with large-scale grid monitoring data to provide real-time updates of IES states, predictive control systems, and optimized power dispatch solutions. Learning- based algorithms will make real-time decisions upon detection of component contingencies caused by climate-induced or man-made extreme events, such as cyber-attacks or extreme weather, thereby mitigating their impacts through appropriate counter measures. | Idaho National Laboratory | No | |||||||||||||||||||
116 | Automated Infrastructure & Dependency Detection via Satellite Imagery and Dependency Profiles | U.S. Department of Energy | Computer vision, a broad set of techniques for training statistical models and neural networks to process images, has advanced substantially in recent years. Applying these capabilities to satellite imagery can improve critical infrastructure analysis and interdependency data build-outs. Combining advanced computer vision techniques, a functional taxonomic approach to critical infrastructure, and the unique geo-spatial and dependency datasets the research team developed can produce innovative and state- of-the-art image processing results that advance abilities to secure and defend national critical infrastructure. | Idaho National Laboratory | No | |||||||||||||||||||
117 | Accelerated Nuclear Materials and Fuel Qualification by Adopting a First to Failure Approach | U.S. Department of Energy | Physics-based multi-scale modeling was coupled with deep, recursive, and transfer learning approaches to accelerate nuclear materials research and qualification of high- entropy alloys. Applying AI to combinatorial-based materials research enables subsequent analysis to focus on a limited number of candidates predicted to have the necessary materials properties for the application. | Idaho National Laboratory | No | |||||||||||||||||||
118 | Evaluating thermal properties of advanced materials | U.S. Department of Energy | The standard thermal diffusivity measurement technique laser flash is enhanced by modifying the traditional experimental set up and analyzing results with a machine learning based tool that includes a finite element model, a least-squares fitting algorithm and experimental data treatment algorithms. This tool helps elucidate thermo- physical properties of a material from a single laser flash measurement. | Idaho National Laboratory | No | |||||||||||||||||||
119 | Spectral Observation Convolutional Neural Network | U.S. Department of Energy | This project developed method to analyze collected radiation spectra using advanced, scalable deep learning by combining spectroscopic expertise with high performance computing. Sophisticated deep learning can overcome the weaknesses of existing spectroscopic techniques and enhance the value of difficult measurements. This method was trained, tested, and operated on the International Space Station’s Spaceborne Computer-2 supercomputer, returning zero errors over the course of 100 training hours. This demonstrated performance autonomously in far-edge, low-wattage computing situations and in hazardous radiological environments where interference can cause errors. | Idaho National Laboratory | No | |||||||||||||||||||
120 | Passive Strain Measurements for Experiments in Radiation Environments | U.S. Department of Energy | This project will develop passive instrumentation to determine permanent strains induced by irradiation and extract critical parameters using modeling and simulation as well as machine learning algorithms. An irradiation experiment will be conducted that will benefit from engineered anisotropic materials and characterize the directional deformation in response to neutron radiation. The results of the experiment will be incorporated into the model so that the material response can be predicted for future uses as a probe material. | Idaho National Laboratory | No | |||||||||||||||||||
121 | Machine Learning Interatomic Potentials for Radiation Damage and Physical Properties in Model Fluorite Systems | U.S. Department of Energy | This project will use machine learning interatomic potentials to study the influence of radiation damage on physical properties of calcium fluoride and uranium dioxide. Electron irradiation experiments and thermal conductivity measurements will be performed to validate the effectiveness of the developed potentials. The high throughput capability of this method will become an important combinatorial materials science tool for developing and qualifying new nuclear fuels. | Idaho National Laboratory | No | |||||||||||||||||||
122 | Data-driven failure diagnosis and prognosis of solid-state ceramic membrane reactor under harsh conditions using deep learning technology with internal voltage sensors | U.S. Department of Energy | This research will investigate in situ the effects of different components on the degradation behavior in a solid-state ceramic membrane reactor by embedding sensors that will collect current and impedance data during operation. Artificial intelligence will be used to understand the large amounts of data and predict reactor failure under harsh operating conditions. | Idaho National Laboratory | No | |||||||||||||||||||
123 | Tailoring the Properties of Multiphase Materials Through the Use of Correlative Microscopy and Machine Learning | U.S. Department of Energy | This research uses state-of-the-art machine learning (ML) techniques in a new and novel manner to identify and correlate the critical microstructural features in a multiphase alloy that exhibits high strength and fracture toughness. Experimental data will be used to train a convolutional neural network (CNN) in a semi-supervised environment to identify key microstructural features and correlate those features with the strength and toughness. The resulting machine learning tool can be trained for additional microstructural features, different alloys, and/or target mechanical properties. | Idaho National Laboratory | No | |||||||||||||||||||
124 | Microstructurally-driven Framework for Optimization of In-core Materials | U.S. Department of Energy | This research will develop a methodology that relies on mechanism-informed machine learning models, rapid ion irradiation and creep testing techniques, and advanced characterization coupled with automated image analysis to enable reactor developers to quickly understand the complex linkage between alloy composition, thermomechanical processing, the resulting microstructure, and swelling and creep behavior. This project will (1) develop and demonstrate a high-potential methodology for rapid development of future in-core materials and (2) provide critically important information on alloy design for optimized swelling and creep behavior to the advanced reactor development community. | Idaho National Laboratory | No | |||||||||||||||||||
125 | Use of random forest model to predict exposure pathways | Environmental Protection Agency | Prioritizing the potential risk posed to human health by chemicals requires tools that can estimate exposure from limited information. In this study, chemical structure and physicochemical properties were used to predict the probability that a chemical might be associated with any of four exposure pathways leading from sources-consumer (near-field), dietary, far-field industrial, and far-field pesticide-to the general population. The balanced accuracies of these source-based exposure pathway models range from 73 to 81%, with the error rate for identifying positive chemicals ranging from 17 to 36%. We then used exposure pathways to organize predictions from 13 different exposure models as well as other predictors of human intake rates. We created a consensus, meta-model using the Systematic Empirical Evaluation of Models framework in which the predictors of exposure were combined by pathway and weighted according to predictive ability for chemical intake rates inferred from human biomonitoring data for 114 chemicals. The consensus model yields an R2 of ∼0.8. We extrapolate to predict relevant pathway(s), median intake rate, and credible interval for 479 926 chemicals, mostly with minimal exposure information. This approach identifies 1880 chemicals for which the median population intake rates may exceed 0.1 mg/kg bodyweight/day, while there is 95% confidence that the median intake rate is below 1 μg/kg BW/day for 474572 compounds | |||||||||||||||||||||
126 | Records Categorization | Environmental Protection Agency | The records management technology team is using machine learning to predict the retention schedule for records. The machine learning model will be incorporated into a records management application to help users apply retention schedules when they submit new records. | |||||||||||||||||||||
127 | Enforcement Targeting | Environmental Protection Agency | EPA’s Office of Compliance, in partnership with the University of Chicago, built a proof-of-concept to improve enforcement of environmental regulations through facility inspections by the EPA and state partners. The resulting predictive analytics showed a 47% improvement of identifying violations of the Resource Conservation and Recovery Act. | |||||||||||||||||||||
128 | Acquisition Analytics | General Services Administration | No further info - says in development | |||||||||||||||||||||
129 | Category Taxonomy Refinement Using NLP | General Services Administration | No further info - says in development | |||||||||||||||||||||
130 | Chatbot for Federal Acquisition Community (not in development yet) | General Services Administration | No further info - says in development | |||||||||||||||||||||
131 | City Pairs Program Ticket Forecast and Scenario Analysis Tool | General Services Administration | No further info - says in development | |||||||||||||||||||||
132 | Classifying Qualitative Data with Medallia | General Services Administration | No further info - says in development | |||||||||||||||||||||
133 | Contract Acquisition Lifecycle Intelligence (CALI | General Services Administration | No further info - says in development | |||||||||||||||||||||
134 | Enterprise Brain | General Services Administration | No further info - says in development | |||||||||||||||||||||
135 | Key KPI Forecasts for GWCM | General Services Administration | No further info - says in development | |||||||||||||||||||||
136 | OAS Kudos Chatbot | General Services Administration | No further info - says in development | |||||||||||||||||||||
137 | ServiceNow Generic Ticket Classification | General Services Administration | No further info - says in development | |||||||||||||||||||||
138 | ServiceNow Generic Ticket Classification | General Services Administration | No further info - says in development | |||||||||||||||||||||
139 | Survey Comment Ham / Spam tester | General Services Administration | No further info - says in development | |||||||||||||||||||||
140 | Insight | SSA -- Office of Analytics, Review, and Oversight; Office of Hearing Operations, Office of Disability Information Systems | Insight is decision support software used by hearings and appeals-level Disability Program adjudicators to help maximize the quality, speed, and consistency of their decisionmaking. Insight analyzes the free text of disability decisions and other case data to offer adjudicators real-time alerts on potential quality issues and case-specific reference information within a web application. It also offers adjudicators a series of interactive tools to help streamline their work. Adjudicators can leverage these features to speed their work and fix issues before the case moves forward (e.g. to another reviewing employee or to the claimant). Insight’s features are powered by several natural language processing and artificial intelligence packages and techniques. | |||||||||||||||||||||
141 | Intelligent Medical Langage Analysis Generation (IMAGEN) | SSA -- Oifice of Disability Determinations, Office of Disability Information Systems | IMAGEN analyzes clinical text from disability applicants health records and transforms it to data and other useful formats to enable disability adjudicators to more easily find and identify clinical content that is relevant to SSA’s disability determination process. IMAGEN leverages various Artifcial Intelligence (AI) machine learning technologies including Natural Language Processing (NLP), data modeling and predictive analytics to provide new tools and services that improve the organization and visualization of specific medical encounters, medical reports and lab results that will improve the efficiency and consistency of disability determinations and decisions. IMAGEN analytics platform will also support other high-priority agency initiatives such as Continuing Disability Reviews (CDR). | |||||||||||||||||||||
142 | Duplicate Identification Process (DIP) | SSA -- Office of Disability Information Systems, Office of Hearing Operations, Office of Appelate Operations | DIP's objective is to help the user to identify and flag duplicate pages and documents within the disability electronic folder more efficiently, reducing the amount of task time associated with preparing cases for SSA's ALJ Hearings. DIP uses artificial intelligence software in the form of image recognition technology to accurately identify document and page duplication that is consistent with SSA policy. | |||||||||||||||||||||
143 | Handwriting recognition from forms | SSA -- Office of Operations / Wikes-Barre Direct Operation Center | Artificial Intelligence (AI) performs Optical Character Recognition (OCR) against handwritten entries on specific standard forms submitted by clients. This use case is in support of a Robotic Process Automation (RPA) effort as well as a standalone use. | |||||||||||||||||||||
144 | Artificial Intelligence physical therapy app | Veterans Affairs | This app is a physical therapy support tool. It is a data source agnostic tool which takes input from a variety of wearable sensors and then analyzes the data to give feedback to the physical therapist in an explainable format. | |||||||||||||||||||||
145 | Artificial intelligence coach in cardiac surgery | Veterans Affairs | The artificial intelligence coach in cardiac surgery infers misalignment in team members’ mental models during complex healthcare task execution. Of interest are safety-critical domains (e.g., aviation, healthcare), where lack of shared mental models can lead to preventable errors and harm. Identifying model misalignment provides a building block for enabling computer-assisted interventions to improve teamwork and augment human cognition in the operating room. | |||||||||||||||||||||
146 | AI Cure | Veterans Affairs | AICURE is a phone app that monitors adherence to orally prescribed medications during clinical or pharmaceutical sponsor drug studies. | |||||||||||||||||||||
147 | Acute kidney injury (AKI) | Veterans Affairs | This project, a collaboration with Google DeepMind, focuses on detecting acute kidney injury (AKI), ranging from minor loss of kidney function to complete kidney failure. The artificial intelligence can also detect AKI that may be the result of another illness. | |||||||||||||||||||||
148 | Assessing lung function in health and disease | Veterans Affairs | Health professionals can use this artificial intelligence to determine predictors of normal and abnormal lung function and sleep parameters. | |||||||||||||||||||||
149 | Automated eye movement analysis and diagnostic prediction of neurological disease | Veterans Affairs | Artificial intelligence recursively analyzes previously collected data to both improve the quality and accuracy of automated algorithms, as well as to screen for markers of neurological disease (e.g. traumatic brain injury, Parkinson's, stroke, etc). | |||||||||||||||||||||
150 | Automatic speech transcription engines to aid scoring neuropsychological tests. | Veterans Affairs | Automated speech transcription engines analyze the cognitive decline of older VA patients. Digitally recorded speech responses are transcribed using multiple artificial intelligence-based speech-to-text engines. The transcriptions are fused together to reduce or obviate the need for manual transcription of patient speech in order to score the neuropsychological tests. | |||||||||||||||||||||
151 | CuraPatient | Veterans Affairs | CuraPatient is a remote tool that allows patients to better manage their conditions without having to see a provider. Driven by artificial intelligence, it allows patients to create a profile to track their health, enroll in programs, manage insurance, and schedule appointments. | |||||||||||||||||||||
152 | Digital command center | Veterans Affairs | The Digital Command Center seeks to consolidate all data in a medical center and apply predictive prescriptive analytics to allow leaders to better optimize hospital performance. | |||||||||||||||||||||
153 | Disentangling dementia patterns using artificial intelligence on brain imaging and electrophysiological data | Veterans Affairs | This collaborative effort focuses on developing a deep learning framework to predict the various patterns of dementia seen on MRI and EEG and explore the use of these imaging modalities as biomarkers for various dementias and epilepsy disorders. The VA is performing retrospective chart review to achieve this. | |||||||||||||||||||||
154 | Machine learning (ML) for enhanced diagnostic error detection and ML classification of protein electrophoresis text | Veterans Affairs | Researchers are performing chart review to collect true/false positive annotations and construct a vector embedding of patient records, followed by similarity-based retrieval of unlabeled records "near" the labeled ones (semi-supervised approach). The aim is to use machine learning as a filter, after the rules-based retrieval, to improve specificity. Embedding inputs will be selected high-value structured data pertinent to stroke risk and possibly selected prior text notes. | |||||||||||||||||||||
155 | Behavidence | Veterans Affairs | Behavidence is a mental health tracking app. Veterans download the app onto their phone and it compares their phone usage to that of a digital phenotype that represents people with confirmed diagnosis of mental health conditions. | |||||||||||||||||||||
156 | Machine learning tools to predict outcomes of hospitalized VA patients | Veterans Affairs | This is an IRB-approved study which aims to examine machine learning approaches to predict health outcomes of VA patients. It will focus on the prediction of Alzheimer's disease, rehospitalization, and Chlostridioides difficile infection. | |||||||||||||||||||||
157 | Nediser reports QA | Veterans Affairs | Nediser is a continuously trained artificial intelligence “radiology resident” that assists radiologists in confirming the X-ray properties in their radiology reports. Nediser can select normal templates, detect hardware, evaluate patella alignment and leg length and angle discrepancy, and measure Cobb angles. | |||||||||||||||||||||
158 | Precision medicine PTSD and suicidality diagnostic and predictive tool | Veterans Affairs | This model interprets various real time inputs in a diagnostic and predictive capacity in order to forewarn episodes of PTSD and suicidality, support early and accurate diagnosis of the same, and gain a better understanding of the short and long term effects of stress, especially in extreme situations, as it relates to the onset of PTSD. | |||||||||||||||||||||
159 | Prediction of Veterans' Suicidal Ideation following Transition from Military Service | Veterans Affairs | Machine learning is used to identify predictors of veterans' suicidal ideation. The relevant data come from a web-based survey of veterans’ experiences within three months of separation and every six months after for the first three years after leaving military service. | |||||||||||||||||||||
160 | PredictMod | Veterans Affairs | PredictMod uses artificial intelligence to determine if predictions can be made about diabetes based on the gut microbiome. | |||||||||||||||||||||
161 | Predictor profiles of OUD and overdose | Veterans Affairs | Machine learning prediction models evaluate the interactions of known and novel risk factors for opioid use disorder (OUD) and overdose in Post-9/11 Veterans. Several machine learning classification-tree modeling approaches are used to develop predictor profiles of OUD and overdose. | |||||||||||||||||||||
162 | Provider directory data accuracy and system of record alignment | Veterans Affairs | AI is used to add value as a transactor for intelligent identity resolution and linking. AI also has a domain cache function that can be used for both Clinical Decision Support and for intelligent state reconstruction over time and real-time discrepancy detection. As a synchronizer, AI can perform intelligent propagation and semi-automated discrepancy resolution. AI adapters can be used for inference via OWL and logic programming. Lastly, AI has long term storage (“black box flight recorder”) for virtually limitless machine learning and BI applications. | |||||||||||||||||||||
163 | Seizure detection from EEG and video | Veterans Affairs | Machine learning algorithms use EEG and video data from a VHA epilepsy monitoring unit in order to automatically identify seizures without human intervention. | |||||||||||||||||||||
164 | SoKat Suicidial Ideation Detection Engine | Veterans Affairs | The SoKat Suicide Ideation Engine (SSIE) uses natural language processing (NLP) to improve identification of Veteran suicide ideation (SI) from survey data collected by the Office of Mental Health (OMH) Veteran Crisis Line (VCL) support team (VSignals). | |||||||||||||||||||||
165 | Using machine learning to predict perfusionists’ critical decision-making during cardiac surgery | Veterans Affairs | A machine learning approach is used to build predictive models of perfusionists’ decision-making during critical situations that occur in the cardiopulmonary bypass phase of cardiac surgery. Results may inform future development of computerized clinical decision support tools to be embedded into the operating room, improving patient safety and surgical outcomes. | |||||||||||||||||||||
166 | Gait signatures in patients with peripheral artery disease | Veterans Affairs | Machine learning is used to improve treatment of functional problems in patients with peripheral artery disease (PAD). Previously collected biomechanics data is used to identify representative gait signatures of PAD to 1) determine the gait signatures of patients with PAD and 2) the ability of limb acceleration measurements to identify and model the meaningful biomechanics measures from PAD data. | |||||||||||||||||||||
167 | Medication Safety (MedSafe) Clinical Decision Support (CDS) | Veterans Affairs | Using VA electronic clinical data, the Medication Safety (MedSafe) Clinical Decision Support (CDS) system analyzes current clinical management for diabetes, hypertension, and chronic kidney disease, and makes patient-specific, evidence-based recommendations to primary care providers. The system uses knowledge bases that encode clinical practice guideline recommendations and an automated execution engine to examine multiple comorbidities, laboratory test results, medications, and history of adverse drug events in evaluating patient clinical status and generating patient-specific recommendations | |||||||||||||||||||||
168 | Prediction of health outcomes, including suicide death, opioid overdose, and decompensated outcomes of chronic diseases. | Veterans Affairs | Using electronic health records (EHR) (both structured and unstructured data) as inputs, this tool outputs deep phenotypes and predictions of health outcomes including suicide death, opioid overdose, and decompensated outcomes of chronic diseases. | |||||||||||||||||||||
169 | VA-DoE Suicide Exemplar Project | Veterans Affairs | The VA-DoE Suicide Exemplar project is currently utilizing artificial intelligence to improve VA's ability to identify Veterans at risk for suicide through three closely related projects that all involve collaborations with the Department of Energy. | |||||||||||||||||||||
170 | Machine learning models to predict disease progression among veterans with hepatitis C virus | Veterans Affairs | A machine learning model is used to predict disease progression among veterans with hepatitis C virus. | |||||||||||||||||||||
171 | Prediction of biologic response to thiopurines | Veterans Affairs | Using CPRS and CDW data, artificial intelligence is used to predict biologic response to thiopurines among Veterans with irritable bowel disease. | |||||||||||||||||||||
172 | Predicting hospitalization and corticosteroid use as a surrogate for IBD flares | Veterans Affairs | This work examines data from 20,368 Veterans Health Administration (VHA) patients with an irritable bowel disease (IBD) diagnosis between 2002 and 2009. Longitudinal labs and associated predictors were used in random forest models to predict hospitalizations and steroid usage as a surrogate for IBD Flares. | |||||||||||||||||||||
173 | Predicting corticosteroid free endoscopic remission with Vedolizumab in ulcerative colitis | Veterans Affairs | This work uses random forest modeling on a cohort of 594 patients with Vedolizumab to predict the outcome of corticosteroid-free biologic remission at week 52 on the testing cohort. Models were constructed using baseline data or data through week 6 of VDZ therapy. | |||||||||||||||||||||
174 | Use of machine learning to predict surgery in Crohn’s disease | Veterans Affairs | Machine learning analyzes patient demographics, medication use, and longitudinal laboratory values collected between 2001 and 2015 from adult patients in the Veterans Integrated Service Networks (VISN) 10 cohort. The data was used for analysis in prediction of Crohn’s disease and to model future surgical outcomes within 1 year. | |||||||||||||||||||||
175 | Reinforcement learning evaluation of treatment policies for patients with hepatitis C virus | Veterans Affairs | A machine learning model is used to predict disease progression among veterans with hepatitis C virus. | |||||||||||||||||||||
176 | Predicting hepatocellular carcinoma in patients with hepatitis C | Veterans Affairs | This prognostic study used data on patients with hepatitis C virus (HCV)-related cirrhosis in the national Veterans Health Administration who had at least 3 years of follow-up after the diagnosis of cirrhosis. The data was used to examine whether deep learning recurrent neural network (RNN) models that use raw longitudinal data extracted directly from electronic health records outperform conventional regression models in predicting the risk of developing hepatocellular carcinoma (HCC). | |||||||||||||||||||||
177 | Computer-aided detection and classification of colorectal polyps | Veterans Affairs | This study is investigating the use of artificial intelligence models for improving clinical management of colorectal polyps. The models receive video frames from colonoscopy video streams and analyze them in real time in order to (1) detect whether a polyp is in the frame and (2) predict the polyp's malignant potential. | |||||||||||||||||||||
178 | GI Genius (Medtronic) | Veterans Affairs | The Medtronic GI Genius aids in detection of colon polyps through artificial intelligence. | |||||||||||||||||||||
179 | Extraction of family medical history from patient records | Veterans Affairs | This pilot project uses TIU documentation on African American Veterans aged 45-50 to extract family medical history data and identify Veterans who are are at risk of prostate cancer but have not undergone prostate cancer screening. | |||||||||||||||||||||
180 | VA /IRB approved research study for finding colon polyps | Veterans Affairs | This IRB approved research study uses a randomized trial for finding colon polyps with artifical intelligence. | |||||||||||||||||||||
181 | Interpretation/triage of eye images | Veterans Affairs | Artificial intelligence supports triage of eye patients cared for through telehealth, interprets eye images, and assesses health risks based on retina photos. The goal is to improve diagnosis of a variety of conditions, including glaucoma, macular degeneration, and diabetic retinopathy. | |||||||||||||||||||||
182 | Screening for esophageal adenocarcinoma | Veterans Affairs | National VHA administrative data is used to adapt tools that use electronic health records to predict the risk for esophageal adenocarcinoma. | |||||||||||||||||||||
183 | Social determinants of health extractor | Veterans Affairs | AI is used with clinical notes to identify social determinants of health (SDOH) information. The extracted SDOH variables can be used during associated health related analysis to determine, among other factors, whether SDOH can be a contributor to disease risks or healthcare inequality. | |||||||||||||||||||||
184 | Inventory Item Replenishment MLR Modeling POC - Phase 1 | Treasury | Build and evaluate a multiple linear regression model to predict to determine if the replenishment of an inventory item was received before or after the need by date to predict the likelihood that an item will be received on time in the future. | Planned (not in production) | ||||||||||||||||||||
185 | Inventory Item Replenishment MLR Modeling POC - Phase 2 | Treasury | Built and evaluate a multiple linear regression model to predict to determine if the replenishment of an inventory item would receive by the standard Need By time of 128 days set for all inventory items. | In production: less than six months | ||||||||||||||||||||
186 | Collection Chat Bot | Treasury | The Natural Language Understanding (NLU) model will be located inside the eGain intent engine. This NLU will take customer typed text input aka – Utterances. It will map the utterance to a specific intent and return the appropriate knowledge article. | In production: less than six months | ||||||||||||||||||||
187 | Collection Voice Bot | Treasury | The NLU model will be located inside the Automated Collections IVR (ACI) main menu. This NLU will take customer speech input aka – Utterances. It will map the utterance to a specific intent and direct the taxpayer down to a certain call path. | In production: less than six months | ||||||||||||||||||||
188 | Evaluate Multilingual BERT for Software Translation Use Case Evaluations | Treasury | Project is evaluating the cost-effectiveness of training a multi-lingual BERT model on IRS corpora and using the model as means to evaluate software translation output of IRS content. The framework is leveraging COMET, ROGUE, and BLEU measures. Furthermore, the product will also be assessed for English-Only and Spanish-Only content content classification. | |||||||||||||||||||||
189 | In production: more than one year In production: more than one year | Treasury | In production: less than six months | |||||||||||||||||||||
190 | LB&I Text Analytics (including Appeals Case Management) | Treasury | Trained text extraction and tax domain-specific BERT models (called TaxBERT) using about 190,000 documents including Internal Revenue Code, Internal Revenue Manual, and PDFs from irs.gov, Revenue Rulings, Private Letter Rulings, Revenue Procedures, Treasury Decisions, and other legal tax-related documents. The extracted text was decomposed into 21 million sentences with 1 million unique tokens. Further filtering refinement resulted in 11 million unique sentences and 31 thousand | In production: less than six months | ||||||||||||||||||||
191 | Line Anomaly Recommender | Treasury | This use case seeks to identify a workload selection model that uses two recommender system models to measure overall compliance risk and identify anomalous tax returns and line-item values. The delivered pipeline capabilities can supplement the core case selection model processes by providing additional insight to IRS LB&I reviewers through the use of advanced deep learning techniques for anomaly detection. | In production: less than one year | ||||||||||||||||||||
192 | NRP Redesign | Treasury | Deploy innovative active learning methods to provide a lower opportunity cost method of estimating a compliance baseline to support tax gap estimation, improper papyments reporting, development and validation of workload identfication and selection models, and inform policy analysis. System inputs require existing NRP data which provide an acceptable level of precision and quality for an acceptable level of data quality output. | In production: less than one year | ||||||||||||||||||||
193 | Projected Contract Award Date Web App | Treasury | Projected contract award dates are generated with a machine learning model that statistically predicts when procurement requests will become signed contracts. Input data includes funding information, date / time of year, and individual Contract Specialist workload. The model outputs projected contract award timeframes for specific procurement requests. 'When will a contract be signed?' is a key question for the IRS and generally for the federal government. This tool gives insight about when each request is likely to turn into a contract. The tool provides a technique other federal agencies can implement, potentially affecting $600 billion in government contracts. Weblink: https://www.irs.gov/newsroom/irs- announces-use-of-projected-contract-award-date-web-app-that-predicts- when-contracts-will-be-signed | |||||||||||||||||||||
194 | Developed an AI-based recommender for detecting potential non- | Treasury | ||||||||||||||||||||||
195 | SBSE Issue Recommender compliance issues which makes training returns selection more efficient In production: less than 6 months | |||||||||||||||||||||||
196 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
197 | Automated Delay detection using voice processing | Department of Transportation | Federal Transportation Administration | ATO | In order to get a full accounting of delay, automated voice detection of ATC and aircraft interaction is required. Many delay events, such as vectoring, are not currently reported/detected/accounted for and voice detection would enable automated detection. | In production: less than 1 year | Initial development | Natural Language Processing; | Yes | Agency Generated | Yes | N/A | Yes | Yes | N/A | Yes | Yes | Wilbur | FALSE | |||||
198 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
199 | Power Platform Use | Department of Transportation | Federal Aviation Administration | ATO | We use ChatGPT in code writing assistance. | In production: less than 1 year | NA | ChatGPT | No | Other | No | NA | No | NA | No | No | NA | FALSE | ||||||
200 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
201 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
202 | Surface Report Classifiier (SCM/Auto-Class) | Department of Transportation | Federal Aviation Administration | ATO | SCM classifies surface incident reports by event type, such as Runway Incursion, Runway Excursion, Taxiway Incursion/Excursion and categorizes runway incursions further by severity type (Category A, B, C, D, E) | In production: more than 1 year | Refinments planned for future release | Support Vector Machines, Gradient boosting, neural networks, natural language processing | Yes | Agency Generated | Yes | NA | Yes | Yes | NA | Yes | Yes | AIT/EIM Platform | FALSE | |||||
203 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
204 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
205 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
206 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
207 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
208 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
209 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
210 | Regulatory Compliance Mapping Tool | Department of Transportation | Federal Aviation Administration | AVS | The AVS International office is required to identify means of compliance to ICAO Standards and Recommended Practices (SARPs). Both SARPs and means of compliance evidence are text paragraphs scattered across thousands of pages of documents. AOV identified a need to find each SARP, evaluate the text of many FAA Orders, and suggest evidence of compliance based upon the evaluation of the text. The base dataset used by RCMT is the documents’ texts deconstructed into paragraphs. RCMT processes all the documents’ paragraphs run through Natural Language Processing (NLP) (this process has an AI aspect) to extract the meaning (semantics) of the text. RCMT then employs a recommender system (also using some AI technology) to take the texts augmented by the texts’ meaning to establish candidate matches between the ICAO SARPs and FAA text that provides means of compliance. | Planned (not in production) | User Acceptance Testing to begin early spring '22 | ML (Recommender Algorithim) , NLP, | Yes | Agency Generated | No | FAA Orders and ICAO Annexes | No | No | NA | NA | NA | NA | FALSE | NA | ||||
211 | JASC Code classification in Safety Difficulty Reports (SDR) | Department of Transportation | Federal Aviation Administration | AVS | AVS identified a need to derive the joint aircraft system codes (JASC) chapter codes from the narrative description within service difficulty reports (SDR), a form of safety event reporting from aircraft operators. A team of graduate students at George Mason University collaborated with AVS employees to apply Natural Language Processing (NLP) and Machine Learning to predict JASC codes. This method can be used to check SDR entries to ensure the correct codes were provided or to assign a code when one was not. | Planned (not in production) | NA | NLP, ML Classification | Yes | Agency Generated | Yes | NA | Yes | No | NA | Yes | Yes | Service Difficult Reporting System (SDRS) | FALSE | NA | ||||
212 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
213 | Offshore Precipitation Capability (OPC) | Department of Transportation | Federal Aviation Administration | NextGen (ANG) | OPC leverages data from several sources such as weather radar, lightning networks, satellite and numerical models to produce a radar-like depiction of precipitation. The algorithm then applies machine learning techniques based on years of satellite and model data to improve the accuracy of the location and intensity of the precipitation areas. | In production: more than 1 year | OPC runs in a pseudo-operational capacity via a webpage maintained by the Massachusetts Institute of Technology - Lincoln Lab, as well as in a test and evaluation capacity in a research mode. | AI, ML via a Convolutional Neural Network | Yes | Agency Generated | No | Yes | No | FALSE | ||||||||||
214 | Course Deviation Identification for Multiple Airport Route Separation (MARS) | Department of Transportation | Federal Aviation Administration | Aviation Safety (AVS) | The Multiple Airport Route Separation (MARS) program is developing a safety case for reduced separation standards between Performance Based Navigation (PBN) routes in terminal airspace. These new standards may enable deconfliction of airports in high-demand metropolitan areas, including the Northeast Corridor (NEC), North Texas, and Southern California. To build necessary collision risk models for the safety case, several models are needed, including one that describes the behavior of aircraft that fail to navigate the procedure correctly. These events are very rare and difficult to identify with standard data sources. Prior work has used Machine Learning to filter incident data to identify similar events on departure procedures. | Planned (not in production) | NA | Python in Jupyter Labs, ML, NLP | Yes | Agency Generated | No | NA | Yes | No | NA | Yes | Yes | CEDAR MORs/EORs, Pilot Deviation Reports | FALSE | Sensitive communications. | ||||
215 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
216 | Determining Surface Winds with Machine Learning Software | Department of Transportation | Federal Aviation Administration | ANG | Successfully demonstrated use of an AI capability to analyze camera images of a wind sock to produce highly accurate surface wind speed and direction information in remote areas that don’t have a weather observing sensor. | Planned (not in production) | Successfully tested but not in production. | AI | Yes | Agency Generated | No | No | No | FALSE | ||||||||||
217 | Remote Oceanic Meteorological Information Operations (ROMIO) | Department of Transportation | Federal Aviation Administration | ANG | ROMIO is an operational demonstration to evaluate the feasibility to uplink convective weather information to aircraft operating over the ocean and remote regions. Capability converted weather satellite data, lightning and weather prediction model data into areas of thunderstorm activity and cloud top heights. AI is used to improve the accuracy of the output based on previous activity compared to ground truth data. | Planned (not in production) | Technical transfer of capability to industry planned this summer. | AI, ML via a Convolutional Neural Network | Yes | Agency Generated | No | Yes | No | FALSE | ||||||||||
218 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
219 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
220 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
221 | Machine Learning for Occupant Safety Research | Department of Transportation | National Highway Traffic Safety Administration | NSR Human Injury Research Division | Description: Utilize deep learning models for predicting head kinematics directly from crash videos. The utilization of deep learning techniques enables the extraction of 3D kinematics from 2D views, offering a viable alternative for calculating head kinematics in the absence of sensors or when sensor availability is inadequate, and when high-quality sensor data is absent Input: Vehicle crash videos Output: Angular velocity - injury prediction | Planned (not in production) | Proof of Concept completed and published | Deep learning models - Convolutional Neural Networks, Long-Short Term Memory based Recurrent Neural Networks | Yes | Agency Generated | Yes | NA | Yes | No | NA | Yes | FALSE | |||||||
222 | Machine Learning for Occupant Safety Research | Department of Transportation | National Highway Traffic Safety Administration | NSR Human Injury Research Division | Description: Utilize deep learning for predicting crash parameters, Delta-V (change in velocity) and PDOF (principal direction of force), directly from real-world crash images. Delta-V and PDOF are two most important parameters affecting injury outcome. Deep learning models can help predict both Delta-V and PDOF, without the need to run WinSmash software for Delta-V computation, and without requiring estimations by crash examiners. Moreover, with deep learning models, the Delta-V and PDOF can be obtained within milliseconds, providing rapid results for improved efficiency" Input: Real world crash images Output: Delta-V & PDOF | Planned (not in production) | Currently under development | Deep learning models - Convolutional Neural Networks | Yes | Agency Generated | Yes | Data is taken from NASS-CDS, CISS and CIREN databases, which are publically available | Yes | No | NA | Yes | FALSE | |||||||
223 | PHMSA Rule Making | Department of Transportation | Pipeline and Hazardous Materials Safety Administration (PHMSA | PHMSA Office of Chief Counsel (PHC) | Artificial Intelligence Support for Rulemaking - Using ChatGPT to support the rulemanking processes to provide significant efficiencies, reduction of effort, or the ability to scale efforts for unusual levels of public scrutiny or interest (e.g. comments on a rulemaking). ChatGPT will be used to provide: 1. Sentiment Analysis – Is the comment positive / negative or neutral towards the proposed rule. 2. Relevance Analysis – Whether the particular comment posted is relevant to the proposed rule 3. Synopsis of the posted comment. 4. Cataloging of comments. 5. Identification of duplicate comments. | Planned (not in production) | This is a pilot initiative | ChatGPT, NLP | Yes | Agency Generated | Yes | N/A | Yes | No | N/A | Yes | Yes | Office Automation for Administrative Systems Support | FALSE | |||||
224 | Crushed Aggregate Gradation Evaluation System | Department of Transportation | Federal Railroad Administration | Office of Research, Development and Technology | Description: Deep learning computer vision algorithms aimed at analyzing aggregate particle size grading. Input: Images of ballast cross sections Output: Ballast fouling index | In production: less than 1 year | NA | FALSE | ||||||||||||||||
225 | Automatic Track Change Detection Demonstration and Analysis | Department of Transportation | Federal Railroad Administration | Office of Research, Development and Technology | Description: DeepCNet-based neural network to identify and classify track-related features (e.g., track components, such as fasteners and ties) for "change detection" applications. Input: Line-scan images from rail-bound inspection systems Output: Notification of changes from status quo or between different inspections based on geolocation. | In production: more than 1 year | NA | FALSE | ||||||||||||||||
226 | Development of Predictive Analytics Using Autonomous Track Geometry Measurement System (ATGMS) Data | Department of Transportation | Federal Railroad Administration | Office of Research, Development and Technology | Description: Leveraging large volumes of these recursive track geometry measurements to develop and implement automated machine-learning-based processes for analyzing, predicting, and reporting track locations of concern, including those with significant rates of degradation. Input: Track geometry measurements and exceptions Output: Inspection report that includes the trending of track geometry measures and time to failure (i.e., maintenance and safety limits). | In production: more than 1 year | NA | FALSE | ||||||||||||||||
227 | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | REDACTED | TRUE | REDACTED | ||||
228 | Bureau of Global Public Affairs (GPA) | CLIPSLAB | GPA’s production media collection and analysis system that pulls data from half a dozen different open and commercial media clips services to give an up-to-date global picture of media coverage around the world. | DOL | ||||||||||||||||||||
229 | Bureau of Global Public Affairs (GPA) | MISSION PRESS DIGEST | A prototype system that collects and analyzes the daily media clips reports from about 70 different Embassy Public Affairs Sections. | |||||||||||||||||||||
230 | Bureau of Global Public Affairs (GPA) | DIGITAL COMMUNICATIONS DATABASE | GPA’s production system for collecting, analyzing, and summarizing the global digital content footprint of the Department. | |||||||||||||||||||||
231 | Bureau of Global Public Affairs (GPA) | FACEBOOK AD TEST OPTIMIZATION SYSTEM | GPA’s production system for testing potential messages at scale across segmented foreign sub-audiences to determine effective outreach to target audiences. | |||||||||||||||||||||
232 | Bureau of Global Public Affairs (GPA) | GLOBAL AUDIENCE SEGMENTATION FRAMEWORK | GPA’s prototype framework for predicting how content or messages designed for one audience some place in the world will resonate with other audiences outside the United States. | |||||||||||||||||||||
233 | Bureau of Global Public Affairs (GPA) | MACHINE-LEARNING ASSISTED MEASUREMENT AND EVALUATION OF PUBLIC OUTREACH | A high-performing classifier capable of measuring the level of six different emotions that a text evokes, including time-series analysis, to help evaluate historical messages and predict successful future public messaging. | |||||||||||||||||||||
234 | Bureau of Global Public Affairs (GPA) | GPA TOOLS AND GPAIX | AI-enabled analysis package for automating public outreach analysis. | |||||||||||||||||||||
235 | Bureau of Political-Military Affairs (PM) | NLP TO PULL KEY INFORMATION FROM UNSTRUCTURED TEXT | Use natural language processing to extract information from document text to help summarize and allow for analysis more efficiently than manual methods. | |||||||||||||||||||||
236 | Bureau of Political-Military Affairs (PM) | K-MEANS CLUSTERING INTO TIERS | Cluster countries into tiers based on data collected from open source and Bureau data using k-means clustering. | |||||||||||||||||||||
237 | Global Engagement Center (GEC) | DISINFORMATION TOPIC MODELING | Text clustering and topic modeling of documents and social media to determine possible disinformation subjects and topics. | |||||||||||||||||||||
238 | Global Engagement Center (GEC) | DEEPFAKE DETECTOR | Deep learning model that takes in an image containing a person’s face and classifies the image as either being real (contains a real person’s face) or fake (synthetically generated face, a deepfake often created using Generative Adversarial Networks) to predict disinformation activities. | |||||||||||||||||||||
239 | Global Engagement Center (GEC) | TEXT SIMILARITY DETECTION | Tool to identify different texts that are identical or nearly identical by calculating cosine similarity between each pair of texts. Texts are then grouped if they share high cosine similarity and then available for analysts to review further. | |||||||||||||||||||||
240 | Global Engagement Center (GEC) | IMAGE CLUSTERING FOR DISINFORMATION DETECTION | AI solutions and processes to identify similar images in order to analyze how images are used to spread and build traction with disinformation narratives. | |||||||||||||||||||||
241 | Global Engagement Center (GEC) | LOUVAIN COMMUNITY DETECTION | Uses social network analysis to cluster nodes together into “communities” to detect clusters of accounts possibly spreading disinformation. | |||||||||||||||||||||
242 | Office of U.S. Foreign Assistance Resources (F) | NLP FOR FOREIGN ASSISTANCE APPROPRIATIONS ANALYSIS | Natural language processing application to automate and streamline the extraction of earmarks and directives from the annual appropriations bill to facilitate the Department’s adherence to congressional direction. | |||||||||||||||||||||
243 | Office of Management Strategy and Solutions (M/SS) | DEPARTMENT CABLES ANALYTICS | Natural language processing analysis of Department cables reporting to inform multiple areas of Department policy and operations. | |||||||||||||||||||||
244 | CSO | Automated Burning Detection | The Village Monitoring System program uses AI and machine learning to conduct daily scans of moderate resolution commercial satellite imagery to identify anomalies using the near-infrared band. | |||||||||||||||||||||
245 | CSO | Automated Damage Assessments | The Conflict Observatory program uses AI and machine learning on moderate and high-resolution commercial satellite imagery to document a variety of war crimes and other abuses in Ukraine, including automated damage assessments of a variety of buildings, including critical infrastructure, hospitals, schools, crop storage facilities. | |||||||||||||||||||||
246 | Wildlife Underpass Camera Trap Image Classification, San Diego CA | Department of the Interior | USGS | Tracey, Jeff | jatracey@usgs.gov | Janice Gordon, janice_gordon@usgs.gov | This software system takes wildlife camera trap images as inputs and outputs the probability of the image belonging to user-specified taxonomic classes based on wildlife species present in each image. (Wildlife camera traps are motion-triggered, time lapse, and other camera systems placed in the field to capture images of wildlife at the location and times where and when the cameras are placed.) The process of humans reviewing, labeling, and QA/QCing labels is labor intensive, time consuming, and costly. Developing AI systems that can perform these tasks within an acceptable level of accuracy can reduce the costs in extracting tabular data from camera-based datasets and increase the volume of data for analysis. The system supports training experiments where model hyperparameters and training dataset characteristics can be varied to find those that are more optimal for training. Training, validation, and testing datasets have human-assigned labels and are used to train and evaluate the models. Once trained, the models can be used to predict classes on unlabeled images. We use a convolutional neural network (CNN) approach based on TensorFlow and training is run on the USGS Tallgrass supercomputer designed for AI/ML workflows. | Planned (not in production) | in development, testing and training experimentation | convolutional neural networks | Yes | Agency Generated | No | NA | Yes | No | NA | Yes | Yes | No | CSS/Ecosystems | |||
247 | Walrus Haulout Camera Trap Image Classification | Department of the Interior | FWS | USGS | Tracey, Jeff | jatracey@usgs.gov | Janice Gordon, janice_gordon@usgs.gov | This project extends the application of codes developed for wildlife underpass camera trap image classification. Similarly, the system takes walrus haulout camera trap images as inputs and outputs the probability of the image containing walruses and various human disturbances (boats, aircraft, etc.). We will use and further develop the previous system's capability of supporting training experiments. Training, validation, and testing datasets have human-assigned labels and are used to train and evaluate the models. Once trained, the models can be used to predict classes on unlabeled images from ongoing camera monitoring efforts. We use a convolutional neural network (CNN) approach based on TensorFlow and training is run on the USGS Tallgrass supercomputer designed for AI/ML workflows. | Planned (not in production) | in development | convolutional neural networks | Yes | Agency Generated | No | NA | CSS/Ecosystems | ||||||||
248 | ARMI Amphibian Species ID from Acoustic Data | Department of the Interior | USGS | Tracey, Jeff | jatracey@usgs.gov | Janice Gordon, janice_gordon@usgs.gov | The mission of the USGS Amphibian and Reptile Minitoring Initiative (ARMI) is to provide essential scientific information to managers to help arrest or reverse amphibian population declines. Acoustic monitoring of amphibian (anuran) vocalizations are a core technique used by ARMI researchers. Reviewing audio recordings and identifying species vocalizations captured therein is time consuming and labor intensive. For these reasons, many recordings remain unprocessed, preventing valuable data from being available for analysis. Our goal is to train convlutional neural networks (CNNs) that take audio clips that have been converted to sonograms (images) and classify the species generating the vocalizations in the recordings. Our initial prototype project will attempt to develop models that can identify audio clips containing bullfrog (*Lithobates catesbeianus*, which are native in some parts of the US and a destructive invasive species in others) vocalizations. The software will be build using the TensorFlow Python API and training will be performed on the USGS Tallgrass supercomputer. | Planned (not in production) | in development | convolutional neural networks | Yes | Agency Generated | No | NA | CSS/Ecosystems | |||||||||
249 | Individual Mountain Lion ID from Camera Data | Department of the Interior | USGS | Tracey, Jeff | jatracey@usgs.gov | Janice Gordon, janice_gordon@usgs.gov | This system will to take pairs of mountain lion (*Puma concolor*) facial images and output the probability that the images come from the same individual mountain lion. This will allow researches to passively "mark" individuals and support population estimation analyses. We will use a "Siamese" convolutional neural network architecture that has been used in other facial recognition and motion tracking applications. | Planned (not in production) | in development | convolutional neural networks | Yes | Agency Generated | No | NA | CSS/Ecosystems | |||||||||
250 | Walrus Object Detection in Drone/Satelite Imagery | Department of the Interior | USGS | Tracey, Jeff | jatracey@usgs.gov | Janice Gordon, janice_gordon@usgs.gov | This system, once developed, will input drone imagery and output bounding boxes for individual walruses. If successful, this will allow researchers with Alaska Science Center to count the numbers of walruses in drone imagery to support population reseaarch. The system will use TensorFlow-based convolutional neural networks for object detection trained on the USGS Tallgrass supercomputer. | Planned (not in production) | convolutional neural networks | Yes | Agency Generated | No | NA | CSS/Ecosystems | ||||||||||
251 | PRObability of Streamflow PERmanence | Department of the Interior | USGS | Roy Sando | tsando@usgs.gov | Kristin Jaeger, kjaeger@usgs.gov | The PROSPER modeling framework was developed to incorporate sparse streamflow observation data representing wet or dry stream conditions and gridded hydroclimatic explanatory data to predict the annual probability of streamflow permanence at 30-m (PROSPER Pacific Northwest) or 10-m (PROSPER Upper Missouri) resolution. The training data are point observations of wet or dry at locations in the Pacific Northwest or Upper Missouri River basin. The PROSPER models were primarily developed using the FCPGTools (Barnhart, Sando, et al., 2020), R, and USGS HPC resources (Yeti). | In production: more than 1 year | Models have been calibrated and outputs published for the Pacific Northwest. Models for other domains (Upper Missouri River basin) are currently in review. | Random Forest Classification | Yes | Other | No | https://doi.org/10.5066/P96F7RX8, https://doi.org/10.5066/F7BV7FSP | Yes | No | No | No | No | Water | ||||
252 | Water Mission Area Drought Prediction Project | Department of the Interior | USGS | Stacey Archfield | sarch@usgs.gov | Roy Sando, tsando@usgs.gov | The goal of this project is to develop a method for predicting daily hydrologic drought using machine learning models calibrated on streamflow data (response) and meteorological forcing data. Models will be built at individual gages across CONUS, then transferred to ungaged basins using a 'donor model' approach that identifies which gages are most similar to the ungaged basin and combines the models from those gages for the final prediction. Models will be developed and run on the USGS HPC systems. | Planned (not in production) | Random forest regression, random forest classification | Yes | Agency Generated | No | NA | Yes | Yes | Yes | Yes | No | Water | |||||
253 | Water Mission Area Regional Drought Early Warning System | Department of the Interior | USGS | John Hammond | jhammond@usgs.gov | Roy Sando, tsando@usgs.gov | The goal of this project is to build and test multiple ML models for predicting and forecasting daily hydrologic drought in the Colorado River Basin (CRB). Similar to the project listed in line 9, we use gridded meteorologic forcing data and daily streamflow data in the CRB to build random forest and neural networks (long-short term memory) to determine the best approach to predicting and forecasting hydrologic drought. The project is being developed on AWS and in cooperation with CHS. We are also using the USGS HPC systems. | Planned (not in production) | Random forest regression, random forest classification, random survival forests, neural networks, long-short term memory, recurrent neural networks | Yes | Agency Generated | No | NA | Yes | Yes | Yes | Yes | Water | ||||||
254 | AI system to recognize individual fish and disease | Department of the Interior | Collaboration with Dr. Sheng Li, Department of Computer Sciences, University of Georgia | USGS | Hitt, Nathaniel | nhitt@usgs.gov | Ben Letcher, bletcher@usgs.gov | This study focuses on the development of an AI system to recognize individual fish and their disease status from images. Success of this effort could complement or replace traditional mark-recapture methods used for estimating abundance, survival, and movement, and this could greatly reduce costs to fisheries managers. Likewise, disease detection from images could enable new approaches for assessing status and trends in fish health. | In production: less than 1 year | Research on-going to optimize model parameters and data pipeline | Convolutional Neural Networks | No | Agency Generated | Yes | Yes | Beta version: https://code.usgs.gov/marami/Fish-AI/-/releases/fish | Yes | Yes | No | Ecosystems | ||||
255 | River Image SEnsing | Department of the Interior | USGS | Lotspeich, Russ | rlotspei@usgs.gov | Frank Engel, fengel@usgs.gov | The River Image Sensing (RISE) project is charged with the development of a reliable camera system for integration into the operational streamgage monitoring network of the USGS Water Mission Area. In addition to capturing images and videos, the RISE system will be capable of producing time-series of surface water levels derived from still camera images using AI/ML modeling techniques. | In production: less than 1 year | Proof-of-concept completed in FY21. Researching now limitations, requirements, and operational implementation approaches. | Convolutional Neural Networks | Yes | Agency Generated | No | Yes | No | Yes | No | Water | ||||||
256 | Estimating stream flow from images in headwaters | Department of the Interior | Collaboration with Dr Xioawei Jia (University of Pittsburgh), Drs Tony Chang and Amrita Gupta (Conservation Science Partners), and Dr Jeff Walker (Walker Environmental Research) | USGS | Letcher, Ben | bletcher@usgs.gov | Jenn Fair, jfair@usgs.gov | The goals of this project are to 1) measure how much water flows in small, ungaged stream networks using timelapse images captured by inexpensive and off-the-shelf cameras and 2) provide a web-based platform for making the images, associated climate and other related data as well as the model itself easy to access and explore. Data for training come from user-uploaded imagery and flow data (when available). Database is available for uploading and image viewing here: https://www.usgs.gov/apps/ecosheds/fpe/ | In production: less than 1 year | Initial models have been trained. Working on estimating relative flow from sites with no gage data based on human-annotated pair-wise comparisons of images. Also working on understanding transferability of models to sites with less data. | Convolutional Neural Networks | Yes | Other | No | Yes | No | Yes | Yes | No | Ecosystems | ||||
257 | Economic valuation of fisheries in the Delaware River | Department of the Interior | Collaboration with Dr Xioawei Jia (University of Pittsburgh) | USGS | Letcher, Ben | bletcher@usgs.gov | Jenn Fair, jfair@usgs.gov | The goal is to link existing hydrological flow data (e.g., USGS stream gages) and models (e.g., USGS Process-Guided Deep Learning Models for flow and temparture) with trout population dynamic models, changes to fish catch, and the economic benefits of recreational fishing. These trout population dynamic models will be developed based on observational data, existing literature estimates, and existing models. | In production: less than 6 months | Model training will begin in the 2nd quarter of FY22 | Convolutional Neural Networks | Yes | Other | No | Yes | No | Yes | Yes | No | Ecosystems | ||||
258 | Stream physical habitat characterization in the Chesapeake Bay Watershed | Department of the Interior | USGS | Cashman, Matthew | mcashman@usgs.gov | The project objective is to take a large dataset of rapid habitat assessment data collected by multiple jurisdictions in the Chesapeake Bay Watershed, train a predictive model using those data, and use that model to predict stream habitat conditions for all unmeasured stream reaches in the region. The model is able to generate predictions for multiple aspects of physical habitat condition. The model directly connects to EPA's database containing the training data, enabling it to be rapidly updated when new data updates occur. | In production: less than 1 year | End-to-end workflows have all been developed. Preliminary models have been trained and models continued to be iterated for improvements using additional feature datasets. Models will be finalized end of Q3 FY22. | Random Forest Regression, XGBoost | Yes | From Another Agency | No | Yes | No | Yes | Yes | No | Water | ||||||
259 | Deep Learning for Automated Detection and Classification of Waterfowl, Seabirds, and other Wildlife from Digital Aerial Imagery | Department of the Interior | USFWS, BOEM, and the University of California- Berkeley | USGS | Landolt, Kyle | klandolt@usgs.gov | Jennifer Dieck, jdieck@usgs.gov | Our project includes two stages of AI, the first is a binary detector to automate the detection of wildlife in aerial imagery and the second is a robust classification algorithm to automate the taxonomic classification of wildlife from the binary detector. The input of the first and second stage are manually annotated polygons around targets of interest and their taxonomic classification values from family to species, respectively. We use Tallgrass to develop and train our algorithms, BlackPearl/Caldera to store our large image datasets, a hosted instance of a customized version of the Computer Vision Annotation Tool to gather manually annotated data, and a separate PostgreSQL database to store annotations and image metadata. | In production: less than 1 year | Model training for wildlife detection is complete, further training, tuning, and optimization for detection and classification set for FY22, large production-level tests set for FY22 | Convolutional neural networks | Yes | Other | No | NA | Yes | Yes | Yes | Yes | No | Ecosystems | |||
260 | Prediction of Regolith Thickness in the Delaware River Basin | Department of the Interior | USGS | Goodling, Phillip | pgoodling@usgs.gov | "Fleming, Brandon, bjflemin@usgs.gov Stackelberg, Paul, pestack@usgs.gov, Belitz, Ken, kbelitz@usgs.gov" | This project uses observations of the depth to bedrock reported by private well drillers in the Delaware River Basin to train a Random Forest model to map the thickness of the regolith layer. This data product will support groundwater and hydrologic modeling efforts in the basin. | In production: less than 1 year | Prelminary models developed, optimization is underway. | Random Forest Classification and Regression | Yes | Other | Yes | Yes | No | Yes | Yes | No | Water | |||||
261 | ML-Mondays course on applications of deep learning to image analysis | Department of the Interior | USGS | Buscombe, Daniel | dbuscombe@contractor.usgs.gov | jwarrick@usgs.gov | A course in application of deep learning image segmentation, image classification, and object-in-image detection. Course includes software written in Python using Keras and Tensorflow ML libraries, software documentation, data, website, and slides. See course website https://dbuscombe-usgs.github.io/MLMONDAYS/ for more details | In production: more than 1 year | Course is completed and has been taken by more than 100 particpants | Deep convolutional neural networks; ResNet, MobileNet, UNet, RetinaNet | Yes | Yes | Yes | No | Yes | Yes | No | Natural Hazards | ||||||
262 | Coast Train | Department of the Interior | USGS | Buscombe, Daniel | dbuscombe@contractor.usgs.gov | pwernette@usgs.gov; jwarrick@usgs.gov | Coast Train is a multi-labeler ML-ready dataset of orthomosaic and satellite images of coastal, estuarine, and wetland environments and corresponding thematic label masks. The data consist of spatial and time-series, and contains 1.2 billion labelled pixels, representing over 3.6 million hectares. | In production: less than 6 months | Data release and journal article being prepared for publication | Doodler: https://github.com/dbuscombe-usgs/dash_doodler | No | Yes | in progress | Yes | Yes | Yes | Yes | No | Natural Hazards | |||||
263 | Seabird and Marine Mammal Surveys Near Potential Renewable Energy Sites Offshore Central and Southern California | Department of the Interior | USGS | Adams, Josh | josh_adams@usgs.gov | cahorton@usgs.gov | The Seabird Studies Team at the Western Ecological Research Center (WERC), with support from the Bureau of Ocean Energy Management (BOEM), completed aerial photographic surveys of the ocean off central and southern California between 2018-2021. Over 800,000 high resolution images of the ocean were collected, with the goal of extracting and counting marine birds and mammals contained within. To process this volume of images machine learning offered the best methodology, but publicly available training data did not exist for this specific purpose. Through a collaboration with Conservation Metrics, Inc. we created a labeled training dataset using Faster RCNN models via active learning and transfer learning. We then evaluated a set of candidate models trained on different label aggregation schemes, selected a final model utilizing YOLOv5 architecture, and ran the final model on the complete image dataset. Images output from the final model classified targets into seven categories: bird, dark bird, dark bird flying, light bird, fish, marine mammal, and other. We are currently reviewing the final model output for false positives and negatives to evaluate performance. Next, we will reclassify model labels to the lowest taxonomic group possible. This manual review is occurring in the USGS cloud environment (Amazon Web Services) utilizing the opensource Computer Vision Annotation Tool (CVAT). Once low taxonomic reclassification is complete, we will generate maps of species distribution and abundance to inform BOEM’s planning in advance of potential offshore wind energy development along the California coast. | Review final model output | Full image review is being conducted to determine model performance on real world data. Manual low taxonomic identification of targets identified to grouped classes by model in imagery. | Active learning, transfer learning, deep learing, convolutional neural networks (Fastern-RCNN, YOLOv5) | No | Collaborator Generated | No | NA | No | No | NA | NA | Yes | No | Ecosystems | |||
264 | Fouling Identification Neural Network (FINN) | Department of the Interior | USGS | Katoski, Michelle | mkatoski@usgs.gov | Cashman, Matthew mcashman@usgs.gov | Our product is an end-to-end system that is used to predict and detect sensor (sonde) fouling at USGS stream gages. The system is trained using supervised learning on multiple features derived from archived stream gage data labelled by expert field technicians. The system operated in real-time on Amazon Web Services (AWS), providing predictions every 30 minutes based off of raw data collected from the USGS AQUARIUS database. The system produces values detecting the likelihood that fouling is currently present and likelihood of fouling predicted to occur in the next 24 hours. These values are displayed on a Tableau dashboard that is connected to AWS using Amazon Athena. This dashboard also displays other stream gage network monitoring information from AQUARIUS, like time since a sonde was last visited by a technician. | In production: 1 year | Our models currently provide real time predictions in a production environment. Model tuning, training and testing continued for 6 months after deploying the first iteration in the production environment. | Long short term memory (LSTM) models | Yes | Agency Generated | Yes | In progress | Yes | Yes | In progress | Yes | Yes | No | Water | |||
265 | Mapping river bathymetry from remotely sensed data | Department of the Interior | USGS | Legleiter, Carl | cjl@usgs.gov | We are using high frequency satellite images from the Planetscope constellation to estimate water depth in river channels. The short time lags between images allows us to average multiple scenes collected on the same day or within a couple of days to improve accuracy. In addition to established depth retrieval methods, we developed a neural network regression approach for this purpose. The training data consist of field measurements of water depth collected as part of other USGS projects on five different rivers. The neural network regression method is implemented in MATLAB using the Deep Learning Toolbox. | Planned (not in production) | Prototype code developed; manuscript describing this approach currently under review | Neural network regression | Yes | Agency Generated | No | USGS ScienceBase data releases: https://doi.org/10.5066/P9HZL7BZ; https://doi.org/10.5066/P9K54WDL; https://doi.org/10.5066/F7Q52NZ1; https://doi.org/10.5066/P92PNWE5; https://doi.org/10.5066/P9S4T8YM | Yes | No | Yes | Yes | Yes | Manuscript currently in review | Water | ||||
266 | Mapping benthic algae along the Buffalo National River from remotely sensed data | Department of the Interior | USGS | Legleiter, Carl | cjl@usgs.gov | This study involves using orthophotos acquired from a manned, fixed-wing aircraft and multispectral images from two different satellites to map bottom-attached (benthic) algae along the Buffalo National River in northern Arkansas. The training data for this effort consist of field observations of water depth and percent cover of benthic algae along 8-10 cross sections from two distinct reaches of the Buffalo River. These field data are used to train a bagged trees (aka random forest) classification algorithm to distinguish among four ordinal levels of algal density: none, low, medium, and high. | Planned (not in production) | Prototype code developed; manuscript describing this approach currently under review | Bagged trees (aka random forest) classification | Yes | Agency Generated | No | Will be made available through USGS ScienceBase catalog | Yes | No | Yes | Yes | Yes | Manuscript currently in review | Water | ||||
267 | Characterization of Sub-surface drainage (tile drains) from satellite imagery | Department of the Interior | USGS | Williamson, Tanja | tnwillia@usgs.gov | Without knowing how tile-drain extent (sub-surface agricultural drainage) has changed with time, it is difficult to differentiate how streamflow and water quality have changed as a result of spatial extent and characteristics of tile-drain networks. Our method delineates tile drains in satellite imagery, providing a way to look at historical imagery and to use satellite data to maintain an up-to-date geospatial layer of tile drain extent in basins of interest. We use panchromatic imagery that is processed using a UNet model that was trained on a library of panchromatic images on which visible tile-drain networks had been traced. Our workflow uses a combination of python scripting that is encapsulated in a Jupyter notebook; the entire process is open source. | We have the final version of the Unet model. We are working on a manscript to document the model and workflow. | Code and publication are still in progress. Questions 7 on are answered for the future. | convolutional neural networks | Yes | Other | Yes | Yes | will be - but not yet there. | No | Water | ||||||||
268 | Waterfowl Lifehistory and Behavior Classification | Department of the Interior | USGS | Casazza, Michael | mike_casazza@usgs.gov | Cory Overton, coverton@usgs.gov | The model we developed provides a highly accurate daily classification of waterfowl behavior into 8 life history states/movement patterns using hourly GPS relocations and, optionally, remotely sensed habitat data. This will provide waterfowl researchers and managers a tool for real-time capable rapid assessments and notification of important life history events to improve research and management outcomes and reduce project operational costs. | Planned (not in production) | extreme gradient boosted classification, stochastic gradient decent (LinearLearner®), multi-layer perceptron | Yes | Agency Generated | No | Upon manuscript acceptance, and IPDS review | Yes | Upon manuscript acceptance, and IPDS review | NA | Yes | Yes | No | Ecosystems | ||||
269 | Spot Elevation OCR from historical topo maps | Department of the Interior | USGS | Arundel, Samantha | sarundel@usgs.gov | The goal of this project is to create a database of summit spot elevations from the HTMC labeled for summits in CONUS. | In production: more than 1 year | convolutional neural networks | Yes | Agency Generated | No | na | Yes | Yes | Yes | Yes | No | CSS | ||||||
270 | TerrainFeatures detection and recognition | Department of the Interior | USGS | Arundel, Samantha | sarundel@usgs.gov | The objective of this project is to use DL tools to extract terrain features. | In production: more than 1 year | convolutional neural networks | Yes | Agency Generated | No | na | Yes | Yes | Yes | Yes | No | CSS | ||||||
271 | The National Landcover database | Department of the Interior | USGS | Dewitz, Jon | dewitz@usgs.gov | NLCD uses AI/ML to develop Landcover across all 50 states. The system includes HPC processes, cloud services, and local resources to create thematic and continuous field classifications. These classifications serve as the base for users and federal agencies across the nation to provide wildlife habitat estimates, urban runoff estimates, population growth, etc. etc. | in production | cnns, nns, and decision trees | No | www.mrlc.gov | Yes | Yes | CSS | |||||||||||
272 | Artificial Intelligence for Environment & Sustainability (ARIES) | Department of the Interior | USGS | Bagstad, Kenneth | kjbagstad@usgs.gov | "ARIES is an international research project based at the Basque Centre for Climate Change (Bilbao, Spain), to which USGS has been a long-term collaborator. ARIES uses semantics and machine reasoning to enable AI-assisted multidisciplinary, integrated modeling of coupled human-natural systems. See https://aries.integratedmodelling.org/ and https://docs.integratedmodelling.org/technote/ for full details. ARIES is a full-stack solution for integrated modelling, supporting the production, curation, linking and deployment of scientific artifacts such as datasets, data services, modular model components and distributed computational services. Its purpose is to ensure — by design rather than just intent — that the pool of such artifacts constitutes a seamless knowledge commons, readily actionable (by humans and machines) through a full realization of the linked data paradigm augmented with semantics and powered by artificial intelligence. This design enables automation of a wide range of modeling tasks that would normally require human experts to perform. ARIES’ underlying software stack, called k.LAB, includes client and server components that support the creation, maintenance and use of a distributed semantic web platform where scientific information can be stored, published, curated and connected. The software is licensed through the Affero General Public License (AGPL) v.3.0." | In production: more than 1 year | Project originated in 2007, with increasing public functionality (IDE: 2013, Web Explorer: 2018, first Web Application: 2021) | Semantics, machine reasoning, various ML algorithms (e.g., KNN, Bayesian structural learning) | No | Yes | No | Yes | No | No | Ecosystems | ||||||||
273 | Global Inland Fisheries Risk Index | Department of the Interior | USGS | Lynch, Abigail | ajlynch@usgs.gov | Daniel Wieferich, dwieferich@usgs.gov | We applied coupled manual and machine learning methods to an expansive literature set for major global inland fisheries to explore opportunities for improving user efficiency for linking anthropogenic drivers of environmental change to direct impacts. This work informs the relative influence of threats in the development of a global inland fisheries assessment using boosted regression trees to derive a spatially-explicit risk index of stressors. | In production: more than 1 year | Manuscript in prep for publication | Natural language processing; boosted regression trees | No | No | Not yet | Yes | No | Not yet | No | No | No | Ecosystems / CSS | ||||
274 | Fish and Climate Change Database (FiCli) | Department of the Interior | USGS | Lynch, Abigail | ajlynch@usgs.gov | Craig Paukert, cpaukert@usgs.gov | The Fish and Climate Change Database (FiCli) is a comprehensive database of peer-reviewed literature compiled through an extensive, systematic primary literature review to identify English-language, peer-reviewed journal publications with projected and documented examples of climate change impacts on inland fishes globally. We are currently exploring options to automate certain portions of the review process to increase our efficiency in maintaining and updating the database. | In production: less than 1 year | natural language processing | No | Yes | Yes | No | No | Ecosystems | |||||||||
275 | Evaluating fish movement in restored coastal wetlands using imaging sonar and machine learning models | Department of the Interior | USGS | Kowalski, Kurt | kkowalski@usgs.gov | Alexandra Bozimowski, abozimowski@usgs.gov | Wetland managers are restoring coastal wetland habitats in the Great Lakes, and often seek more information on when and how fish access restored habitats. Terabytes of hydroacoustic data on fish movement need to be analyzed more efficiently, so a collaboration between USGS, USFWS, and the University of Michigan is developing a machine learning model (MLM) that identifies, tracks, and quantifies fish movement. The completed model will read proprietary sonar image files, convert them to a universal file format (i.e., .mp4), place bounding boxes around individual fish detected by the model, and track them across consecutive image frames to determine bi-directional movement. The model uses training data and TensorFlow-based convolutional neural networks for object detection. Post-processing uses sonar geometry to estimate length of individual fish, and output files include labeled videos showing bounding boxes (.avi format), model metrics (.txt format), an enumeration of bi-directional fish movement (i.e., left or right) in .csv format, and individual fish length estimates (.csv format). Model output will allow USGS researchers to estimate fish habitat use and associated community metrics in restored wetland habitats. This information will support USFWS and other management agencies restoring coastal wetland habitats. | In production: less than 1 year | Currently adding additional parameters and testing model accuracy on different training datasets | convolutional neural networks | No | Agency Generated | No | NA | Yes | No | NA | Yes | Yes | No | Ecosystems | |||
276 | Fluvial Fish Native Distributions for the Conterminous United States using the NHDPlusV2.1 and Boosted Regression Tree (BRT) Models | Department of the Interior | USGS | Wieferich, Daniel | dwieferich@usgs.gov | Wiltermuth, Mark, mwiltermuth@usgs.gov | Species distribution models are developed for 271 fluvial fish species in their native ranges of the conterminous United States. Boosted Regression Tree (BRT) models were used to develop presence/absence predictions for each of the National Hydrography Dataset Plus Version 2.1 stream segments within a species' native range. Landscape data that describe the natural variation (e.g., slope, precip) and anthropogenic impacts (e.g., stream fragmentation) were summarized to stream segments and used as predictor variables. Native species ranges were used to geographically constrain distribution modeling efforts. R Version 4.0.2 (or newer) with ‘dismo’ and ‘labdsv’ packages are used for modeling. | In production: less than 6 months | in review for release, additional modeling techniques such as Baysian Additive Regression Tree (BART) and Maxent approaches are also being explored | boosted regression tree (BRT) | Yes | Agency Generated | Yes | https://doi.org/10.5066/P9V390V2 (doi will be activated after review completed) | Yes | No | Yes | Yes | No | CSS | ||||
277 | Prediction of Inland Salinity in the Delaware River Basin | Department of the Interior | USGS | Smith, Jared | jsmith@usgs.gov | Alison Appling, aappling@usgs.gov | Once developed, the system will input watershed characteristics (soils, land cover), land use (road salt application) and meteorological timeseries, and output predictions of specific conductance (SC) for inland stream reaches in the Delaware River Basin (DRB). The model will be trained using SC sample data from within the DRB. The resulting model will allow for predictions in ungaged locations and time periods, and allow for an evaluation of salinity exposure in these stream reaches. The model will be built using pyTorch on the USGS Tallgrass supercomputer. | Planned (not in production) | in development | random forests, recurrent graph convolutional neural networks | Yes | Agency Generated | No | Our processed training and testing data are not yet ready to be shared. The raw data are publicly available at the link below and on NWIS for the listed years. Harmonized Water Quality Portal data: https://www.sciencebase.gov/catalog/item/5e010424e4b0b207aa033d8c | Yes | No | In development here: https://github.com/USGS-R/drb-inland-salinity-ml | Yes | Yes | No | Water | |||
278 | Prediction of Salt Front Location in the Delaware River Estuary | Department of the Interior | USGS | Gorski, Galen | ggorski@usgs.gov | Alison Appling, aappling@usgs.gov | We are developing a machine learning model to make predictions of the 250 mg/L isochlor (salt front location) within the Delaware River Estuary. The model will be driven by river discharge into the estuary, tidal forcings, and meterological data from several points throughout the estuary. Model predictions will be compared with a process-based, hydrodynamic model, COAWST. The machine learning model is currently in development, but it will consist of a recurrent neural network architecture built using tools from pyTorch. | Planned (not in production) | Have initial results, plan to have presentation-ready results by 6/2022 | recurrent neural networks | Yes | From Another Agency | No | Once data are processed and prepared they will be released in a publicly available data release | Yes | No | In development here: https://github.com/USGS-R/drb-estuary-salinity-ml | Yes | Yes | No | Water | |||
279 | Prediction of Water Temperature in the Delaware River Basin | Department of the Interior | USGS | Oliver, Samantha | soliver@usgs.gov | Alison Appling, aappling@usgs.gov | We published a machine learning model to make water temperature predictions at 456 reaches in the Delaware River Basin. The recurrent graph convolutional network (RGCN) was pre-trained with predictions from a coupled process-based model that predicts stream flow and temperature (the Precipitation Runoff Modeling System with the coupled Stream Network Temperature Model or PRMS-SNTemp). | In production: more than 1 year | Model published in Jia et al 2021 | recurrent graph convolutional neural network | Yes | No | No | Water | ||||||||||
280 | Forecasting Water Temperature in the Delaware River Basin | Department of the Interior | USGS | Zwart, Jacob | jzwart@usgs.gov | Alison Appling, aappling@usgs.gov | We developed a process-guided deep learning and data assimilation approach to operationally produce 7-day forecasts of daily maximum stream water temperature downstream of drinking water reservoirs in support of water management decisions. Our process-guided deep learning model was pretrained on output from an integrated stream-reservoir process-based model and used an autoregressive technique and data assimilation to ingest real-time observations of stream temperature to improve near-term forecasts. Our modeling system produced forecasts of daily maximum water temperature with an average root mean squared error (RMSE) from 1.1 to 1.4°C for 1-day-ahead and 1.4 to 1.9°C for 7-day-ahead forecasts across all sites. | In production: less than 1 year | Model published at https://doi.org/10.31223/X55K7G | long short-term memory network | Yes | Agency Generated | Yes? | Yes | Yes | Yes | Yes | No | Water | |||||
281 | Prediction of Flood Flow Metrics for Minimally Altered Catchments | Department of the Interior | Funded by the Federal Highway Administration (FHWA) | USGS | Smith, Jared | jsmith@usgs.gov | Once developed, the system will input watershed characteristics (soils, land cover) and long-term meteorological data, and output predictions of flood flow metrics (magnitude, duration, frequency, volume) for stream reaches. Two models will be trained using gage data from regions surrounding the Delaware River Basin and the Colorado River Basin. The resulting models will allow for estimating flood flow metrics in ungaged reaches, which can be used to inform infrastructure designs along those reaches (e.g., bridges). The current deliverable is predictions for minimally altered catchments, and future years may expend to predictions in altered catchments (e.g., those with dam regulation). The models will be built using various R packages on the USGS Tallgrass supercomputer. | Planned (not in production) | in development | random forest, boosted trees, artificial neural network | Yes | Agency Generated | No | Our processed training and testing data are not yet ready to be shared. The raw data are publicly available on NWIS. | Yes | No | In development here: https://github.com/USGS-R/regional-hydrologic-forcings-ml | Yes | Yes | No | Water | |||
282 | Process-Guided Deep Learning Predictions of Lake Water Temperature | Department of the Interior | USGS | Read, Jordan | jread@usgs.gov | Alison Appling, aappling@usgs.gov | This process-guided deep learning model predicts depth-specific lake temperatures while obeying physical laws using inputs of meteorological drivers. Training consists of two stages. In the first stage, the model is pre-trained using process-based modeling outputs. Then, lake temperature observations are used to finetune the model in a second training stage. The models are trained to simultaneously fit observations and honor conservation of energy. The models were developed using various R and Python packages on the USGS Tallgrass supercomputer. The General Lake Model (GLM version 2) software was used for process-based modeling. | In production: more than 1 year | Published at https://doi.org/10.1029/2019WR024922 | long short-term memory network | Yes | Agency Generated | Yes | Yes | No | Yes | Yes | No | Water | |||||
283 | Prediction of Lake Water Temperature using Lake Attributes | Department of the Interior | USGS | McAliley, Wallace (Andy) | wmcaliley@usgs.gov | Alison Appling, aappling@usgs.gov | Once developed, the system will input lake characteristics (surface area, elevation, and others to be determined) and output predictions of depth-specific lake temperatures. Training data consist of lake temperature observations, meteorological data, and lake characteristics. The models will be developed using various Python packages including PyTorch on the USGS Tallgrass supercomputer. | Planned (not in production) | in development | long short-term memory network | Yes | Agency Generated | No | Once data are processed and prepared they will be released in a publicly available data release | Yes | No | In development here: https://github.com/USGS-R/lake-temperature-lstm-static | Yes | Yes | No | Water | |||
284 | Process-Guided Deep Learning for Dissolved Oxygen Predictions on Stream Networks | Department of the Interior | USGS | Sadler, Jeffrey | jsadler@usgs.gov | Alison Appling, aappling@usgs.gov | 1) this model predicts daily minimum, mean, and maximum dissolved oxygen (DO) concentrations at several stream locations in the Delaware River Basin. The inputs used are meteorological inputs (e.g., precipitation, cloud cover) and static catchment attributes (e.g., basin area). 2) the training data are DO concentrations collected by the USGS and made available via the National Water Information System (NWIS). 3) this work is being done using Python and R. The deep learning models were written via TensorFlow, the data prepartion is in R, and the modeling workflow was scripted via Snakemake. | Planned (not in production) | in development | long short-term memory network | Yes | Agency Generated | Once data are processed and prepared they will be released in a publicly available data release | Yes | No | https://github.com/USGS-R/drb-do-ml, https://github.com/USGS-R/river-dl | Yes | Yes | No | Water | ||||
285 | Multi-task deep learning for daily streamflow and water temperature | Department of the Interior | USGS | Sadler, Jeffrey | jsadler@usgs.gov | Alison Appling, aappling@usgs.gov | 1) This model predicts two interdependent variables, daily average streamflow and daily average stream water temperature, together using multi-task deep learning. A multi-task scaling factor controls the relative contribution of the auxiliary variable’s error to the overall loss during training. Input data include meteorological variables such as rainfall and humidity. 2) The training data were streamflow and water temperature observations. The stream temperature data were collected by the USGS and made available via NWIS. The streamflow observations were also collected by the USGS but collated along with input drivers in the CAMELS dataset. 3) This work was done using Python. The deep learning models were written via TensorFlow and the modeling workflow was scripted via Snakemake. | Planned (not in production) | this work is about to be published in a peer-reviewed journal article (Water Resources Research) | long short-term memory network | Yes | Agency Generated | Yes | No | Yes | Yes | No | Water | ||||||
286 | Predicting Water Temperature Dynamics of Unmonitored Lakes With Meta‐Transfer Learning | Department of the Interior | USGS | Read, Jordan | jread@usgs.gov | The approach compares the transfer of different model types from well-observed to unobserved lake systems. Process-based models, neural networks, and process-guided neural networks are trained on well observed lakes (source lakes) and then is used to make predictions in unobserved lakes (target lakes). The performance of each of those transfers is used to train a meta-model that uses lake characteristics (e.g., depth, area) to predict which source lakes will be good candidates for transfer to target lakes. The process-guided deep learning models were able to transfer better than process-based and pure machine learning approaches. | In production: more than 1 year | This work is published: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2021WR029579 | long short-term memory network with a custom loss function to maintain physical realism; transfer model is a gradient boosting regression | Yes | No | Water | ||||||||||||
287 | Process-guided deep learning for predicting stream temperature in out-of-bound conditions | Department of the Interior | USGS | Topp, Simon | stopp@usgs.gov | Alison Appling, aappling@usgs.gov | 1) This work uses meteorological drivers to predict network wide daily average stream temperature in the Delaware River Basin. 2) The training data are water temperature observations available through NWIS and collected by the UGSS. 3) The work compares the performance of two deep learning achictectures, both of which incorporate process guidance through pretraining on process-based modelling outputs. For each architecture, we're testing the ability of the model to generalize outside the bounds of its training data in order to better understand the limitations of each modelling approach for accurately predicting stream temperature under changing climate and precipitation regimes. | Planned (not in production) | Reccurrent graph convolutional network and GraphWaveNet (a temporal convolution based graph nueral network) | Agency Generated | Yes? | Yes | No | In development: https://github.com/USGS-R/river-dl | Yes | Yes | Water | |||||||
288 | Process guidance for learning groundwater influence on stream temperature predictions | Department of the Interior | USGS | Barclay, Janet | jbarclay@usgs.gov | Topp, Simon, stopp@usgs.gov | 1) This work uses meteorological drivers to predict network wide daily average stream temperature in the Delaware River Basin. 2) The training data are water temperature observations available through NWIS and collected by the UGSS. 3) The work focuses on developing a custom loss function that helps deep learning models learn to account for groundwater influence on stream temperature. Specifically, it uses the phase lag and amplitude dampening effect of groundwater to identify reaches likely influenced by shallow and deep groundwater inputs. | Planned (not in production) | Recurrent graph convolutional neural network with a custom loss function to account for groundwater. | Agency Generated | Yes? | Yes | No | In development: https://github.com/USGS-R/river-dl | Yes | Yes | Water | |||||||
289 | Explainable AI and interpretable machine learning | Department of the Interior | USGS | Topp, Simon | stopp@usgs.gov | Jeremy Diaz, jdiaz@usgs.gov | This work focuses on developing expertise and resources for Explainable AI (XAI) within WMA PUMP Projects. The inputs are various models developed for predicting stream temperature, discharge, dissolved oxygen, and other characteristics. The outputs are interpretable metrics to help understand why models are making the predictions they are and what physical processes are getting captured with the model architectures. | Planned (not in production) | Integrated gradients, SHAP values, layerwise relevance propogation, accumulated local effects. | Agency Generated | No | Yes | No | In planning, no active code repositories at the moment | Yes | Yes | Water | |||||||
290 | AI applications to mapping surface water | Department of the Interior | USGS | Shavers, Ethan | eshavers@usgs.gov | Lawrence Stanislawski, lstan@usgs.gov | The research is investigating the use of hand annotated hydrography from one region to train an artificial neural net (ANN) to identify where surface water is likely to be in other areas. The input training data includes lidar, radar, and other remotely sensed data along with modeled surface flow to inform the model. The work is using open-source tools in a high-performance computing environment. | Planned (not in production) | This is research into potential applications for NHD update and validation | U-net CNN | Yes | Agency Generated | No | Yes | No | Yes | Yes | No | CSS | |||||
291 | Where’s the Rock: Using Neural Networks to Improve Land Cover Classification | Department of the Interior | USGS | Cerovski-Darriau, Corina | ccerovski-darriau@usgs.gov | Jonathan Stock, jstock@usgs.gov | While machine learning techniques have been increasingly applied to land cover classification problems, these techniques have not focused on separating exposed bare rock from soil covered areas. Therefore, we are using a neural network to differentiate exposed bare rock (rock) from soil cover (other). We started with a training dataset by mapping exposed rock at 20 test sites across the Sierra Nevada Mountains (California, USA) using USDA’s 0.6 m National Aerial Inventory Program (NAIP) orthoimagery. These initial sites were used to train and test the original CNN and now NASA's DELTA toolkit, which is being run on the USGS high-performance computing facilities. The goal is to generate a machine learning approach to classify bare rock in NAIP orthoimagery, starting with the Sierras, in order to provide a more accurate map of soil vs. rock-covered areas for use in landslide hazard mapping, quantifying soil carbon storage, calculating water fluxes, etc. | In production: more than 1 year | in development, pilot project published here: https://www.mdpi.com/2072-4292/11/19/2211 | convolutional neural networks | Yes | Agency Generated | No | Yes | No | In development. No longer using original code from pilot project. Currently testing NASA's DELTA toolkit (https://software.nasa.gov/software/ARC-18446-1) | Yes | Yes | CSS | |||||
292 | Data–driven prospectivity modelling of sediment–hosted Zn–Pb mineral systems and their critical raw materials | Department of the Interior | Part of the Critical Minerals Mapping Initiative in cooperation with Geoscience Australia and Canadian Geological Survey | USGS | Coyan Joshua | jcoyan@usgs.gov | Garth Graham, ggraham@usgs.gov | Regional data (magnetics and their derivatives, gravity and their derivatives, black shales, terrane boundaries, LAB depths, permissive geology, paleo-latitude etc.) is loaded into Uber's H3 cube. Clastic Dominated (CD) and Mississippi Valley Type (MVT) deposits are used to train a Weights of Evidence model and two different Gradient-Boosting Machine models. After training occured the result was a prospecticvity map for CD and MVT deposits in the three countries. | In production: less than 1 year | Gradient Boosting Machines and Weights of Evidence | No | No | Yes | No | Yes | Yes | No | Energy and Minerals, GMEG/G3 | ||||||
293 | Updating Real-time Earthquake Shaking, Ground Failure, and Impact products with remote sensing and ground truth observations | Department of the Interior | Colaboration with Stanford and Stonybrook Universities | USGS | David Wald | wald@usgs.gov | A breakthrough for rapid post-earthquake ground failure (GF) and loss modeling and reporting has been achieved with initial Bayesian updating of our global loss and GF models with ground-truth observations. Empirical models suffer from limited performance due to the complex, event-specific causal effects underlying the cascading processes of earthquake-triggered hazards and impacts. In contrast, satellite imagery-based impact assessments (e.g., NASA’s Damage Proxy Maps, or DPMs), while spatially accurate, lack the specificity as to what physical process caused those image changes. We present the first rapid seismic multi-hazard and damage updating framework based on variational Bayesian causal inference and remotely sensed DPMs. This machine learning framework enables accurate and high-resolution multi-hazard and damage estimates by jointly inferring shaking and secondary hazards and resulting building damage and quantifying their causal dependencies from imagery and prior loss and GF models. The underlying physical causal dependencies are modeled using a multi-layer causal Bayesian network. Initial results are impressive, showing that our framework significantly improves the GF prediction abilities. It also reveals the event-specific causal dependencies among ground shaking, GF, building damage, and other environmental factors. We expect improved PAGER products to more rapidly evolve to accurate and thus more actionable images, maps, and products. | Planned (not in production) | in peer review | Bayesian networks using a probabilistic graphical model employing mixture models with an unsupervised learning technique. | Yes | Agency Generated | Yes | Yes. Many public sources of data | Yes | No | will be - but not yet there. | Yes | Yes | No | Natural Hazards | |||
294 | Using Artificial Neural Networks to Improve Earthquake Ground-Motion Models | Department of the Interior | USGS | Aagaard, Brad | baagaard@usgs.gov | The ML model provides estimates of peak ground-motion from earthquakes given the location, magnitude, and local geological structure at a site of interest. The training data is a compilation of about 12,000 peak ground-motions recorded at seismic stations for moderate to large earthquakes. I constructed the ML model in Python using Keras with TensorFlow. | Planned (not in production) | ML model generated for research purposes, not as a product. Manuscript for peer-reviewed journal is "in preparation". | convolutional neural networks | Yes | Other | No | Yes | No | Yes | Yes | No | Natural Hazards | ||||||
295 | Leveraging Deep Learning to Improve Earthquake Monitoring | Department of the Interior | USGS | Yeck, William | wyeck@usgs.gov | jpatton@usgs.gov | The USGS National Earthquake Information Center monitors global earthquakes 24/7, rapidly detecting, characterizing, and publically desimating earthquake information. In order to improve the perfomance of their event characterization system, the NEIC has trained AI models to characterize earthquake source information using small portions of waveform data. These models improve autotmatic phase picking, classify phase types, and estimate source-station distances. The outcome of these models is improved automatic earthquake detections. The training dataset used in these models leverages the long standing reveiwed earthquake catalog produced by the NEIC combined with archived continous waveform recordings, many of which are USGS opperated stations. These tools have been developed primarily leveraging Python, Keras, and Tensorflow. | In production: less than 1 year | Manuscript and software release have are completed. Algorithms are in opperations. Research is ongoing to improve these models, traning datasets, and create similar characterization tools. | Deep Learning, Convolutional neural networks | Yes | Agency Generated | No | Yes | Yes | Yes | Yes | No | Natural Hazards | |||||
296 | Using Gradient Boosting Method and Feature Selection to Reduce Aleatory Uncertainty of Earthquake Ground-Motion Models | Department of the Interior | Collaboration with Tufts | USGS | Cochran, Elizabeth | ecochran@usgs.gov | We develop ground-motion models for peak ground acceleration and peak ground velocity using a gradient boosting method (GBM). In total 128 GBM-based ground-motion models are developed for estimating PGA and PGV, respectively, using varying subsets of explanatory variables. We select eight GBM-based ground-motion models that have the lowest root mean squared error (rmse) for the cross-validation datasets among models with the same number of explanatory variables. The secondary variables, in order of importance, that contribute to the model accuracy are: VS30, Ztor, Ry, Rx, Rake, Zhyp, and Dip. By considering the tradeoff between the model accuracy and model complexity (number of explanatory variables), we find an optimal model to predict PGA and PGV uses four explanatory variables: M, Rjb, VS30, and Ztor. The variability decomposition results suggest that the reduction of total variability is mostly due to the reduction of inter-event variability, likely because more source parameters than site or path parameters are included as explanatory variables. | Planned (not in production) | In peer review | Gradient boosting model (GBM) | Yes | Other | No | No | No | No | No | Natural Hazards | ||||||
297 | Application of machine learning to ground motion-based earthquake early warning | Department of the Interior | USGS | Cochran, Elizabeth | ecochran@usgs.gov | Clements, Tim (tclements@usgs.gov) | We use initial observations of an earthquake on seismic stations close to an earthquake to predict what the peak ground shaking will be across a region. The initial test dataset are waveforms from the USGS-collected, large-n seismic array in an area of induced seismicity in Oklahoma. Future datasets will include seismic data from the California Seismic Integrated Network (primarily supported by USGS) and possible the Japanese Meterological Agency. Currently running Python-based codes on a desktop, plan to move to AWS or similar. | Planned (not in production) | Work started a couple of months ago and is ongoing | Graph neural networks | Yes | Agency Generated | No | Yes | No | Yes | Yes | No | Natural Hazards | |||||
298 | A machine learning approach to developing ground motion models from simulated ground motions | Department of the Interior | USGS | Withers, Kyle | kwithers@usgs.gov | We use a machine learning approach to build a ground motion model (GMM) from a synthetic database of ground motions extracted from the Southern California CyberShake study. An artificial neural network is used to find the optimal weights that best fit the target data (without overfitting), with input parameters chosen to match that of state-of-the-art GMMs. We validate our synthetic-based GMM with empirically based GMMs derived from the globally based Next Generation Attenuation West2 data set, finding near-zero median residuals and similar amplitude and trends (with period) of total variability. Additionally, we find that the artificial neural network GMM has similar bias and variability to empirical GMMs from records of the recent Mw7.1 Ridgecrest event, which neither GMM has included in its formulation. As simulations continue to better model broadband ground motions, machine learning provides a way to utilize the vast amount of synthetically generated data and guide future parameterization of GMMs. | In production: more than 1 year | Inital work is completed. We are expanding the approach currently to handle more specific questions. | Neural networks | Yes | Other | Yes | No | Yes | Yes | No | Natural Hazards | |||||||
299 | Integrating machine learning phase pickers into the Southern California Seismic Network earthquake catalog | Department of the Interior | Collaboration with Caltech Seismo Lab | USGS | Yoon, Clara | cyoon@usgs.gov | We evaluate the readiness of machine-learning models for automatic earthquake detection and phase picking to enhance the Southern California Seismic Network earthquake catalog, with the end-goal of using these models in routine seismic network operations. We first test a model called Generalized Phase Detection (GPD), trained on millions of manually-picked P- and S- arrival times from Southern California earthquakes and examples of noisy time series data. Inputs are continuous seismic data time series, with 3 components (north, east, vertical), at hundreds of seismic stations located in southern California. Outputs are arrival times of P and S seismic waves, with associated probabilities between 0 and 1, with a threshold probability applied for detection; these arrival times are fed into existing software to estimate earthquake locations, origin times, and magnitudes. Custom software is written in Python with the model implemented in the PyTorch library. We are also developing a cloud-native software architecture that takes real-time seismic data as input (~15 seconds at a time) and applies the GPD model within Amazon Web Services. | Planned (not in production) | Training data and machine learning model have been developed outside USGS and are available to the public. USGS seeks to apply this model to improve earthquake monitoring operations, still in development /testing stage, not yet in production. | Deep Learning, Convolutional neural networks | No | No | https://scedc.caltech.edu/data/deeplearning.html, https://github.com/seisbench/seisbench/blob/main/seisbench/data/scedc.py | Yes | No | Not yet publicly available; code resides on a private git repository | Yes | Yes | No | Natural Hazards | ||||
300 | Understanding the 2020-2021 Puerto Rico Earthquake sequence with deep learning approaches | Department of the Interior | Collaboration with Caltech Seismo Lab and Puerto Rico Seismic Network | USGS | Yoon, Clara | cyoon@usgs.gov | We enhance the earthquake catalog for the 2020-2021 southwestern Puerto Rico earthquake sequence with a variety of deep learning approaches to understand its complex fault system, triggering mechanisms, and long-lived vigorous nature of the aftershock sequence. We use an existing deep learning model for earthquake detection and phase picking called EQTransformer, which was trained on a global data set of earthquake waveforms called STEAD, using the TensorFlow library. We also apply deep learning methods for earthquake location (EikoNet and HypoSVI), trained on a known velocity model with a physics informed neural network using the PyTorch library, which then allows grid-free rapid seismic wave travel time calculation between any 2 locations within a 3D volume . These machine learning methods for automatic earthquake detection, phase-picking, and location, which are all available as open-source Python codes, help increase the number of small earthquake observations and improve earthquake depth estimates, thus offering more detailed information about active faults and physical processes in this earthquake sequence. | Planned (not in production) | Applying existing machine learning models and software developed in the seismology community, for research purposes. Currently working on the data analysis and will write up for publication soon. | Deep learning, physics informed neural networks | No | No | Seismic data - IRIS data center via Puerto Rico seismic network. STEAD data set available at https://github.com/smousavi05/STEAD and https://github.com/seisbench/seisbench/blob/main/seisbench/data/stead.py | Yes | No | EQTransformer model and software available at https://github.com/smousavi05/EQTransformer and https://github.com/seisbench/seisbench/blob/main/seisbench/models/eqtransformer.py. EikoNet and HypoSVI software available at https://github.com/Ulvetanna | Yes | Yes | No | Natural Hazards | ||||
301 | Land Use Plan Document and Data Mining and Analysis R&D | Department of the Interior | BLM | German, Jesse | jgerman@blm.gov | Julie Recker, jrecker@blm.gov | Exploring the potential to identify patterns, rule alignment or conflicts, discovery, and mapping of geo history and/or rules. Inputs included unstructured planning documents. Outputs identify conflicts in resource management planning rules with proposed action locations requiring exclusion, restrictions, or stipluations as defined in the planning documents. | Planned (not in production) | This initial effort was conducted under contract as research and development and was never in full production or implementation. As of March, 2022 the BLM has not committed to further pursuit of this potential | Natural Language Processing and Geo Classification | Yes | Agency Generated | No | No | Yes | No | N/A | Not at this time | Yes | N/A | Yes | DRAFT - Exploratory effort which preceeded EO 13960 and was never in production/implementation. Provided heres only as an example use case. | ||
302 | Data Driven Sub-Seasonal Forecasting of Temperature and Precipitation | Department of the Interior | BOR | Nowak, Kenneth | knowak@usbr.gov | Reclamation has run 2, year-long prize competitions where particants developed and deployed data driven methods for sub-seasonal (2-6 weeks into future) prediction of temperature and precipitation across the western US. Particpants outperformed benchmark forecasts from NOAA. Reclamation is currently working with Scripps Institute of Oceanography to further refine, evaluate, and pilot implement the most promising methods from these two copmetitions. Improving sub-seasonal forecasts has significant potential to enhance water management outcomes. | Development (not in production) | Working with UCSD - SIO - CW3E to further evaluate and test methods | Range of data driven, AI/ML techniques (e.g. random forests) | no, but can "re-train" | range of geophysical/earth observation data (temperature, precipitation, elevation, oceans) and forecast data (e.g., NOAA, ECMWF) | some may be, uses a wide range of data from wide range of sources - see 6B. | yes | no | yes | yes | NA | no | ||||||
303 | Data Driven Streamflow Forecasting | Department of the Interior | BOR | Nowak, Kenneth | knowak@usbr.gov | Reclamation, along with partners from the CEATI hydropower industry group (e.g. TVA, DOE-PNNL, and others) ran a year-long evaluation of existing 10-day streamflow foreasting technologies and a companion prize competition open to the public, also focused on 10-day streamflow forecasts. Forecasts were issued every day for a year and verified agains observed flows. Across locations and metrics, the top perfoming foreacst product was a private, AI/ML forecasting company - UpstreamTech. Several competitors from the prize competition also performed strongly; outperforming benchmark forecasts from NOAA. Reclamation is working to further evaluate the UpstreamTech forecast products and also the top performers from the prize competition. | Development (not in production) | Exploring partnership with industry to further evaluate methods. | Range of data driven, AI/ML techniques (e.g. LSTMs) | no, but can "re-train" | range of geophysical/earth observation data (temperature, precipitation, streamflow, soil moisture, snow, etc) and forecast data (e.g., NOAA, ECMWF) | some may be, uses a wide range of data from wide range of sources - see 6B. | yes | no | yes | yes | NA | no | ||||||
304 | Seasonal/Temporary Wetland/Floodplain Delineation using Remote Sensing and Deep Learning | Department of the Interior | BOR | King, Vanessa | vking@usbr.gov | Reclamation was interested in determining if recent advancements in machine learning, specifically convolutional neural network architecture in deep learning, can provide improved seasonal/temporary wetland/floodplain delineation (mapping) when high temporal and spatial resolution remote sensing data is available? If so, then these new mappings could inform the management of protected species and provide critical information to decision-makers during scenario analysis for operations and planning. | Completed | S&T report has been completed; no further work planned at this time. | Image classification using Joint Unsupervised Learning (JULE) | Yes | Data obtained from Planet Labs under a limited-use license for government research | No | N/A | Yes | No | N/A | yes | PI has left Reclamation. The code is poorly documented and no other Reclamation staff are familiar with its use. | ||||||
305 | Improving UAS-derived photogrammetric data and analysis accuracy and confidence for high-resolution data sets using artificial intelligence and machine learning | Department of the Interior | BOR | Klein, Matthew | mklein@usbr.gov | UAS derived photogrammetric products contain a large amount of potential information that can be less accurate than required for analysis and time consuming to analyze manually. By formulating a standard reference protocol and applying machine learning/artificial intelligence, this information will be unlocked to provide detailed analysis of Reclamation's assets for better informed decision making. | Proof-of-concept completed | Next step would be to develop full-scale system based on results from Photogrammetric Data Set Crack Mapping Technology Search | Yes | The data in the AI is controlled. | ||||||||||||||
306 | Photogrammetric Data Set Crack Mapping Technology Search | Department of the Interior | BOR | Klein, Matthew | mklein@usbr.gov | This project is exploring a specific application of photogrammetric products to process analysis of crack mapping on Reclamation facilites. This analysis is time consuming and has typically required rope access or other means to photograph and locate areas that can now be reached with drones or other devices. By formulating a standard reference protocol and applying machine learning/AI, this information will be used to provide detailed analysis of Reclamation assets for better decision making. | Proof-of-concept completed | Next step would be to develop full-scale system based on results from Photogrammetric Data Set Crack Mapping Technology Search | Yes | The data used in the AI will be controlled. | ||||||||||||||
307 | Improved Processing and Analysis of Test and Operating Data from Rotating Machines | Department of the Interior | BOR | Agee, Stephen | sagee@usbr.gov | This project is exploring a better method to analyze DC ramp test data from rotating machines. Previous DC ramp test analysis requires engineering expertise to recognize characteristic curves from DC ramp test plots. DC ramp tests produce a plot of voltage vs current for a ramping voltage applied to a rotating machine. By using machine learning/AI tools, such as linear regression, the ramp test plots can be analyzed by computer software, rather than manual engineering analysis, to recognize characteristic curves. The anticipated result will be faster and more reliable analysis of field-performed DC ramp testing. | Investigating/Proof of concept | Results will be used to develop future AI tools that will help fully analyze DC ramp tests. | Yes | The data used in the AI will be controlled (power plant operation data). If released, data would need to be normalized and anonomyzed. | ||||||||||||||
308 | Sustained Casing Pressure Identification | Department of the Interior | BSEE | Boone, Adam | adam.boone@bsee.gov | Timothy Baudier (timothy.baudier@bsee.gov) | Well casing pressure requests are submitted to BSEE to determine whether a well platform is experiencing a sustained casing pressure (SCP) problem. SCP is usually caused by gas migration from a high-pressured subsurface formation through the leaking cement sheath in one of the well’s casing annuli, but SCP can also be caused by defects in tube connections, downhole accessories, or seals. Because SCP can lead to major safety issues, quickly identifying wells with SCP could greatly mitigate accidents on the well platforms | Planned (not in production) | POC being conducted via an IAA with the NASA Advanced Supercomputing division | Machine learning via deep learning model, such as a residual neural network (ResNet) to classify Sustain Casing Pressure | Yes | Agency Generated | Yes | N/A | Yes | No | N/A | N/A | Yes | Yes | Business Proprietary data is involved | |||
309 | Level 1 Report Corrosion Level Classification | Department of the Interior | BSEE | Boone, Adam | adam.boone@bsee.gov | Timothy Baudier (timothy.baudier@bsee.gov) | Level 1 surveys obtained from BSEE report the condition of well platforms. The reports include images of well platform components, which can be used to estimate coating condition and structural condition, important factors in the overall condition of the facility. The reports are used to assess the well platforms for safety concerns. The reports are submitted to BSEE and are manually reviewed to determine whether a well platform needs additional audits. Because the manual review process is time-consuming, an automated screening system that can identify parts of the wells that exhibit excess corrosion may greatly reduce report processing time. | Planned (not in production) | POC being conducted via an IAA with the NASA Advanced Supercomputing division | Automate classifying the level of corrosion of the images within the Level 1 Survey Report image data using machine learning via a deep learning model | Yes | Agency Generated | Yes | N/A | Yes | No | N/A | N/A | Yes | Yes | Business Proprietary data is involved | |||
310 | Well Activity Report Classification | Department of the Interior | BSEE | Boone, Adam | adam.boone@bsee.gov | Timothy Baudier (timothy.baudier@bsee.gov) | Researching use of self-supervised deep neural networks to identify classification systems for significant well event using data from well Activity Reports | Planned (not in production) | POC being conducted via an IAA with the NASA Advanced Supercomputing division | Automate classifying of significant well events within the well Activity Report text data using machine learning via a deep learning model | Yes | Agency Generated | Yes | N/A | Yes | No | N/A | N/A | Yes | Yes | Business Proprietary data is involved | |||
311 | Sentiment Analysis and Topic Modeling (SenTop) | DHS | The initial purpose of the Sentiment Analysis and Topic Modeling (SenTop) project was to analyze survey responses for DHS’s Office of the Chief Procurement Officer related to contracting. However, it has evolved to be a general-purpose text analytics solution that can be applied to any domain/area. It also has been tested/used for human resources topics. SenTop is a DHS-developed Python package for performing descriptive text analytics. Specifically, sentiment analysis and topic modeling on free-form, unstructured text. SenTop uses several methods for analyzing text including combining sentiment analyses and topic modeling into a single capability, permitting identification of sentiments per topic and topics per sentiment. Other innovations include the use of polarity and emotion detection, fully automated topic modeling, and multi-model/multi-configuration analyses for automatic model/configuration selection. The code has been established, performs an analysis, and provides a report but it is only accessed and run by one person per customer request. | |||||||||||||||||||||
312 | AIS Scoring & Feedback (AS&F) | Cybersecurity and Infrastucture SEcurity Agency | ||||||||||||||||||||||
313 | Automated PII Detection | Cybersecurity and Infrastucture SEcurity Agency | CISA's Automated Personally Identifiable Information (PII) Detection and Human Review Process incorporates descriptive, predictive, and prescriptive analytics. Automated PII Detection leverages natural language processing tasks including named entity recognition coupled with Privacy guidance thresholds to automatically detect potential PII from within Automated Indicator Sharing submissions. If submissions are flagged for possible PII, the submission will be queued for human review where the analysts will be provided with the submission and artificial intelligence-assisted guidance to the specific PII concerns. Within human review, analysts can confirm/deny proper identification of PII and redact the information (if needed). Privacy experts are also able to review the actions of the system and analysts to ensure proper performance of the entire process along with providing feedback to the system and analysts for process improvements (if needed). The system learns from feedback from the analysts and Privacy experts. | |||||||||||||||||||||
314 | CDC Airport Hotspot Throughput (Pagerank) | TSA | TSA launched the “Stay Healthy. Stay Secure.” campaign, which details proactive and protective measures have been implemented at security checkpoints to make the screening process safer for passengers and our workforce by reducing the potential of exposure to the coronavirus. The campaign includes guidance and resources to help passengers prepare for the security screening process in the COVID environment. A big part of that campaign was the development of the Centers for Disease Control and Prevention's Airport Hotspot Throughput. This capability determines the domestic airports that have the highest rank of connecting flights during the holiday travel season to help mitigate the spread of COVID-19. This capability is a DHS-developed artificial intelligence model written in Spark/Scala that takes historical non-PII travel data and computes the highest-ranking airports based on the PageRank algorithm. TSA does not make decisions about flight cancellations or airport closures. These decisions are made locally, on a case-by-case basis, by individual airlines, airports, and public health officials. TSA will continuously evaluate and adapt procedures and policies to keep the public and our workforce safe as we learn more about this devastating disease and how it spreads. | |||||||||||||||||||||
315 | Asylum Text Analytics (ATA) | USCIS | USCIS oversees lawful immigration to the United States. As set forth in Section 451(b) of the Homeland Security Act of 2002, Public Law 107-296, Congress charged USCIS with administering the asylum program. USCIS, through its Asylum Division within the Refugee, Asylum & International Operations Directorate (RAIO), administers the affirmative asylum program to provide protection to qualified individuals in the United States who have suffered past persecution or have a well-founded fear of future persecution in their country of origin, as outlined under Section 208 of the Immigration and Nationality Act (INA), 8 U.S.C. § 1158 and Title 8 of the Code of Federal Regulations (C.F.R.), Part 208. Generally, an individual not in removal proceedings may apply for asylum through the affirmative asylum process regardless of how the individual arrived in the United States or his or her current immigration status by filing Form I-589, Application for Asylum and for Withholding of Removal. The ATA capability employs machine learning and data graphing techniques to identify plagiarism-based fraud in applications for asylum status and for the withholding of removal by scanning the digitized narrative sections of the associated forms and looking for common language patterns. | |||||||||||||||||||||
316 | BET/FBI Fingerprint Success Maximization | USCIS | ||||||||||||||||||||||
317 | Biometrics Enrollment Tool (BET) Fingerprint Quality Score | USCIS | USCIS's Customer Profile Management Service (CPMS) serves as a person-centric repository of biometric and biographic information provided by applicants and petitioners (hereafter collectively referred to as “benefit requestors”) that have been issued a USCIS card evidencing the granting of an immigration related benefit (i.e., permanent residency, work authorization, or travel documents). The Biometrics Enrollment Tool (BET) team has been working on enhancing their quality checks, with one of the new improvements being incorporation of the National Institute of Standards and Technology (NIST) Fingerprint Image Quality 2 (NIFQ2) algorithm (a trained machine learning algorithm) for scoring of fingerprints (https://www.nist.gov/services-resources/software/nfiq-2) into the BET application. This algorithm takes a fingerprint image and assigns a score between 0 - 100, with 100 indicating that this is the best quality fingerprint image that could be obtained. The higher the score, the more likely that the fingerprint will match when captured again. This algorithm has been in place for several Program Increments. BET had been providing Biometric Capture Technicians with a poor-quality indicator and encountered objections from technicians for the larger than expected number of recaptures required, based on contractual complications. The BET team continues to capture this data in the background, but this does not require recapture currently. For more information, please visit: https://www.dhs.gov/publication/dhsuscispia-060-customer-profile-management-service-cpms | |||||||||||||||||||||
318 | Evidence Classifier | USCIS | ||||||||||||||||||||||
319 | FDNS-DS NexGen | USCIS | ||||||||||||||||||||||
320 | Sentiment Analysis | USCIS | The USCIS Service Center Operations Directorate (SCOPS) provides services for persons seeking immigration benefits while ensuring the integrity and security of our immigration system. As part of that mission, we issued a two-part survey asking users both quantitative and qualitative questions. USCIS performed a statistical analysis of the quantitative results and then used Natural Language Processing modeling software to assign "sentiments" to categories ranging from strongly positive to strongly negative. This model was eventually enhanced using a machine learning model to have better reusability and performance. This capability has been deployed to production for more than one year. | |||||||||||||||||||||
321 | Testing Performance of ML Model using H2O | USCIS | USCIS is the component within DHS that oversees lawful immigration to the United States. That means USCIS receives, processes, and maintains all applications for admission for Lawful permanent residents (LPRs), or adjustments to LPR status. Also known as “green card” holders, LPRs are non-citizens who are lawfully authorized to live permanently within the United States and are required to fill out Form I-90, Application to Replace Permanent Resident Card (Green Card). Since there has been a considerable influx of green card applications, USCIS used a combination of exploratory data analysis to determine the most used categories for applicants submitting I-90's, and machine learning to create predictions of workloads. USCIS used the H20 machine learning model to allow USCIS analysts to build and run several machine learning models on big data in an enterprise environment and identify the model that performs the best. It has already been successful in identifying the most accurate model for the I-90 Form Timeseries Analysis and Forecasting use case. This capability has been in production for more than one year. | |||||||||||||||||||||
322 | Timeseries Analysis and Forecasting | USCIS | USCIS is the component within DHS that oversees lawful immigration to the United States. That means USCIS receives, processes, and maintains all applications for admission for Lawful permanent residents (LPRs), or adjustments to LPR status. Also known as “green card” holders, LPRs are non-citizens who are lawfully authorized to live permanently within the United States and are required to fill out Form I-90, Application to Replace Permanent Resident Card (Green Card). Since there has been a considerable influx of green card applications, USCIS used a combination of exploratory data analysis to determine the most used categories for applicants submitting I-90's and machine learning to create predictions of workloads. As a follow-on, USCIS used Autoregressive Integrated Moving Average (ARIMA) models on the I-90 form, which allowed the prediction of the total number of forms for a 2-year period. ARIMA is one of the easiest and effective machine learning algorithms to perform time series forecasting. This capability has been deployed in production for more than a year. This model was eventually enhanced using ML model to have better reusability and performance. | |||||||||||||||||||||
323 | Silicon Valley Innovation Program (SVIP) Language Translator | USCG | ||||||||||||||||||||||
324 | Agent Portable Surveillance | CBP | ||||||||||||||||||||||
325 | Autonomous Surveillance Towers | CBP | ||||||||||||||||||||||
326 | I4 Viewer Matroid Image Analysis | CBP | ||||||||||||||||||||||
327 | Open-source News Aggregation | CBP | ||||||||||||||||||||||
328 | Data Tagging and Classification | ICE | ||||||||||||||||||||||
329 | Language Translator | ICE | ||||||||||||||||||||||
330 | RAVEn Compliance Automation Tool (CAT) | ICE | ||||||||||||||||||||||
331 | RAVEn Normalization Services | ICE | ||||||||||||||||||||||
332 | Form Recognizer for Benefits Forms | Labor | Custom machine learning model to extract data from complex forms to tag data entries to field headers. The input is a document or scanned image of the form and the output is a JSON response with key/value pairs extracted by running the form against the custom trained model. | Operation and Maintenance | ||||||||||||||||||||
333 | Language translation of published documents and website using natural language processing models. | Labor | Implementation | Cloud based commercial-off-the-shelf pre-trained NLP models | ||||||||||||||||||||
334 | Audio Transcription | Labor | Transcription of speech to text for records keeping using natural language processing models. | Operation and Maintenance | ||||||||||||||||||||
335 | Text to Speech Conversion | Labor | Text to speech (Neural) for more realistic human sounding applications using natural language processing models. | Operation and Maintenance | ||||||||||||||||||||
336 | Claims Document Processing | Labor | To identify if physician’s note contains causal language by training custom natural language processing models. | Implementation | ||||||||||||||||||||
337 | Website Chatbot Assistant | Labor | The chatbot helps the end user with basic information about the program, information on who to contact, or seeking petition case status. | Implementation | ||||||||||||||||||||
338 | Data Ingestion of Payroll Forms | Labor | Custom machine learning model to extract data from complex forms to tag data entries to field headers. The input is a document or scanned image of the form and the output is a JSON response with key/value pairs extracted by running the form against the custom trained model. | Initiation | ||||||||||||||||||||
339 | Hololens | Labor | AI used by Inspectors to visually inspect high and unsafe areas from a safe location. | Operation and Maintenance | ||||||||||||||||||||
340 | DOL Intranet Website Chatbot Assistant | Labor | Conversational chatbot on DOL intranet websites to help answer common procurement questions, as well as specific contract questions. | Initiation | ||||||||||||||||||||
341 | Official Document Validation | Labor | AI detection of mismatched addresses and garbled text in official letters sent to benefits recipients. | Implementation | ||||||||||||||||||||
342 | Electronic Records Management | Labor | Meeting NARA metadata standards for (permanent) federal documents by using AI to identify data within the document, and also using NLP to classify and summarize documents. | Initiation | ||||||||||||||||||||
343 | Call Recording Analysis | Labor | Automatic analysis of recorded calls made to Benefits Advisors in the DOL Interactive Voice Repsonse (IVR) center. | Initiation | ||||||||||||||||||||
344 | Automatic Document Processing | Labor | Automatic processing of continuation of benefits form to extract pre-defined selection boxes. | Implementation | ||||||||||||||||||||
345 | Automatic Data Processing Workflow with Form Recognizer | Labor | Automatic processing of current complex worflow to extract required data. | Initiation | ||||||||||||||||||||
346 | Case Recording summarization | Labor | Using an open source large language model to summarize publicly available case recording documents which are void of personal identifiable information (PII) or any other sensitive information. This is not hosted in the DOL technical environment and is reviewed by human note takers. | Development and Acquisition | ||||||||||||||||||||
347 | OEWS Occupation Autocoder | Labor | The input is state submitted response files that include occupation title and sometimes job description of the surveyed units. The autocoder reads the job title and assigns up to two 6-digit Standard Occupational Classification (SOC) codes along with their probabilities as recommendations for human coders. Codes above a certain threshold are appended to the submitted response file and sent back to states to assist them with their SOC code assignment. | Operation and Maintenance | ||||||||||||||||||||
348 | Scanner Data Product Classification | Labor | BLS receives bulk data from some corporations related to the cost of goods they sell and services they provide. Consumer Price Index (CPI) staff have hand-coded a segment of the items in these data into Entry Level Item (ELI) codes. To accept and make use of these bulk data transfers at scale, BLS has begun to use machine learning to label data with ELI codes. The machine learning model takes as input word frequency counts from item descriptions. Logistic regression is then used to estimate the probability of each item being classified in each ELI category based on the word frequency categorizations. The highest probability category is selected for inclusion in the data. Any selected classifications that do not meet a certain probability threshold are flagged for human review. | Operation and Maintenance | ||||||||||||||||||||
349 | Expenditure Classification Autocoder | Labor | Custom machine learning model to assign a reported expense description from Consumer Expenditure Diary Survey respondents to expense classification categories known as item codes. | Development and Acquisition | ||||||||||||||||||||
350 | drug_signature_program_algorithms | Department of Justice | DEA's Special Testing and Research Laboratory utilizes AI/ML techniques and has developed a robust statistical methodology including multi-variate statistical analysis tools to automatically classify the geographical region of origin of samples selected for DEA's Heroin and Cocaine signature programs. The system provides for detection of anomalies and low confidence results. | |||||||||||||||||||||
351 | complaint_lead_value_probability | Department of Justice | Threat Intake Processing System (TIPS) database uses artificial intelligence (AI) algorithms to accurately identify, prioritize, and process actionable tips in a timely manner. The AI used in this case helps to triage immediate threats in order to help FBI field offices and law enforcement respond to the most serious threats first. Based on the algorithm score, highest priority tips are first in the queue for human review. | |||||||||||||||||||||
352 | intelligent_records_consolidation_tool | Department of Justice | The Office of Records Management Policy uses an AI and Natural Language Processing (NLP) tool to assess the similarity of records schedules across all Department records schedules. The tool provides clusters of similar items to significantly reduce the time that the Records Manager spends manually reviewing schedules for possible consolidation. An AI powered dashboard provides recommendations for schedule consolidation and review, while also providing the Records Manager with the ability to review by cluster or by individual record. The solution's technical approach has applicability with other domains that require text similarity analysis. | |||||||||||||||||||||
353 | privileged_material_identification | Department of Justice | The application scans documents and looks for attorney/client privileged information. It does this based on keyword input by the system operator. | |||||||||||||||||||||
354 | Information Gateway OneReach Application | ACF | ACF Children's Bureau | The Information Gateway hotline connects to a phone IVR managed by OneReach AI. OneReach maintains a database of state hotlines for reporting child abuse and neglect that it can connect a caller to based on their inbound phone area code. Additionally, OneReach offers a limited FAQ texting service that utilizes natural language processing to answer user queries. User queries are used for reinforcement training by a human AI trainer and to develop additional FAQs. | Operation and Maintenance | 3/1/20 | 4/1/20 | 6/23/20 | hhs.caio@hhs.gov | Commercial-off-the-shelf | Yes | |||||||||||||
355 | Artificial Intelligence-based Deduplication Algorithm for Classification of Duplicate Reports in the FDA Adverse Event Reports (FAERS) | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | The deduplication algorithm is applied to nonpublic data in the FDA Adverse Event Reporting System (FAERS) to identify duplicate individual case safety reports (ICSRs). Unstructured data in free text FAERS narratives is processed through a natural language processing system to extract relevant clinical features. Both structured and unstructured data are then used in a probabilistic record linkage approach to identify duplicates. Application of the deduplication algorithm is optimized for processing entire FAERS database to support datamining. | Development and Acquisition | 9/1/19 | 9/1/19 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
356 | Information Visualization Platform (InfoViP) to support analysis of individual case safety reports | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | Developed the Information Visualization Platform (InfoViP) for post market safety surveillance, to improve the efficiency and scientific rigor of Individual Case Study Reports (ICSRs) review and evaluation process. InfoViP incorporates artificial intelligence and advanced visualizations to detect duplicate ICSRs, create temporal data visualization, and classify ICSRs for useability. | Development and Acquisition | 9/1/19 | 9/1/19 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
357 | Opioid Data Warehouse Term Identification and Novel Synthetic Opioid Detection and Evaluation Analytics | FDA | CDER/Office of Strategic Programs (OSP) | The Term Identification and Novel Synthetic Opioid Detection and Evaluation Analytics use publicly available social media and forensic chemistry data to identify novel referents to drug products in social media text. It uses the FastText library to create vector models of each known NSO-related term in a large social media corpus, and provides users with similarity scores and expected prevalence estimates for lists of terms that could be used to enhance future data gathering efforts. | Operation and Maintenance | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||||
358 | Community Level Opioid Use Dynamics Modeling and Simulation | FDA | CDER/Office of Translational Sciences | The OUD project leverages artificial intelligence techniques, specifically Agent-Based Modeling (ABM), to design and carry out Community Level Opioid Use Dynamics Modeling and Simulation with a cohort of datasets and to investigate the propagation mechanisms involving various factors including geographical and social influences and more, and their impacts at a high level. The project also leveraged Machine Learning (ML), such as Classification, to identify data entry types (e.g., whether a particular data entry is entered by a person in the target population, e.g., a woman of child-bearing ages) as part of the training data generation task. | Initiation | 5/5/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
359 | Centers of Excellence in Regulatory Science and Innovation (CERSI) project - Leveraging AI for improving remote interactions. | FDA | CDER/Office of Pharmaceutical Quality (OPQ) | This project aims to improve four major areas identified by FDA, including transcription, translation, document and evidence management, and co-working space. Automatic speech recognition has been widely used in many applications. Its cutting-edge technology is transformer-based sequence to sequence (seq2seq) model, which is trained to generate transcripts autoregressively and has been fine-tuned on certain datasets. Using pre-trained language models directly may not be suitable because they might not work properly with different accents and specialized regulatory and scientific terminologies. This is because the models were trained on a specific type of data and may not be able to handle data that is significantly different from what they were trained on. To address this, researchers plan to manually read a set of video/audio to obtain their true transcripts, upon which they fine-tune the model to make it adapt to this new domain. Machine translation converts a sequence of text from one language to another. Researchers usually use a method called "seq2seq," where original text is codified into a language that a computer can understand. Then, we use this code to generate the translated version of the text. It's like a translator who listens to someone speak in one language and then repeats what they said in another language. Similarly, it is not appropriate to directly apply the existing pre-trained seq2seq models, because (a) some languages used in the FDA context might not exist in existing models. (b) domain specific terms used in FDA are very different from general human languages. To tackle these challenges, models are trained for some unusual languages and fine-tune pre-trained models for major languages. For both situations, researchers prepare high-quality training set labeled by experts. University of Maryland CERSI (M-CERSI) plans to build a system to manage different documents and evidence, by implementing three sub-systems: (a) document classifier, (b) video/audio classifier, and (c) an interactive middleware that connects the trained model at the backend and the input at the frontend. With this, all documents created during co-working can be shared and accessed by all participants. | Initiation | 3/1/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
360 | Clinical Study Data Auto-transcribing Platform (AI Analyst) for Generating Evidence to Support Drug Labelling | FDA | CDER/Office of Translational Sciences/Office of Clinical Pharmacology | The AI Analyst platform is trained to auto-author clinical study reports from the source data to assess the strength and robustness of analytical evidence for supporting drug labelling languages. The platform directly transcribes SDTM (Study Data Tabulation Model) datasets of phase I/II studies into full-length clinical study reports autonomously with minimal human input. The underlying AI algorithm mimics the subject matter experts (e.g., clinicians, statisticians, and data managers) thinking process to decipher the full details of study design and conduct, and interpret the study results according to the study design. It consists of multiple layers of data pattern recognitions. The algorithm addresses the challenging nature of assessing clinical study results, including huge variety of study designs, unpredictable study conduct, variations of data reporting nomenclature/format, and wide range of study-specific analysis methods. The platform has been trained and tested with hundreds of NDA/BLA submissions and over 1500 clinical trials. The compatible study types include most drug label supporting studies, such as drug interaction, renal/hepatic impairment, and bioequivalence. In 2022, the Office of Clinical Pharmacology (OCP/OTS/CDER) initiated the RealTime Analysis Depot (RAD) project aiming to routinely apply the AI platform to support the review of NME, 505b2 and 351K submissions. | Implementation | 12/1/20 | 1/1/21 | 11/1/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||
361 | Application of Statistical Modeling and Natural Language Processing for Adverse Event Analysis | FDA | CDER/Office of New Drugs | Drug-induced adverse events (AEs) are difficult to predict for early signal detection, and there is a need to develop new tools and methods to monitor the safety of marketed drugs, including novel approaches for evidence generation. This project will utilize natural language processing (NLP) and data mining (DM) to extract information from approved drug labeling that can be used for statistical modeling to determine when the selected AEs are generally labeled (pre- or post-market) and identify patterns of detection, such as predictive factors, within the first 3 years of marketing of novel drugs. This project is intended to increase our understanding of timing/early detection of AEs, which can be applied to targeted monitoring of novel drugs. Funding will be used to support an ORISE fellow. | Initiation | 11/1/22 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
362 | Using Unsupervised Learning to Generate Code Mapping Algorithms to Harmonize Data Across Data Systems | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | The goal of this project is to assess the potential of dataÄêdriven statistical methods for detecting and reducing coding differences between healthcare systems in Sentinel. Findings will inform development and deployment of methods and computational tools for transferring knowledge learned from one site to another and pave the way towards scalable and automated harmonization of electronic health records data. | Implementation | 6/1/21 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
363 | Augmenting date and cause of death ascertainment in observational data sources | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | The objective of this project is to develop a set of algorithms to augment assessment of mortality through probabilistic linkage of alternative data sources with EHRs. Development of generalizable approaches to improve death ascertainment is critical to improve validity of Sentinel investigations using mortality as an endpoint, and these algorithms may also be usable in supplementing death ascertainment in claims data as well. Specifically, we propose the following Aims. Specific Aim 1: We propose to leverage online publicly available data to detect date of death for patients seen at two healthcare systems. Specific Aim 2: We propose to augment cause of death data using healthcare system narrative text and administrative codes to develop probabilistic estimates for common causes of death | Implementation | 1/1/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
364 | Scalable automated NLP-assisted chart abstraction and feature extraction tool | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | The overall goal of this study is to demonstrate the usability and value of currently available data sources and techniques in electronic medical records by harnessing claims and EHR data, including structured, semi-structured, and unstructured data, in a pharmacoepidemiology study. This study will use real-world longitudinal data from the Cerner Enviza Electronic Health Records (CE EHR) linked to claims with NLP technology applied to physician notes. NLP methods will be used to identify and contextualize pre-exposure confounding variables, incorporate unstructured EHR data into confounding adjustment, and for outcome ascertainment. Use case study; This study will seek to understand the relationship between use of montelukast among patients with asthma and neuropsychiatric events. | Initiation | 9/15/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
365 | MASTER PLAN Y4 | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | The overall mission of the Innovation Center is to integrate longitudinal patient-level EHR data into the Sentinel System to enable in-depth investigations of medication outcomes using richer clinical data than are generally not available in insurance claims data. The Master Plan lays out a five-year roadmap for the Sentinel Innovation Center to achieve this vision through four key strategic areas: (1) data infrastructure; (2) feature engineering; (3) causal inference; and (4) detection analytics. The projects focus on utilizing emerging technologies including feature engineering, natural language processing, advanced analytics, and data interoperability to improve Sentinel's capabilities. | Initiation | 10/1/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
366 | Onboarding of EHR data partners | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | In the currently proposed project (DI6), structured fields from EHRs and linked claims data from two identified commercial data partners will be converted to the Sentinel Common Data Model (SCDM). The SCDM is an organizing CDM that preserves the original information from a data source and has been successfully used in the Sentinel system for over a decade. While originally built for claims data, SCDM was expanded in 2015 to accommodate some information commonly found in EHRs in separate clinical data tables to capture laboratory test results of interest and vital signs. We selected the SCDM over other CDMs because data formatted in the SCDM enables analyses that can leverage the standardized active risk identification and analysis (ARIA) tools. Operationally, both Data Partners will share SCDM transformed patient-level linked EHR-claims data with the IC after quality assessments are passed. This is a substantial advantage in this early stage of understanding how to optimally analyze such data. It will allow Sentinel investigators to directly work with the data, adapt existing analytic programs, and test algorithms. In sum, transformation of structured data from the proposed sources to SCDM format will be a key first step for potential future incorporation of these Data Partners into Sentinel to provide access to EHR-claims linked data for >10 million patients, which will be critical to meet the need identified in the 5-year Sentinel System strategic plan of 2019. | Initiation | 10/1/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
367 | Creating a development network | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | This project has the following specific Aims: Aim 1: To convert structured data from EHRs and linked claims into Sentinel Common Data Model at each of the participating sites Aim 2: To develop a standardized process for storage of free text notes locally at each site and develop steps for routine meta data extraction from these notes for facilitating direct investigator access for timely execution of future Sentinel tasks | Initiation | 10/1/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
368 | Empirical evaluation of EHR-based signal detection approaches | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | This project will develop approaches for abstracting and combining structured and unstructured EHR data as well as expanding TBSS methods to also identify signals for outcomes identifiable only through EHR data (e.g. natural language processing, laboratory values). | Initiation | 9/30/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
369 | Automatic Recognition of Individuals by Pharmacokinetic Profiles to Identify Data Anomalies | FDA | CDER/Office of Translational Sciences/Office of Biostatistics | In efforts to detect data anomalies under ANDA, Office of Biostatistics, Division of Biometrics VIII created an R shiny application, DABERS (Data Anomalies in BioEquivalence R Shiny) to support OSIS and OGD. Despite its demonstrated effectiveness, a major drawback is that the pharmacokinetics and pharmacodynamics may be too complicated to describe with a single statistic. Indeed, the current practice offers no practical guidelines regarding how similar PK profiles from different subjects can be in order to be considered valid. This makes it difficult to assess the adequacy of data to be accepted for an ANDA and requires additional information requests to applicants. This project will address the current gap in identifying the data anomalies and potential data manipulations by use of state-of-the-art statistical methods, specifically focusing on machine learning and data augmentation. The purpose of the project is twofold. First, from a regulatory perspective, our project will provide a data driven method that can model complex patterns of PK data to identify potential data manipulations under an ANDA. Second, from a public health research and drug development point of view, the proposed study can potentially be used to understand and quantify the variability in drug response, to guide stratification and targeting of patient subgroups, and to provide insight into what the right drug and right range of doses are for those subgroups. | Development and Acquisition | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||||
370 | CluePoints CRADA | FDA | CDER/Office of Translational Sciences/Office of Biostatistics | This project uses unsupervised machine learning to detect and identify data anomalies in clinical trial data at the site, country and subject levels. This project will consider multiple use cases with the goals of improving data quality and data integrity, assist site selection for inspection, and assist reviewers by identifying potentially problematic sites for sensitivity analyses. | Development and Acquisition | 10/6/16 | 11/6/16 | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||
371 | Label comparison tool to support identification of safety-related changes in drug labeling | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | A tool with AI capabilities used to assist humans in their review and comparison of drug labeling in PDF format to identify safety-related changes occurring over time. The FDA uses postmarket data to update drug labeling, which can include new and a broad range safety-related issues; safety updates may be added to various sections of drug labeling. The tool's BERT natural language processing was trained to identify potential text related to newly added safety issues between drug labeling. | Development and Acquisition | 11/1/22 | 11/22/23 | 2/23/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||
372 | Artificial Intelligence (AI) Supported Annotation of FAERS Reports | FDA | CDER/Office of Surveillance and Epidemiology (OSE) | Develop a prototype software application to support the human†review of FAERS data by developing computational algorithms to semi-automatically categorizing FAERS reports into meaningful medication error categories based on report free text. Leveraged existing annotated reports and worked with subject matter experts to annotate subsets of FAERS reports, to generate initial NLP algorithms that can classify any report as being medication related and with an identified type of medication error. An innovative active learning approach was then used to annotate reports and build more robust algorithms for more accurate categorization. | Development and Acquisition | 9/22/23 | 9/22/23 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
373 | Development of Machine Learning Approaches to Population Pharmacokinetic Model Selection and Evaluation of Application to Model-Based Bioequivalence Analysis | FDA | CDER/Office of Generic Drugs | 1. Development of a deep learning/reinforcement learning approach to population pharmacokinetic model selections 2. Implementation of an established Genetic algorithm approach to population pharmacokinetic model selections in Python. | Development and Acquisition | 8/15/21 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
374 | Machine-Learning based Heterogeneous Treatment Effect Models for Prioritizing Product-Specific Guidance Development | FDA | CDER/Office of Generic Drugs | In this project, we propose to develop and implement a novel machine learning algorithm for estimating heterogeneous treatment effects to prioritize PSG development. Specifically, we propose three major tasks. First, we will address an important problem in treatment effect estimation from observational data, where the observed variables may contain confounders, i.e., variables that affect both the treatment and the outcome. We will build on recent advances in variational autoencoder to introduce a data-driven method to simultaneously estimate the hidden confounders and the treatment effect. Second, we will evaluate our model on both synthetic datasets and previous treatment effect estimation benchmarks. The ground truth data enable us to investigate model interpretability. Third, we will validate the model with the real-world PSG data and explain model output for a particular PSG via collaborating with FDA team. The real-world datasets are crucial to validate our model, which may include Orange Book, FDAÄôs PSGs, National Drug Code directory database, Risk Evaluation and Mitigation Strategies (REMS) data and IQVIA National Sales Perspectives that are publicly available, as well as internal ANDA submission data. | Development and Acquisition | 9/10/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
375 | Developing Tools based on Text Analysis and Machine Learning to Enhance PSG Review Efficiency | FDA | CDER/Office of Generic Drugs | 1. Develop a novel neural summarization model in tandem with information retrieval system, tailored for PSG review, with dual attention over both sentence-level and word-level outputs by taking advantage of both extractive and abstractive summarization. 2. Evaluate the new model with the PSG data and the large CNN/Daily Mail dataset. 3. Develop an open-source software package for text summarization model and the information retrieval system. | Development and Acquisition | 10/15/19 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
376 | BEAM (Bioequivalence Assessment Mate) - a Data/Text Analytics Tool to Enhance Quality and Efficiency of Bioequivalence Assessment | FDA | CDER/Office of Generic Drugs | We aim to develop BEAM using verified data analytics packages, text mining, and artificial intelligence (AI) toolsets (including machine learning (ML)), to streamline the labor-intensive work during BE assessments to facilitate high-quality and efficient regulatory assessments. | Development and Acquisition | 8/10/18 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
377 | OSCAR | FDA | CTP/OS/DRSI | OSCAR (Office of Science Customer Assistance Response) is a chatbot with predefined intents for customers to get help from Customer Service Center. It offers a 24/7 user interface allowing users to input questions and view previous responses, as well as a dashboard offering key metrics for admin users. | Operation and Maintenance | 6/1/21 | 1/1/22 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
378 | SSTAT | FDA | CTP/OS/DRSI | Self-Service Text Analytics Tool (SSTAT) is used to explore the topics of a set of documents. Documents can be submitted to the tool in order to generate a set of topics and associated keywords. A visual listing of the documents and their associated topics is automatically produced to help quickly snapshot the submitted documents. | Operation and Maintenance | 10/1/20 | 6/1/22 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
379 | ASSIST4TOBACCO | FDA | CTP/OS/DRSI | ASSIST4Tobacco is a semantic search system that helps CTP stakeholders find tobacco authorization applications more accurately and efficiently. | Implementation | 10/1/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
380 | Process Large Amount of Submitted Docket Comments | FDA | CBER/OBPV/DABRA | Provide an automated process to transfer, deduplicate, summarize and cluster docket comments using AI/ML | Implementation | 11/15/21 | 11/15/21 | 4/1/22 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||
381 | To develop novel approaches to expand and/or modify the vaccine AESI phenotypes in order to further improve adverse event detection | FDA | CBER/OBPV/DABRA | Developing a BERT-like ML model to improve detection of adverse events of special interest by applying a clinical-oriented language models pre-trained using the clinical documents from UCSF | Implementation | 9/1/21 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
382 | BEST Platform improves post-market surveillance efforts through the semi-automated detection, validation and reporting of adverse events. | FDA | CBER/OBPV/DABRA | The BEST Platform employs a suite of applications and techniques to improve the detection, validation and reporting of biologics-related adverse events from electronic health records (EHRs). The Platform utilizes ML and NLP to detect potential adverse events, and extract the important features for clinicians to validate. | Implementation | 9/1/19 | 12/1/20 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
383 | AI Engine for Knowledge discovery, Post-market Surveillance and Signal Detection | FDA | CFSAN/OFAS | The use of Artificial Intelligence in post-market surveillance and signal detection will enhance CFSAN's ability to detect potential problems associated with CFSAN commodities, including leveraging data to investigate potential issues with chronic, long-term exposure to food additives, color additives, food contact substances and contaminants or long-term use of cosmetics. OFAS Warp Intelligent Learning Engine (WILEE) project seeks establish an intelligent knowledge discovery and analytic agent for the Office. WILEE (pronounced Wiley) provides a horizon-scanning solution, analyzing data from the WILEE knowledgebase, to enable the Office to maintain a proactive posture and the capacity to forecast industry trends so that the Office can stay ahead of the development cycle and prepare for how to handle a large influx of submissions (operational risk - e.g., change in USDA rules regarding antimicrobial residue levels in poultry processing), prioritize actions based on risk or stakeholder perceived risk regarding substances under OFAS purview (e.g., yoga mat incident). WILEE will provide the Office with an advanced data driven risked based decision-making tool, that leverages AI technologies to integrate and process a large variety of data sources, generating reports with quick insights that will significantly improve our time-to-results. | Implementation | 8/1/21 | hhs.caio@hhs.gov | Yes | ||||||||||||||||
384 | Data Infrastructure Backbone for AI applications | FDA | CFSAN /OFAS | OFAS is creating a data lake (WILEE knowledgebase) that ingests and integrates data from a variety of data sources to assist our use of advance analytics in driving risked based decision making. The sources of data include, internal stakeholder submission data, data generated by OFAS staff, scientific information from PubMed, NIH and other scientific publications, CFSAN generated data such as the total diet study, news articles and blog posts, publications from sister agencies, food ingredient and packaging data, food sales data etc. The design of this data store allows for the automated ingestion of new data while allowing for manual curation where necessary. It is also designed to enable the identification, acquisition and integration of new data sources as they become available. The design of the data lake centralizes information about CFSAN regulated products, food additives, color additives, GRAS substances and food contact substance and integrates the different sources of information with stakeholder submission information contained in FARM and cheminformatics information in CERES enabling greater insights and a more efficient knowledge discovery during review of premarket submissions and post market monitoring of the U.S food supply. | Operation and Maintenance | 6/1/20 | 6/1/20 | 6/14/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||
385 | Emerging Chemical Hazard Intelligence Platform (ECHIP - completed) | FDA | CFSAN/OFAS | This is an AI solution designed to identify emerging, potential chemical hazards or emerging stakeholder concerns regarding potential hazards associated with substances of interest to CFSAN. Implementation of this solution will enable CFSAN to take proactive measures to protect and/or address concerns from our stakeholders. ECHIP uses data from the news and social media, and the scientific literature to identify potential issues that may require CFSAN's attention. Real world examples without the ECHIP AI solution have taken 2-4 weeks for signal identification and verification depending on the number of scientists dedicated to reviewing the open literature, news and social media. Results from pilot studies indicate that ECHIP could reduce the overall signal detection and validation process to about 2 hours. ECHIP accomplishes this reduction by automatically ingesting, reviewing, analyzing and presenting data from multiple sources to scientists in such a way that signal detection and verification can be done an a very short time period. | Operation and Maintenance | 8/1/18 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
386 | Development of virtual animal models to simulate animal study results using Artificial Intelligence (AI) | FDA | NCTR | Testing data from animal models provides crucial evidence for the safety evaluation of chemicals. These data have been an essential component in regulating drug, food, and chemical safety by regulatory agencies worldwide including FDA. As a result, a wealth of animal data is available from the public domain and other sources. As the toxicology community and regulatory agencies move towards a reduction, refinement, and replacement (3Rs principle) of animal studies, we proposed an AI-based generative adversarial network (GAN) architecture to learn from existing animal studies so that it can generate animal data for new and untested chemicals without conducting further animal experiments. The FDA has developed guidelines and frameworks to modernize toxicity assessment with alternative methods, such as the FDA Predictive Toxicology Roadmap and the Innovative Science and Technology Approaches for New Drugs (ISTAND). These programs facilitate the development and evaluation of alternative methodologies to expand the FDA's toxicology predictive capabilities, to reduce the use of animal testing, and to facilitate drug development. A virtual animal model with capability of simulating animal studies could serve as an alternative to animal studies to support the FDA mission. | Initiation | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||||
387 | Assessing and mitigating bias in applying Artificial Intelligence (AI) based natural language processing (NLP) of drug labeling documents | FDA | NCTR | As use of AI in biomedical sciences increases, significant concerns are raised regarding bias, stereotype, or prejudice in some AI systems. An AI system trained on inappropriate or inadequate data may reinforce biased patterns and thus provide biased predictions. Particularly, when the AI model was trained on dataset from different domains and then transferred to a new application domain, the system needs to be evaluated properly to avoid potential bias risks. Given the increased number of transfer learning and AI applications in document analysis to support FDA review, this proposal is to conduct a comprehensive study to understand and assess the bias in applying AI based natural language processing of drug labeling documents, and to the extension of developing a strategy to mitigate such a bias. | Initiation | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||||
388 | Identify sex disparities in opioid drug safety signals in FDA adverse events report systems (FAERS) and social media Twitter to improve women health | FDA | NCTR | This proposal aims to address OWH 2023 Priority Area: Use of real world data and evidence to inform regulatory processes. We propose to analyze sex differences in adverse events for opioid drugs in social media (Twitter) and the FDA Adverse Events Report Systems (FAERS). We will compare sex disparities identified from FAERS and Twitter to assess whether Twitter data can be used as an early warning system to signal the opioid-related issues specific to women. The identified sex disparities in adverse events for opioid drugs from this project could help improve women health. | Initiation | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||||
389 | Prediction of adverse events from drug - endogenous ligand - target networks generated using 3D-similarity and machine learning methods. | FDA | NCTR | Excluding areas of the biochemical space near activity cliffs [1], molecular similarity [2] has long proven to be an outstanding tool in virtual screening [3], absorption, distribution, metabolism, and excretion (ADME) [4], drug design [5] and toxicology [6]. Among these, the toxicological response is the most challenging task due to its immense complexity involving multiple pathways and protein targets. Although many adverse drug reactions (ADRs) result from genetic polymorphisms and factors such as the patient's medical history and the treatment dosage and regimen, on a fundamental level all ADRs are initiated by the binding of a drug molecule to a target, whether intended (therapeutic target) or non-intended (off-target interactions with promiscuous proteins) [7]. While molecular similarity approaches designed to identify off-target interaction sites have been explored since the late 2000s [8, 9], most have been focused on drug design, repurposing and more generally, efficacy, whereas relatively few have been applied to toxicology [10, 11]. Since there are multiple approaches to molecular similarity (structural, functional, whole molecule, pharmacophore, etc. [12]), the performance of any of the above applications depends strongly on the metrics by which similarity is quantified. For the past 10 years, DSB has been working on creating a universal molecular modeling approach utilizing unique three-dimensional fingerprints encoding both the steric and electrostatic fields governing the interactions between ligands and receptors. It has been demonstrated that these fingerprints could quantify reliably both the structural and functional similarities between molecules [13, 14] and their application for prediction of adverse events from AI generated drug - endogenous ligand - target networks could provide new insights into yet unknown mechanisms of toxicity. | Initiation | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||||
390 | Predictive toxicology models of drug placental permeability using 3D-fingerprints and machine learning | FDA | NCTR | The human placenta plays a pivotal role in fetal growth, development, and fetal exposure to chemicals and therapeutics. The ability to predict placental permeability of chemicals during pregnancy is an important factor that can inform regulatory decisions related to fetal safety and clinical trials with women of child-bearing potential (WOCBP). The human placenta contains transport proteins, which facilitate the transfer of various endogenous substances and xenobiotics. Several mechanisms allow this transfer: i) passive diffusion, ii) active transport, iii) facilitated diffusion, iv) pinocytosis, and v) phagocytosis. Among these, passive and active transport are the two major routes. Small, non-ionized, highly lipophilic drugs cross the placenta via passive diffusion; however, relatively large molecules (MW > 500 Da) with low lipophilicity are carried by transporters. While prediction of the ability of drugs to cross the placenta via diffusion is straight-forward, the complexity of molecular interactions between drugs and transporters has proven to be a challenging problem to solve. Virtually, all QSARs (Quantitative Structure Activity Relationships) published to date model small datasets (usually not exceeding 100 drugs) and utilize weak validation strategies [1-5]. In this proposal, 3D-molecular similarities of endogenous placental transporter ligands to known drug substrates will be used to identify the most likely mode of drug transportation (active/passive) and build predictive, quantitative and categorical 3D-SDAR models by linking their molecular characteristics to placental permeability. Permeability data will be collected via mining the literature, the CDER databases, and conducting empirical assessments using in vitro NAMs with confirmation using rodent models. Predictability will be validated using: i) blind test sets including known controls and ii) a small set of drugs with unknown permeabilities, which will be tested in in vitro and in vivo models. | Initiation | 3/6/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
391 | Opioid agonists/antagonists knowledgebase (OAK) to assist review and development of analgesic products for pain management and opioid use disorder treatment | FDA | NCTR | The number of deaths caused by opioid overdose in the United States has been increasing dramatically for the last decade. misuse and abuse continue at alarmingly high rates. Opioid use disorder (OUD) often starts with use of prescription opioid analgesics. Therefore, the development of abuse-deterrent analgesic products may significantly impact the trajectory of the opioid crisis. In addition, FDA is making new efforts to support novel product innovation for pain management and the treatment of OUD to combat this opioid crisis. Opioid agonists bind and activate opioid receptors to decrease calcium influx and cyclic adenosine monophosphate (cAMP), leading to hyperpolarization that inhibits pain transmission. Opioid antagonists bind and inhibit or block opioid receptors. Both opioid agonists and antagonists are used in drug products for pain management and treatment of opioid addiction. An opioid agonists/antagonists knowledgebase (OAK) would be useful for FDA reviewers to inform evaluation and to assist development of analgesics and of additional treatments for OUD. To create a comprehensive OAK, we propose to curate the experimental data on opioid agonist/antagonist activity from the public domain, experimentally test some 2800 drugs in functional opioid receptor assays using quantitative high-throughput screen (qHTS) platform, and develop and validate in silico models to predict opioid agonist/antagonist activity. The created OAK knowledgebase could be used for retrieving experimental opioid agonist/antagonist activity data and the related experimental protocols. For chemicals without experimental data, read-across methods could be used to find similar chemicals in OAK to estimate the opioid agonist/antagonist activity, and the in silico models in OAK could be used to predict the opioid agonist/antagonist activity. The retrieved or predicted activity data can then be used to inform regulatory review or to assist in the development of analgesics. | Implementation | 2/25/20 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
392 | Development of a Comprehensive Open Access Molecules with Androgenic Activity Resource (MAAR) to Facilitate Assessment of Chemicals | FDA | NCTR | Androgen receptor (AR) is a ligand-dependent transcription factor and a member of the nuclear receptor superfamily, which is activated by androgens. AR is the target for many drugs but it could also act as an off target for drugs and other chemicals. Therefore, detecting androgenic activity of drugs and other FDA regulated chemicals is critical for evaluation of drug safety and assessment of chemical risk. There is a large amount of androgenic activity data in the public domain, which could be an asset for the scientific community and regulatory science. However, the data are distributed across different and diverse sources and stored in different formats, limiting the use of the data in research and regulation. Therefore, a comprehensive, reliable resource to provide open access to the data and enable modeling and prediction of androgenic activity for untested chemicals is in urgent need. This project will develop a high-quality open access Molecules with Androgenic Activity Resource (MAAR) including data and predictive models fully compliant with the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. MAAR can be used to facilitate research on androgenic activity of chemicals and support regulatory decision making concerning efficacy and safety evaluation of drugs and chemicals in the FDA regulated products. | Implementation | 11/6/20 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
393 | Artificial Intelligence (AI)-based Natural Language Processing (NLP) for FDA labeling documents | FDA | NCTR | FDA has historically generated and continues to generate a variety of documents during the product-review process, which are typically unstructured text and often not follows the use of standards. Therefore, analysis of semantic relationships plays a vital role to extract useful information from the FDA documents to facilitate the regulatory science research and improve FDA product review process. The rapid advancement in artificial intelligence (AI) for Natural Language Processing (NLP) offers an unprecedent opportunity to analyze the semantic text data by using the language models that are trained with large biomedical corpus. This study is to assess the AI based NLP for the FDA documents with a focus on the FDA labeling documents. Specifically, we will apply the publicly available language models (e.g., BERT and BioBERT) to the FDA drug labeling documents available from the FDA Label tool that manages over 120K labeling documents including over 40K Human Prescription Drug and Biological Products. We will investigate three areas of AI applications that are important to the regulatory science research: (1) the interpretation and classification of drug properties (e.g., safety and efficacy) with AI reading, (2) text summarization to provide highlights of labeling sections, (3) automatic anomaly analysis (AAA) for signal identification, and (4) information retrieval with Amazon-like Questions/Answer. We will compare the AI based NLP with MedDRA based approach whenever possible for drug safety and efficacy. The study will provide a benchmark for fit-for-purpose application of the public language models to the FDA documents and, moreover, the outcome of the study could provide a scientific basis to support the future development of FDALabel tool which is widely used in CDER review process. | Implementation | 5/14/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
394 | Informing selection of drugs for COVID-19 treatment by big data analytics and artificial intelligence | FDA | NCTR | The pandemic of COVID-19 is the biggest global health concern currently. As of July 11, 2020, more than 12 million people have been tested positive of SARS-COV-2 virus infection and more than half million deaths have been caused by COVID-19 in the world. Currently, no vaccines and/or drugs have been proved to be effective to treat COVID-19. Therefore, many drug products on the market are being repurposed for the treatment of COVID-19. However, sufficient evidence is needed to determine that the repurposed drugs are safe and effective. Therefore, safety information on the drugs selected for repurposing purpose is important. The proposed project aims to mine adverse drug events using artificial intelligence and big data analytics in the public domain including the agency's database, public databases, and social media data for the drugs to be repurposed for the treatment of COVID-19. The ultimate goal of this project is to provide detailed adverse event information that can be used to facilitate safety evaluation for drugs repurposed for the treatment COVID-19. The detailed adverse event information will be used to develop recommendations for selecting the right drugs for repurposing efforts and for help select the appropriate COVID-19 patients and thus better to combat the pandemic. | Implementation | 3/27/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
395 | Towards Explainable AI: Advancing Predictive Modeling for Regulatory Use | FDA | NCTR | Artificial Intelligence (AI) is a broad discipline of training machines to think and accomplish complex intellectual tasks like humans. It learns from existing data/information to predict future outcomes, distill knowledge, offer advices, or plan action steps. The rise of AI has offered both opportunities and challenges to FDA in two aspects: (1) how to assess and evaluate marketed AI-centric products and (2) how to implement AI methods to improve the agency's operation. One of the key aspects of both regulatory applications is to understand the underlying features driving the AI performance and to the extension of its interpretability in the context of application. Different from the statistical evaluation (e.g., accuracy, sensitivity and specificity), model interpretability assessment lacks quantitative metrics. In most cases, the assessment tends to be subjective, where prior knowledge is often used as a ground-truth to explain the biological relevance of underlying features, e.g., whether the biomarkers featured by the model are in accordance with the existing findings. In reality, there is a trade-off between statistical performance and interpretability among different AI algorithms, and understanding the difference will improve the context of use of AI technologies in regulatory science. For that, we will investigate representative AI methods, in terms of their performance and interpretability, first through benchmark datasets that have been well-established in the research community, then extended to clinical/pre-clinical datasets. This project will provide basic parameters and offer an insightful guidance on developing explainable AI models to facilitate the real-world decision making in regulatory settings. | Implementation | 3/27/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
396 | Identification of sex differences on prescription opioid use (POU)-related cardiovascular risks by big data analysis | FDA | NCTR | 1) Prescription opioid use (POU) varies among patient population subgroups, such as gender, age, and ethnicity. POU can potentially cause various adverse effects in the respiratory, gastrointestinal, musculoskeletal, cardiovascular, immune, endocrine, and central nervous systems. Important sex differences have been observed in POU-associated cardiac endpoints. Currently, systematic knowledge is lacking for risk factors associated with the increased cardiotoxicity of POU in women. 2) Currently, the FDA utilizes two methods of analysis for data mining, the Proportional Reporting Ratio (PRR) and the Empirical Bayesian Geometric Mean (EBGM) to identify significant statistical associations between products and adverse events (AEs). These methods are not applicable when two or more reporting measures (e.g. gender, age, race, etc.) must be considered and compared. In this study, a novel statistical model will be developed to detect the safety signals when gender is considered as the third variable. Safety signals will then be detected and compared from combined multiple-layered real-world evidence in the form of EHRs from diverse sources. Sex-dependent differences in risk factors for cardiotoxicity from POU will be identified and analyzed using big data methods and AI-related tools. 3) The proposed project addresses the first of four priority areas of FDA's 2018 Strategic Policy Roadmap: Reduce the burden of addiction crises that are threatening American families, and two priority areas of Women's Health Research Roadmap: Priority Area 1: Advance Safety and Efficacy, and Priority Area 5: Expand Data Sources and Analysis. The results may provide information and knowledge to help the FDA drug reviewers and physicians be aware of sex differences to certain POU drugs and combinations of POU with other prescription drugs, therefore, preventing or reducing risk of the POU drug-induced CVD in women. | Development and Acquisition | 9/7/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
397 | NCTR/DBB-CDER/OCS collaboration on A SafetAI Initiative to Enhance IND Review Process | FDA | NCTR | The development of animal-free models has been actively investigated and successfully demonstrated as an alternative to animal-based approaches for toxicity assessments. Artificial Intelligence (AI) and Machine learning (ML) have been the central engine in this paradigm shift to identify safety biomarkers from non-animal assays or to predict safety outcomes solely based on chemical structure data. AI is a computer system or algorithm that has the ability to learn from existing data to foresee the future outcome. ML, a subset of AI, has been specifically studied to make predictions for adverse drug reactions. Deep Learning (DL) is arguably the most advanced approach in ML which frequently outperforms other types of ML approaches (or conventional ML approaches) for the study of drug safety and efficacy. DL usually consists of multiple layers of neural networks to mimic the cognitive behaviors associated with the human brain learning and problem-solving process to solve data intensive problems. Among many studies using AI/ML, DL has become a default algorithm to consider due to its superior performance. This proposal will apply DL to flag safety concerns regarding drug-induced liver injury (DILI) and carcinogenicity during the IND review process. | Initiation | 6/3/22 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
398 | Using XGBoost Machine Learning Method to Predict Antimicrobial Resistance from WGS data | FDA | CVM | Genomic data and artificial intelligence/machine learning (AI/ML) are used to study antimicrobial resistance (AMR) in Salmonella, E. coli, Campylobacter, and Enterococcus, isolated from retail meats, humans, and food producing animals. The Boost Machine Learning Model (XGBoost) is implemented to improve upon categorical resistance vs susceptible predictions by predicting antimicrobial Minimum Inhibitory Concentrations (MICs) from WGS data. | Development and Acquisition | 1/1/19 | 1/1/20 | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||
399 | individual Functional Activity Composite Tool (inFACT) | NIH | National Institutes of Health (NIH) CC | inFACT is being developed for use in the Social Security Administration (SSA) disability determination process to assist adjudicators in identifying evidence on function from case records that might be hundreds or thousands of pages long. inFACT displays information on whole person function as extracted from an individual's free text medical records and aligned with key business elements. | Development and Acquisition | 9/30/24 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
400 | Assisted Referral Tool | NIH | National Institutes of Health (NIH) CSR | To provide assistance in assigning appropriate scientific areas for grant applications. | Operation and Maintenance | 5/15/18 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
401 | NanCI: Connecting Scientists | NIH | National Institutes of Health (NIH) NCI | Uses AI to match scientific content to users interests. By collecting papers into a folder a user can engage the tool to find similar articles in the scientific literature, and can refine the recommendations by up or down voting of recommendations. Users can also connect with others via their interests, and receive and make recommendations via this social network. | Development and Acquisition | 1/1/24 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
402 | Detection of Implementation Science focus within incoming grant applications | NIH | National Institutes of Health (NIH) NHLBI | This tool uses natural language processing and machine learning to calculate an Implementation Science (IS) score that is used to predict if a newly submitted grant application proposes to use science that can be categorized as "Implementation Science" (a relatively new area of delineation). NHLBI uses the "IS score" in its decision for assigning the application to a particular division for routine grants management oversight and administration. | Operation and Maintenance | 1/1/20 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
403 | Federal IT Acquisition Reform Act (FITARA) Tool | NIH | National Institutes of Health (NIH) NIAID | The tool automates the identification of NIAID contracts that are IT-related. | Operation and Maintenance | 7/1/17 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
404 | Division of Allergy, Immunology, and Transplantation (DAIT) AIDS-Related Research Solution | NIH | National Institutes of Health (NIH) NIAID | The tool uses natural language processing (NLP), text extraction, and classification algorithms to predict both high/medium/low priority and area of research for a grant application. The incoming grant applications are ranked based on these predictions and more highly-ranked applications are prioritized for review. | Operation and Maintenance | 3/1/19 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
405 | Scientific Research Data Management System Natural Language Processing Conflict of Interest Tool | NIH | National Institutes of Health (NIH) NIAID | A tool that identifies entities within a grant application to allow NIAID's Scientific Review Program team to more easily identify conflicts of interest (COI) between grant reviewers and applicants using NLP methods (e.g., OCR, text extraction). | Operation and Maintenance | 10/1/19 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
406 | Tuberculosis (TB) Case Browser Image Text Detection | NIH | National Institutes of Health (NIH) NIAID | A tool to detect text in images that could be potentially Personally Identifiable Information (PII)/ Protected Health Information (PHI) in TB Portals. | Operation and Maintenance | 7/1/19 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
407 | Research Area Tracking Tool | NIH | National Institutes of Health (NIH) NIAID | A dashboard that incorporates machine learning to help identify projects within certain high-priority research areas. | Operation and Maintenance | 1/1/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
408 | NIDCR Digital Transformation Initiative (DTI) | NIH | National Institutes of Health (NIH) NIDCR | An initiative to create a natural language processing chatbot to improve efficiency, transparency, and consistency for NIDCR employees. | Development and Acquisition | 6/1/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
409 | NIDCR Data Bank | NIH | National Institutes of Health (NIH) NIDCR | The project will permit intramural research program investigators to move large sets of unstructured data into a cloud archival storage, which will scale, provide cost effective data tiering, capture robust meta data sufficient for management and governance, and create secondary or tertiary opportunities for analysis leveraging cognitive services AI/ML/NLP toolsets. | Development and Acquisition | 6/1/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
410 | Automated approaches for table extraction | NIH | National Institutes of Health (NIH) NIEHS | This project developed an automated, model-based processes to reduce the time and level of effort for manual†extraction of data from tables. Published data tables are a particularly data-rich and challenging presentation of critical information in published research. | Development and Acquisition | 1/1/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
411 | SWIFT Active Screener | NIH | National Institutes of Health (NIH) NIEHS | Applies statistical models designed to save screeners time and effort through active learning. Utilize user feedback to automatically prioritize studies. Supports literature screening for Division of Translational Toxicology evidence evaluations. | Operation and Maintenance | 1/1/20 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
412 | Splunk IT System Monitoring Software | NIH | National Institutes of Health (NIH) NIEHS | Utilizes machine learning to aggregate system logs from on-premises IT infrastructure systems and endpoints for auditing and cybersecurity monitoring purposes. | Operation and Maintenance | 1/1/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
413 | Clinical Trial Predictor | NIH | National Institutes of Health (NIH) NIGMS | The Clinical Trial Predictor uses an ensemble of several natural language processing and machine learning algorithms to predict whether applications may involve clinical trials based on the text of their titles, abstracts, narratives, specific aims, and research strategies. | Implementation | 5/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
414 | Stem Cell Auto Coder | NIH | National Institutes of Health (NIH) NIGMS | The Stem Cell Auto Coder uses natural language processing and machine learning to predict the Stem Cell Research subcategories of an application: human embryonic, non-human embryonic, human induced pluripotent, non-human induced pluripotent, human non-embryonic, and non-human non-embryonic. | Implementation | 5/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
415 | JIT Automated Calculator (JAC) | NIH | National Institutes of Health (NIH) NIGMS | The JIT Automated Calculator (JAC) uses natural language processing to parse Just-In-Time (JIT) Other Support forms and determine how much outside support PIs are receiving from sources other than the pending application. | Implementation | 5/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
416 | Similarity-based Application and Investigator Matching (SAIM) | NIH | National Institutes of Health (NIH) NIGMS | The SAIM system uses natural language processing to identify non-NIH grants awarded to NIGMS Principal Investigators. The system aids in identifying whether a grant application has significant unnecessary overlap with one funded by another agency. | Development and Acquisition | 1/1/24 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
417 | Remediate Adobe .pdf documents to be more accessible | NIH | National Institutes of Health (NIH) NLM | Many .pdf documents could be made available for public release if they conformed to Section 508 accessibility standards. NLM has been investigating the use of AI developed to remediate Adobe .pdf files not currently accessible to Section 508 standards.†The improved files are particularly more accessible to those like the blind who use assistive technology to read. | Development and Acquisition | 10/30/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
418 | CylanceProtect | NIH | National Institutes of Health (NIH) NLM | Protection of Windows and Mac endpoints from Cyberthreats | Operation and Maintenance | 9/6/19 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
419 | MEDIQA: Biomedical Question Answering | NIH | National Institutes of Health (NIH) NLM | Using and developing AI approaches to automate question answering for different users. This project leverages NLM knowledge sources and traditional and neural machine learning to address a wide-range of biomedical information needs. This project aims for improving access with one-entry access point to NLM resources. | Initiation | 3/1/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
420 | CLARIN: Detecting clinicians' attitudes through clinical notes | NIH | National Institutes of Health (NIH) NLM | Understanding clinical notes and detecting bias is essential in supporting equity and diversity, as well as quality of care and decision support. NLM is using and developing AI approaches to detect clinicians' emotions, biases and burnout. | Development and Acquisition | 12/30/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
421 | Best Match: New relevance search for PubMed | NIH | National Institutes of Health (NIH) NLM | PubMed is a free search engine for biomedical literature accessed by millions of users from around the world each day. With the rapid growth of biomedical literature, finding and retrieving the most relevant papers for a given query is increasingly challenging. NLM developed Best Match, a new relevance search algorithm for PubMed that leverages the intelligence of our users and cutting-edge machine-learning technology as an alternative to the traditional date sort order. | Operation and Maintenance | 1/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
422 | SingleCite: Improving single citation search in PubMed | NIH | National Institutes of Health (NIH) NLM | A search that is targeted at finding a specific document in databases is called a Single Citation search, which is particularly important for scholarly databases, such as PubMed, because it is a typical information need of the users. NLM developed SingleCite, an automated algorithm that establishes a query-document mapping by building a regression function to predict the probability of a retrieved document being the target based on three variables: the score of the highest scoring retrieved document, the difference in score between the two top retrieved documents, and the fraction of a query matched by the candidate citation. SingleCite shows superior performance in benchmarking experiments and is applied to rescue queries that would fail otherwise. | Operation and Maintenance | 1/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
423 | Computed Author: author name disambiguation for PubMed | NIH | National Institutes of Health (NIH) NLM | PubMed users frequently use author names in queries for retrieving scientific literature. However, author name ambiguity (different authors share the same name) may lead to irrelevant retrieval results. NLM developed a machine-learning method to score the features for disambiguating a pair of papers with ambiguous names. Subsequently, agglomerative clustering is employed to collect all papers belong to the same authors from those classified pairs. Disambiguation performance is evaluated with manual verification of random samples of pairs from clustering results, with a higher accuracy than other state-of-the-art methods. It has been integrated into PubMed to facilitate author name searches. | Operation and Maintenance | 1/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
424 | NLM-Gene: towards automatic gene indexing in PubMed articles | NIH | National Institutes of Health (NIH) NLM | Gene indexing is part of the NLM's MEDLINE citation indexing efforts for improving literature retrieval and information access. Currently, gene indexing is performed manually by expert indexers. To assist this time-consuming and resource-intensive process, NLM developed NLM-Gene, an automatic tool for finding gene names in the biomedical literature using advanced natural language processing and deep learning methods. Its performance has been assessed on gold-standard evaluation datasets and is to be integrated into the production MEDLINE indexing pipeline. | Initiation | 9/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
425 | NLM-Chem: towards automatic chemical indexing in PubMed articles | NIH | National Institutes of Health (NIH) NLM | Chemical indexing is part of the NLM's MEDLINE citation indexing efforts for improving literature retrieval and information access. Currently, chemicals indexing is performed manually by expert indexers. To assist this time-consuming and resource-intensive process, NLM developed NLM-Chem, an automatic tool for finding chemical names in the biomedical literature using advanced natural language processing and deep learning methods. Its performance has been assessed on gold-standard evaluation datasets and is to be integrated into the production MEDLINE indexing pipeline. | Initiation | 12/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
426 | Biomedical Citation Selector (BmCS) | NIH | National Institutes of Health (NIH) NLM | Automation of article selection allows NLM to more efficiently and effectively index and host relevant information for the public. Through automation, NLM is able standardize article selection and reduce the amount of time it takes to process MEDLINE articles. | Implementation | 1/1/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
427 | MTIX | NIH | National Institutes of Health (NIH) NLM | Machine learning-based system for the automated indexing of MEDLINE articles with Medical Subject Headings (MeSH) terms. Automated indexing is achieved using a multi-stage neural text ranking approach. Automated indexing allows for cost-effective and timely indexing of MEDLINE articles. | Implementation | 5/1/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
428 | ClinicalTrials.gov Protocol Registration and Results System Review Assistant | NIH | National Institutes of Health (NIH) NLM | This research project aims to help ClinicalTrials.gov determine whether the addition of AI could make reviewing study records more efficient and effective. | Development and Acquisition | 1/1/25 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
429 | MetaMap | NIH | National Institutes of Health (NIH) NLM | MetaMap is a widely available program providing access from biomedical text to the concepts in the unified medical language system (UMLS) Metathesaurus. MetaMap uses NLP to provide a link between the text of biomedical literature and the knowledge, including synonymy relationships, embedded in the Metathesaurus. The flexible architecture in which to explore mapping strategies and their application are made available. MTI uses the MetaMap to generate potential indexing terms. | Operation and Maintenance | 6/19/05 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
430 | Pangolin lineage classification of SARS-CoV-2 genome sequences | NIH | National Institutes of Health (NIH) NLM | The PangoLEARN machine learning tool provides lineage classification of SARS-CoV-2 genome sequences. Classification of SARS-CoV-2 genome sequences into defined lineages supports user retrieval of sequences based on classification and tracking of specific lineages, including those lineages associated with mutations that may decrease the effectiveness of therapeutics or protection provided by vaccination. | Operation and Maintenance | 4/1/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
431 | HIV-related grant classifier tool | NIH | National Institutes of Health (NIH) OD/DPCPSI/OAR | A front-end application for scientific staff to input grant information which then runs an automated algorithm to classify HIV-related grants. Additional features and technologies used include an interactive data visualization, such as a heat map, using Plotly Python library to display the confidence level of predicted grants. | Implementation | 3/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
432 | Automated approaches to analyzing scientific topics | NIH | National Institutes of Health (NIH) OD/DPCPSI/OPA | Developed and implemented a validated approach that uses natural language processing and AI/ML to group semantically similar documents (including grants, publications, or patents) and extract AI labels that accurately reflect the scientific focus of each topic to aid in NIH research portfolio analysis. | Implementation | 2/27/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
433 | Identification of emerging areas | NIH | National Institutes of Health (NIH) OD/DPCPSI/OPA | Developed an AI/ML-based approach that computes the age and rate of progress of topics in NIH portfolios. This information can identify emerging areas of research at scale and help accelerate scientific progress. | Implementation | 2/27/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
434 | Person-level disambiguation for PubMed authors and NIH grant applicants | NIH | National Institutes of Health (NIH) OD/DPCPSI/OPA | Correct attribution of grants, articles, and other products to individual researchers is critical for high quality person-level analysis. This improved method for disambiguation of authors on articles in PubMed and NIH grant applicants can inform data-driven decision making | Implementation | 2/27/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
435 | Prediction of transformative breakthroughs | NIH | National Institutes of Health (NIH) OD/DPCPSI/OPA | The ability to predict scientific breakthroughs at scale would accelerate the pace of discovery and improve the efficiency of research investments. The initiative has helped identify a common signature within co-citation networks that accurately predicts the occurrence of breakthroughs in biomedicine, on average more than 5 years in advance of the subsequent publication(s) that announced the discovery.†There is a patent application filed for this approach: U.S. Patent Application No. 63/257,818 (filed October 20, 2021) | Implementation | 2/27/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
436 | Machine learning pipeline for mining citations from full-text scientific articles | NIH | National Institutes of Health (NIH) OD/DPCPSI/OPA | The NIH Office of Portfolio Analysis developed a machine learning pipeline to identify scientific articles that are freely available on the internet and do not require an institutional library subscription to access. The pipeline harvests full-text pdfs, converts them to xml, and uses a Long Short-Term Memory (LSTM) recurrent neural network model that discriminates between reference text and other text in the scientific article. The LSTM-identified references are then passed through our Citation Resolution Service. For more information see the publication describing this pipeline: Hutchins et al 2019 (https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000385#sec003). | Operation and Maintenance | 6/3/19 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
437 | Machine learning system to predict translational progress in biomedical research | NIH | National Institutes of Health (NIH) OD/DPCPSI/OPA | A machine learning system that detects whether a research paper is likely to be cited by a future clinical trial or guideline. Translational progress in biomedicine can therefore be assessed and predicted in real time based on information conveyed by the scientific community's early reaction to a paper. For more information see the publication describing this system: Hutchins et al 2019 (https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000416) | Operation and Maintenance | 11/29/18 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
438 | Research, Condition, and Disease Categorization (RCDC) AI Validation Tool | NIH | National Institutes of Health (NIH) OD/OER | The goal of the tool is to ensure RCDC categories are accurate and complete for public reporting of data. | Development and Acquisition | 2/24/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
439 | Internal Referral Module (IRM) | NIH | National Institutes of Health (NIH) OD/OER | The IRM initiative automates a manual process by using Artificial Intelligence & Natural Language Processing capabilities to help predict grant applications to NIH Institutes and Centers (ICs) Program Officers to make informed decisions. | Implementation | 2/27/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
440 | NIH Grants Virtual Assistant | NIH | National Institutes of Health (NIH) OD/OER | Chat Bot to assist users in finding grant related information via OER resources | Operation and Maintenance | 1/13/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
441 | Tool for Nature Gas Procurement Planning | NIH | National Institutes of Health (NIH) OD/ORF | With this tool, NIH can establish a natural gas procurement plan and set realistic price targets based on current long-term forecasts. | Implementation | 7/3/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
442 | NIH Campus Cooling Load Forecaster | NIH | National Institutes of Health (NIH) OD/ORF | This project forecasts the NIH campus's chilled water demand for the next four days. With this information, the NIH Central Utilities Plant management can plan and optimize the chiller plant's operation and maintenance. | Operation and Maintenance | 10/17/22 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
443 | NIH Campus Steam Demand Forecaster | NIH | National Institutes of Health (NIH) OD/ORF | This project forecasts the NIH campus steam demand for the next four days. With this information, the stakeholders at the NIH Central Utilities Plant can plan and optimize the plant operation and maintenance in advance. | Operation and Maintenance | 10/24/22 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
444 | Chiller Plant Optimization | NIH | National Institutes of Health (NIH) OD/ORF | This project will help to reduce the energy usage for producing chilled water to cool the NIH campus. | Development and Acquisition | 6/2/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
445 | Natural Language Processing Tool for Open Text Analysis | NIH | National Institutes of Health (NIH) OD/ORF | This project will improve facility readiness and reduce downtime by allowing other software to analyze data that was locked away in open text. | Development and Acquisition | 6/1/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||||
446 | Contracts and Grants Analytics Portal | OIG | OIG | The Contracts and Grants Analytics Portal uses AI to enhance HHS OIG staff's ability to access grants related data quickly and easily by: quickly navigating directly to the text of relevant findings across thousands of audits, the ability to discover similar findings, analyze trends, compare data between OPDIVs, and the means to see preliminary assessments of potential anomalies between grantees. | Operation and Maintenance | 11/1/17 | 6/1/18 | 12/1/18 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
447 | Text Analytics Portal | OIG | OIG | The text analytics portal allows personnel without an analytics background to quickly examine text documents through a related set of search, topic modeling and entity recognition technologies; Initial implementation's focus is on HHS-OIG specific use cases. | Implementation | 2/1/21 | 2/1/21 | 9/1/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
448 | Amazon Lex and Amazon Polly for the Marketplace Appeals Call Center | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | CMS/OHI: Amazon Lex & Amazon Polly are used in conjunction with the Amazon Connect phone system (cloud based) for the Marketplace Appeals Call Center. Amazon Lex offers self-service capabilities with virtual contact center agents, interactive voice response (IVR), information response automation, and maximizing information by designing chatbots using existing call center transcripts. Amazon Polly turns text into speech, allowing the program to create applications that talk, and build entirely new categories of speech-enabled products. | Operation and Maintenance | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||||
449 | Feedback Analysis Solution (FAS) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | The Feedback Analysis Solution is a system that uses CMS or other publicly available data (such as Regulations.Gov) to review public comments and/or analyze other information from internal and external stakeholders. The FAS uses Natural Language Processing (NLP) tools to aggregate, sort and identify duplicates to create efficiencies in the comment review process. FAS also uses machine learning (ML) tools to identify topics, themes and sentiment outputs for the targeted dataset. | Operation and Maintenance | hhs.caio@hhs.gov | Yes | |||||||||||||||||
450 | Predictive Intelligence - Incident Assignment for Quality Service Center (QSC). | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | Predictive Intelligence (PI) is used for incident assignment within the Quality Service Center (QSC). The solution runs on incidents created from the ServiceNow Service Portal (https://cmsqualitysupport.servicenowservices.com/sp_ess). The solution analyzes the short description provided by the end user in order to find key words with previously submitted incidents and assigns the ticket to the appropriate assignment group. This solution is re-trained with the incident data in our production instance every 3-6 months based on need. | Operation and Maintenance | hhs.caio@hhs.gov | Yes | |||||||||||||||||
451 | Fraud Prevention System Alert Summary Report Priority Score | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | This model will use Medicare administrative, claims, and fraud alert and investigations data to predict the likelihood of an investigation leading to an administrative action (positive outcome), supporting CMS in prioritizing their use of investigations resources. This analysis is still in development and the final model type has not been determined yet. | Development and Acquisition | hhs.caio@hhs.gov | Yes | |||||||||||||||||
452 | Center for Program Integrity (CPI) Fraud Prevention System Models (e.g. DMEMBITheftML, HHAProviderML) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | These models use Medicare administrative and claims data to identify potential cases of fraud, waste, and abuse for future investigation using random forest techniques. Outputs are used to alert investigators of the potential fraud scheme and associated providers. | Operation and Maintenance | hhs.caio@hhs.gov | Yes | |||||||||||||||||
453 | Priority Score Model - ranks providers within the Fraud Prevention System using logistic regression based on program integrity guidelines. | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | Inputs - Medicare Claims data, Targeted Probe and Educate (TPE) Data, Jurisdiction information Output - ranks providers within the FPS system using logistic regression based on program integrity guidelines. | Operation and Maintenance | hhs.caio@hhs.gov | Yes | |||||||||||||||||
454 | Priority Score Timeliness - forecast the time needed to work on an alert produced by Fraud Prevention System (Random Forest, Decision Tree, Gradient Boost, Generalized Linear Regression) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | Inputs - Medicare Claims data, TPE Data, Jurisdiction information Output - forecast the time needed to work on an alert produced by FPS (Random Forest, Decision Tree, Gradient Boost, Generalized Linear Regression) | Operation and Maintenance | hhs.caio@hhs.gov | Yes | |||||||||||||||||
455 | CCIIO Enrollment Resolution and Reconciliation System (CERRS) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | CERRS AI for Classification | Operation and Maintenance | 2/1/18 | 2/1/18 | 8/1/18 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
456 | Central Data Abstraction Tool-Modernized (Modernized-CDAT)- Intake Process Automation (PA) Tool | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | Intake PA uses advanced capabilities (NLP, OCR, AI, ML) to automate, modernize, and reduce manual efforts related to medical record review functions within MA RADV audits | Operation and Maintenance | 8/19/17 | 8/19/17 | 4/1/19 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
457 | CMS Connect (CCN) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | CCN AI for Global Search | Operation and Maintenance | 2/27/18 | 2/27/18 | 7/23/18 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
458 | CMS Enterprise Portal Services (CMS Enterprise Portal-Chatbot) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | CMS Enterprise Portal AI for Process Efficiency Improvement| Knowledge Management | Operation and Maintenance | 7/19/19 | 2/28/20 | 10/31/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
459 | Federally Facilitated Marketplaces (FFM) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | FFM AI for Anomaly Detection and Correction| Classification| Forecasting and Predicting Time Series | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
460 | Marketplace Learning Management System (MLMS) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | MLMS AI for Language Interpretation and Translation | Operation and Maintenance | 9/30/19 | 9/30/19 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
461 | Medicaid And CHIP Financial (MACFin) Anomaly Detection Model for DSH Audit | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | MACFin AI team developed machine learning model to predict anomalies within DSH audit data. The model flags top outliers in the submitted DSH hospitals data in terms of extreme behavior in the data based on amounts and other characteristics of the data to isolate the most outliers in the data. For example, out of all DSH allocations, the model can identify the top 1-5% outliers in the data for further review and auditing. Such model facilitates targeted investigations for gaps and barriers. In addition, the model can support the process by minimizing overpayment and/or underpayment and amounts redistribution | Initiation | 1/1/21 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
462 | Medicaid And CHIP Financial (MACFin) DSH Payment Forecasting model | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | Forecasting model to predict future DSH payment (next 1 year) based on historical data and trends (ex: last 1-3 years). Multiple models were trained based on time series (i.e., statistical models) and machine learning based model and compared for best performance in terms of average means error on DSH payment amount across all hospitals. DSH data were highly disorganized, the team spent time cleaning and combing the data from over 6 years for all states to conduct full model implementation and meaningful analysis. Predicting future DSH payment facilitates early planning and recommendations around trends, redistributions, etc. Modified models can also be built to predict other DSH-related metrics like payment-to-uncompensated ratio, underpayment, or overpayment | Initiation | 3/1/21 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
463 | Performance Metrics Database and Analytics (PMDA) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | PMDA AI for Anomaly Detection and Correction| Language Interpretation and Translation| Knowledge Management | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
464 | Relationships, Events, Contacts, and Outreach Network (RECON) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | RECON AI for Recommender System| Sentiment Analysis | Operation and Maintenance | 4/1/19 | 1/2/19 | 8/1/22 | hhs.caio@hhs.gov | Yes | ||||||||||||||
465 | Risk Adjustment Payment Integrity Determination System (RAPIDS) | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | RAPIDS AI for Classification| Process Efficiency Improvement | Operation and Maintenance | 9/16/19 | 9/16/19 | 4/1/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
466 | Drug Cost Increase Predictions | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | Use Historical drug costs increases to predict future increases | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
467 | Brand vs Generic Market Share | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | Analyze generic drugs compared to brand drugs over time and forecast future market shares based on Part D claims volume | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
468 | Drug cost anomaly detection | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | Identify anomalies in drug costs on Part D claims | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
469 | Artificial Intelligence (AI) Explorers Program Pilot - Automated Technical Profile | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | 90 day Pilot is to engage in research and development to investigate applications in the generation of a machine-readable Automated Technical Profile for CMS systems with the goal of inferring the technology fingerprint of CMS projects based on multiple data sources at different stages of their development lifecycle | Development and Acquisition | 9/21/22 | 9/21/22 | hhs.caio@hhs.gov | Contracted | yes | ||||||||||||||
470 | Artificial Intelligence (AI) Explorers Program Pilot -  Section 508 accessibility Testing | Centers for Medicare & Medicaid Services (CMS) | Centers for Medicare & Medicaid Services (CMS) | 90 day Pilot is to better inform CMS technical leads and Application Development Organizations (ADOs) to conduct a comprehensive analysis on the data from the test result documents in support of the CMS Section 508 Program. | Development and Acquisition | 9/21/22 | 9/21/22 | hhs.caio@hhs.gov | Contracted | yes | ||||||||||||||
471 | R+2:18eDIRECT: Clarivate | ASPR | BARDA (CBRN & DRIVe) | AI to identify drug repurposing candidates | Operation and Maintenance | 11/8/21 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
472 | ReDIRECT: AriScience | ASPR | BARDA (CBRN & DRIVe) | AI to identify drug repurposing candidates | Development and Acquisition | 4/6/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
473 | Burn & Blast MCMs: Rivanna | ASPR | BARDA (CBRN) | AI Based algorithms on Accuro XV to detect and highlight fractures and soft tissue injuries | Development and Acquisition | 10/29/21 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
474 | Burn & Blast MCMs: Philips | ASPR | BARDA (CBRN) | AI-based algorithms on Lumify handheld ultrasound system to detect lung injury and infectious diseases | Development and Acquisition | 5/8/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
475 | Burn & Blast MCMs: Philips | ASPR | BARDA (CBRN) | AI-based algorithms on Lumify handheld ultrasound system to detect traumatic injuries | Development and Acquisition | 5/8/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
476 | Burn & Bast MCMs: SpectralMD | ASPR | BARDA (CBRN) | Determination of burn depth severity and burn size of injuries | Development and Acquisition | 7/1/19 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
477 | Digital MCM: Virufy | ASPR | BARDA (DRIVe) | Using forced cough vocalization (FCV) in a smartphone to detect the presence of COVID-19 using AI. | Operation and Maintenance | 7/6/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
478 | Current Health | ASPR | BARDA (DRIVe) | Continuous monitoring platform and AI algorithm for COVID severity | Operation and Maintenance | 9/30/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
479 | Digital MCM: Raisonance | ASPR | BARDA (DRIVe) | Using forced cough vocalization (FCV) in a smartphone to detect the presence of COVID-19 and Influenza using AI. | Development and Acquisition | 3/22/23 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
480 | Digital MCM: Visual Dx | ASPR | BARDA (DRIVe) | Using smartphone image with AI to detect the presence of mPox | Development and Acquisition | 9/29/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
481 | Host-Based Diagnostics: Patchd | ASPR | BARDA (DRIVe) | Wearable device and AI model to predict sepsis at home. | Development and Acquisition | 9/28/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
482 | MIT Lincoln Labs | ASPR | BARDA (DRIVe) | Predicting the onset of COVID and Influenza with wearable device data using AI. | Development and Acquisition | 9/20/22 | hhs.caio@hhs.gov | Yes | ||||||||||||||||
483 | Redirect: Clarivate | ASPR | BARDA (DRIVe) | AI to identify drug repurposing candidates | Operation and Maintenance | 9/20/21 | hhs.caio@hhs.gov | Yes | ||||||||||||||||
484 | Antiviral Screening: Janssen OTA - Influenza | ASPR | BARDA (IEIDD) | Atomwise (subcontractor) Coronavirus antiviral discovery | Operation and Maintenance | 7/1/20 | hhs.caio@hhs.gov | Yes | ||||||||||||||||
485 | MIT Lincoln Labs | ASPR | BARDA (DRIVe) | Predicting the onset of COVID and Influenza with wearable device data using AI. | Development and Acquisition | 9/20/22 | hhs.caio@hhs.gov | Yes | ||||||||||||||||
486 | Redirect: Clarivate | ASPR | BARDA (DRIVe) | AI to identify drug repurposing candidates | Operation and Maintenance | 9/20/21 | hhs.caio@hhs.gov | Yes | ||||||||||||||||
487 | Antiviral Screening: Janssen OTA - Influenza | ASPR | BARDA (IEIDD) | Atomwise (subcontractor) Coronavirus antiviral discovery | Operation and Maintenance | 7/1/20 | hhs.caio@hhs.gov | Yes | ||||||||||||||||
488 | Cyber Threat Detection/ Predictive analytics | ASPR | Office of Critical Infrastructure | Use AI and ML tools for processing of extremely large threat data | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
489 | emPOWER | ASPR | Office of Information Management, Data and Analytics | Using the AI capabilities to rapidly develop the empower COVID-19 At Risk Population data tools and program | Operation and Maintenance | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||||
490 | Community Access to Testing | ASPR | Office of Information Management, Data and Analytics/Division of Supply Chain Control Tower | Utilizing several ML models to forecast a surge in the pandemic | Operation and Maintenance | 12/2/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
491 | Ventilator Medication Model | ASPR | Office of Information Management, Data, and Analytics/Division of Supply Chain Control Tower | Leveraging generalized additive model to project ventilated rate of COVID inpatients | Operation and Maintenance | 12/2/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||||
492 | Modeling & Simulation | ASPR | Office of Information Management, Data, and Analytics/Division of Modeling and Simulation | Create modeling tools and perform analyses in advance of biothreat events and be able to refine them during emergent events | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
493 | Data Modernization | ASPR | Chief Data Officer | Develop open data management architecture that enables optimized business intelligence (BI) and machine learning (ML) on all ASPR data. | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
494 | Product redistribution optimization | ASPR | Office of Information Management, Data, and Analytics/ODA | Using AI and models, allow partners (jurisdictions, pharmacies, federal entities) to optimize redistribution of products based on various factors like distance, ordering/admins, equity, etc. | Development and Acquisition | 4/1/23 | 5/1/23 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
495 | Highly Infectious Patient Movement optimization | ASPR | Office of Information Management, Data, and Analytics/ODA | Given a limited number of highly infectious patient transport containers, optimize US location based on various factors like distance, population, etc. Use as a planning tool for decision-making. | Initiation | hhs.caio@hhs.gov | Yes | |||||||||||||||||
496 | AHRQ Search | AHRQ | AHRQ | Organization wide search that includes Relevancy Tailoring, Auto-generation Synonyms, Automated Suggestions, Suggested Related Content ,Auto Tagging, and Did you mean to allow visitors to find specific content | Operation and Maintenance | 9/15/19 | 9/15/19 | 10/14/20 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
497 | Chatbot | AHRQ | AHRQ | Provide interface to allow user to conversationally ask questions about AHRQ content to replace public inquiry telephone line | Operation and Maintenance | 10/15/20 | 10/15/20 | 10/14/21 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
498 | TowerScout:Automated cooling tower detection from aerial imagery for Legionnaires' Disease outbreak investigation | CDC | CSELS | TowerScout scans aerial imagery and uses object detection and image classification models to detect cooling towers, which can be sources of community outbreaks of Legionnaires' Disease. | Operation and Maintenance | 1/1/21 | 1/1/21 | hhs.caio@hhs.gov | Yes | |||||||||||||||
499 | HaMLET: Harnessing Machine Learning to Eliminate Tuberculosis | CDC | CSELS | HaMLET uses computer vision models to detect TB from chest x-rays to improve the quality of overseas health screenings for immigrants and refugees seeking entry to the U.S. | Development and Acquisition | 1/1/21 | 1/1/21 | hhs.caio@hhs.gov | Yes | |||||||||||||||
500 | Zero-shot learning to identify menstrual irregularities reported after COVID-19 vaccination | CDC | CSELS | Zero-shot learning was used to identify and classify reports of menstrual irregularities after receiving COVID-19 vaccination | Operation and Maintenance | 1/1/22 | 1/1/22 | hhs.caio@hhs.gov | Yes | |||||||||||||||
501 | NCIRD SmartFind ChatBots - Public and Internal | CDC | NCIRD | Develop conversational ChatBots (Public Flu, Public COVID-19 Vaccination, Internal Knowledge-Bot) that analyze free text questions entered by the public, healthcare providers, partners, and internal staff, and provide agency-cleared answers which best match the question. Developed in collaboration with Microsoft staff during COVID-19 pandemic using their Cognitive Services, Search,†QnA Maker, Azure Healthcare Bot, Power Automate, SharePoint, and webapps. | Operation and Maintenance | 3/1/20 | 5/1/20 | 12/10/20 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||
502 | Semi-Automated Nonresponse Detection for Surveys (SANDS) | CDC | NCHS | NCHS has developed and release an item nonresponse detection model, to identify cases of item nonresponse (e.g., gibberish, uncertain/don't know, refusals, or high-risk) among open-text responses to help improve survey data and question and questionnaire design. The system is a Natural Language Processing (NLP) model pre-trained using Contrastive Learning and fine-tuned on a custom dataset from survey responses. | Operation and Maintenance | 5/17/21 | 6/7/21 | 2/13/23 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||
503 | Sequential Coverage Algorithm (SCA) and partial Expectation-Maximization (EM) estimation in Record Linkage | CDC | NCHS | CDC's National Center for Health Statistics (NCHS) Data Linkage Program has implemented both supervised and unsupervised machine learning (ML) techniques in their linkage algorithms. The Sequential Coverage Algorithm (SCA), a supervised ML algorithm, is used to develop joining methods (or blocking groups) when working with very large datasets. The unsupervised partial Expectation-Maximization (EM) estimation is used to estimate the proportion of pairs that are matches within each block. Both methods improve linkage accuracy and efficiency. | Operation and Maintenance | 2/15/17 | 9/15/17 | 2/28/18 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
504 | Coding cause of death information on death certificates to ICD-10 | CDC | NCHS | MedCoder ICD-10 cause of death codes to the literal text cause of death description provided by the cause of death certifier on the death certificate. This includes codes for the underlying and contributing causes of death. | Operation and Maintenance | 10/15/17 | 2/1/18 | 6/5/22 | hhs.caio@hhs.gov | Contracted | Yes | |||||||||||||
505 | Detecting Stimulant and Opioid Misuse and Illicit Use | CDC | NCHS | Analyze clinical notes to detect illicit use and miscue of stimulants and opioids | Initiation | 3/15/23 | 3/15/23 | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||
506 | AI/ML Model Release Standards | CDC | NCHS | NCHS is creating a set of model release standards for AI/ML projects that should be adhered to throughout the Center, and could serve as a starting point for broader standards across the AI/ML development lifecycle to be created at NCHS and throughout CDC. | Development and Acquisition | 10/17/22 | 10/17/22 | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||
507 | PII detection using Private AI | CDC | NCHS | NCHS has been evaluating Private AI's NLP solution designed to identify, redact, and replace PII in text data. This suite of models is intended to be used to safely identify and remove PII from free text data sets across platforms within the CDC network. | Development and Acquisition | 5/9/22 | 4/12/23 | 5/2/23 | hhs.caio@hhs.gov | Commercial-off-the-shelf | Yes | |||||||||||||
508 | Transcribing Cognitive Interviews with Whisper | CDC | NCHS | Current transcription processes for cognitive interviews are limited. Manual transcription is time-consuming and the current automated solution is low quality. Recently, open-sourced AI models have been released that appear to perform substantially better than previous technologies in automated transcription of video/audio. Of note is the model by OpenAI named Whisper (publication, code, model card) which has been made available for under a fully permissive license. Although Whisper is currently considered state-of-the-art compared to other AI models in standard benchmarks, it has not been tested with cognitive interviews. We hypothesize Whisper will produce production quality transcriptions for NCHS. We plan to do a comparison against both VideoBank and a manual transcription. If the results are encouraging, we plan to transcribe all videos from the CCQDER archive. | Development and Acquisition | 11/28/22 | 11/28/22 | hhs.caio@hhs.gov | In-house | Yes | ||||||||||||||
509 | Nowcasting Suicide Trends | CDC | NCIPC/DIP | An internal-facing, interactive dashboard incorporating multiple traditional and non-traditional datasets and a multi-stage machine learning pipeline to 'nowcast' suicide death trends nationally on a week-to-week basis. | Operation and Maintenance | 1/1/21 | 1/1/21 | 10/1/21 | hhs.caio@hhs.gov | In-house | Yes | |||||||||||||
510 | Automating extraction of sidewalk networks from street-level images | CDC | NCCDPHP/DNPAO | A team of scientists participating in CDC's Data Science Upskilling Program are building a computer vision model to extract information on the presence of sidewalks from street-level images from Mapillary. | Development and Acquisition | 1/1/21 | 1/1/21 | hhs.caio@hhs.gov | Yes | |||||||||||||||
511 | Use of Natural Language Processing for Topic Modeling to Automate Review of Public Comments to Notice of Proposed Rulemaking | CDC | NCEZID | Development of a Natural Language Processing Topic Modeling tool to improve efficiency for the process of clustering public comments to a 'notice of proposed rulemaking' | Development and Acquisition | 1/1/21 | 1/1/21 | hhs.caio@hhs.gov | Yes | |||||||||||||||
512 | Named Entity Recognition for Opioid Use in Free Text Clinical Notes from Electronic Health Records | CDC | NCHS | A team of scientists participating in CDC's Data Science Upskilling Program are developing an NLP Named Entity Recognition model to detect the assertion or negation of opioid use in electronic medical records from the National Hospital Care Survey | Development and Acquisition | 1/1/21 | 1/1/21 | hhs.caio@hhs.gov | Yes | |||||||||||||||
513 | Identify walking and bicycling trips in location-based data, including global-positioning system data from smartphone applications | CDC | NCCDPHP/DNPAO | The Division of Nutrition, Physical Activity, and Obesity at the National Center for Chronic Disease Prevention and Health Promotion is developing machine learning techniques to identify walking and bicycling trips in GPS-based data sources. Inputs would include commercially-available location-based data similar to those used to track community mobility during the COVID-19 pandemic. Outputs could include geocoded data tables, GIS layers, and maps. | Initiation | 10/1/22 | 10/1/22 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
514 | Identify infrastructure supports for physical activity (e.g. sidewalks) in satellite and roadway images | CDC | NCCDPHP/DNPAO | The Division of Nutrition, Physical Activity, and Obesity at the National Center for Chronic Disease Prevention and Health Promotion is interested in developing and promoting machine learning techniques to identify sidewalks, bicycle lanes, and other infrastructure in images, both satellite and roadway images. The inputs would include image-based data. The outputs could be geocoded data tables, maps, GIS layers, or summary reports. | Initiation | 10/1/22 | 10/1/22 | hhs.caio@hhs.gov | Contracted | Yes | ||||||||||||||
515 | Identifying state and local policy provisions that promote or inhibit creating healthy built environments | CDC | NCCDPHP/DNPAO | The Division of Nutrition, Physical Activity, and Obesity at the National Center for Chronic Disease Prevention and Health Promotion is interested in developing and promoting natural language processing and machine learning techniques to improve the efficiency of policy surveillance. Inputs are the text of state and local policies, including law (e.g., statute, legislation, regulation, court opinion), procedure, administrative action, etc. and outputs are datasets that capture relevant aspects of the policy as quantifiable information. To date (Apr 2023), DNAPO has not performed this work in-house, but is working with a contractor on various experiments comparing machine learning with traditional methods and identifying CDC, academic and other groups doing related work. | Initiation | 10/1/21 | 7/1/22 | hhs.caio@hhs.gov | Yes | |||||||||||||||
516 | Validation Study of Deep Learning Algorithms to Explore the Potential Use of Artificial Intelligence for Public Health Surveillance of Eye Diseases | CDC | NCCDPHP/DDT | Applying deep learning algorithms for detecting diabetic retinopathy to the NHANES retinal photos. The purpose of this project is to determine whether these algorithms could be used in the future to replace ophthalmologist grading and grade retinal photos collected for surveillance purposes through the National Health and Nutrition Examination Survey (NHANES). | Development and Acquisition | 1/15/20 | 6/1/20 | 6/1/20 | hhs.caio@hhs.gov | Yes | ||||||||||||||
517 | Skills matching on Open Opportunities | Office of Personnel Management | HRS/USAJOBS | The website uses Skills engine, a third party vendor, to provide personalized recommendations to users based on user input text and opportunity descriptions | Operation and Maintenance | 1/1/22 | media@opm.gov | Commercial-off-the-shelf | Yes | Natural Language Processing | No | Yes | Yes | |||||||||||
518 | Similar Job Recommendations | Office of Personnel Management | HRS/USAJOBS | USAJOBS is planning to use natural language processing to provide better matches between posted job opportunities in order to help users identify opportunities of interest. | Development and Acquisition | 7/1/22 | media@opm.gov | In-house | Yes | Natural Language Processing | Agency Generated | Yes | Yes | Yes | ||||||||||
519 | Human Resource Apprentice (HRA) | Office of Personnel Management | HRS/FSC/ASMG & OCIO/FITBS | Evaluate the technical feasibility, validity, and affordability of providing AI-supported applicant review help to HR Specialists in USA Staffing. OPM will also evaluate prototype against fairness and bias standards to ensure it does not introduce adverse impact to the hiring process. The key metric that OPM is seeking is “can the AI solution deliver faster, more accurate evaluations of applicant qualifications when compared to experienced HR Specialists?” | Development and Acquisition | 7/1/22 | media@opm.gov | In-house | Yes | No | No | No | ||||||||||||
520 | Retirement Services (RS) Chat Bot | Office of Personnel Management | RS/RO | A chatbot is a computer program that uses artificial intelligence (AI) and natural language processing to understand customer questions and automate responses to them, simulating human conversation. Retirement Services uses the chatbot to answer user questions related to Survivor Benefits. The bot initially started with a set of 13 questions and continues to grow based on reviews of user interaction. | Operation and Maintenance | 11/1/22 | 11/15/22 | 3/8/23 | media@opm.gov | In-house | Yes | Natural Language Processing | Agency Generated | No | No | Yes | ||||||||
521 | ||||||||||||||||||||||||
522 | ||||||||||||||||||||||||
523 | ||||||||||||||||||||||||
524 | ||||||||||||||||||||||||
525 | ||||||||||||||||||||||||
526 | ||||||||||||||||||||||||
527 | ||||||||||||||||||||||||
528 | ||||||||||||||||||||||||
529 | ||||||||||||||||||||||||
530 | ||||||||||||||||||||||||
531 | ||||||||||||||||||||||||
532 | ||||||||||||||||||||||||
533 | ||||||||||||||||||||||||
534 | ||||||||||||||||||||||||
535 | ||||||||||||||||||||||||
536 | ||||||||||||||||||||||||
537 | ||||||||||||||||||||||||
538 | ||||||||||||||||||||||||
539 | ||||||||||||||||||||||||
540 | ||||||||||||||||||||||||
541 | ||||||||||||||||||||||||
542 | ||||||||||||||||||||||||
543 | ||||||||||||||||||||||||
544 | ||||||||||||||||||||||||
545 | ||||||||||||||||||||||||
546 | ||||||||||||||||||||||||
547 | ||||||||||||||||||||||||
548 | ||||||||||||||||||||||||
549 | ||||||||||||||||||||||||
550 | ||||||||||||||||||||||||
551 | ||||||||||||||||||||||||
552 | ||||||||||||||||||||||||
553 | ||||||||||||||||||||||||
554 | ||||||||||||||||||||||||
555 | ||||||||||||||||||||||||
556 | ||||||||||||||||||||||||
557 | ||||||||||||||||||||||||
558 | ||||||||||||||||||||||||
559 | ||||||||||||||||||||||||
560 | ||||||||||||||||||||||||
561 | ||||||||||||||||||||||||
562 | ||||||||||||||||||||||||
563 | ||||||||||||||||||||||||
564 | ||||||||||||||||||||||||
565 | ||||||||||||||||||||||||
566 | ||||||||||||||||||||||||
567 | ||||||||||||||||||||||||
568 | ||||||||||||||||||||||||
569 | ||||||||||||||||||||||||
570 | ||||||||||||||||||||||||
571 | ||||||||||||||||||||||||
572 | ||||||||||||||||||||||||
573 | ||||||||||||||||||||||||
574 | ||||||||||||||||||||||||
575 | ||||||||||||||||||||||||
576 | ||||||||||||||||||||||||
577 | ||||||||||||||||||||||||
578 | ||||||||||||||||||||||||
579 | ||||||||||||||||||||||||
580 | ||||||||||||||||||||||||
581 | ||||||||||||||||||||||||
582 | ||||||||||||||||||||||||
583 | ||||||||||||||||||||||||
584 | ||||||||||||||||||||||||
585 | ||||||||||||||||||||||||
586 | ||||||||||||||||||||||||
587 | ||||||||||||||||||||||||
588 | ||||||||||||||||||||||||
589 | ||||||||||||||||||||||||
590 | ||||||||||||||||||||||||
591 | ||||||||||||||||||||||||
592 | ||||||||||||||||||||||||
593 | ||||||||||||||||||||||||
594 | ||||||||||||||||||||||||
595 | ||||||||||||||||||||||||
596 | ||||||||||||||||||||||||
597 | ||||||||||||||||||||||||
598 | ||||||||||||||||||||||||
599 | ||||||||||||||||||||||||
600 | ||||||||||||||||||||||||
601 | ||||||||||||||||||||||||
602 | ||||||||||||||||||||||||
603 | ||||||||||||||||||||||||
604 | ||||||||||||||||||||||||
605 | ||||||||||||||||||||||||
606 | ||||||||||||||||||||||||
607 | ||||||||||||||||||||||||
608 | ||||||||||||||||||||||||
609 | ||||||||||||||||||||||||
610 | ||||||||||||||||||||||||
611 | ||||||||||||||||||||||||
612 | ||||||||||||||||||||||||
613 | ||||||||||||||||||||||||
614 | ||||||||||||||||||||||||
615 | ||||||||||||||||||||||||
616 | ||||||||||||||||||||||||
617 | ||||||||||||||||||||||||
618 | ||||||||||||||||||||||||
619 | ||||||||||||||||||||||||
620 | ||||||||||||||||||||||||
621 | ||||||||||||||||||||||||
622 | ||||||||||||||||||||||||
623 | ||||||||||||||||||||||||
624 | ||||||||||||||||||||||||
625 | ||||||||||||||||||||||||
626 | ||||||||||||||||||||||||
627 | ||||||||||||||||||||||||
628 | ||||||||||||||||||||||||
629 | ||||||||||||||||||||||||
630 | ||||||||||||||||||||||||
631 | ||||||||||||||||||||||||
632 | ||||||||||||||||||||||||
633 | ||||||||||||||||||||||||
634 | ||||||||||||||||||||||||
635 | ||||||||||||||||||||||||
636 | ||||||||||||||||||||||||
637 | ||||||||||||||||||||||||
638 | ||||||||||||||||||||||||
639 | ||||||||||||||||||||||||
640 | ||||||||||||||||||||||||
641 | ||||||||||||||||||||||||
642 | ||||||||||||||||||||||||
643 | ||||||||||||||||||||||||
644 | ||||||||||||||||||||||||
645 | ||||||||||||||||||||||||
646 | ||||||||||||||||||||||||
647 | ||||||||||||||||||||||||
648 | ||||||||||||||||||||||||
649 | ||||||||||||||||||||||||
650 | ||||||||||||||||||||||||
651 | ||||||||||||||||||||||||
652 | ||||||||||||||||||||||||
653 | ||||||||||||||||||||||||
654 | ||||||||||||||||||||||||
655 | ||||||||||||||||||||||||
656 | ||||||||||||||||||||||||
657 | ||||||||||||||||||||||||
658 | ||||||||||||||||||||||||
659 | ||||||||||||||||||||||||
660 | ||||||||||||||||||||||||
661 | ||||||||||||||||||||||||
662 | ||||||||||||||||||||||||
663 | ||||||||||||||||||||||||
664 | ||||||||||||||||||||||||
665 | ||||||||||||||||||||||||
666 | ||||||||||||||||||||||||
667 | ||||||||||||||||||||||||
668 | ||||||||||||||||||||||||
669 | ||||||||||||||||||||||||
670 | ||||||||||||||||||||||||
671 | ||||||||||||||||||||||||
672 | ||||||||||||||||||||||||
673 | ||||||||||||||||||||||||
674 | ||||||||||||||||||||||||
675 | ||||||||||||||||||||||||
676 | ||||||||||||||||||||||||
677 | ||||||||||||||||||||||||
678 | ||||||||||||||||||||||||
679 | ||||||||||||||||||||||||
680 | ||||||||||||||||||||||||
681 | ||||||||||||||||||||||||
682 | ||||||||||||||||||||||||
683 | ||||||||||||||||||||||||
684 | ||||||||||||||||||||||||
685 | ||||||||||||||||||||||||
686 | ||||||||||||||||||||||||
687 | ||||||||||||||||||||||||
688 | ||||||||||||||||||||||||
689 | ||||||||||||||||||||||||
690 | ||||||||||||||||||||||||
691 | ||||||||||||||||||||||||
692 | ||||||||||||||||||||||||
693 | ||||||||||||||||||||||||
694 | ||||||||||||||||||||||||
695 | ||||||||||||||||||||||||
696 | ||||||||||||||||||||||||
697 | ||||||||||||||||||||||||
698 | ||||||||||||||||||||||||
699 | ||||||||||||||||||||||||
700 | ||||||||||||||||||||||||
701 | ||||||||||||||||||||||||
702 | ||||||||||||||||||||||||
703 | ||||||||||||||||||||||||
704 | ||||||||||||||||||||||||
705 | ||||||||||||||||||||||||
706 | ||||||||||||||||||||||||
707 | ||||||||||||||||||||||||
708 | ||||||||||||||||||||||||
709 | ||||||||||||||||||||||||
710 | ||||||||||||||||||||||||
711 | ||||||||||||||||||||||||
712 | ||||||||||||||||||||||||
713 | ||||||||||||||||||||||||
714 | ||||||||||||||||||||||||
715 | ||||||||||||||||||||||||
716 | ||||||||||||||||||||||||
717 | ||||||||||||||||||||||||
718 | ||||||||||||||||||||||||
719 | ||||||||||||||||||||||||
720 | ||||||||||||||||||||||||
721 | ||||||||||||||||||||||||
722 | ||||||||||||||||||||||||
723 | ||||||||||||||||||||||||
724 | ||||||||||||||||||||||||
725 | ||||||||||||||||||||||||
726 | ||||||||||||||||||||||||
727 | ||||||||||||||||||||||||
728 | ||||||||||||||||||||||||
729 | ||||||||||||||||||||||||
730 | ||||||||||||||||||||||||
731 | ||||||||||||||||||||||||
732 | ||||||||||||||||||||||||
733 | ||||||||||||||||||||||||
734 | ||||||||||||||||||||||||
735 | ||||||||||||||||||||||||
736 | ||||||||||||||||||||||||
737 | ||||||||||||||||||||||||
738 | ||||||||||||||||||||||||
739 | ||||||||||||||||||||||||
740 | ||||||||||||||||||||||||
741 | ||||||||||||||||||||||||
742 | ||||||||||||||||||||||||
743 | ||||||||||||||||||||||||
744 | ||||||||||||||||||||||||
745 | ||||||||||||||||||||||||
746 | ||||||||||||||||||||||||
747 | ||||||||||||||||||||||||
748 | ||||||||||||||||||||||||
749 | ||||||||||||||||||||||||
750 | ||||||||||||||||||||||||
751 | ||||||||||||||||||||||||
752 | ||||||||||||||||||||||||
753 | ||||||||||||||||||||||||
754 | ||||||||||||||||||||||||
755 | ||||||||||||||||||||||||
756 | ||||||||||||||||||||||||
757 | ||||||||||||||||||||||||
758 | ||||||||||||||||||||||||
759 | ||||||||||||||||||||||||
760 | ||||||||||||||||||||||||
761 | ||||||||||||||||||||||||
762 | ||||||||||||||||||||||||
763 | ||||||||||||||||||||||||
764 | ||||||||||||||||||||||||
765 | ||||||||||||||||||||||||
766 | ||||||||||||||||||||||||
767 | ||||||||||||||||||||||||
768 | ||||||||||||||||||||||||
769 | ||||||||||||||||||||||||
770 | ||||||||||||||||||||||||
771 | ||||||||||||||||||||||||
772 | ||||||||||||||||||||||||
773 | ||||||||||||||||||||||||
774 | ||||||||||||||||||||||||
775 | ||||||||||||||||||||||||
776 | ||||||||||||||||||||||||
777 | ||||||||||||||||||||||||
778 | ||||||||||||||||||||||||
779 | ||||||||||||||||||||||||
780 | ||||||||||||||||||||||||
781 | ||||||||||||||||||||||||
782 | ||||||||||||||||||||||||
783 | ||||||||||||||||||||||||
784 | ||||||||||||||||||||||||
785 | ||||||||||||||||||||||||
786 | ||||||||||||||||||||||||
787 | ||||||||||||||||||||||||
788 | ||||||||||||||||||||||||
789 | ||||||||||||||||||||||||
790 | ||||||||||||||||||||||||
791 | ||||||||||||||||||||||||
792 | ||||||||||||||||||||||||
793 | ||||||||||||||||||||||||
794 | ||||||||||||||||||||||||
795 | ||||||||||||||||||||||||
796 | ||||||||||||||||||||||||
797 | ||||||||||||||||||||||||
798 | ||||||||||||||||||||||||
799 | ||||||||||||||||||||||||
800 | ||||||||||||||||||||||||
801 | ||||||||||||||||||||||||
802 | ||||||||||||||||||||||||
803 | ||||||||||||||||||||||||
804 | ||||||||||||||||||||||||
805 | ||||||||||||||||||||||||
806 | ||||||||||||||||||||||||
807 | ||||||||||||||||||||||||
808 | ||||||||||||||||||||||||
809 | ||||||||||||||||||||||||
810 | ||||||||||||||||||||||||
811 | ||||||||||||||||||||||||
812 | ||||||||||||||||||||||||
813 | ||||||||||||||||||||||||
814 | ||||||||||||||||||||||||
815 | ||||||||||||||||||||||||
816 | ||||||||||||||||||||||||
817 | ||||||||||||||||||||||||
818 | ||||||||||||||||||||||||
819 | ||||||||||||||||||||||||
820 | ||||||||||||||||||||||||
821 | ||||||||||||||||||||||||
822 | ||||||||||||||||||||||||
823 | ||||||||||||||||||||||||
824 | ||||||||||||||||||||||||
825 | ||||||||||||||||||||||||
826 | ||||||||||||||||||||||||
827 | ||||||||||||||||||||||||
828 | ||||||||||||||||||||||||
829 | ||||||||||||||||||||||||
830 | ||||||||||||||||||||||||
831 | ||||||||||||||||||||||||
832 | ||||||||||||||||||||||||
833 | ||||||||||||||||||||||||
834 | ||||||||||||||||||||||||
835 | ||||||||||||||||||||||||
836 | ||||||||||||||||||||||||
837 | ||||||||||||||||||||||||
838 | ||||||||||||||||||||||||
839 | ||||||||||||||||||||||||
840 | ||||||||||||||||||||||||
841 | ||||||||||||||||||||||||
842 | ||||||||||||||||||||||||
843 | ||||||||||||||||||||||||
844 | ||||||||||||||||||||||||
845 | ||||||||||||||||||||||||
846 | ||||||||||||||||||||||||
847 | ||||||||||||||||||||||||
848 | ||||||||||||||||||||||||
849 | ||||||||||||||||||||||||
850 | ||||||||||||||||||||||||
851 | ||||||||||||||||||||||||
852 | ||||||||||||||||||||||||
853 | ||||||||||||||||||||||||
854 | ||||||||||||||||||||||||
855 | ||||||||||||||||||||||||
856 | ||||||||||||||||||||||||
857 | ||||||||||||||||||||||||
858 | ||||||||||||||||||||||||
859 | ||||||||||||||||||||||||
860 | ||||||||||||||||||||||||
861 | ||||||||||||||||||||||||
862 | ||||||||||||||||||||||||
863 | ||||||||||||||||||||||||
864 | ||||||||||||||||||||||||
865 | ||||||||||||||||||||||||
866 | ||||||||||||||||||||||||
867 | ||||||||||||||||||||||||
868 | ||||||||||||||||||||||||
869 | ||||||||||||||||||||||||
870 | ||||||||||||||||||||||||
871 | ||||||||||||||||||||||||
872 | ||||||||||||||||||||||||
873 | ||||||||||||||||||||||||
874 | ||||||||||||||||||||||||
875 | ||||||||||||||||||||||||
876 | ||||||||||||||||||||||||
877 | ||||||||||||||||||||||||
878 | ||||||||||||||||||||||||
879 | ||||||||||||||||||||||||
880 | ||||||||||||||||||||||||
881 | ||||||||||||||||||||||||
882 | ||||||||||||||||||||||||
883 | ||||||||||||||||||||||||
884 | ||||||||||||||||||||||||
885 | ||||||||||||||||||||||||
886 | ||||||||||||||||||||||||
887 | ||||||||||||||||||||||||
888 | ||||||||||||||||||||||||
889 | ||||||||||||||||||||||||
890 | ||||||||||||||||||||||||
891 | ||||||||||||||||||||||||
892 | ||||||||||||||||||||||||
893 | ||||||||||||||||||||||||
894 | ||||||||||||||||||||||||
895 | ||||||||||||||||||||||||
896 | ||||||||||||||||||||||||
897 | ||||||||||||||||||||||||
898 | ||||||||||||||||||||||||
899 | ||||||||||||||||||||||||
900 | ||||||||||||||||||||||||
901 | ||||||||||||||||||||||||
902 | ||||||||||||||||||||||||
903 | ||||||||||||||||||||||||
904 | ||||||||||||||||||||||||
905 | ||||||||||||||||||||||||
906 | ||||||||||||||||||||||||
907 | ||||||||||||||||||||||||
908 | ||||||||||||||||||||||||
909 | ||||||||||||||||||||||||
910 | ||||||||||||||||||||||||
911 | ||||||||||||||||||||||||
912 | ||||||||||||||||||||||||
913 | ||||||||||||||||||||||||
914 | ||||||||||||||||||||||||
915 | ||||||||||||||||||||||||
916 | ||||||||||||||||||||||||
917 | ||||||||||||||||||||||||
918 | ||||||||||||||||||||||||
919 | ||||||||||||||||||||||||
920 | ||||||||||||||||||||||||
921 | ||||||||||||||||||||||||
922 | ||||||||||||||||||||||||
923 | ||||||||||||||||||||||||
924 | ||||||||||||||||||||||||
925 | ||||||||||||||||||||||||
926 | ||||||||||||||||||||||||
927 | ||||||||||||||||||||||||
928 | ||||||||||||||||||||||||
929 | ||||||||||||||||||||||||
930 | ||||||||||||||||||||||||
931 | ||||||||||||||||||||||||
932 | ||||||||||||||||||||||||
933 | ||||||||||||||||||||||||
934 | ||||||||||||||||||||||||
935 | ||||||||||||||||||||||||
936 | ||||||||||||||||||||||||
937 | ||||||||||||||||||||||||
938 | ||||||||||||||||||||||||
939 | ||||||||||||||||||||||||
940 | ||||||||||||||||||||||||
941 | ||||||||||||||||||||||||
942 | ||||||||||||||||||||||||
943 | ||||||||||||||||||||||||
944 | ||||||||||||||||||||||||
945 | ||||||||||||||||||||||||
946 | ||||||||||||||||||||||||
947 | ||||||||||||||||||||||||
948 | ||||||||||||||||||||||||
949 | ||||||||||||||||||||||||
950 | ||||||||||||||||||||||||
951 | ||||||||||||||||||||||||
952 | ||||||||||||||||||||||||
953 | ||||||||||||||||||||||||
954 | ||||||||||||||||||||||||
955 | ||||||||||||||||||||||||
956 | ||||||||||||||||||||||||
957 | ||||||||||||||||||||||||
958 | ||||||||||||||||||||||||
959 | ||||||||||||||||||||||||
960 | ||||||||||||||||||||||||
961 | ||||||||||||||||||||||||
962 | ||||||||||||||||||||||||
963 | ||||||||||||||||||||||||
964 | ||||||||||||||||||||||||
965 | ||||||||||||||||||||||||
966 | ||||||||||||||||||||||||
967 | ||||||||||||||||||||||||
968 | ||||||||||||||||||||||||
969 | ||||||||||||||||||||||||
970 | ||||||||||||||||||||||||
971 | ||||||||||||||||||||||||
972 | ||||||||||||||||||||||||
973 | ||||||||||||||||||||||||
974 | ||||||||||||||||||||||||
975 | ||||||||||||||||||||||||
976 | ||||||||||||||||||||||||
977 | ||||||||||||||||||||||||
978 | ||||||||||||||||||||||||
979 | ||||||||||||||||||||||||
980 | ||||||||||||||||||||||||
981 | ||||||||||||||||||||||||
982 | ||||||||||||||||||||||||
983 | ||||||||||||||||||||||||
984 | ||||||||||||||||||||||||
985 | ||||||||||||||||||||||||
986 | ||||||||||||||||||||||||
987 | ||||||||||||||||||||||||
988 | ||||||||||||||||||||||||
989 | ||||||||||||||||||||||||
990 | ||||||||||||||||||||||||
991 | ||||||||||||||||||||||||
992 | ||||||||||||||||||||||||
993 | ||||||||||||||||||||||||
994 | ||||||||||||||||||||||||
995 | ||||||||||||||||||||||||
996 | ||||||||||||||||||||||||
997 | ||||||||||||||||||||||||
998 | ||||||||||||||||||||||||
999 | ||||||||||||||||||||||||
1000 | ||||||||||||||||||||||||
1001 | ||||||||||||||||||||||||
1002 | ||||||||||||||||||||||||
1003 | ||||||||||||||||||||||||
1004 | ||||||||||||||||||||||||
1005 | ||||||||||||||||||||||||
1006 | ||||||||||||||||||||||||
1007 | ||||||||||||||||||||||||
1008 | ||||||||||||||||||||||||
1009 | ||||||||||||||||||||||||
1010 | ||||||||||||||||||||||||
1011 | ||||||||||||||||||||||||
1012 | ||||||||||||||||||||||||
1013 | ||||||||||||||||||||||||
1014 | ||||||||||||||||||||||||
1015 | ||||||||||||||||||||||||
1016 | ||||||||||||||||||||||||
1017 | ||||||||||||||||||||||||
1018 | ||||||||||||||||||||||||
1019 | ||||||||||||||||||||||||
1020 | ||||||||||||||||||||||||
1021 | ||||||||||||||||||||||||
1022 | ||||||||||||||||||||||||
1023 | ||||||||||||||||||||||||
1024 | ||||||||||||||||||||||||
1025 | ||||||||||||||||||||||||
1026 | ||||||||||||||||||||||||
1027 | ||||||||||||||||||||||||
1028 | ||||||||||||||||||||||||
1029 | ||||||||||||||||||||||||
1030 | ||||||||||||||||||||||||
1031 | ||||||||||||||||||||||||
1032 | ||||||||||||||||||||||||
1033 | ||||||||||||||||||||||||
1034 | ||||||||||||||||||||||||
1035 | ||||||||||||||||||||||||
1036 | ||||||||||||||||||||||||
1037 | ||||||||||||||||||||||||
1038 | ||||||||||||||||||||||||
1039 | ||||||||||||||||||||||||
1040 | ||||||||||||||||||||||||
1041 | ||||||||||||||||||||||||
1042 | ||||||||||||||||||||||||
1043 | ||||||||||||||||||||||||
1044 | ||||||||||||||||||||||||
1045 | ||||||||||||||||||||||||
1046 | ||||||||||||||||||||||||
1047 | ||||||||||||||||||||||||
1048 | ||||||||||||||||||||||||
1049 | ||||||||||||||||||||||||
1050 | ||||||||||||||||||||||||
1051 | ||||||||||||||||||||||||
1052 | ||||||||||||||||||||||||
1053 | ||||||||||||||||||||||||
1054 | ||||||||||||||||||||||||
1055 | ||||||||||||||||||||||||
1056 | ||||||||||||||||||||||||
1057 | ||||||||||||||||||||||||
1058 | ||||||||||||||||||||||||
1059 | ||||||||||||||||||||||||
1060 | ||||||||||||||||||||||||
1061 | ||||||||||||||||||||||||
1062 | ||||||||||||||||||||||||
1063 | ||||||||||||||||||||||||
1064 | ||||||||||||||||||||||||
1065 | ||||||||||||||||||||||||
1066 | ||||||||||||||||||||||||
1067 | ||||||||||||||||||||||||
1068 | ||||||||||||||||||||||||
1069 | ||||||||||||||||||||||||
1070 | ||||||||||||||||||||||||
1071 | ||||||||||||||||||||||||
1072 | ||||||||||||||||||||||||
1073 | ||||||||||||||||||||||||
1074 | ||||||||||||||||||||||||
1075 | ||||||||||||||||||||||||
1076 | ||||||||||||||||||||||||
1077 | ||||||||||||||||||||||||
1078 | ||||||||||||||||||||||||
1079 | ||||||||||||||||||||||||
1080 | ||||||||||||||||||||||||
1081 | ||||||||||||||||||||||||
1082 | ||||||||||||||||||||||||
1083 | ||||||||||||||||||||||||
1084 | ||||||||||||||||||||||||
1085 | ||||||||||||||||||||||||
1086 | ||||||||||||||||||||||||
1087 | ||||||||||||||||||||||||
1088 | ||||||||||||||||||||||||
1089 | ||||||||||||||||||||||||
1090 | ||||||||||||||||||||||||
1091 | ||||||||||||||||||||||||
1092 | ||||||||||||||||||||||||
1093 | ||||||||||||||||||||||||
1094 | ||||||||||||||||||||||||
1095 | ||||||||||||||||||||||||
1096 | ||||||||||||||||||||||||
1097 | ||||||||||||||||||||||||
1098 | ||||||||||||||||||||||||
1099 | ||||||||||||||||||||||||
1100 | ||||||||||||||||||||||||
1101 | ||||||||||||||||||||||||
1102 | ||||||||||||||||||||||||
1103 | ||||||||||||||||||||||||
1104 | ||||||||||||||||||||||||
1105 | ||||||||||||||||||||||||
1106 | ||||||||||||||||||||||||
1107 | ||||||||||||||||||||||||
1108 | ||||||||||||||||||||||||
1109 | ||||||||||||||||||||||||
1110 | ||||||||||||||||||||||||
1111 | ||||||||||||||||||||||||
1112 | ||||||||||||||||||||||||
1113 | ||||||||||||||||||||||||
1114 | ||||||||||||||||||||||||
1115 | ||||||||||||||||||||||||
1116 | ||||||||||||||||||||||||
1117 | ||||||||||||||||||||||||
1118 | ||||||||||||||||||||||||
1119 | ||||||||||||||||||||||||
1120 | ||||||||||||||||||||||||
1121 | ||||||||||||||||||||||||
1122 | ||||||||||||||||||||||||
1123 | ||||||||||||||||||||||||
1124 | ||||||||||||||||||||||||
1125 | ||||||||||||||||||||||||
1126 | ||||||||||||||||||||||||
1127 | ||||||||||||||||||||||||
1128 | ||||||||||||||||||||||||
1129 | ||||||||||||||||||||||||
1130 | ||||||||||||||||||||||||
1131 | ||||||||||||||||||||||||
1132 | ||||||||||||||||||||||||
1133 | ||||||||||||||||||||||||
1134 | ||||||||||||||||||||||||
1135 | ||||||||||||||||||||||||
1136 | ||||||||||||||||||||||||
1137 | ||||||||||||||||||||||||
1138 | ||||||||||||||||||||||||
1139 | ||||||||||||||||||||||||
1140 | ||||||||||||||||||||||||
1141 | ||||||||||||||||||||||||
1142 | ||||||||||||||||||||||||
1143 | ||||||||||||||||||||||||
1144 | ||||||||||||||||||||||||
1145 | ||||||||||||||||||||||||
1146 | ||||||||||||||||||||||||
1147 | ||||||||||||||||||||||||
1148 | ||||||||||||||||||||||||
1149 | ||||||||||||||||||||||||
1150 | ||||||||||||||||||||||||
1151 | ||||||||||||||||||||||||
1152 | ||||||||||||||||||||||||
1153 | ||||||||||||||||||||||||
1154 | ||||||||||||||||||||||||
1155 | ||||||||||||||||||||||||
1156 | ||||||||||||||||||||||||
1157 | ||||||||||||||||||||||||
1158 | ||||||||||||||||||||||||
1159 | ||||||||||||||||||||||||
1160 | ||||||||||||||||||||||||
1161 | ||||||||||||||||||||||||
1162 | ||||||||||||||||||||||||
1163 | ||||||||||||||||||||||||
1164 | ||||||||||||||||||||||||
1165 | ||||||||||||||||||||||||
1166 | ||||||||||||||||||||||||
1167 | ||||||||||||||||||||||||
1168 | ||||||||||||||||||||||||
1169 | ||||||||||||||||||||||||
1170 | ||||||||||||||||||||||||
1171 | ||||||||||||||||||||||||
1172 | ||||||||||||||||||||||||
1173 | ||||||||||||||||||||||||
1174 | ||||||||||||||||||||||||
1175 | ||||||||||||||||||||||||
1176 | ||||||||||||||||||||||||
1177 | ||||||||||||||||||||||||
1178 | ||||||||||||||||||||||||
1179 | ||||||||||||||||||||||||
1180 | ||||||||||||||||||||||||
1181 | ||||||||||||||||||||||||
1182 | ||||||||||||||||||||||||
1183 | ||||||||||||||||||||||||
1184 | ||||||||||||||||||||||||
1185 | ||||||||||||||||||||||||
1186 | ||||||||||||||||||||||||
1187 | ||||||||||||||||||||||||
1188 | ||||||||||||||||||||||||
1189 | ||||||||||||||||||||||||
1190 | ||||||||||||||||||||||||
1191 | ||||||||||||||||||||||||
1192 | ||||||||||||||||||||||||
1193 | ||||||||||||||||||||||||
1194 | ||||||||||||||||||||||||
1195 | ||||||||||||||||||||||||
1196 | ||||||||||||||||||||||||
1197 | ||||||||||||||||||||||||
1198 | ||||||||||||||||||||||||
1199 | ||||||||||||||||||||||||
1200 | ||||||||||||||||||||||||
1201 | ||||||||||||||||||||||||
1202 | ||||||||||||||||||||||||
1203 | ||||||||||||||||||||||||
1204 | ||||||||||||||||||||||||
1205 | ||||||||||||||||||||||||
1206 | ||||||||||||||||||||||||
1207 | ||||||||||||||||||||||||
1208 | ||||||||||||||||||||||||
1209 | ||||||||||||||||||||||||
1210 | ||||||||||||||||||||||||
1211 | ||||||||||||||||||||||||
1212 | ||||||||||||||||||||||||
1213 | ||||||||||||||||||||||||
1214 | ||||||||||||||||||||||||
1215 | ||||||||||||||||||||||||
1216 | ||||||||||||||||||||||||
1217 | ||||||||||||||||||||||||
1218 | ||||||||||||||||||||||||