Bridging AI international policy and practice
AIDV WG
Francis Crawley, Natalie Meyers, Rodrigo Roa, Seonyoung Kim, Patricia Buendia, Madhava Jay
Oct 15, 25th RDA Plenary Meeting �Brisbane, 2025
https://www.rd-alliance.org/groups/artificial-intelligence-and-data-visitation-aidv-wg/activity/
Acknowledgement of Country
We acknowledge and celebrate the First Australians on whose traditional lands we meet, and we pay our respect to their elders past and present.
Welcome to new RDA members!
OPENNESS
COMMUNITY-�DRIVEN
CONSENSUS
NON-PROFIT AND TECHNOLOGY-�NEUTRAL
HARMONISATION
INCLUSIVITY
6 Guiding Principles are at the heart of the RDA community.
JOIN THE RDA�www.rd-alliance.org/register/
All RDA members are expected to adhere by the RDA Code of Conduct to foster a welcoming and inclusive environment.
https://www.rd-alliance.org/groups/artificial-intelligence-and-data-visitation-aidv-wg/activity/
EOSC-Future/RDA AIDV Working Group and its core outputs
Co-Chairs: Natalie Meyers & Francis Crawley
https://www.rd-alliance.org/groups/artificial-intelligence-and-data-visitation-aidv-wg/activity/
Shaping Responsible AI
Four Recommendations from the AIDV Working Group:
Outputs
Supporting/Other Outputs
�
11:35
The Shifting Paradigm: From Data Transfer to Data Visitation
11:40
DV4RDA TIGER Project: Putting RDA Principles into Practice
Natalie Meyers
11:45
Bridging Data Silos and Privacy
DV4RDA: Embodiment of RDA Core Principles
Goals of the DV4RDA project
Participants' Recruitment
DV4RDA Participants
Actionable RDA Outputs
4 Short Talks
Presenters: Rodrigo Roa, Seonyoung Kim, Patricia Buendia, Madhava Jay
Policy into Practice & the Data Observatory Experience
Rodrigo Roa, Executive Director
Data Observatory
Santiago, Chile
11:50
Who we are?
Data Observatory (DO) is a non-profit public–private–academic institution created by the Government of Chile, Amazon Web Services (AWS), and Adolfo Ibáñez University.
Its mission is to acquire, process, and make available large volumes of data with scientific, technological, and social impact.
Astronomy
Earth Observation
Natural Resources
Society
Research & Work Areas
Open data platforms and infrastructures with a FAIR approach (Findable, Accessible, Interoperable, Reusable), developed in collaboration and partnership with public and private institutions.
FAIR Strategy and SURDATA Alliance
The Data Observatory (DO) leads Chile’s FAIR Data Policy Implementation Strategy, in coordination with the Research Data Alliance (RDA), CODATA, and as Chile’s DataCite consortium for DOI provision.
This work evolved into SURDATA, a regional alliance that promotes collaboration across science, government, industry, and civil society to strengthen data interoperability, research, and innovation, supporting the sustainable development of Chile and Latin America.
FAIR Strategy Launch
March 2025
1
Framework Agreement and Stakeholder Mapping
Apr - May 2025
2
Stakeholder invitation
May 2025
3
1st General Assembly
Oct 2025
5
2nd General Assembly
April 2026
4
Launch of thematic working groups - Nov 2025 - March 2026
ROADMAP
LatamGPT
ROADMAP
DO Participation begins December 2024
1
Public Launch by Science Minister
February 2025
2
Cloud and Data Infrastructure
Mar - Jun 2025
3
Model Training
Jun - Dec 2025
New versions - LatamGPT
2026
Data Sources
+2,6 M documents
21 countries
DO-AWS
Infrastructure, engineering
Experts from GenAI and Data
Processing
Participants
30 Institutions from Latin America and over 60 experts involved
Data Storage and Integration
- Classification of trained data
Data Collection and Cleaning
- Sources in spanish, english & portuguese
Analysis, Processing and Modeling
- Cloud-based training of the LatamGPT
Key Aspects
Release v1.0 LatamGPT
Dec 2025
ETraining, Capacity Building and Support
- Workshops and expert consulting
The Pulse of AI in Latin America (ILIA 2025)
The region is at a turning point.
According to ILIA 2025 LatAm and the Caribbean are moving forward with strong interest, but also deep asymmetries. Brazil and Chile lead the way, while countries such as Costa Rica, Ecuador, and the Dominican Republic are rapidly emerging.
The challenge: to move from enthusiasm to real investment, and from plans to implementation.
Generative AI and open source development are consolidating as the region’s most powerful drivers of democratization.
Source: ILIA 2025
From Data to Action: How the DO Strengthens Data & AI in Lat Am
Latin America INDEX for AI conclusions-> we have a lot of data but limited availability.
Without openness and standardization, data cannot generate real value.
This is where institutions like Data Observatory become strategic, curating, processing, and making available reliable, interoperable, and FAIR data to support research, innovation, and public policy.
The Data Observatory helps close this gap through open platforms that bring open science and digital sovereignty to life. In parallel, through SURDATA, we foster regional collaboration and data interoperability to strengthen Latin America’s research and innovation ecosystem.
Acquisition
Cleansing
Storage
Processing
Analysis
Visualization
Interpretation
contacto@dataobservatory.net
Thank you - Gracias!!!!!
Questions for Rodrigo Roa about DO, Surdata, ILIA, or LATAM GPT?
https://www.rd-alliance.org/groups/artificial-intelligence-and-data-visitation-aidv-wg/outputs/
12:00
Use Case: Policy and Compliance for Data Visitation
Seonyoung Kim, PhD
Bernard Becker Medical Library
Washington University in St. Louis
12:10
What is Data Visitation and Why Adopt it?
Data Visitation in the IRB Protocol
Key Placement Areas:
Goal: Frame DV as a risk-mitigating strategy
Guidance for Informed Consent in AI & Data Visitation
“A reconsideration of the classic form of informed consent is necessary in light of AI. We need to support autonomy through practical, flexible consent mechanisms.”
Dr. Kristy Hackett, Institue on Ethics & Policy for Innovation, McMaster University
Evolving Consent Models in AI and Data Visitation
Data Visitation in the Informed Consent Form (ICF)
Plain language explanation to participants:
Place it in “Confidentiality” section
Per-Subject Informed Consent Verification During QC
Implementation and Call to Action
“A reconsideration of the classic form of informed consent is necessary in light of AI. We need to support autonomy through practical, flexible consent mechanisms.”
– Dr. Kristy Hackett, Institue on Ethics & Policy for Innovation, McMaster University
Questions for Seonyoung Kim about Informed Consent in DV?
https://www.rd-alliance.org/groups/artificial-intelligence-and-data-visitation-aidv-wg/outputs/
12:20
Use Case: FAIRlyz Implementation of AIDV Policies
Patricia Buendia
12:25
FAIRlyz: An Infrastructure for Secure Data Visitation
*FAIR data is Findable, Accessible, Interoperable, Reusable
*FAIRLYZ adds anaLYZable as a 5th principle to the FAIR principles
*Data visitation refers to moving the analysis to the data
Manage Data
simply and securely
+
Validate Data
with semi-automated QC through data visitation
+
Share Data
based on FAIR principles
FAIRLYZ: Rethinking the Data Workflow
Data Consumer
via Data Visitation
QC Reports
Study Data Registry
Ontology and Omics models
Data Contributor
Value Proposition: QC Before Commitment
Future Focus
AI-Powered QC
New UI will integrate an AI agent chatbot to guide researchers through complex QC tasks intuitively.
Enhanced Security
Transitioning to a locally trained AI model to eliminate reliance on external APIs, ensuring data remains isolated and secure.
Ecosystem Growth
Release of open-source plugins and integration with federated learning networks to encourage community contribution and maximum extensibility.
Call to Action
Seeking Institutional Data Owners Worldwide:
Seeking Developers:
Seeking Researchers:
Questions about FAIRlyz Implementation about DV Policies?
https://www.rd-alliance.org/groups/artificial-intelligence-and-data-visitation-aidv-wg/outputs/
12:35
SyftBox: a General Purpose Solution for Data Visitation and Equitable Data Sharing
Lightning Talk
Madhava Jay
12:40
Madhava Jay
🧬 Rare Disease Patient
Software Engineer @ OpenMined
🚀 Help solve data access with open source
🌏 Brisbane, Australia
📧 madhava@openmined.org
Building the public network
for non-public information
Mission
1. Problems with data sharing
2. A general purpose solution
3. A use-case for equitable genomics
This lightning talk
Data’s true power comes from collaboration.
But many data owners are forced to choose between giving up data ownership through copying and centralization, or simply not participating.
Due to legal and ethical constraints, copying data across borders is often unacceptable; resulting in no action.
We need a new way to collaborate fairly and securely.
The Motivating Problem
Data Visitation
Remotely study data on a computer at another organisation
Data Scientist
Datasite
Can answer a “specific” question
…and only that question
Retains governance over the information they steward
…and never shares a copy of the data
Federated Learning
Data Scientist
Datasite
Datasite
Datasite
FL Project
FL Project
FL Project
Datasite
FL Project
Federated Learning
SyftBox.net
An open-source, privacy-first, decentralized network
for secure data collaboration
The SyftBox Platform
https://github.com/OpenMined/syftbox
Try it out!
BioVault.net
A free, open-source, permissionless network
for collaborative genomics
Built on SyftBox
Problems with equitable genomics and data sharing
We allow data owners to make their
data available for remote analysis without uploading or exposing the raw data
Because we’re built on SyftBox and Nextflow, researchers can easily run arbitrary analysis and complex data pipelines
Our solution - data visitation for genomics
Video Slide
Dr Carika Weldon (Bermuda)
Founder of CariGenetics
Dr Rana Dajani (Jordan)
Professor at Hashemite University
Pilot Programmes
SyftBox is Looking for partners and pilots
Any Questions for Madhava about SyftBox?
Or for the Panel?
https://www.rd-alliance.org/groups/artificial-intelligence-and-data-visitation-aidv-wg/outputs/
12:50