Skip to main content

News & Highlights

Topics: AI in Research, Artificial Intelligence, Bioinformatics, Gen AI, Public Health

Unlocking Public Health Intelligence: How Federated Networks and AI are Bridging the Bedside to the Biosphere

Annual i2b2 symposium highlights how informatics tools are answering our most pressing research questions.

In the spring of 2020, as the COVID-19 pandemic ground the world to a sudden halt, the fragmentation of global healthcare infrastructure became dangerously apparent. Public health agencies scrambled to track infections using lagging indicators like hospital admission rates, while massive silos of potentially life-saving clinical data remained trapped inside competing hospital networks.

Six years later, the landscape of clinical informatics looks fundamentally different. At the 2026 i2b2 symposium held at Simmons University in Boston  June 9-10, 2026, hundreds of medical researchers, software engineers, data scientists, and public health innovators gathered at the conference center – and participants from across the U.S. and from 19 countries around the world attended virtually – to discuss a quiet revolution in medicine: the dawn of an AI-equipped, real-time public health intelligence system operating at a national and global scale.

The annual conference was hosted by the i2b2 Foundation and co-sponsored by Harvard Catalyst, where Griffin Weber, MD, PhD, serves as the academic lead of informatics, with Shawn Murphy, MD, PhD, as lead of informatics training programs. Dell Technologies, AI With Care, Chartis, and Massachusetts Consortium on Pathogen Readiness (MassCPR) served as event sponsors. Working together – the integration of federated data networks—platforms that allow institutions to query vast datasets without compromising patient privacy—and advanced artificial intelligence are tools that are helping to solve what once seemed like insurmountable hurdles in modern medicine.

Moving past a “pipe dream” to a national standard

Opening the symposium, Diane Keogh, executive director of the i2b2 Foundation and senior director of the Informatics program at Harvard Catalyst, and Marc Ciriello, associate director of Informatics (who served as host throughout the conference), introduced George Daley, MD, dean of Harvard Medical School, who reflected on the early, uncertain days of biomedical informatics.

“I actually remember talking with Zak [Isaac] Kohane and David Goland (former dean of graduate education at Harvard Medical School) a few years ago about what at that time felt a bit like a pipe dream—the idea that we might be able to connect siloed patient data in a way that was really generally beneficial,” Daley told the audience. “It took years of building trust, years of governance crafting to make it happen. But we really finally have an AI-equipped, real-time public health intelligence that enables cost-effective, accurate surveillance on a national scale.”

“It took years of building trust, years of governance crafting to make it happen. But we really finally have an AI-equipped, real-time public health intelligence that enables cost-effective, accurate surveillance on a national scale.”

Following the opening remarks, the morning sessions transitioned into a series of panel discussions analyzing the governance models and collaborative missions required to sustain modern data sharing. Leading informatics experts mapped out the real-world deployment of massive data enclaves, highlighting successful cross-institutional networks such as MassCPR, the Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD), and the ENACT network.

A recurring theme throughout these panels was that data sharing has evolved past being a purely technical problem; rather, it requires a unified governance mission and deeply coordinated institutional trust to successfully orchestrate collective medical intelligence across distinct health systems.

This evolution from a local software tool to a global network infrastructure is a triumph that i2b2 founder Zak Kohane, MD, PhD, who is chair of the department of biomedical informatics at Harvard Medical School, views as both a milestone and a mandate.

“When we first built i2b2, the goal was to prove that medicine didn’t have to remain locked in institutional silos,” Kohane remarked. “We wanted to create a common language where data could safely move from the bedside to the researcher’s bench without compromising privacy. Seeing these panels discuss global data enclaves and AI integration validates that original architecture of trust, but it also reminds us that our primary responsibility is to keep these scaling systems neutral, secure, and fiercely protective of patient privacy.”

“We wanted to create a common language where data could safely move from the bedside to the researcher’s bench without compromising privacy.”

The technological backbone of this intelligence system relies on two critical tools: i2b2 (Informatics for Integrating Biology and the Bedside) and its companion platform, SHRINE (Shared Health Research Information Network). Originally developed more than 20 years ago at Partners Healthcare with National Institutes of Health (NIH) funding, i2b2 pulls diverse clinical datasets together for discovery research within single institutions.

Today, i2b2 is used by more than 200 institutions globally. To scale this tool across multiple health systems, SHRINE—an open-source platform devised by Griffin Weber, MD, PhD, associate professor of biomedical informatics at Harvard Medical School, alongside Shawn Murphy, MD, PhD, and colleagues—connects these individual i2b2 instances. SHRINE also serves the ENACT network, which allows clinical and translational science centers in the CTSA consortium to implement and utilize informatics tools for electronic health record (EHR) research, and the MassCPR network, which supports access to high-quality clinical data by enabling secure, local analysis of EHRs.

Today, i2b2 is used by more than 200 institutions globally.

Rather than centralizing sensitive patient records into a single repository, SHRINE allows researchers to deploy secure cross-site queries and receive back anonymized, aggregate results. This preserving of local governance and institutional autonomy removes the data privacy concerns that historically blocked large-scale institutional collaboration.

Federated data networks serve as decentralized data-sharing systems that allow institutions (hospitals, medical centers, etc) to securely collaborate without exposing patient information. The records are not kept in one massive database; rather, queries and algorithms are sent to local sites, remaining behind each hospital’s firewall to ensure security.

“Federated data networks are crucial because they make it possible to do large-scale, multi-institutional research without moving or centralizing sensitive patient data,” Daley emphasized, noting that MassCPR’s data network now amasses ten years of data from over seven million patients across Massachusetts.

The integration of these platforms with predictive analytics represents a massive paradigm shift in how global health studies operate. As Weber noted during his presentation on network capabilities:

“Federated learning changes the rules of engagement for multi-center research,” Weber explained. “By utilizing computed phenotypes across secure enclaves, we can orchestrate massive population health studies across different continents while maintaining absolute data privacy. We are finally building a collective medical intelligence.”

When these platforms successfully integrate structured EHR data with clinical notes, imaging, genomics, and public health signals, they become powerful engines capable of pinpointing emerging health threats in real time.

Precision public health: From the bedside to the cloud

The clinical imperative for unified data was driven home during the keynote address delivered by Monica Bharel, MD, MPH, clinical lead for global public health at Google and former commissioner of the Massachusetts Department of Public Health. Drawing on more than 20 years of experience in internal medicine—including her tenure as chief medical officer of Boston Healthcare for the Homeless—Bharel bridged the gap between raw data and the human experience.

Early in her career, Bharel began asking her patients a fundamental question: What will it take to make you healthy? The overwhelming response pointed directly outside the traditional clinical exam room. Patients spoke of stable housing, community re-engagement, purposeful work, and a safe place to store medications.

“Our data structure is not set up for me to be able to understand how to help them get to those outcomes,” Bharel noted, pointing to the structural inefficiencies of siloed care. She recalled a patient with uncontrolled diabetes and hypertension who, when asked if he was coordinating with a case manager to find housing, responded that he had six different case managers from separate organizations, none of whom had ever spoken to each other about his housing status.

To turn data into precise public health action, Bharel championed the creation of the Commonwealth’s first integrated public health data warehouse during her time as commissioner. By securely bringing together clinical medical claims, public safety records, and housing data, her team was able to identify striking systemic gaps.

For example, their data revealed that individuals interacting with the criminal justice system were 120 times more likely to experience a fatal opioid overdose upon release, with the risk spiking drastically in the immediate weeks following re-entry. This data-driven insight allowed the state to coordinate with county sheriffs to embed medication-assisted treatment and warm handoffs into clinical health systems before individuals left correctional facilities.

Now leading public health initiatives at Google, Bharel highlighted how modern generative AI and cloud computing are amplifying these analytic capabilities.

“If what we did with the public health data warehouse was like using a compass, this current age of AI is like the invention of GPS,” she said. “It’s a whole new world where things can be done more securely, more safely, and more rapidly.”

Bharel outlined an array of AI-powered public health tools currently in development or active deployment, demonstrating how predictive modeling can transform global health intervention. By leveraging advanced analytics, these initiatives include analyzing subtle variations in human cough sounds to screen for tuberculosis in low-resource settings, using traffic prediction models to map accurate travel times to emergency obstetric facilities to optimize infrastructure investments, and merging weather patterns, air quality signals, and local mapping data to create unique “location fingerprints” that forecast regional disease prevalence and identify under-vaccinated neighborhoods.

However, she issued a warning regarding the widening “AI fluency gap” between well-funded state systems and small, local boards of health. She urged public health practitioners to actively adopt accessible, closed-system AI tools to build comfort, manage workloads, and translate dense scientific reports into actionable community briefs.

“GenAI reached 50 million users in just five weeks,” Bharel said. “But using it to improve health outcomes and address inequities won’t happen naturally. It will take those of us in this room to grasp these tools, learn how to put appropriate guardrails on them, and design systems for the common good.”

“GenAI reached 50 million users in just five weeks….but using it to improve health outcomes and address inequities won’t happen naturally. It will take those of us in this room to grasp these tools, learn how to put appropriate guardrails on them, and design systems for the common good.”

Reading the biosphere: Wastewater as an evolutionary forecast

If day one of the symposium illustrated how clinical data can move from individual hospital bedsides into larger public policy frameworks, Mark Johnson, PhD, professor of molecular biology and immunology at the University of Missouri, presented a demonstration showing how public health intelligence can be extracted directly from the environment itself.

Johnson spent 25 years as a specialized virologist researching HIV replication. At the start of the 2020 lockdowns, a desperate email from his state health department asking for anyone with PCR expertise prompted him to pivot to wastewater-based epidemiology.

What began as an unappealing temporary project quickly grew into a massive statewide surveillance infrastructure. By tracking viral loads and utilizing caffeine metabolites as a steady baseline for human biological contributions, Johnson’s lab managed to map the precise, city-by-city movement of viral variants across Missouri weeks before clinical testing caught up.

“Wastewater surveillance was a tool we didn’t know we really had,” he stated, explaining how his lab generated a real-time lineage dashboard just as the Alpha and Delta waves hit the United States. When the Delta variant first emerged at a music festival in Branson, Missouri, wastewater tracking showed it overtaking 50% of the local sewer shed within days. By week three, it had swept across the state. While the CDC’s official clinical tracking recorded only five isolated cases of Delta in Missouri at the time, wastewater sequencing had already revealed that the variant was entirely dominant.

The most surprising turn in Johnson’s research came with the discovery of what are now known as “cryptic lineages”—highly mutated, divergent strings of SARS-CoV-2 detectable in municipal wastewater but completely absent from clinical nasal swabs. Initially suspecting that the mutations were spreading through hidden urban rat or animal populations, Johnson and his collaborators embarked on a painstaking, manhole-by-manhole tracking initiative across the state of Wisconsin.

Using automated samplers to trace the viral signal up through main sewer lines, pump stations, and municipal branches, they eventually narrowed the source of a hyper-divergent viral strain down to a single corporate building’s plumbing system—and ultimately, to a specific set of toilets used exclusively by employees.

The viral load collected directly downstream from that single pipeline was 100 times higher than the total viral output of entire prisons experiencing active outbreaks. Johnson’s lab had proven that a single individual could shed enough virus to be detected after being diluted millions of fold into a city’s wastewater infrastructure.

“Cryptic lineages are coming from rare individuals where COVID has established a persistent, long-term gastrointestinal infection,” Johnson explained, noting one tracked individual in Nebraska who has continuously shed hyper-mutated virus for over five years.

Crucially, these isolated gastrointestinal infections act as a natural laboratory for the virus, allowing it to adapt to persistent selective pressures outside the broader population. When Johnson’s team engineered a “Frankenstein spike” protein mimicking a highly divergent lineage found in a Newark, New Jersey sewer shed, they discovered it possessed the tightest ACE2 receptor binding affinity ever recorded.

By tracking these mutations in wastewater before they enter the human-to-human transmission cycle, informatics teams can access an evolutionary forecast of where a virus is likely to go next. This insight proves invaluable when evaluating vaccine updates and therapeutic protocols within global networks like the SAVE consortium.

“As we plug advanced AI into these networks, our primary responsibility is to ensure these systems remain neutral, secure, and fiercely protective of patient privacy. That is how we turn fragmented data into true public health resilience.”

Day 2: Deepening the technical foundation

Following the broad institutional and epidemiological insights of the first day, the second day of the symposium shifted into a highly technical, hands-on working session dedicated to software deployment, platform architecture, and standardization.

Data engineers and clinical informatics specialists spent June 10 examining the underlying computational mechanics of the i2b2 platform. Technical workshops focused heavily on standardizing data mapping both the i2b2 and the Observational Medical Outcomes Partnership (OMOP) common data model, optimizing cloud architecture for multi-site federated queries, and refining automated ETL (extract, transform, load) pipelines.

Participants also reviewed advanced cryptographic protocols designed to ensure that even as large language models (LLMs) and predictive analytics plug directly into hospital networks, patient-level data remains completely obscured and locally controlled.

Architecture for the common good

The overarching theme of the 2026 symposium was clear: the future of medicine relies on building reliable, neutral avenues for data exchange. As health systems evolve, the challenges ahead are rarely purely technological; rather, they center on governance, regulatory alignment, and fostering cross-institutional trust.

It is an evolution Kohane views as both a triumph and a mandate for the next generation of informatics.

“The technology will always advance, but the core mission of i2b2 remains unchanged: democratizing medical intelligence for the common good,” Kohane stated. “As we plug advanced AI into these networks, our primary responsibility is to ensure these systems remain neutral, secure, and fiercely protective of patient privacy. That is how we turn fragmented data into true public health resilience.”

By combining the privacy-preserving structural safeguards of federated querying systems with the rapid analytical capabilities of AI and environmental monitoring, the medical informatics community has established a new blueprint for global health. The long-sought goal of an open, interoperable data landscape is no longer a pipe dream—it is an active infrastructure protecting public health from the local neighborhood level to the global stage.

All slide presentations and talks from the symposium can be viewed on the i2b2 website.

AI tools were used in organizing content for this article.

 

 

 

Sign up to receive our newsletter: courses, funding, events, and resources.