Universal and transferable attacks expose vulnerabilities in pathology foundation models
The Californer/10348293

Trending...
LOS ANGELES - Californer -- The integration of AI into digital pathology through general-purpose foundation models promises to significantly enhance various tasks, such as cancer detection and subtyping. However, these powerful AI systems also introduce severe vulnerabilities, rendering them susceptible to adversarial attacks. Researchers at the University of California, Los Angeles (UCLA) have introduced Universal and Transferable Adversarial Perturbations (UTAP) to investigate these potential threats and shed light on defense mechanisms against such adversarial attacks.

UTAP utilizes an adaptive optimization process to iteratively craft a subtle microscopic noise pattern. When this fixed noise pattern is added to a pathology image, for example, corresponding to a microscopic image of a biopsied tissue section, it systematically disrupts the feature representation capabilities of pathology foundation models by minimizing the similarity between the feature representations of the original and perturbed images. This adversarial methodology fundamentally hampers the representational power of AI models.

More on The Californer
UCLA research demonstrated two key capabilities of UTAP: universality and transferability. The optimized microscopic perturbation of the attack can be applied across diverse sets of tissue images, independent of the training dataset, confirming its universality. Furthermore, the perturbation degrades the performance of various external pathology foundation models without prior exposure, demonstrating extensive transferability to new AI models never seen before. Quantitative evaluations revealed that applying the UTAP microscopic perturbation to various tissue images resulted in significant reductions in accuracy across seven state-of-the-art pathology foundation models.

Standard defense mechanisms, such as the application of spatial low-pass filters to neutralize high-frequency adversarial noise, were proven insufficient. UCLA researchers demonstrated that an adaptive adversary could systematically bypass these filtering defenses by incorporating similar filters into the forward pass during perturbation training.

More on The Californer
To secure the clinical utility of pathology foundation models against these sophisticated threats, the research team proposes a closed-loop methodology comprising detection, source identification, and reconfirmation. This framework utilizes a dedicated attack-detection network as a first line of defense, followed by protocolized rescanning of physical tissue slides to identify/isolate the source of the attack. Ultimately, the framework relies on a human expert in the loop to assess the morphological data and reject hallucinated diagnostic outputs, ensuring patient safety.

The rapid creation of a universal and transferable perturbation pattern, in less than 15 minutes of training time, carries significant implications for the clinical deployment and safety of AI in digital pathology and optical microscopy systems. Consequently, to support the safer development and deployment of pathology foundation models, it is essential to comprehensively study these threats and develop robust defenses.

The study was supervised by Prof. Aydogan Ozcan of UCLA. The other authors of this work include Yuntian Wang, Xilin Yang, Che-Yung Shen, Shuhang Dong, and Nir Pillar.

Article: https://doi.org/10.1038/s41377-026-02347-w

Source: ucla ita
Filed Under: Science

Show All News | Disclaimer | Report Violation

0 Comments

Latest on The Californer