Amplifying Membership Signal Through Chained Regeneration

Wojciech Łapacz, Stanisław Pawlak

The tendency of large generative models to memorize training data makes sample verification critical for privacy auditing and copyright enforcement. Current membership (MIA) and dataset inference (DI) attacks often rely on one-shot generations, which yield weak signals and limited sensitivity across modalities. Inspired by Model Autophagy Disorder (MAD), we introduce MADreMIA, a model-agnostic framework that enhances white-, gray-, and black-box MIA and DI. Rather than relying on shadow model training -- often infeasible for large generative models -- our framework facilitates scalable inference by leveraging inherent signals through iterative trajectories. This process utilizes chained generations across diverse modalities, where each output serves as the subsequent input, to improve membership evidence at low FPR. We demonstrate that memorized training samples exhibit significantly higher coherence and slower degradation during iterative regeneration than non-member generations. Our results show that MADreMIA provides richer signals across diverse model families and modalities; we present comprehensive evaluations for IARs, diffusion, and language models, alongside preliminary results demonstrating its potential for audio models.

Open Source

Research Brief

MADreMIA is a novel model-agnostic framework that enhances the detection of memorized training data in generative AI by leveraging iterative, chained generations across modalities, providing stronger evidence for membership at low false positive rates.

Large generative AI models often inadvertently memorize parts of their training data, which poses significant risks for privacy and intellectual property. Current methods to verify if a specific data point was part of a model's training set (known as Membership Inference Attacks or Dataset Inference) are often weak and not very sensitive across different types of AI. This paper introduces MADreMIA, a new approach inspired by 'Model Autophagy Disorder.' Instead of needing to train 'shadow models' (which is impractical for very large AIs), MADreMIA works by repeatedly generating content. It takes an input, generates an output, then uses that output as the next input in a 'chained regeneration' process. The core idea is that data points the model has memorized will degrade much more slowly and remain more 'coherent' through these iterative generations compared to novel, unseen data. This technique is 'model-agnostic,' meaning it works for various AI architectures and modalities (like images, text, and audio), making it a scalable and effective way to audit models for memorized data.

Potential Applications

**Privacy Auditing:** Verifying if sensitive personal data (e.g., medical records, private conversations) was inadvertently memorized by a generative model, enabling data removal requests or privacy compliance checks.
**Copyright Enforcement:** Detecting if copyrighted material (e.g., specific artworks, unique text passages) was directly copied or memorized by an AI model, aiding in intellectual property protection for creators.
**Data Governance & Model Security:** Helping developers and organizations understand and control what their models have learned, preventing the accidental leakage of proprietary or restricted training data.
**Bias Detection:** While not explicitly stated, detecting specific memorized examples could indirectly help identify instances where models are overly reliant on specific training examples that might introduce or perpetuate bias.

25/100

Paper Trustworthiness Index

High Skepticism

High Skepticism / Self-Published

This document should be treated with critical skepticism. It contains unverified scientific claims or was self-published.

Verified AI Assessment: This credibility analysis was generated by Gemini 2.5 Flash analyzing the full paper text, references, and metadata.

Core Pillars Breakdown

Author & Institutional Track Record

0 / 25

The abstract does not provide any information about the authors or their institutional affiliations. Therefore, it is impossible to assess their track record, prestige, or funding without further context from the full paper or publication details.

Technical Rigor & Methodology

25 / 30

The abstract proposes a 'model-agnostic framework' and explicitly states 'comprehensive evaluations for IARs, diffusion, and language models, alongside preliminary results demonstrating its potential for audio models.' This breadth across diverse model families and modalities, coupled with claims of 'low FPR' and 'significantly higher coherence and slower degradation,' suggests a robust methodological approach and extensive testing, indicating high technical rigor for the proposed method.

Reproducibility & Openness

0 / 25

The abstract does not contain any information regarding the availability of code, datasets, or model weights. There are no links to repositories (e.g., GitHub) or mentions of open-sourcing efforts, making it impossible to assess reproducibility based solely on this abstract.

Community Vetting & Peer Review

0 / 20

The abstract does not specify if the paper has been peer-reviewed, accepted in a major conference (e.g., NeurIPS, ICASSP), or published in a journal. Without this information, its current standing within the scientific community cannot be assessed.

Detailed Evidence Assessment

Verified Evidence & Citations

Current membership (MIA) and dataset inference (DI) attacks yield weak signals and limited sensitivity.

“Abstract: 'Current membership (MIA) and dataset inference (DI) attacks often rely on one-shot generations, which yield weak signals and limited sensitivity across modalities.'”

MADreMIA is a model-agnostic framework.

“Abstract: 'we introduce MADreMIA, a model-agnostic framework that enhances white-, gray-, and black-box MIA and DI.'”

MADreMIA facilitates scalable inference without relying on shadow model training.

“Abstract: 'Rather than relying on shadow model training -- often infeasible for large generative models -- our framework facilitates scalable inference by leveraging inherent signals through iterative trajectories.'”

MADreMIA uses chained generations where each output serves as the subsequent input.

“Abstract: 'This process utilizes chained generations across diverse modalities, where each output serves as the subsequent input, to improve membership evidence at low FPR.'”

Memorized training samples exhibit higher coherence and slower degradation during iterative regeneration.

“Abstract: 'We demonstrate that memorized training samples exhibit significantly higher coherence and slower degradation during iterative regeneration than non-member generations.'”

MADreMIA provides richer signals across diverse model families and modalities.

“Abstract: 'Our results show that MADreMIA provides richer signals across diverse model families and modalities; we present comprehensive evaluations for IARs, diffusion, and language models, alongside preliminary results demonstrating its potential for audio models.'”

Uncertainties & Omissions

• Omission:Author names and affiliations are missing.

• Omission:The specific methodologies and experimental setups are not detailed.

• Omission:Quantitative metrics for 'significantly higher coherence' and 'slower degradation' are not provided in the abstract.

• Omission:Details on the datasets used for evaluation are missing.

• Omission:Specific benchmarks or baselines for comparison beyond 'current MIA/DI attacks' are not explicitly mentioned.

• Omission:No codebase, data, or model weights repository links are provided.

• Omission:The publication venue or peer-review status is not mentioned.

• Uncertainty:The exact definition and measurement of 'coherence' and 'degradation' are not specified.

• Uncertainty:The specific 'low FPR' achieved is not quantified.

• Uncertainty:The extent of 'preliminary results' for audio models compared to 'comprehensive evaluations' for other modalities is unclear.

• Uncertainty:The computational cost and practical overhead of 'iterative trajectories' for extremely large models are not detailed in the abstract.

• Uncertainty:The robustness of the framework against adversarial attempts to obscure memorization is not discussed.