DiscoverBest AI papers explainedRewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior
Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

Update: 2025-10-22
Share

Description

This paper introduces an experimental recipe for interventional analyses designed to study how training data specifically affects the behavior of language models (LMs). This methodology, termed "Rewriting History," involves a three-stage process: selecting target evaluation items, matching relevant pretraining documents to those items, and then modifying those documents before retraining the model to measure the effects. The authors demonstrate the utility of this approach through case studies on factual knowledge acquisition in LMs, examining how both term cooccurrence and information retrieval (IR) methods relate to a model's ability to learn and report facts. The overall aim is to provide a standardized, flexible method for researchers to test fine-grained hypotheses about the relationship between pretraining data and specific model behaviors, moving beyond solely observational studies.

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

Enoch H. Kang