This September 2025 paper explores the critical role of statistical methods in enhancing the reliability and functionality of Generative AI (GenAI), which inherently lacks guarantees regarding correctness or safety. It discusses various statistical applications, including improving and altering model behavior through techniques like output trimming and abstention based on risk scores, often utilizing conformal prediction for provable guarantees. The text also covers diagnostics and uncertainty quantification (UQ), differentiating between epistemic and aleatoric uncertainty and addressing challenges like semantic multiplicity and the need for calibration in GenAI outputs. Furthermore, it highlights the importance of statistical inference in evaluating GenAI models, particularly with limited data, and examines interventions and experiment design, such as using "steering vectors" and "causal mediation analysis" to understand and mitigate biases within these complex systems.
Source:
https://arxiv.org/pdf/2509.07054