This research explores the discrepancy between transformer in-context learning and Bayesian inference, arguing that models are Bayesian in expectation rather than through every individual realization. While previous studies used martingale diagnostics to question the Bayesian nature of these models, this paper identifies positional encodings as the primary factor that breaks the required exchangeability. By accounting for how architectural design prioritizes sequence order, the authors prove that transformers still achieve near-optimal compression and information-theoretic efficiency when performance is averaged across different orderings. Empirical tests on black-box LLMs and controlled ablations demonstrate that order-induced variance exists but predictably decays as context length increases. Ultimately, the study suggests permutation averaging as a practical and effective method for reducing uncertainty and improving the reliability of model outputs in tasks with exchangeable data.