Listen

Description

This July 2025 paper introduces AdLlama, a new large language model (LLM) for generating Facebook ad text, trained using Reinforcement Learning with Performance Feedback (RLPF). Unlike previous models that relied on supervised fine-tuning to imitate curated ads, AdLlama utilizes historical ad performance data, specifically click-through rates (CTR), as a reward signal to optimize its text generation. A large-scale A/B test on Facebook, involving nearly 35,000 advertisers, demonstrated that AdLlama significantly improved advertiser-level CTR by 6.7% and increased the number of ad variations advertisers created by 18.5%. The findings highlight RLPF as a promising, generalizable approach for metric-driven LLM post-training, moving beyond subjective human preferences to achieve tangible business outcomes in real-world settings like online advertising.

Source:

https://arxiv.org/pdf/2507.21983