Listen

Description

In this episode, Katherine Forrest and Scott Caravello break down three generative AI architectures—transformers, JEPA, and diffusion models—exploring what sets each apart and how they overlap. They also discuss Manifold-Constrained Hyper-Connections, a recent innovation aimed at improving how transformer layers communicate during training.

For the sources referenced in this episode, please see the links below:

DeepSeek AI: mHC: Manifold-Constrained Hyper Connections

 

##

Learn More About Paul, Weiss’s Artificial Intelligence practice:
https://www.paulweiss.com/industries/artificial-intelligence