Why does PyTorch use code generation as part of its build process? Why doesn't it use C++ templates? What things is code generation used for? What are the pros/consof using code generation? What are some other ways to do the same things we currently do with code generation?
Further reading.
Outline:
- High level: reduce the amount of code in PyTorch, easier to develop
- Strongly typed python
- Stuff we're using codegen for
- Meta point: stuff c++ metaprogramming can't do
- C++ apis (functions, methods on classes)
- Especially for forwarding (operator dot doko)
- Prototypes for c++ to implement
- YAML files used by external frameworks for binding (accidental)
- Python arg parsing
- pyi generation
- Autograd classes for saving saved data
- Otherwise complicated constexpr computation (e.g., parsing JIT
schema)
- Pros
- Better surface syntax (native_functions.yaml, jit schema,
derivatives.yaml) - Better error messages (template messages famously bad)
- Easier to organize complicated code; esp nontrivial input
data structure - Easier to debug by looking at generated code
- Con
- Not as portable (template can be used by anyone)
- Less good modeling for C++ type based metaprogramming (we've replicated a crappy version of C++ type system in our codegen)
- Counterpoints in the design space
- C++ templates: just as efficient
- Boxed fallback: simpler, less efficient
- Open question: can you have best of both worlds, e.g., with partially evaluated interpreters?