Listen

Description

John Berryman moved from aerospace engineering to search, then to ML and LLMs. His path: Eventbrite search → GitHub code search → data science → GitHub Copilot. He was drawn to more math and ML throughout his career.

RAG Explained

"RAG is not a thing. RAG is two things." It breaks into:

  1. Search - finding relevant information
  2. Prompt engineering - presenting that information to the model

These should be treated as separate problems to optimize.

The Little Red Riding Hood Principle

When prompting LLMs, stay on the path of what models have seen in training. Use formats, structures, and patterns they recognize from their training data:

Models respond better to familiar structures.

Testing Prompts

Testing strategies:

Managing Token Limits

When designing prompts, divide content into:

Prioritize content by:

  1. Must-have information
  2. Nice-to-have information
  3. Optional if space allows

Even with larger context windows, efficiency remains important for cost and latency.

Completion vs. Chat Models

Chat models are winning despite initial concerns about their constraints:

Applications: Workflows vs. Assistants

Two main LLM application patterns:

Breaking Down Complex Problems

Two approaches:

Example: For SOX compliance, break horizontally (understand control, find evidence, extract data, compile report) and vertically (different audit types).

On Agents

Agents exist on a spectrum from assistants to workflows, characterized by:

Best Practices

For building with LLMs:

  1. Start simple: API key + Jupyter notebook
  2. Build prototypes and iterate quickly
  3. Add evaluation as you scale
  4. Keep users in the loop until models prove reliability

John Berryman:

Nicolay Gerold: