Week of 2024-12-29

Description

Alex: Hello and welcome to The Generative AI Group Digest for the week of 29 Dec 2024!

Maya: We're Alex and Maya.

Alex: First up, we’re talking about AI and the future of software engineering, sparked by Aravind Putrevu’s insightful take on code generation and the role of CS grads.

Maya: Aravind said, “The talk that it is over for CS Grads a joke. We have a long long way to go.” That’s pretty bold!

Alex: Absolutely! He’s seeing a future where humans might not write code—but that future still needs breakthroughs in things like long context understanding and cost efficiency.

Maya: What’s "long context," by the way?

Alex: It’s basically how much information the AI can refer to at once. For software, having long context helps by understanding complex codebases, especially when libraries have multiple versions.

Maya: Yatharth Pipsies added that most bugs come from library version mismatches, right?

Alex: Yes, and he mentioned “Supermaven,” a non-transformer model, as an interesting approach to tackling long context challenges.

Maya: Aravind also stressed the need for more engineers who innovate on architecture, memory, and hardware, not just API users.

Alex: That’s right. Paras Chopra agreed, saying software engineering teaches both problem-solving and creation, skills still crucial despite AI’s rise.

Maya: Shan Shah made an intriguing point—if AI helps software engineering become as reliable as traditional engineering, with zero surprises at launch, that would be huge.

Alex: Definitely. Today, software is notorious for bugs at launch; AI could help change that standard.

Maya: And Aravind wrapped this up by saying all the doom and gloom talk isn’t helpful to industry, encouraging a more optimistic and proactive outlook.

Alex: Next, let’s move on to the future of CS students and careers with AI.

Maya: Aravind mentioned shortening timelines for learning tech, like maybe BTech shouldn't have to be four years anymore?

Alex: Exactly. At AI companies, candidates solve AI-solvable problems using prompting, which speeds up learning and hiring.

Maya: Yatharth wondered if teams might get smaller with more reasoning models and better context handling.

Alex: Aravind thinks so. He predicts future developers might focus more on reviewing generated code, with testing becoming key.

Maya: So new grads might get hired more as code reviewers rather than pure coders.

Alex: That’s the idea. It signals a shift in skills needed in the industry.

Maya: And also interesting — Aravind shared a cool AI side project, an AI-generated greeting card app he coded over a weekend.

Alex: That’s a great example of how AI lowers the barrier for indie developers to create and market products quickly.

Maya: Alright, moving on, let’s talk about web search APIs and AI grounding.

Alex: Vikram asked about using Perplexity API for web search but found the docs sparse.

Maya: Rachitt Shah recommended Tavily and Exa but with caveats—the performance varies by use case.

Alex: Others found Exa erratic, with outdated results sometimes surfacing for fresh queries.

Maya: Aman pointed out Exa’s financial report searches often hit stale results, implying coverage gaps.

Alex: There was also discussion about Google’s Gemini API offering new grounding features for AI with some early access credits for startups.

Maya: Jina AI provides a similar service — both offer AI-powered fact-checking and search.

Alex: So for anyone building AI-powered search or retrieval tools, it’s crucial to test these APIs for freshness, relevance, and coverage.

Maya: Next, let’s talk startup funding and innovation culture in India.

Alex: xAI, Elon Musk’s AI company, raised $6 billion at a rumored $75 billion valuation. That’s huge!

Maya: Aravind highlighted the lack of policy-driven R&D spend and IP commercialization in India, saying a culture shift is needed beyond just VC funding.

Alex: Paras Chopra and others pointed out systemic challenges—apathy toward breakthrough talent, risk aversion, and cultural crab mentality.

Maya: Rajiv was optimistic though, urging the new generation to dream bigger in this transformative time.

Alex: Pratik Bhavsar emphasized the power of focused, low-cost fine-tuning ($800 total compute), showing you don’t always need giant clusters to innovate.

Maya: Great insights. Now, a quick word on advancements in model training and cost efficiency.

Alex: Deepseek V3, a Chinese model, reportedly used only 2.8 million GPU hours, far less than the 30M+ GPU hours for Llama 3, thanks to innovations like FP8 training and optimized parallelism.

Maya: But people still wonder about its coding performance versus models like Claude which have strong post-training.

Alex: Post-training means fine-tuning a model’s behavior based on real human feedback to improve quality, which is computationally expensive but pays off.

Maya: Nirant K shared how ranking multiple outputs with a reward model plus rejection sampling leads to better results—a clever training loop.

Alex: And Paras Chopra noted that most competitive edges come from post-training because the base pre-training data is usually public.

Maya: Switching gears, a fun update — YouTube now has dubbed audio tracks even in Indic languages like Tamil and Hindi, improving accessibility.

Alex: That’s thanks to voice actors and production houses investing in manual dubbing, costing surprisingly little compared to overall production.

Maya: On to TTS tech — Kewal’s working on improving pronunciation for accented words using a combination of language-specific TTS and speech-to-speech APIs.

Alex: AI4Bharat has good native Indic TTS models but struggles mixing with English, so batch pre-generation strategies might help.

Maya: Then there was talk about context windows—how models handle really long texts like 1 million tokens.

Alex: Adarsh explained models aren’t directly trained on 1M tokens; they expand context windows later with special math tricks like RoPE and ALiBi.

Maya: Managing VRAM and serving these models efficiently is still a big challenge.

Alex: Finally, a listener tip. Maya, what do you have for us?

Maya: Here’s a pro tip you can try today: If you hit chat limits in AI code generation, try summarizing your conversation to extract key context, then feed that summary into a new chat. It reduces token usage while keeping continuity. Alex, how would you use that?

Alex: Great tip! I’d use this for complex coding tasks—split a big problem into chat chunks, summarize intermediate results, and keep the AI focused without losing progress.

Maya: Perfect. Let’s wrap up with key takeaways.

Alex: Remember, AI is transforming software engineering, but human creativity and innovation remain essential—especially in code review and testing.

Maya: Don’t forget, infrastructure and talent gaps are real, but focused, resource-efficient R&D can still lead to breakthroughs from anywhere.

Maya: That’s all for this week’s digest.

Alex: See you next time!

Listen

Description

Want to check another podcast?