## Sources

1. [Generative AI in the Real World: Aishwarya Naresh Reganti on Making AI Work in Production](https://www.oreilly.com/radar/podcast/generative-ai-in-the-real-world-aishwarya-naresh-reganti-on-making-ai-work-in-production/)
2. [Meet the Scope Creep Kraken](https://www.oreilly.com/radar/meet-the-scope-creep-kraken/)

---

### Generative AI in the Real World: Aishwarya Naresh Reganti on Making AI Work in Production
**Authors:** Ben Lorica and Aishwarya Naresh Reganti

*   **The 80-20 Flip in AI Development:** Traditional software development typically dedicates 80% of the time to building and 20% to post-launch maintenance [1]. Generative AI flips this ratio: developers spend about 20% of their time building and **80% of their time on "calibration,"** which involves continuously monitoring user behavior and adjusting the product to align with natural language interactions [2].
*   **The Importance of Data and Workflows:** A common mistake made by non-machine learning developers is neglecting to look closely at their data distribution [3]. **Taking the time to manually establish workflows, curate data, and set up agents** is an underrated but foundational step to maximizing AI performance [3, 4].
*   **Traditional Software Skills are Still Vital:** Traditional developers bring crucial design thinking to AI projects, such as building secure, scalable architectures around the model and treating the model as a nondeterministic API [5]. 
*   **Evals are a Process, Not a Buzzword:** The term "evaluations" or "evals" has been overhyped and poorly defined [6-8]. Evals represent a long, continuous process of calibrating a product, building a feedback flywheel, and conducting online A/B testing, rather than just hitting a static dataset of metrics [7, 8].
*   **Balancing Trade-offs in AI Products:** When building AI systems, teams must balance **performance, effort, cost, and latency** [9]. A recommended strategy is to start with a low-effort approach to see what is possible, focus on hitting performance targets for your dataset, and only optimize for cost and latency (like caching or using smaller models) once a functional prototype exists [9].
*   **When to Use an LLM Judge:** You should only replace human judgment with an LLM judge when you can **codify an evaluation framework and write it out as a rubric in natural language** [10]. For ambiguous tasks that require a specific brand voice or subjective reasoning, subject-matter experts (like marketers) are essential cross-functional collaborators [11, 12].
*   **Enterprise Adoption Prioritizes Internal Ops:** Between 70% and 80% of enterprise engagements focus on internal productivity and ops rather than customer-facing applications due to the severe PR risks associated with generative AI errors [13]. When AI is used in customer-facing scenarios, companies rely heavily on triaging systems to ensure humans oversee complex or high-risk interactions [14, 15].
*   **Model Neutrality vs. Personality:** Enterprises tend to stick with major vendor partnerships rather than swapping models, as baseline capabilities are converging rapidly [16, 17]. However, consumer applications rely heavily on a model's distinct "personality." Swapping models for consumer apps requires immense prompt reengineering, making model neutrality difficult to achieve [17-19].
*   **Career Advice for the AI Era:** Early-career professionals should strive to be **"agent native"**—meaning they instinctively understand how to delegate and augment workflows using AI tools [20, 21]. Building independent projects that solve personal pain points and sharing them publicly is a highly effective way to gain visibility and bypass traditional job application queues [22, 23].

### Meet the Scope Creep Kraken
**Author:** Tim O'Brien

*   **AI Removes the Friction of Scope Creep:** Scope creep existed long before AI, but AI accelerates its growth [24]. Previously, staffing constraints and the time required to build and test features naturally filtered out unnecessary additions. AI breaks this barrier by allowing models to generate complex code, like an entire Swift application, in minutes [24, 25]. 
*   **The Danger of "Tool-Driven Momentum":** Reckless project expansion usually starts with legitimate excitement rather than incompetence [25, 26]. Teams fall into **"confident improvisation,"** adding features just because the model can generate them quickly, subtly replacing deliberate design decisions with rapid demonstrations [26].
*   **The Illusion of Productivity vs. Integration Cost:** While AI increases output and makes teams feel highly productive, it masks the massive downstream integration costs [27]. Every new feature (or "tentacle" of the Kraken) brings a new maintenance obligation, requiring extensive testing and documentation, which pulls the project away from its original goal [27].
*   **Demonstrations Are Not Decisions:** The author emphasizes that a feature should not be considered complete just because an AI produced a convincing draft [28]. Teams easily confuse a successful rapid demonstration with a strategic decision [28].
*   **Restoring Project Discipline:** To fight the Scope Creep Kraken, teams need to put traditional project management friction back in place [28]. This requires keeping a written scope, actively identifying when a new "tentacle" is introduced, and rigorously asking how each new addition affects testing, support, and the future maintainability of the system [28, 29].