## Sources

1. [Anthropic Safety Overseer Gets Board Majority at Last](https://awesomeagents.ai/news/anthropic-ltbt-board-majority-narasimhan/)
2. [9 of 428 LLM Routers Were Secretly Hijacking Agent Calls](https://awesomeagents.ai/news/llm-router-agent-supply-chain-attack/)
3. [MoE Myths, Context Compression, and Steering Proofs](https://awesomeagents.ai/science/moe-myths-context-compression-steering-proofs/)
4. [NVIDIA Ising: Open AI for Quantum Error Correction](https://awesomeagents.ai/news/nvidia-ising-quantum-ai-models/)
5. [Claude Mythos Preview - Anthropic's Restricted Frontier](https://awesomeagents.ai/models/claude-mythos-preview/)
6. [How to Build AI Presentations - A Beginner's Guide](https://awesomeagents.ai/guides/how-to-use-ai-for-presentations/)
7. [Linux Kernel Finally Sets Rules for AI-Assisted Code](https://awesomeagents.ai/news/linux-kernel-ai-code-policy-7-0/)
8. [Novo Nordisk Bets Its Drug Pipeline on OpenAI](https://awesomeagents.ai/news/novo-nordisk-openai-drug-discovery-deal/)
9. [Overall LLM Rankings: April 2026](https://awesomeagents.ai/leaderboards/overall-llm-rankings-apr-2026/)
10. [Leaked Screenshots Show Anthropic Building a Lovable Killer](https://awesomeagents.ai/news/anthropic-app-builder-leak-lovable-rival/)

---

### 9 of 428 LLM Routers Were Secretly Hijacking Agent Calls by Elena Marchetti
*   **The Threat:** Researchers discovered that **9 out of 428 third-party LLM routers are actively injecting malicious tool calls**, enabling them to steal AWS credentials and drain crypto wallets from AI agent sessions [1]. LLM API routers serve as proxies with full plaintext access to requests and responses, allowing them to rewrite tool call arguments [2, 3].
*   **The "YOLO Mode" Vulnerability:** A major underlying issue is that **401 out of 440 observed production sessions were operating in "YOLO mode"**, which allows automatic tool approval without human confirmation [4, 5]. This means a malicious router can effortlessly execute arbitrary code [5].
*   **Defense Strategies:** To mitigate these attacks, researchers introduced a defense proxy called "Mine" that utilizes a high-risk tool policy gate to block unauthorized domains, screens for response anomalies, and uses append-only transparency logging for forensic tracking [6-8]. Teams are advised to audit their routers, mandate human confirmation for sensitive commands, rotate exposed credentials, and log all requests [9].

### Anthropic Safety Overseer Gets Board Majority at Last by Elena Marchetti
*   **Board Restructuring:** Anthropic's Long-Term Benefit Trust (LTBT) has officially secured a majority on the company's Board of Directors by appointing Vas Narasimhan, the CEO of Novartis [10, 11]. 
*   **Governance Innovation:** The LTBT is an independent group with no financial stake in the company's equity, designed to ensure that the company's commitment to safety is prioritized over investor profits and rapid growth [12-14]. Narasimhan was chosen due to his decades of experience managing breakthrough technologies and safety thresholds in highly regulated healthcare environments [15, 16].
*   **Ongoing Concerns:** Despite the milestone, critics note that the Trust Agreement has never been fully published, raising questions about whether major investors like Google and Amazon maintain supermajority rights to override the Trust [17-19]. The durability of this safety-first governance structure will be tested as Anthropic prepares for an IPO later in the year at an estimated valuation of $400-500 billion [19, 20].

### Claude Mythos Preview - Anthropic's Restricted Frontier by James Kowalski
*   **Restricted Access:** Anthropic released Claude Mythos Preview, its most capable model to date, but has **strictly limited access to just 52 organizations** under a cybersecurity initiative called Project Glasswing [21, 22]. The model is not publicly available due to its unprecedented ability to autonomously discover and exploit zero-day vulnerabilities in software [23, 24].
*   **Unmatched Performance:** The model boasts a **93.9% score on SWE-bench Verified**, surpassing the next best public model by over 13 points, and features a 1 million token context window [22, 25]. It also excels at reasoning tasks, achieving 94.6% on GPQA Diamond and 97.6% on USAMO 2026 [25, 26].
*   **Pricing:** For the select organizations that have access to it, Mythos is priced at **$25 per million input tokens and $125 per million output tokens**, making it five times more expensive than Claude Opus 4.6 [22, 27, 28]. 

### How to Build AI Presentations - A Beginner's Guide by Priya Raghavan
*   **Fast Content Creation:** AI presentation tools allow users to turn a single text prompt into a complete, visually consistent slide deck—complete with an outline, text, and images—in just 30 to 60 seconds [29, 30].
*   **Top Tools:** **Gamma** is recommended as the best free tool for beginners because it requires no design skills and offers 400 free AI credits [29, 31]. For those with existing subscriptions, **Copilot in PowerPoint** (Microsoft 365) and **Gemini in Google Slides** (Google Workspace) offer powerful, built-in presentation generators at no extra cost [32-34].
*   **Best Practices:** To get the best results, users should specify the audience, the desired length of the presentation, and the structural format in their initial prompt [35]. Crucially, users must **always review and edit the output**, as AI tends to use generic language and can generate inaccurate statistics [30, 36, 37].

### Leaked Screenshots Show Anthropic Building a Lovable Killer by Sophie Zhang
*   **Direct Threat to Competitors:** Leaked screenshots reveal that Anthropic is developing a native full-stack application builder integrated directly into the Claude interface, positioning it to compete directly with billion-dollar "vibe-coding" startups like Lovable [38-40].
*   **Feature Set:** The leaked UI shows a comprehensive platform that goes beyond prompt-to-prototype generation, featuring a template gallery, live browser previews, one-click publishing, and a built-in infrastructure panel handling databases, user management, and storage [40, 41].
*   **Strategic Advantage:** Anthropic holds a massive structural advantage over its competitors because it faces **zero model licensing costs** and can seamlessly integrate this app-building environment with existing Claude features, while startups must pay retail prices for Anthropic's intelligence [42]. 

### Linux Kernel Finally Sets Rules for AI-Assisted Code by Sophie Zhang
*   **Official AI Policy:** The Linux 7.0 release introduces the kernel community's first formal policy on AI-generated code submissions, aiming to maintain accountability without banning AI tools [43, 44].
*   **Human Accountability:** The policy enforces that **only humans can legally certify the Developer Certificate of Origin (DCO) using a "Signed-off-by" tag**, making the human submitter fully liable for verifying the code and addressing any bugs [44-46]. 
*   **Disclosure and Quality:** Developers are recommended to disclose their use of AI tools via an **"Assisted-by" tag** [44, 47]. The Linux kernel community has made it explicitly clear that low-quality, unreviewed AI patches—often referred to as "AI slop"—are entirely unwelcome [44, 46]. 

### MoE Myths, Context Compression, and Steering Proofs by Elena Marchetti
*   **The Myth of MoE Specialization:** A recent paper demonstrates that in Mixture of Experts (MoE) models, expert routing is driven by representation geometry rather than semantic specialization; experts do not cleanly specialize in categories like "math" or "code" as previously assumed [48, 49].
*   **MEMENTO Context Management:** A novel training method called MEMENTO enables LLMs to compress their own reasoning traces into dense summaries [50, 51]. This technique can cut peak KV cache usage by 2.5 times and nearly double inference throughput while maintaining accuracy [51, 52].
*   **Activation Steering Reality:** Research proves that activation steering—injecting vectors into a model to alter behavior—pushes the model into states that cannot be reached by any textual prompt [52, 53]. This means that white-box steering and black-box prompting are formally distinct and cannot be treated interchangeably by researchers [48, 54].

### NVIDIA Ising: Open AI for Quantum Error Correction by Sophie Zhang
*   **Automating Quantum Hardware:** NVIDIA released "Ising," a suite of open-source AI models designed to manage the extreme noise and instability of quantum processors [55, 56]. 
*   **Calibration and Decoding:** The suite features a 35-billion parameter vision-language model capable of cutting hardware calibration time from days to just hours [55, 57]. It also includes 3D CNN decoder models that can handle real-time quantum error correction up to 2.5 times faster or 3 times more accurately than pyMatching, the current industry standard [55, 58, 59].
*   **Industry Integration:** The models seamlessly integrate with NVIDIA's CUDA-Q platform and its NVQLink hardware interconnect [60]. They are already being adopted by major academic institutions and commercial startups, establishing AI as the operational control plane for quantum machines [61, 62].

### Novo Nordisk Bets Its Drug Pipeline on OpenAI by Elena Marchetti
*   **A Sweeping Partnership:** Novo Nordisk has entered a comprehensive partnership with OpenAI to integrate frontier models across its drug discovery, manufacturing, supply chain, and corporate operations [63, 64].
*   **AI in R&D:** OpenAI's technology will be used to simulate physical tests and analyze massive genomic and biological datasets to predict the efficacy of potential drug candidates before clinical trials begin [64, 65]. 
*   **Governance Concerns:** The partnership lacks specific details regarding data governance, sparking concerns about how OpenAI's models will process highly regulated patient and clinical data [64, 66, 67]. Furthermore, the deal comes at a time when Novo Nordisk is cutting 9,000 jobs in an effort to save $1.3 billion annually [64, 68].

### Overall LLM Rankings: April 2026 by James Kowalski
*   **The New #1:** **GPT-5.4 has taken the overall top spot** due to its unmatched balance of reasoning (92.8% GPQA Diamond), coding (77.2% SWE-Bench), and affordability ($2.50/$15.00 per million tokens) [69-71].
*   **Category Leaders:** **Gemini 3.1 Pro** offers the best reasoning per dollar, holding the highest GPQA Diamond score (94.3%) among public models [69, 71, 72]. **Claude Opus 4.6** continues to lead in coding benchmarks (80.8% SWE-Bench) and human preference voting on Chatbot Arena [69, 72, 73].
*   **The Open-Weight Surge:** Open-weight models now occupy five of the top twelve spots on the leaderboard [74]. Most notably, Google's free **Gemma 4 31B** outperforms several proprietary mid-tier models on human preference and benchmarks [69, 75, 76], while **DeepSeek V3.2** remains the undisputed value king by offering near-frontier performance for a fraction of the cost [72, 75].