CrowdStrike acquires Israel-based browser security startup Seraphic, a source says for around $400M; Seraphic has raised around $37M in total (Meir Orbach/CTech)

CrowdStrike acquires Israel-based browser security startup Seraphic, a source says for around $400M; Seraphic has raised around $37M in total (Meir Orbach/CTech)

Meir Orbach / CTech : CrowdStrike acquires Israel-based browser security startup Seraphic, a source says for around $400M; Seraphic has raised around $37M in total —  Founded in 2020, the Israeli company focuses on embedding security directly into the browser.  —  CrowdStrike is acquiring Israeli startup Seraphic Security.

How to quickly compress image files in Windows 11 before sharing

How to quickly compress image files in Windows 11 before sharing

Thanks to the latest Windows 11 update you can now reduce the size of image files before sharing them with one of the neat new applications offered. To do this, right-click on the image file and go to “Share with -> Using more options”. In the following window, you will find a small menu below the name of the image file, which shows the menu item “Original” by default. Click on the small arrow to access the compression levels “Low”, “Medium”, and “High”. If you select them, the Explorer displays the new, compressed file size with which it passes the file on to the desired application. A new function in the “Share with” window of Windows 11 Explorer allows you to reduce the size of image files before passing them on to other applications. Sam Singleton

DeepSeek’s conditional memory fixes silent LLM waste: GPU cycles lost to static lookups

DeepSeek’s conditional memory fixes silent LLM waste: GPU cycles lost to static lookups

When an enterprise LLM retrieves a product name, technical specification, or standard contract clause, it's using expensive GPU computation designed for complex reasoning — just to access static information. This happens millions of times per day. Each lookup wastes cycles and inflates infrastructure costs. DeepSeek's newly released research on "conditional memory" addresses this architectural limitation directly. The work introduces Engram, a module that separates static pattern retrieval from dynamic reasoning. It delivers results that challenge assumptions about what memory is actually for in neural networks. The paper was co-authored by DeepSeek founder Liang Wenfeng. Through systematic experiments DeepSeek found the optimal balance between computation and memory with 75% of sparse model capacity allocated to dynamic reasoning and 25% to static lookups. This memory system improved reasoning more than knowledge retrieval. Complex reasoning benchmarks jumped from 70% to 74% accuracy, while knowledge-focused tests improved from 57% to 61%. These improvements came from tests including Big-Bench Hard, ARC-Challenge, and MMLU. The research arrives as enterprises face mounting pressure to deploy more capable AI systems while navigating GPU memory constraints and infrastructure costs. DeepSeek's approach offers a potential path forward by fundamentally rethinking how models should be structured. How conditional memory solves a different issue than agentic memory and RAG Agentic memory systems, sometimes referred to as contextual memory — like Hindsight , MemOS , or Memp — focus on episodic memory. They store records of past conversations, user preferences, and interaction history. These systems help agents maintain context across sessions and learn from experience. But they're external to the model's forward pass and don't optimize how the model internally processes static linguistic patterns. For Chris Latimer, founder and CEO of Vectorize, which developed Hindsight, the conditional memory approach used in Engram solves a different problem than agentic AI memory. "It's not solving the problem of connecting agents to external memory like conversation histories and knowledge stores," Latimer told VentureBeat. "It's more geared towards squeezing performance out of smaller models and getting more mileage out of scarce GPU resources." Conditional memory tackles a fundamental issue: Transformers lack a native knowledge lookup primitive. When processing text, they must simulate retrieval of static patterns through expensive neural computation across multiple layers. These patterns include named entities, technical terminology, and common phrases. The DeepSeek paper illustrates this with a concrete example. Recognizing "Diana, Princess of Wales" requires consuming multiple layers of attention and feed-forward networks to progressively compose features. The model essentially uses deep, dynamic logic circuits to perform what should be a simple hash table lookup. It's like using a calculator to remember your phone number rather than just looking it up. "The problem is that Transformer lacks a 'native knowledge lookup' ability," the researchers write. "Many tasks that should be solved in O(1) time like retrieval have to be 'simulated for retrieval' through a large amount of computation, which is very inefficient." How conditional memory works Engram introduces "conditional memory" to work alongside MoE's conditional computation. The mechanism is straightforward. The module takes sequences of two to three tokens and uses hash functions to look them up in a massive embedding table. Retrieval happens in constant time, regardless of table size. But retrieved patterns need filtering. A hash lookup for "Apple" might collide with unrelated content, or the word might mean the fruit rather than the company. Engram solves this with a gating mechanism. The model's current understanding of context (accumulated through earlier attention layers) acts as a filter. If retrieved memory contradicts the current context, the gate suppresses it. If it fits, the gate lets it through. The module isn't applied at every layer. Strategic placement balances performance gains against system latency. This dual-system design raises a critical question: How much capacity should each get? DeepSeek's key finding: the optimal split is 75-80% for computation and 20-25% for memory. Testing found pure MoE (100% computation) proved suboptimal. Too much computation wastes depth reconstructing static patterns; too much memory loses reasoning capacity. Infrastructure efficiency: the GPU memory bypass Perhaps Engram's most pragmatic contribution is its infrastructure-aware design. Unlike MoE's dynamic routing, which depends on runtime hidden states, Engram's retrieval indices depend solely on input token sequences. This deterministic nature enables a prefetch-and-overlap strategy. "The challenge is that GPU memory is limited and expensive, so using bigger models gets costly and harder to deploy," Latimer said. "The clever idea behind Engram is to keep the main model on the GPU, but offload a big chunk of the model's stored information into a separate memory on regular RAM, which the model can use on a just-in-time basis." During inference, the system can asynchronously retrieve embeddings from host CPU memory via PCIe. This happens while GPU computes preceding transformer blocks. Strategic layer placement leverages computation of early layers as a buffer to mask communication latency. The researchers demonstrated this with a 100B-parameter embedding table entirely offloaded to host DRAM. They achieved throughput penalties below 3%. This decoupling of storage from compute addresses a critical enterprise constraint as GPU high-bandwidth memory remains expensive and scarce. What this means for enterprise AI deployment For enterprises evaluating AI infrastructure strategies, DeepSeek's findings suggest several actionable insights: 1. Hybrid architectures outperform pure approaches. The 75/25 allocation law indicates that optimal models should split sparse capacity between computation and memory. 2. Infrastructure costs may shift from GPU to memory. If Engram-style architectures prove viable in production, infrastructure investment patterns could change. The ability to store 100B+ parameters in CPU memory with minimal overhead suggests that memory-rich, compute-moderate configurations may offer better performance-per-dollar than pure GPU scaling. 3 . Reasoning improvements exceed knowledge gains. The surprising finding that reasoning benefits more than knowledge retrieval suggests that memory's value extends beyond obvious use cases. For enterprises leading AI adoption, Engram demonstrates that the next frontier may not be simply bigger models. It's smarter architectural choices that respect the fundamental distinction between static knowledge and dynamic reasoning. The research suggests that optimal AI systems will increasingly resemble hybrid architectures. Organizations waiting to adopt AI later in the cycle should monitor whether major model providers incorporate conditional memory principles into their architectures. If the 75/25 allocation law holds across scales and domains, the next generation of foundation models may deliver substantially better reasoning performance at lower infrastructure costs.

Pages, Numbers, and Keynote are still (mostly) free, but we don't like the trend

Pages, Numbers, and Keynote are still (mostly) free, but we don't like the trend

Apple's upcoming Creator Studio subscription adds paid features to Pages, Numbers, and Keynote, and we can see the future having a bigger shift in how it treats long-time free apps. Apple is bringing paid features to its iWork apps On Tuesday, Apple announced the launch of Apple Creator Studio, a new monthly subscription that gives users access to a host of creativity and production apps. Priced at $12.99 a month — or $129 a year — users get access to Logic Pro, Final Cut Pro, and Pixelmator Pro. But those aren't the only apps bundled into the suite, either. Apple is also including "premium content" for Keynote, Pages, Numbers, and Freeform — apps collectively referred to as iWork . Continue Reading on AppleInsider | Discuss on our Forums

Buy in chat: Google adds ‘Checkout’ to Gemini and Search’s AI Mode

Buy in chat: Google adds ‘Checkout’ to Gemini and Search’s AI Mode

Google is launching a “Checkout” feature in its Gemini AI chatbot as well as in Google Search’s AI Mode, according to a recent blog post . The feature allows users to purchase products without leaving the chat or search interface. Purchases can be completed with Google Pay or PayPal. At the same time, Google is also unveiling its Universal Commerce Protocol (UCP). This is an open standard that enables different AI agents, payment systems, and shops to work seamlessly together. It’s also compatible with existing protocols, including Agent2Agent (A2A), Agent Payments Protocol (AP2), and Model Context Protocol (MCP). The protocol was developed in collaboration with retailers such as Shopify, Etsy, Walmart, and Target. It’s supported by over 20 other companies, including Mastercard and Visa. Checkout in Gemini and AI Mode is currently available to US users.