Ad for AI editing app which said it could 'remove anything' banned
The UK regulator said the ad condoned "digitally altering and exposing women's bodies without their consent."
The UK regulator said the ad condoned "digitally altering and exposing women's bodies without their consent."
RAM prices might be steadying, but the knock-on effects are still nasty as MSI reportedly cuts back on budget laptops.
The phone was originally launched back in December in South Korea. The post Sayonara, Galaxy TriFold: Samsung Decides to Discontinue its Most Expensive Foldable Model appeared first on Phandroid .
Apple brings smarter software and premium materials. Sony brings longer battery, hi-res wireless audio, and $100 in savings. Which flagship headphone actually deserves your money? The post AirPods Max 2 vs. Sony WH-1000XM6: Should you get the $549 or $449 flagship headphone? appeared first on Digital Trends .
When Apple removed the free music streaming app Musi from the App Store in 2024, the developers sued . This week, a federal judge dismissed the lawsuit with prejudice in what might become a landmark case related to App Store delistings. Here are the details. more…
Looking for help with today's New York Times Wordle? Here are some expert hints, clues and commentary to help you solve today's Wordle and sharpen your guessing game.
The Sequoia Capital co-steward argues that consistent compounding, not a lack of imagination, is the primary reason investors have chronically underestimated how large technology companies can become.
The post Your Phone Can Now Train AI: Tether’s QVAC Fabric Changes Everything appeared first on Android Headlines .
Reuters : Sources: Nvidia is preparing to sell a version of Groq chips to the Chinese market that is expected to be available in May and will not be downgraded — Nvidia is preparing a version of its Groq artificial-intelligence chips that can be sold to the Chinese market, two sources familiar with the matter told Reuters on Tuesday.
Looking for help with today's New York Times Pips? We'll walk you through today's puzzle and help you match dominoes to tiles.
It's feature-packed for both work and play. The post Save up to 25% OFF the Razer Viper V3 Pro with this Limited-time Offer! appeared first on Phandroid .
Bloomberg : Sources: a JPMorgan-led bank group halted a $5.3B debt deal for Qualtrics, whose existing $1.5B loan trades at 86 cents on the dollar, on weak investor interest — A group of banks led by JPMorgan Chase & Co. halted a $5.3 billion debt deal for software firm Qualtrics International Inc …
Brock E.W. Turner / Axios : Turquoise Health, which offers a healthcare pricing and payments platform, raised a $40M Series C led by Oak HC/FT, bringing its total funding to $95.3M — What to read next
Google is expanding its Personal Intelligence feature to all US users, allowing Search to deliver answers based on your emails, photos, and activity for a more context-aware experience. The post Your Google Search is going to get more personalized than ever appeared first on Digital Trends .
The generative AI era began for most people with the launch of OpenAI's ChatGPT in late 2022 , but the underlying technology — the "Transformer" neural network architecture that allows AI models to weigh the importance of different words in a sentence (or pixels in an image) differently and train on information in parallel — dates back to Google's seminal 2017 paper " Attention Is All You Need ." Yet while Transformers deliver unparalleled model quality and have underpinned most of the major generative AI models used today, they are computationally gluttonous. They are burdened by quadratic compute and linear memory demands that make large-scale inference an expensive, often prohibitive, endeavor. Hence, the desire by some researchers to improve on them by developing a new architecture, Mamba, in 2023, which has gone on to be included in hybrid Mamba-Transformer models like Nvidia's Nemotron 3 Super. Now, the same researchers behind the original Mamba architecture including leaders Albert Gu of Carnegie Mellon and Tri Dao of Princeton have released the latest version of their new architecture, Mamba-3 , as a language model under a permissive Apache 2.0 open source license — making it immediately available to developers, including enterprises for commercial purposes. A technical paper has also been published on arXiv.org . This model signals a paradigm shift from training efficiency to an "inference-first" design. As Gu noted in the official announcement, while Mamba-2 focused on breaking pretraining bottlenecks, Mamba-3 aims to solve the "cold GPU" problem: the reality that during decoding, modern hardware often remains idle, waiting for memory movement rather than performing computation. Perplexity (no, not the company) and the newfound efficiency of Mamba 3 Mamba, including Mamba 3, is a type of State Space Model (SSM). These are effectively a high-speed "summary machine" for AI. While many popular models (like the ones behind ChatGPT) have to re-examine every single word they’ve already seen to understand what comes next—which gets slower and more expensive the longer the conversation lasts—an SSM maintains a compact, ever-changing internal state. This state is essentially a digital "mental snapshot" of the entire history of the data. As new information flows in, the model simply updates this snapshot instead of re-reading everything from the beginning. This allows the AI to process massive amounts of information, like entire libraries of books or long strands of DNA, with incredible speed and much lower memory requirements. To appreciate the leap Mamba-3 represents, one must first understand perplexity, the primary metric used in the research to measure model quality. In the context of language modeling, perplexity is a measure of how "surprised" a model is by new data. Think of a model as a professional gambler. If a model has high perplexity, it is unsure where to place its bets; it sees many possible next words as equally likely. A lower perplexity score indicates that the model is more "certain"—it has a better grasp of the underlying patterns of human language. For AI builders, perplexity serves as a high-fidelity proxy for intelligence. The breakthrough reported in the Mamba-3 research is that it achieves comparable perplexity to its predecessor, Mamba-2, while using only half the state size. This means a model can be just as smart while being twice as efficient to run. A new philosophy The philosophy guiding Mamba-3 is a fundamental shift in how we think about AI "intelligence" versus the speed of the hardware it runs on. While the previous generation, Mamba-2, was designed to be trained at record-breaking speeds, Mamba-3 is an "inference-first" architecture — inference referring to the way AI models are served to end users, through websites like ChatGPT or Google Gemini, or through application programming interfaces (APIs). Mamba 3's primary goal is to maximize every second the computer chip (GPU) is active, ensuring that the model is thinking as hard as possible without making the user wait for an answer. In the world of language models, every point of accuracy is hard-won. At the 1.5-billion-parameter scale, the most advanced "MIMO" variant of Mamba-3 achieved a 57.6% average accuracy across benchmarks, representing a 2.2-percentage-point leap over the industry-standard Transformer. While a two-point jump might sound modest, it actually represents a nearly 4% relative increase in language modeling capability compared to the Transformer baseline. Even more impressively, as alluded to above, Mamba-3 can match the predictive quality of its predecessor while using only half the internal "state size," effectively delivering the same level of intelligence with significantly less memory lag. For years, efficient alternatives to Transformers suffered from a "logic gap"—they often failed at simple reasoning tasks, like keeping track of patterns or solving basic arithmetic, because their internal math was too rigid. Mamba-3 solves this by introducing complex-valued states. This mathematical upgrade acts like an internal compass, allowing the model to represent "rotational" logic. By using this "rotary" approach, Mamba-3 can near-perfectly solve logic puzzles and state-tracking tasks that its predecessors could only guess at, finally bringing the reasoning power of linear models on par with the most advanced systems. The final piece of the puzzle is how Mamba-3 interacts with physical hardware. Most AI models today are "memory-bound," meaning the computer chip spends most of its time idle, waiting for data to move from memory to the processor. Mamba-3 introduces a Multi-Input, Multi-Output (MIMO) formulation that fundamentally changes this dynamic. By performing up to four times more mathematical operations in parallel during each step, Mamba-3 utilizes that previously "idle" power. This allows the model to do significantly more "thinking" for every word it generates without increasing the actual time a user spends waiting for a response. More on these below. Three new technological leaps The appeal of linear models has always been their constant memory requirements and linear compute scaling. However, as the Mamba 3 authors point out, there is "no free lunch". By fixing the state size to ensure efficiency, these models are forced to compress all historical context into a single representation—the exact opposite of a Transformer’s ever-growing KV cache. Mamba-3 pulls three specific levers to make that fixed state do more work. 1. Exponential-Trapezoidal Discretization State Space Models are fundamentally continuous-time systems that must be "discretized" to handle the discrete sequences of digital data. Previous iterations relied on "Exponential-Euler" discretization—a heuristic that provided only a first-order approximation of the system. Mamba-3 introduces a generalized trapezoidal rule , providing second-order accurate approximation. This isn't just a mathematical refinement; it induces an "implicit convolution" within the core recurrence. By combining this with explicit B and C bias terms, the researchers were able to remove the short causal convolution that has been a staple of recurrent architectures for years. 2. Complex-Valued SSMs and the "RoPE Trick" One of the most persistent criticisms of linear models has been their inability to solve simple state-tracking tasks, such as determining the parity of a bit sequence. This failure stems from restricting the transition matrix to real numbers, which prevents the model from representing "rotational" dynamics.Mamba-3 overcomes this by viewing the underlying SSM as complex-valued. Using what the team calls the " RoPE trick ," they demonstrate that a complex-valued state update is mathematically equivalent to a data-dependent rotary embedding (RoPE) applied to the input and output projections. This allows Mamba-3 to solve synthetic reasoning tasks that were impossible for Mamba-2. 3. MIMO: Boosting Arithmetic Intensity The most significant leap in inference efficiency comes from the transition from Single-Input, Single-Output (SISO) to Multi-Input, Multi-Output (MIMO) SSMs. In a standard SSM, the state update is an outer-product operation that is heavily memory-bound.By switching to a matrix-multiplication-based state update, Mamba-3 increases the "arithmetic intensity" of the model—the ratio of FLOPs to memory traffic. This allows the model to perform more computation during the memory-bound decoding phase. Essentially, Mamba-3 utilizes the "idle" compute cores of the GPU to increase model power for "free," maintaining the same decoding speed as its simpler predecessors. What Mamba 3 means for enterprises and AI builders For enterprises, Mamba-3 represents a strategic shift in the total cost of ownership (TCO) for AI deployments. Cost vs. Performance : By matched-parameter performance, Mamba-3 (MIMO) matches the perplexity of Mamba-2 while using half the state size. For enterprise deployment, this effectively doubles the inference throughput for the same hardware footprint. Agentic Workflows : As organizations move toward parallel, agentic workflows (like automated coding or real-time customer service agents), the demand for low-latency generation increases exponentially. Mamba-3 is designed specifically to prevent GPU hardware from sitting "cold" during these tasks. The Hybrid Advantage : The researchers predict that the future of enterprise AI lies in hybrid models . By interleaving Mamba-3 with self-attention, organizations can combine the efficient "memory" of SSMs with the precise "database" storage of Transformers. Availability, licensing, and usage Mamba-3 is not merely a theoretical research paper; it is a fully realized, open-source release available for immediate use with model code published on Github . The project is released under the Apache-2.0 License. This is a permissive, business-friendly license that allows for free usage, modification, and commercial distribution without requiring the disclosure of proprietary source code. This release is good for developers building long-context applications, real-time reasoning agents, or those seeking to reduce GPU costs in high-volume production environments. Leading the State Space Models (SSM) revolution The release was met with enthusiasm on social media, particularly regarding the "student-led" nature of the project. Gu, whose X/Twitter bio describes him as "leading the ssm revolution," gave full credit to the student leads, including Aakash Lahoti and Kevin Y. Li .Gu’s thread highlighted the team’s satisfaction with the design : "We’re quite happy with the final model design! The three core methodological changes are inspired by (imo) some elegant math and methods." As agentic workflows push inference demand "through the roof," the arrival of Mamba-3 suggests that the future of AI may not just be about having the biggest model, but about having the most efficient one. Mamba-3 has successfully re-aligned the SSM with the realities of modern hardware, proving that even in the age of the Transformer, the principles of classical control theory still have a vital role to play.