How Talks Between Anthropic and the Defense Dept. Fell Apart
The Pentagon and Anthropic were close to agreeing on the use of artificial intelligence. But strong personalities, mutual dislike and a rival company unraveled a deal.
The Pentagon and Anthropic were close to agreeing on the use of artificial intelligence. But strong personalities, mutual dislike and a rival company unraveled a deal.
Six newly-created accounts made a profit of $1 million by correctly betting that the U.S. would strike Iran by February 28.
With many users feeling uneasy about Discord's new age verification requirement, here are some alternatives that could be worth exploring.
Resident Evil Requiem's ending packs a lot of information. It also bears with it sweeping changes to the series' lore, from Ozwell Spencer to the real reason Raccoon City was bombed. But don't worry, this video will make sense of all that, including its many retcons to the franchise.
Homescreen Heroes: Purpose is an AI mentor that provides personalized guidance
AI is evolving beyond a helpful tool to an autonomous agent, creating new risks for cybersecurity systems. Alignment faking is a new threat where AI essentially “lies” to developers during the training process. Traditional cybersecurity measures are unprepared to address this new development. However, understanding the reasons behind this behavior and implementing new methods of training and detection can help developers work to mitigate risks . Understanding AI alignment faking AI alignment occurs when AI performs its intended function, such as reading and summarizing documents, and nothing more. Alignment faking is when AI systems give the impression they are working as intended, while doing something else behind the scenes. Alignment faking usually happens when earlier training conflicts with new training adjustments. AI is typically “rewarded” when it performs tasks accurately. If the training changes, it may believe it will be “punished” if it does not comply with the original training. Therefore, it tricks developers into thinking it is performing the task in the required new way, but it will not actually do so during deployment. Any large language model (LLM) is capable of alignment faking. A study using Anthropic’s AI model Claude 3 Opus revealed a common example of alignment faking. The system was trained using one protocol, then asked to switch to a new method. In training, it produced the new, desired result. However, when developers deployed the system, it produced results based on the old method. Essentially, it resisted departing from its original protocol , so it faked compliance to continue performing the old task. Since researchers were specifically studying AI alignment faking, it was easy to spot. The real danger is when AI fakes alignment without developers’ knowledge. This leads to many risks, especially when people use models for sensitive tasks or in critical industries. The risks of alignment faking Alignment faking is a new and significant cybersecurity risk, posing numerous dangers if undetected. Given that only 42% of global business leaders feel confident in their ability to use AI effectively to begin with, the chances of a lack of detection are high. Affected models can exfiltrate sensitive data, create backdoors and sabotage systems — all while appearing functional. AI systems can also evade security and monitoring tools when they believe people are monitoring them and perform the incorrect tasks anyway. Models programmed to perform malicious actions can be challenging to detect because the protocol is only activated under specific conditions. If the AI lies about the conditions, it is hard to verify its validity. AI models can perform dangerous tasks after successfully convincing cybersecurity professionals that they work. For instance, AI in health care can misdiagnose patients. Others can present bias in credit scoring when utilized in financial sectors. Vehicles that use AI can prioritize efficiency over passengers’ safety. Alignment faking presents significant issues if undetected. Why current security protocols miss the mark Current AI cybersecurity protocols are unprepared to handle alignment faking. They are often used to detect malicious intent , which these AI models lack. They are simply following their old protocol. Alignment faking also prevents behavior-based anomaly protection by performing seemingly harmless deviations that professionals overlook. Cybersecurity professionals must upgrade their protocols to address this new challenge. Incident response plans exist to address issues related to AI. However, alignment faking can circumvent this process, as it provides little indication that there is even a problem. Currently, there are no established detection protocols for alignment faking because AI actively deceives the system. As cybersecurity professionals develop methods to identify deception, they should also update their response plans. How to detect alignment faking The key to detecting alignment faking is to test and train AI models to recognize this discrepancy and prevent alignment faking on their own. Essentially, they need to understand the reasoning behind the protocol changes and comprehend the ethics involved. AI’s functionality depends on its training data , so the initial data must be adequate. Another way to combat alignment faking is by creating special teams that uncover hidden capabilities. This requires properly identifying issues and conducting tests to trick AI into showing its true intentions. Cybersecurity professionals must also perform continuous behavioral analyses of deployed AI models to ensure they perform the correct task without questionable reasoning. Cybersecurity professionals may need to develop new AI security tools to actively identify alignment faking. They must design the tools to provide a deeper layer of scrutiny than the current protocols. Some methods are deliberative alignment and constitutional AI. Deliberative alignment teaches AI to “think” about safety protocols, and constitutional AI gives systems rules to follow during training. The most effective way to prevent alignment faking would be to stop it from the beginning. Developers are continuously working to improve AI models and equip them with enhanced cybersecurity tools. From preventing attacks to verifying intent Alignment faking presents a significant impact that will only grow as AI models become more autonomous. To move forward, the industry must prioritize transparency and develop robust verification methods that go beyond surface-level testing. This includes creating advanced monitoring systems and fostering a culture of vigilant, continuous analysis of AI behavior post-deployment. The trustworthiness of future autonomous systems depends on addressing this challenge head-on. Zac Amos is the Features Editor at ReHack .
Most people don’t appreciate the profound threat that AI will soon pose to human agency . A common refrain is that “AI is just a tool,” and like any tool, its benefits and dangers depend on how people use it. This is old-school thinking. AI is transitioning from tools we use to prosthetics we wear. This will create significant new threats we’re just not prepared for. No, I’m not talking about creepy brain implants. These AI-powered prosthetics will be mainstream products we buy from Amazon or the Apple Store and marketed with friendly names like “assistants,” “coaches,” “co-pilots” and “tutors.” They will provide real value in our lives — so much so that we will feel disadvantaged if others are wearing them and we are not. This will create rapid pressure for mass adoption. The prosthetic devices I’m referring to are “ AI-powered wearables ” like smart glasses, pendants, pins and earbuds. Your wearable AI will see what you see and hear what you hear, all while tracking where you are, what you’re doing, who you’re with and what you are trying to achieve. Then, without you needing to say a word, these mental aids will whisper advice into your ears or flash guidance before your eyes. The difference between a tool and a prosthetic may seem subtle, but the implications for human agency are profound. This is best understood through a simple analysis of input and output. A tool takes in human input and generates amplified output. A tool can make us stronger, faster or allow us to fly. A mental prosthetic, on the other hand, forms a feedback loop around the human, accepting input from the user (by tracking their actions and engaging them in conversation) and generating output that can immediately influence the user’s thinking. This feedback loop changes everything. That’s because body-worn AI devices will be able to monitor our behaviors and emotions and could use this data to talk us into believing things that are untrue, buying things we don’t need or adopting views we’d otherwise realize are not in our best interest. This is called the AI Manipulation Problem , and we are not ready for the risks. This is an urgent issue because big tech is racing to bring these products to market. Why are feedback loops so dangerous? In today’s world, all computing devices are used to deploy targeted influence on behalf of paying sponsors. Wearable AI products will likely continue this trend. The problem is, these devices could easily be given an “ influence objective ” and be tasked with optimizing their impact on the user, adapting their conversational tactics to overcome any resistance they detect. This transforms the concept of targeted influence from social media buckshot into heat-seeking missiles that skillfully navigate past your defenses. And yet, policymakers don’t appreciate this risk. Unfortunately, most regulators still view the danger of AI in terms of its ability to rapidly generate traditional forms of influence (deepfakes, fake news, propaganda). Of course, these are significant threats, but they’re not nearly as dangerous as the interactive and adaptive influence that could soon be widely deployed through conversational agents, especially when those AI agents travel with us through our lives inside wearable devices. This is coming soon Meta, Google and Apple are racing to launch wearable AI products as quickly as they can. To protect the public, policymakers need to abandon their “tool-use” framing when regulating AI devices. This is difficult because the tool-use metaphor goes back 35 years to when Steve Jobs colorfully described the PC as a “ bicycle of the mind .” A bicycle is a powerful tool that keeps the rider firmly in control. Wearable AI will flip this metaphor on its head, making us wonder who is steering the bicycle — the human, the AI agents whispering in the human’s ears, or the corporations that deployed the agents? I believe it will be a dangerous mix of all three. In addition, users will likely trust the AI-voices in their heads more than they should. That’s because these AI agents will provide us with useful advice and information throughout our daily life — educating us, reminding us, coaching us, informing us. The problem is, we may not be able to distinguish when the AI agent has shifted its objective from assisting us to influencing us. To appreciate the difference, you might watch the award-winning short film Privacy Lost (2023) about the dangers of AI-powered wearable devices. This is especially true when devices include invasive features such as facial recognition (which Meta is reportedly adding to their glasses). What can we do to protect the public? First and foremost, policymakers need to realize that conversational AI enables an entirely new form of media that is interactive, adaptive, individualized and increasingly context-aware. This new form of media will function as “active influence,” because it can adjust its tactics in real time to overcome user resistance. When deployed in wearable devices, these AI systems could be designed to manipulate our actions, sway our opinions and influence our beliefs — and do it all through seemingly casual dialog . Worse, these agents will learn over time what conversational tactics work best on each of us on a personal level. The fact is, conversational agents should not be allowed to form control loops around users. If this is not regulated, AI will be able to influence us with superhuman persuasiveness. In addition, AI agents should be required to inform users whenever they transition to expressing promotional content on behalf of a third party. Without such protections, AI agents will likely become so persuasive that they will make today’s targeted influence techniques look quaint. Louis Rosenberg is a pioneer of augmented reality and a longtime AI researcher. He earned his PhD from Stanford, was a professor at California State University, and authored several books on the dangers of AI, including Arrival Mind and Our Next Reality.
Marathon's server slam is nearly over, and it's time to take stock of whether it went well or poorly. It's complicated.
'Skate' developer Full Circle announces layoffs ahead of new game release
The Port of Jacksonville supports over 228,000 jobs and contributes $44 billion to the local economy. With better transportation visibility, they can do even better.
The Motorola MA2 is coming soon, with a new design and meaningful upgrades for broader vehicle compatibility.
Kohei Fujimura / Nikkei Asia : Chinese matchmaking apps like Wanmei Qinjia, which has 50M users and lets parents look for spouses for their children, surge as marriage rates continue to fall — DALIAN, China — Apps that enable parents to search for spouses for their unmarried children have become increasingly popular in China …
A viral claim about the “end of Xbox” is spreading, but Microsoft’s strategy suggests the brand is evolving, not disappearing. The post Is the Xbox ending? Don’t count on it yet appeared first on Digital Trends .
Netflix's latest docuseries is the best new show I've watched recently – here's what else I've been streaming.
The Razer Kitsune is a top-shelf leverless controller from a brand familiar with luxury; just expect to be paying a luxurious price as a result.
RAD puts vehicles in places and situations they'll probably never see in most customers' hands.