We partnered with Anthropic to make our Claude Code course free for everyone. No subscription, no trial. Just dive in. It’s taught by Lydia Hallie, who’s been an instructor with us for years and now works on the Claude Code team at Anthropic. When she taught Claude Code live, it broke every platform record we have with over 10,000 people tuning in. Lydia has a knack for visualizing how tools work under the hood, which is exactly the mental model you need to stop guessing with AI and start directing it. Palo Alto Networks has confirmed active exploitation of CVE-2026-0257, an authentication bypass vulnerability affecting the GlobalProtect portal and gateway components in PAN-OS. In practical terms, the flaw can allow a remote attacker to bypass authentication controls and establish an unauthorized VPN connection into an enterprise environment. Palo Alto rates the issue as High severity, CVSS 7.8, with “Highest” suggested urgency, while Rapid7 has urged organizations to treat it as critical because it affects internet-facing VPN infrastructure. The important nuance: this does not affect every Palo Alto deployment. Palo Alto says exposure depends on GlobalProtect being configured with authentication override cookies and a specific certificate configuration. Panorama and Cloud NGFW are not impacted, while affected PAN-OS branches include 10.2, 11.1, 11.2, 12.1 and some Prisma Access deployments before fixed versions. Rapid7 says its MDR team observed successful exploitation across multiple customers, with the earliest observed activity on May 17, 2026. The activity involved suspicious cookie-based authentication to a local admin account, and Rapid7 said it did not observe successful lateral movement in those cases. The vulnerability has also been added to CISA’s Known Exploited Vulnerabilities catalog, which turns it from “patch soon” into “prioritize now.” The bigger story is that attackers are increasingly targeting the systems that decide who gets inside the network. VPN gateways, firewalls, identity bridges, SASE connectors, and remote access appliances sit at the edge of the enterprise. If those systems fail, attackers do not need to phish a user, steal a password, or defeat MFA in the normal way. They may be able to enter through the infrastructure that was supposed to enforce trust. That is why this flaw is more dangerous than its technical score alone suggests. A vulnerability in an internal app may expose one workload. A vulnerability in an enterprise VPN gateway can create a path into the corporate network, cloud resources, internal applications, admin panels, and identity-connected services. For years, enterprises treated VPN access as a controlled, authenticated doorway. But this incident shows a harsh reality: when the gateway itself has an authentication bypass, identity policies downstream may never get the chance to work. This is especially relevant in hybrid environments where VPNs still connect employees, contractors, developers, support teams, OT networks, and cloud workloads. Even companies that have adopted Zero Trust often still keep legacy VPN infrastructure for special cases, privileged access, regional offices, or fallback connectivity. The lesson for CISOs is clear: remote access infrastructure must be monitored like a Tier-0 identity system, not just like a network box. Palo Alto’s challenge is not only patching the bug. It also has to preserve customer trust in GlobalProtect as a secure remote access layer. The company has issued fixed versions and recommends mitigations such as disabling authentication override or using a dedicated certificate for authentication override cookies. But this incident will likely increase scrutiny around how security vendors handle cookie validation, certificate reuse, backward compatibility, and “convenience” features that reduce login friction. Features designed to make access smoother can become dangerous when their trust assumptions break. This should trigger an immediate review of VPN exposure, not just a patch ticket. Security teams should identify all GlobalProtect portals and gateways, confirm whether authentication override cookies are enabled, verify certificate configuration, apply fixed PAN-OS versions, and hunt for suspicious GlobalProtect logins. The most important operational takeaway: do not stop at patching. Review logs for successful VPN sessions from unusual infrastructure, generic hostnames, odd MAC addresses, unexpected Linux clients, local admin logins, and cookie-based authentication events. Unit 42 has also published indicators of activity for defenders to hunt against. This is a strong use case for managed detection and response. Rapid7’s detection came from suspicious VPN authentication behavior, not simply a vulnerability scan.
Everyone Building the Smartest AI Is Losing to the Safest One I spent years watching founders worship speed. Ship faster. Scale faster. Raise faster. The conventional startup religion. Anthropic looked at that religion and quietly walked away from it. In 2021, Dario Amodei and a group of former OpenAI researchers founded a company with one uncomfortable thesis: the most powerful AI company would not be the smartest or the fastest. It would be the most trusted. That sounded naive in 2021. It sounds prophetic now. As of June 2026, Anthropic has filed confidentially for an IPO at a $965 billion valuation, surpassing OpenAI’s $852 billion. Its annualized revenue has grown from $87 million in January 2024 to more than $44 billion by May 2026. That is not a growth curve. That is a vertical line. And almost all of it came from enterprise customers who chose trust over novelty. This issue breaks down the exact strategic moves that made it happen, the real numbers behind the story, and the warnings most people will not tell you before the IPO. Here is the uncomfortable truth nobody in AI wants to say out loud: Most AI companies treat safety as a PR department. Not a product strategy. They create safety teams. They publish responsible AI principles. They give speeches at Davos. And then they quietly ship whatever moves the benchmark. Anthropic did something structurally different. It built Constitutional AI, a framework where models learn from a set of principles during training itself. Not a filter applied afterward. Not a content moderation layer. The principles are baked into how the model thinks. What that actually means for enterprise customers: Bad: An AI that can be jailbroken with creative prompting, creating legal liability for the company using it. Good: An AI with self-correcting behavior that refuses harmful requests predictably, which is exactly what banks, hospitals, and law firms need for production deployment. For regulated industries, predictable refusal is worth more than creative permissiveness. A compliance officer does not want an AI that can write creative fiction. They want an AI that will not accidentally generate advice that violates securities law. Anthropic understood this before almost anyone else in the market. By 2026, Constitutional AI had become a genuine competitive moat, not just a mission statement. Let me give you the actual numbers, because the growth rate is the story here, not just the destination. Salesforce took 20 years to reach $30 billion in annual revenue. Anthropic did it in under three years from a standing start. The CEO called the growth rate ‘crazy.’ That is CEO-speak for ‘we did not believe our own forecasts.’ The breakdown matters. Approximately 80% of Anthropic’s revenue comes from enterprise customers. Over 1,000 businesses now spend more than $1 million per year on Claude. That number was 500 in February 2026. It doubled in under two months. Enterprise revenue behaves differently from consumer revenue. It is stickier. Customers integrate Claude into their internal systems, their workflows, their compliance processes. Switching costs go up with every integration. That is not a coincidence. That was the strategy. This is the part of the Anthropic story that most people misread as luck. It was not luck. It was leverage. Anthropic secured billions from the two companies that benefit most from enterprise AI adoption: Amazon and Google. Amazon committed up to $33 billion total, including a $5 billion investment in April 2026. Google has invested over $10 billion at a $350 billion valuation, with up to $30 billion more tied to performance milestones. What did Anthropic give in return? Distribution. Amazon gets Claude embedded into AWS Bedrock, reaching Amazon’s 300,000-plus enterprise customers. Google gets Claude in Google Cloud Vertex AI. Anthropic gets cloud infrastructure worth billions of dollars plus instant access to the largest enterprise customer bases in the world. Bad: Raising only venture capital, then spending years trying to build enterprise distribution from zero. Good: Trading equity and infrastructure commitment for instant access to the world’s largest cloud enterprise pipelines. The numbers confirm the strategy worked. Amazon’s $8 billion investment is now worth more than $70 billion on paper. In Q1 2026 alone, Amazon booked $16.8 billion in pre-tax gains from its Anthropic position. Google held roughly 14% of Anthropic before its latest commitment. These are not passive investors. They have skin in the game, which means they actively help Anthropic win. That is a different kind of fundraising. Most founders raise money. Anthropic raised distribution. Here is what the strategic split looks like in practice. 1. Pick your customer before you pick your product. Anthropic decided early that enterprise customers in regulated industries were the target.
I’ve been spending summers in northern Michigan my entire life. Lake Michigan is the size of Belgium, and it sure feels like it when you’re staring out at the water from Sleeping Bear Dunes National Lakeshore. Over Memorial Day weekend, my husband Alex ran a marathon along Grand Traverse Bay, so we decided to stay a few extra days and head up Leleenau Peninsula. There, we settled into a space that felt tucked away from the rest of the world. —Samantha Weiss-Hills, managing editor When Domino covered the opening of Parcel, a pair of Scandi-esque rental cabins just off of Little Traverse Lake, last summer, I knew we had to find an excuse to visit. They are just as charming in person as they are depicted in the story, kitted out with great finishes and furniture (many of which we pinpoint). But the little details are just as dreamy. Here are all the things that made our getaway that much sweeter. I liked how deep this fire pit from Ash & Ember was—the cast iron had also aged just right. I always see this Hay outdoor set in olive, but I really liked Parcel’s iron red version. An Outpost Grill from Breeo was staked to the side of the fire pit, and we used it to cook burgers and asparagus. It was rainy and cool on the first day, so we took the opportunity to get some flames going in this handsome wood stove. This Japanese teapot was a perfect vessel to brew Flowerhead’s Earl Grey in every morning; we’d pour it into sizable Danish mugs. As if the owners knew me, there was a little tin of Jacobsen sea salt waiting for us in the kitchen. There are Le Labo products everywhere, including shampoo and conditioner next to the hand-built sauna and shower. These Michigan-made table linens were so lovely; they’re crafted from upcycled textiles in Kent County. How cute are these little egg cups from Hem? I also used them as a condiment dish during our stay. A pleasing update to the modest power strip. Scandinavia, and really the wider Nordic region, is my spiritual home (and one where I have familial roots); I feel most at ease when I’m in that part of the world. This week, I’m back in Copenhagen for the annual design festival 3 Days of Design. Per usual, the fashion in Denmark’s capital city has been just as wonderful to witness as the home decor. Below, a few pieces from Danish labels I’ll be donning myself while I’m here. Clockwise from top left: James Blazer, Skall Studio They had me first at the color name: Green Curry. But the wide lapels, double button detail, and boxy cut make this tailored menswear-inspired shape something I’ll wear all the time. CBDarmen Dress, Cecilie Bahnsen Cecilie Bahnsen drapes fabrics so ethereally you might as well be a walking cloud in this pink number. Derris Shirt, By Malene Birger A tried-and-true classic; I have it in a tiny blue check, too. Merino Knit Tee, AF Agger It’s a little cold here, so I quickly hunted down this versatile transition piece that plays well with the rest of the wardrobe I packed. Speaking of Copenhagen, these are places I always return to when I’m in town, plus a few new spots I’m excited to peek into this time. Eat: Admiralgade 26, Seks, Auren’s Deli Drink: Ved Stranden 10, Bar Vitrine, Josephine Visit: Louisiana Museum, Louise Roe (above), Tableau On My List: Das Vandrette, Martini House
The episode opens with a virtual tour of The Shortcut’s brand‑new New York City office. The hosts point out the open‑plan layout, the floor‑to‑ceiling windows that frame Manhattan’s skyline, and the collaborative zones designed for brainstorming sessions. They highlight how the space reflects the company’s shift toward a more hybrid work model, with dedicated quiet pods for deep work and a lounge area stocked with snacks and games to foster informal networking among staff.
Next, the conversation turns to Apple’s WWDC keynote. The biggest takeaway is the introduction of Apple Intelligence, a new layer of machine‑learning that powers an extensive overhaul of Siri. The hosts explain that Siri will now be able to understand context across apps, anticipate user needs, and suggest actions before you ask. They also note the integration of on‑device processing for privacy, the new “Ask Anything” prompt that can pull data from multiple sources, and the expanded support for third‑party developers to embed AI capabilities into their apps.
The episode then shifts to gaming news, celebrating the Xbox’s 25th anniversary. Microsoft is releasing a limited‑edition console that bundles classic titles, a retro‑styled controller, and a commemorative art book. The hosts discuss the nostalgic appeal for long‑time fans and speculate on how the anniversary might boost the upcoming Game Pass lineup, especially with new cross‑platform features hinted at in the announcement.
Finally, the hosts dive into a quirky side topic: why ChatGPT sometimes spits out random creepypasta‑style stories. They break down how the model’s training data includes a lot of horror fiction from forums and fan sites, and how the lack of a strong content filter in certain prompts can lead to eerie, unscripted narratives. They suggest that tweaking temperature settings and adding clearer instruction prompts can reduce the likelihood of those spooky outputs, while also acknowledging that the occasional creepypasta can be an entertaining Easter egg for users.
One of the consistent themes of this newsletter is how the stock market is increasingly disconnected from underlying economic activity. In the Number Go Up Rule, I traced how we increasingly run everything to ensure that market capitalization continues to increase. From the dot com boom to the subprime housing to crypto to GameStop to sports gambling, there’s an increasing mania in how we encourage speculation instead of morally valuable activity. The flip side of this disconnect is that governance happens in crisis. Our collective understanding of finance and politics is shaped by crashes - the dot com boom, the Great Financial Crisis of 2008 and Covid, all episodes in which a crisis in one part of the system led to seemingly uncorrelated shocks elsewhere, and then political action to reorder the economy. Today I want to ask what the popping of the AI bubble would look like, and whether it would precipitate a broader financial crash. And if it does so, what shape will it take? The right way to start is by analogy, as there are lessons from previous crashes and the governance that came out of them that we can learn from. For reasons I’ll get into, while 2008 and Covid could be useful to look at, the best analogy is dot com era. First, it’s important to scope out what I’m not going to talk about, which is the governance of AI as a technology. It certainly matters whether we are creating a God-like system, a useful general purpose technology, or a moderately useful toy. There are many fascinating questions around copyright, liability, monopolization, and so forth, but it’s hard to offer persuasive tech policy arguments in the midst of a bubble. In many ways, the key important question facing us today is the financing of AI, and the fact that we have placed a economy-sized bet on the enterprises claiming to focus on this technology. Just seven stocks - all linked to AI - comprise a third of the stock market, and AI capital investment is likely to be between $750 billion and a trillion dollars this year, which is big enough that it affects macro-economic growth numbers. The stock mania we’re seeing as a result is based on the narrative that AI will be some sort of insanely profitable transformative technology. But AI is actually costly to operate, taking up a lot of electricity and expensive computing hardware. So the speculative discourse only works as long as investors subsidize the use of the technology. When that subsidy stops, these AI firms have to actually deliver value, or customers won’t buy it. Over the past few months, that subsidy has been eroding. In May, I wrote about what happens to AI investments as the big firms raise prices to corporate America. Now it’s time for these tools to show measurable returns, either lowering costs or raising revenue. If they don’t, well, the revenues for these companies won’t deliver on the multi-trillion dollar promises they made. That may not matter in the short-term, the financial market overvalued Tesla, crypto, GameStop, and so forth. But at some point, there will be a shock. How significant could such a shock be? Dean Baker, who called the 2008 housing crash, has an “AI bubble monitor” where he lays out the scale of what’s happening. The value of the stock market today, close to $80 trillion, is roughly twice what it was at the peak of the tech bubble. That’s 2.5 times the size of the U.S. economy. A fall back to long-term average would cut, according to Baker, $300,000 per household of paper wealth from balance sheets. Others, like AI Now’s Sarah Myers West, are drawing similar conclusions. Now, just because the stock market is very high doesn’t mean it’s a bubble; the labor share of income is much lower than it was in 2000, meaning what used to go to workers is instead going to capital. While that’s not good for society, it actually is a good non-bubble explanation of why stocks are in nose bleed territory. Several economists recently wrote a paper to that effect, showing that stock market values are relatively constant if you account for the fact that corporations are investing less and paying their workers less, remitting what they would have put into equipment and labor to dividends and buybacks. In terms of free cash flow, which is cash to investors, they argue, the market is valued the same as it was in previous periods. Or at least it was - now data centers are eating up all that cash flow. But whether or not cash flow justifies valuations, we shouldn’t overthink this dynamic. A bubble popping is not some odd event. Stock market drawdowns of 50% or more are historically common, though not understood as such today in our heavily financialized economy. For instance, the April swoon that caused Trump to reverse his tariff policy was a decline of just 25%.
ZTE has won three prestigious awards at Selular Award 2026, held on June 8, 2026, at Menara Peninsula Hotel, Jakarta. The awards recognize ZTE's contributions and innovations in advancing artificial intelligence (AI)-powered network technologies amid the acceleration of digital transformation and 5G development in Indonesia. ZTE's contributions to advancing AI-powered network innovation have been recognized by Selular Media Network (SMN), a leading telecommunications and technology media organization in Indonesia, through three awards at Selular Award 2026. ZTE received honors in the categories of Best AI Technology Fixed Wireless Access, Best AI Network Ecosystem, and Best Native AI Baseband. These awards reflect ZTE's capabilities across network access, ecosystem development, and core infrastructure, further strengthening its position as a technology partner supporting digital transformation and the evolution of AI-driven networks in Indonesia. The Selular Award is an annual appreciation program organized by Selular Media Network (SMN) to recognize outstanding achievements and contributions across Indonesia’s ICT and digital technology industry. As the first and most consistent telecommunications industry award since 2003, the Selular Award serves as a benchmark for excellence, honoring companies and brands that demonstrate innovation, strong performance, and meaningful contributions to Indonesia’s digital transformation. Through this award, the public and business community can identify industry leaders that continue to create value and drive progress in the digital ecosystem. This year's Selular Award carries the theme "Leading The Future: Building Exponential Value in 5G-Advanced and AI Economy", highlighting the convergence of AI and 5G-Advanced as key drivers of digital economic growth. Kevin Fang, Marketing Director of ZTE Indonesia, said: "Digital transformation today is no longer driven solely by connectivity, but also by the ability of networks to operate more intelligently, efficiently, and adaptively. Through the AI-powered innovations we have developed—from broadband access to core infrastructure—ZTE is committed to delivering network solutions that are ready to meet connectivity demands in the AI and 5G-Advanced era. These awards motivate us to continue delivering meaningful innovations that create value for the industry, our customers, businesses, and society." Indonesia's telecommunications industry is currently entering a critical phase in its digital transformation journey. According to the e-Conomy SEA 2025 report by Google, Temasek, and Bain & Company, revenue from AI-powered applications in Indonesia grew by 127% year-on-year, the highest growth rate in Southeast Asia, with 80% of users interacting with AI applications daily. This momentum reflects the growing demand for network infrastructure that is not only fast and reliable but also capable of supporting AI workloads. On the infrastructure side, GSMA Intelligence projects that 5G investment in Indonesia could contribute up to USD 41 billion to the national GDP between 2024 and 2030. This projection highlights the strategic role of 5G as a connectivity foundation that supports digital transformation and the growth of the digital economy. At the same time, the increasing adoption of AI and data-driven services is driving demand for networks that are faster, more reliable, and capable of handling greater capacity. As part of its commitment to supporting these developments, ZTE continues to deliver innovations across the entire network technology value chain, from broadband access to core infrastructure. On the access side, ZTE provides AI-powered Fixed Wireless Access (FWA) solutions designed to expand high-speed connectivity more efficiently and flexibly. The solution serves as a strategic approach to supporting broadband inclusion while addressing the growing demand for connectivity across different regions. In addition, ZTE is building an open ecosystem that integrates AI, connectivity, cloud computing, and various digital technologies within a collaborative framework involving operators and enterprises. At the core infrastructure level, ZTE embeds AI capabilities natively into the baseband, the key component responsible for network signal processing. By integrating AI directly into the baseband from the design stage, networks can analyze, optimize, and adapt operations more intelligently and in real time. This approach enables more autonomous and efficient network operations while preparing networks for the demands of the 5G-Advanced era. Moving forward, ZTE will continue to deepen collaboration with operators, enterprises, and industry partners in Indonesia while strengthening its technology portfolio, ranging from wireless access solutions and optical transport to data center infrastructure and telecommunications energy solutions.
Amazon, along with the rest of the industry, has gotten so used to framing everything that happens through the context of AI that it has lost the plot on their Graviton chip lineup, and along with it their own credibility. Which is a shame, because it's actually a triumph of a chip. First, the Wall Street Journal breathlessly reported that Snowflake's $6 billion AWS commitment was "for agentic computing chips." Then AWS's own press release heralded the release of their latest chips "for the Agentic AI era." In both cases, they were referring to their Graviton line. You could be forgiven for thinking this was some kind of GPU. No, that's Trainium. (Technically, Trainium isn't a GPU, nor is it a CPU, but rather a systolic array. Don't worry; most AI engineering software doesn't know what the hell that is, either.) Graviton is AWS's general purpose Arm CPU, which can be used for AI in much the same way as Excel can be used as a database. But that's far from its only, or even primary, purpose. Let's dive into what Graviton actually is. Price / Performance / Reality For the longest time, Amazon refused to issue benchmarks, competitively positioning its then-nascent Arm line against Intel. Many of us thought this meant that the results would underwhelm — so you can imagine my surprise when real-world workload tests showed 35 percent to 40 percent better performance in a wide variety of situations. It was as if Amazon had built something amazing, but was somehow embarrassed to admit it. Those days are long behind us; they trumpet in the subhead of their announcement that Graviton 5 means "apps run 35% faster, ML inference is 35% faster, and databases are 30% faster." To their credit, I was expecting those numbers to be against something ancient, but in a refreshing bout of honesty, they're comparing them to Graviton 4, itself no slouch. They are also 9 percent more expensive. Once upon a time, new generations of AWS instances were notably less expensive than their predecessors. Going from a c4.large to a c5.large meant you'd get better performance, and the instance itself was a whopping 15 percent cheaper. Upgrading was a no-brainer! That started changing, and now upgrading means the instance becomes more expensive. AWS's position is that this is an incomplete analysis, since the improved performance means you'd pay less for a given workload. In some cases, this is correct, but in others, it's akin to saying that a Ferrari offers better price performance than my Honda CR-V because I can drive it to work three times faster. Logic, as well as traffic lights, disagree. Amazon's contention is correct for customers who have large fleets of nodes that they run at high degrees of CPU utilization. Switching those fleets to the new hotness will absolutely result in a price performance improvement, provided the workload and the stars both align. However, for customers who need a fixed number of nodes (think database companies, who offer each customer of theirs a set number of replicas, or workloads of the form "each environment gets three nodes, one in each AZ"), this represents a pure 9 percent price hike going from old generations to new ones. That puts many customers in a pickle: upgrade to new instance families, or stay on the old ones and watch availability become constrained in the coming years as AWS stops racking old chips. (Hi, Amazon PR! If you're about to pop into my inbox to tell me that won't happen, I have a customer I'd love for you to have a chat with!) But this price hike isn't happening in a vacuum. It's happening against a backdrop of "an 8GB Raspberry Pi is now $175, over twice its launch price of $85." Components have become fiendishly expensive across the board as giant companies compete for capacity, and AWS has to be feeling that pressure. Two companies each asked to buy all of AWS's Graviton capacity for the year; AWS clearly has room to kick their prices into the stratosphere! Somehow, they're not only resisting the siren song of "please gouge me, business daddy," but also managing to keep availability strong for customers of all stripes; I upgraded my developer node in my tiny unremarkable AWS account yesterday, and it Just Worked. And so... Despite the nonsense marketing, I don't want to detract from just how amazing Annapurna Labs (Amazon's chip division) has been at churning out wildly performant silicon year over year. Their chips are legitimately great, and the Graviton 5 numbers are a triumph. Lost against the backdrop of "Agentic AI," the stuff underpinning all of it continues to work, improve, and largely pass by unremarked. Keep going. ®
Datacenters got you down? Worried that even the most innocuous questions will spin up AI models running in water-guzzling, energy-sucking, planet-destroying hyperscalers? You need CrankGPT. No, we’re not talking about surrendering to AI psychosis: we’re talking about a literal hand-cranked machine loaded with a voice agent that can respond to questions and even translate speech into other languages, provided someone keeps the power flowing. There’s an onboard custom-built capacitor board to store some juice, mind you, but it only provides around 20 seconds of crank-free runtime before you’ve gotta keep crankin’ to keep it alive. That, and it takes a bit of time to get it running - according to the documentation website, it’s a 30-second process “from the moment you start cranking to the moment you’re having a conversation with CrankGPT.” According to the AI expert duo behind the device, computer scientist Katrin Tomanek and former Google Advanced Technology and Projects Group technical project lead Alex Kauffmann, CrankGPT still delivers impressive results despite the need to perform some hard physical labor for your tokens (though we’d argue some exercise for your AI might not be a bad thing). “Asking Claude to add two numbers for you is like swatting a fly with a wrecking ball,” Kauffmann told The Register in an email. This tongue-in-cheek demonstration, Kauffmann said, may be a bit of light fun, but it’s an exercise in demonstrating what his and Tomanek’s AI company, Squeez, is all about: small, private specialized AI models that, in a pinch, might not even need very much energy or a connection to the web to operate. “Squeez produces customized, efficient, and private models that can run on small, inexpensive hardware to solve specific problems,” Kauffmann explained, citing tasks like voice recognition for someone with a strong accent or speech impediment, or specially-trained, local AIs that are subject matter experts in topics like gardening or auto repair, but won’t touch subjects outside their wheelhouse. Contrary to the flashy dot-com for CrankGPT the pair have set up, Kauffmann told me, Squeez has no plans to pursue spin cycle class-powered AI stacks for dev teams, though he said if anyone wants to foot the bill, he'd be happy to give it a shot. "Off-the-shelf bike generators are shockingly expensive and they're fussy to build," Kauffmann said. Still, "a good biker can maintain a steady 120W output, so a class of twenty could power a Blackwell." Speaking of wheelhouses, what’s inside that box? If there’s a tiny computer in a 3D-printed box with a crank attached, there’s a good possibility it’s going to be a Raspberry Pi, and that’s the case here. CrankGPT’s brain is built on a stock RPi 5 with 8 GB of RAM and a cooling fan HAT, and audio input and output are handled by a dedicated I/O HAT designed for voice assistants running RPis. Power comes from the aforementioned crank, which is actually an off-the-shelf 20W switchable voltage hand crank unit built for emergency USB device charging, and is stored in the custom capacitor unit the duo built. “The neatest part of the whole thing is that you can actually feel the inference,” Kauffmann told us. “The amount of resistance the crank presents varies depending on the amount of work the board is doing, so when it's really working (generating words for instance), the crank becomes much harder to turn than when it's idling waiting for you to say something.” As for software, the device is running the most stripped-down, bare bones instance of DietPi the pair could compile, which is able to boot into a functional userspace in about three seconds. The voice agent is the truly original piece of work done for the project, as detailed in the documentation page, and was built entirely from scratch. “We wanted to understand the system end to end and have as few dependencies as possible,” the documentation page notes. It’s available on GitHub for those interested in trying it out. Speech recognition is handled by the Moonshine automatic speech recognition engine, chosen for its speed, while text-to-speech synthesis is handled by Piper, chosen again for its low-resource edge inference capabilities. As for the models running on the thinking itself, there are a few that are behind CrankGPT, with Liquid LFM2 1.2B providing a general-purpose voice agent, and Gemma 3 1B being used for translation. CrankGPT can switch between translation and various prompts (e.g., general question answering and games like two truths and a lie) via a knob on the side of the enclosure. “It’s entirely configurable,” Kauffmann told us. “We added a couple of physical inputs (the knob, a button, a switch) to make experimentation easier.” Kauffmann added that he and Tomanek were surprised by how well the translation function worked. “We did no fine tuning, it's just a two-line prompt and it works really well for high-coverage languages,” he explained.
Nightmare Eclipse, the prolific zero-day vulnerability hunter with an axe to grind against Microsoft, released yet another exploit late Wednesday that the researcher claims will spawn a command prompt that provides total access to the BitLocker volume. This bug, called GreatXML, was “an accidental discovery,” according to the researcher, who said it only took four hours to find. They claim this exploit (published on GitHub and Git-based code-hosting platforms) can bypass BitLocker on any system that has ever run a Microsoft Defender Offline scan at any point in the past. GreatXML comes just a day after Nightmare released exploit code for RoguePlanet, which allows local privilege escalation and leads to SYSTEM-level control over an affected machine. This brings the researcher’s zero-day count to eight. The earlier six - RedSun, UnDefend, BlueHammer, YellowKey, GreenPlasma, and MiniPlasma - all have patches as of this week’s Patch Tuesday event. Redmond on Wednesday told The Register that it is aware of RoguePlanet, and “actively investigating the validity and potential applicability of these claims.” The Windows giant didn’t immediately respond to our inquiries about GreatXML, including when it planned to issue a patch. Microsoft has said none of the vulnerabilities were reported via its official channels prior to being made public. The company also banned Nightmare’s earlier GitHub account, and seemingly threatened legal action before dialing back its rhetoric after steep backlash from the security community. Nightmare Eclipse, who some researchers suggest is an ex-Microsoft employee, harbors a very personal grudge against the Windows giant and its communications with bug hunters. They have promised to keep the zero-days coming, but waffle on the timing. Last month, the researcher pledged a big July 14 drop: “I will make sure your bones are shattered that day,” and then added, “nothing will be released this June (or maybe I will release smtg, depending on circumstances).” On Tuesday, they changed course. “I will be unable to mass disclose zerodays in July 14th, RoguePlanet took way more time than expected and truly drained me. I might take a break but I can't say for sure what I will be doing for next month, maybe it's nothing, maybe it's smtg.” A day later, Nightmare released the “accidental” GreatXML BitLocker bypass. According to the researcher, the BitLocker bypass first requires copying “unattend.xml” and the “Recovery” directory to the root of the recovery partition. The next step is rebooting into WinRE by Shift-clicking Restart. “If everything was done correctly, a shell with unrestricted access to the bitlocker volume will spawn,” Nightmare wrote. Also, if the scan hasn’t even been initiated on the Windows system, first you’d need to either log in and initiate it, or “figure out a way to boot into WinRE in offline scan state.” Security sleuth Will Dormann followed Nightmare’s steps to reproduce GreatXML, and said the writeup seems “flawed.” In his testing, Dormann said the command prompt appeared the next time a Defender Offline scan ran. “And in order to trigger a Microsoft Defender Offline scan, you both need to be logged in to Windows, and also have admin credentials,” he wrote on social media. “And if you've already got that level of access, you can just turn off bitlocker.” “The writeup for GreatXML suggests that the prerequisite is that Windows Defender Offline has been executed at some point in the past,” Dormann added. “And that after planting two files in WinRE, all you need to do is [Shift]-reboot into WinRE, and Windows will automatically go into Microsoft Defender Offline scan mode. But this is not the case in any of the 3 lineages of Win11 that I have handy.” ®
The boffins on Google’s DeepMind team unveiled an experimental new language model this week that uses techniques originally developed for AI image generators to boost text output performance by as much as 4x when running on resource-constrained consumer hardware. It's free to download and you can run it with just 18 GB of DRAM or VRAM. The model, codenamed DiffusionGemma, is the latest addition to Google’s open weights model family. But unlike Gemma 4, which launched this spring, the 26 billion-parameter mixture of experts (MoE) model isn’t a large language model in a conventional sense. Instead, it’s actually closer to image models like Stable Diffusion or Flux. Rather than generating tokens one after another in an autoregressive fashion, DiffusionGemma generates entire paragraphs' worth of tokens at the same time. The process looks a lot like how a diffusion model turns what’s essentially static into an image through a series of denoising steps. As Google explains it, DiffusionGemma works by laying out a canvas of random tokens, and then refining them until the final output is reached. Compared to conventional LLMs, which are memory-bandwidth bound and require a lot of VRAM, diffusion models are a predominantly compute-bound workload, which is why the Chocolate Factory is positioning these models for local deployment. LLMs are autoregressive. During token generation, the model’s active parameters need to be streamed from memory for every token generated, making memory bandwidth a major bottleneck. In the cloud, inference providers balance compute and memory bandwidth by processing hundreds or thousands of requests in parallel. As you might have guessed, this isn’t something the average user running a local model on their notebook can do. However, many consumer products, like high-end graphics cards, have plenty of excess horsepower, which DiffusionGemma can take advantage of to boost output performance. Diffusion language models aren’t perfect. Google isn’t the first to explore this tech. Previous models, like DREAM or Mercury 2, demonstrated major speedups over conventional LLMs, but generally underperformed them in benchmarks for their size. DiffusionGemma doesn’t appear to be any different. According to Google, the 26 billion-parameter model falls just behind Gemma 4 12B in the GPQA-Diamond benchmark, with its main advantage being output speed, and even then it’s not as impressive as Google has made it out to be. The chart shows a roughly 2.25x speedup for DiffusionGemma over the 12B parameter LLM with speculative decode enabled. Compared to Gemma 4 26B-A4B, the speedup is nearly 4x when running a single Nvidia H100. DiffusionGemma is being released as an experimental model rather than an enterprise focused one, like we saw with Gemma 4. The model is available for download on popular model repos like Hugging Face under a highly permissive Apache 2.0 license with support already merged into popular inference engines like vLLM, MLX, and HF Transformers, with support for Llama.cpp coming soon. While local inference has largely been the domain of AI enthusiasts, companies like Google are increasingly leaning on the tech to cut cloud costs associated with their AI services. As you may recall, back in May, Google quietly began shipping a small LLM with its Chrome web browser. ®
Send this story to anyone — or drop the embed into a blog post, Substack, Notion page. Every play sends rev-share back to storyflo · tech.
We’ve simplified responses to 👍 / 👎. Past comments are archived but no longer visible.