Memory stocks such as Micron Technology Inc. (NASDAQ:MU) and Sandisk Corp. (NASDAQ:SDNK) were the consensus AI trade of 2026 — the most direct hardware bet on the inference buildout powering every major hyperscaler.
For the past six sessions, that trade has been unraveling in a way the demand models did not anticipate.
The proximate trigger was Alphabet Inc. (NASDAQ:GOOGL)‘s announcement of TurboQuant on Tuesday, March 24 — an AI memory compression algorithm that rattled the entire memory sector.
Shares of Micron Technology have fallen in each of the past six sessions – tumbling by over 20%. That is the stock’s worst multi-session performance since April 2025’s tariff shock selloff.
Peers Sandisk and Western Digital Corp. (NASDAQ:WDC) have followed suit, with SNDK tumbling 11% on Thursday alone. In Seoul on Thursday, Samsung Electronics fell nearly 5% and SK Hynix dropped approximately 6%, dragging the KOSPI index lower. Kioxia, Japan’s flash memory specialist, lost about 6% in the same session.
The debate now is whether this is the dip to buy or a bullish thesis that just permanently broke.

What TurboQuant Actually Does (And Doesn’t Do)
The algorithm is called TurboQuant. Developed by Amir Zandieh, a research scientist at Google, and Vahab Mirrokni, a VP and Google Fellow, it compresses the key-value cache — the high-speed memory store that allows an AI model to retrieve past calculations without reprocessing them — to just 3 bits per value, down from the standard 16.
According to Google’s benchmarks, that cuts KV cache memory requirements by at least six times with no measurable loss in model accuracy, and delivers up to an 8x performance boost on Nvidia Corp. (NASDAQ:NVDA)‘s H100 GPUs.
An open-source release is expected in the second quarter of 2026.
In plain terms: every AI inference workload runs against a KV cache that grows with context length.
TurboQuant compresses that cache, which means the same amount of high-bandwidth memory can serve more simultaneous users, handle longer contexts, or run a larger model than was previously possible.
However, there are also notable technical boundaries here.
TurboQuant does not touch the memory requirements of AI model training, which remains the largest single driver of high-bandwidth memory procurement at hyperscalers including Google, Microsoft Corp. (NYSE:MSFT) and Amazon.com Inc. (NASDAQ:AMZN).
The algorithm is also, as of today, a laboratory result. It will be formally presented at International Conference on Learning Representations (ICLR) 2026 in late April. It has not been deployed at production scale across any major AI infrastructure stack.
Analysts Say This Is Not Altering The Memory Trade
Wells Fargo TMT analyst Andrew Rocha acknowledged the threat of demand directly.
“TurboQuant is directly attacking the cost curve here,” Rocha said in an investor note Wednesday, adding that lower memory specifications per AI workload quickly raise the question of how much total capacity the industry actually needs.
Yet, Rocha stopped short of a bearish conclusion — noting that the demand destruction scenario requires broad adoption, which has not yet occurred.
It does not alter the industry’s long-term demand picture,” he said.
Morgan Stanley pushed back on the selling theme.
The bank’s semiconductor analyst Shawn Kim called the stock reaction excessive and argued that TurboQuant could ultimately benefit memory makers over the longer term — lower inference costs reduce the per-token cost of running AI services, which historically drives wider adoption rather than demand compression.
The bank invoked Jevons Paradox: efficiency gains in a resource-constrained market tend to increase total consumption, not reduce it.
The historical precedent is stark. JPEG compression didn’t reduce camera storage. Video codecs didn’t reduce hard drive demand — they enabled 4K streaming that drove it higher.
DeepSeek R1’s efficiency breakthrough in January 2025 triggered a similar selloff in Nvidia and memory stocks; within two quarters, AI capex commitments from hyperscalers hit record highs.
The selloff proved to be an entry point, not a cycle turning point.
Vivek Arya, semiconductor analyst at BofA Securities, made the most direct case against the demand destruction thesis.
In a note published Thursday, he said that similar compression techniques have been in circulation since 2024-25 — Nvidia alone has published four distinct KV cache efficiency methods over the past twelve months — without altering hardware procurement at scale.
The more telling evidence, he said, sits in Google’s own spending plans. Despite publishing TurboQuant, Google raised its CY26 capital expenditure outlook to approximately $180 billion, up 100% year-over-year, well above the prior consensus of roughly $127 billion.
“The 6x improvement in memory efficiency,” Arya said in the note, is likely to produce a “6x increase in accuracy and/or context length, rather than 6x decrease in memory.”
He maintained a $500 price target on Micron, noting the stock is now trading at the low end of its historical 5–10x forward price-to-earnings range.
Ben Barringer, technology research lead at Quilter Cheviot, offered the same framing. TurboQuant added to the pressure on memory stocks, Barringer said, but the technology is evolutionary rather than revolutionary and does not alter the sector’s long-term demand outlook.
Andrew Jackson, technology analyst at Ortus Advisors, noted in a research note that the TurboQuant development may make “little difference to demand given the extreme supply constraints” that currently characterize the AI memory market.
What This Means For Investors
The near-term beneficiaries of broader TurboQuant adoption are hyperscalers — cheaper inference costs improve return on infrastructure investment — and AI startups that can run larger models on smaller hardware budgets.
Notably, Nvidia is not a loser under this scenario; GPUs do not become less necessary when memory efficiency improves, they become more cost-effective per dollar of inference output, potentially accelerating adoption in markets previously constrained by cost.
For Micron and SanDisk, the calculus is more complicated.
Both stocks entered 2026 pricing in the assumption that AI memory demand would scale linearly with model size and context length — an assumption TurboQuant now complicates, even if the full adoption path remains years away.
The market appears to be not pricing in the near-term adoption scenario, but the existence of a credible software pathway to lower memory intensity.
That is a different and harder assumption to dismiss.
Micron’s recent selloff is already notable.
Whether it represents a structural repricing or an overreaction to a lab result not yet in production is a question the next earnings cycle — and ICLR 2026 — will begin to answer.
Image: Shutterstock
Login to comment