Personal Science Week - 260521 Hypnosis
Hypnosis apps need more testing
We first discussed hypnosis back in PSWeek220519 as an intriguing flow-state hack. I haven’t done serious enough testing on myself to have an updated opinion, but mainstream science has progressed a lot since then with new studies and updates.
This week we update our old post to see how personal scientists can evaluate hypnosis for themselves.
Hypnosis is another way of getting into the Flow
Hypnosis has always seemed a little bizarre, maybe even dangerous. We’ve all seen demonstrations involving seemingly normal people who, under the hypnotists’ spell, appear to willingly do crazy things. Is this just fakery? or is there something more significant happening?
David Spiegel, a Stanford professor of psychology has been studying hypnosis science for decades. He says we shouldn’t think of it as a trance – a spooky loss of control to a professional manipulator. Instead, hypnosis is just an example of a “flow state”. We all know this feeling, sometimes described as “being in the zone”, where time seems to stop and you feel thoroughly immersed in an activity. Hypnosis, like meditation, is just another, normal, desirable human psychological state we enter whenever we focus ourselves.
Spiegel makes an app, Reveri, which attempts to package his hypnosis sessions into a series of recorded steps. For $25/month, the app claims to help you achieve your goals of healthier eating, reduced anxiety, better sleep, and more.
Spiegel says about 2/3rds of people are hypnotizable, and he has a simple test to see if you’re one of them. The “eye roll” test checks whether the whites of your eyes are visible when you look straight up without moving your head. Watch it demonstrated in an Andrew Huberman podcast episode with David Spiegel. (See Podcast Notes and also notes from Juan Pablo Aranovich). Time magazine has a short overview of the subject, including additional details from Spiegel.
Testing the app
In December 2025, npj Digital Medicine published the first large outcome study of Reveri itself: almost 84k users across 300k sessions over four years. Within-session stress dropped on a 10-point Likert scale every time, with Cohen’s d between −0.71 and −0.78 across the first ten sessions. By psychology’s loose standards, that’s a large effect. Standard, interactive sessions worked roughly 1.8× better than brief, non-interactive ones. Higher-hypnotizability users improved more. Only ten of the 84K users reported worsening symptoms — a strong safety signal.
The paper is the first attempt to measure what self-hypnosis actually does at scale, and it’s by far the most data anyone has on the question.
Press release headline: Stanford-developed app reduces stress in 84,000 users. To the normie science news reader, that sounds pretty good.
The skeptic reads the methods section
It isn’t. The study has no control group. It’s a retrospective look at people who chose to open a hypnosis app, rated their stress on a single quiz item before a 10-minute session, then rated it again after. How could a design like that not show improvement?
Regression to the mean. People open the app when they’re stressed. Their next rating will tend to be lower regardless of what happened in between.
Demand characteristics. You just spent ten minutes on something marketed as a stress reducer. Reporting “still stressed” feels like grading yourself.
The pause itself. Ten minutes sitting still with eyes closed reduces self-reported stress whether the audio is hypnosis, an audiobook, or silence.
The authors concede in their own Limitation #4 that they used a single stress test (Likert) measure rather than a validated scale like PSS or PANAS, chosen for engagement rather than rigor. I don’t necessarily mind these sorts of pragmatic decisions made in order to get as large a sample size as possible. But all of these caveats make the conclusions far more shaky than the headline suggests
And that’s not to mention the obvious conflict of interest, and motivated reasoning bias when the study’s author is the same guy who sells the product. Spiegel has spent 50 years on this work and his clinical record is real, and I appreciate any academic who tries to put his work into mainstream use as a real product, but put it all together and a more honest framing is: the study shows something happens after a Reveri session. It cannot show whether that something is hypnosis.
Large N doesn’t fix lack of control
This is the personal-science lesson worth pinning to the wall. A sample of 84k doesn’t compensate for the absence of an active comparator. Big N makes random noise vanish; it does nothing for systematic bias. If every one of those users had also rated their stress before and after ten minutes of anything else — a podcast, silence, scrolling Instagram — the comparison would be informative. Without it, the number is rhetorical, not epistemic.
This is exactly the trap personal scientists fall into when reading observational health data. We notice a sample size, we notice a p value, we move on. The question that decides whether the result means anything is almost always: compared to what?
The trait may not be a trait
The other genuinely new result, which got less attention: David Spiegel spent decades arguing hypnotizability is a stable, lifelong trait — measurable with the eye-roll test and that ~2/3 of adults somewhat susceptible, ~15% highly so. In January 2024, Spiegel co-authored with fellow Stanford researchers Afik Faerman and colleagues a preregistered, double-blind RCT in Nature Mental Health showing that a minute of transcranial magnetic stimulation (TMS) temporarily raises hypnotizability in low-scoring patients. The effect lasted about an hour. n = 80, sham-controlled, neuroimaging-guided. In other words, hypnotizability can be changed.
Two caveats keep this from being world-shaking: the trial measured the hypnotizability score itself, not clinical outcomes, and the effect was transient. But the implication for personal science is real: hypnotizability looks more like trainable VO2 max than like eye color. Predominantly stable, modifiable at the margins. If you take the eye-roll test and score low, that’s the current reading — not a verdict.
It also opens the testable question of whether hypnotizability, “absorption,” NSDR responsiveness (see PSWeek230314), and placebo response are facets of one underlying trait. They might be.
The n-of-1 anyone can run
Reveri still costs ~$99/year, with a free 7-day trial. The eye-roll test is free on YouTube. So is sitting quietly in your bed.
Here’s a one-week protocol any reader can run:
Day 1, 3, 5: rate stress 1–10, do a 10-minute Reveri session, rate stress again.
Day 2, 4, 6: rate stress 1–10, sit quietly for 10 minutes with eyes closed (no audio, no app), rate stress again.
Day 7: look at your numbers.
If Reveri beats silence, that’s evidence of something specific to the intervention. If they tie, the active ingredient was the ten minutes, not the hypnosis. Either result is useful. Neither requires a subscription.
This is the personal-scientist’s answer to a study that can’t answer its own question. An academic paper tells you the app is safe and that users feel better after using it. Whether you should pay for it depends on a comparison only you can run.
Personal Science Weekly Readings
Speaking of hypnosis, a 2024 systematic review of 679 hypnosis apps found only four had ever been tested in a clinical efficacy trial — Reveri wasn’t among them. So the new 2025 paper mentioned above is also Reveri’s first appearance in the literature, which makes the design choices it made all the more consequential.
At PSWeek, we’ve had a lot to say about plastics over the years (e.g. PSWeek250102), so we should revisit some of our assumptions after a new University of Michigan study points out that researcher gloves can contaminate the lab tests that claim high microplastic levels.
About Personal Science
Personal scientists are open-minded but skeptical — about apps marketed with big numbers, about founder-authored outcome papers, and about our own confident first reactions. Nullius in verba — take no one’s word for it. Especially when the word comes wrapped in n = 84,395.
This newsletter is delivered each Thursday. If you have topics you’d like to see covered — or experiments you’re running that deserve attention — let us know.



