Personal Science Week - 250612 Futurehouse
A scientific discovery platform meets personal science
A well-funded non-profit aims to build “AI scientists”, autonomous AI systems that can conduct full scientific research cycles.
How well does it work for personal science?
About Futurehouse
FutureHouse is a philanthropically-funded nonprofit research organization founded in 2023 with significant backing from Eric Schmidt. Its mission is to build AI agents—described as "AI Scientists"—that can automate and scale scientific research, especially in biology and complex sciences.
The organization positions itself as addressing a fundamental bottleneck in modern science: "no individual scientist has time to design tens of thousands of individual hypotheses, or to read the thousands of biology papers that are published" daily. Rather than just throwing more computing power at the problem, FutureHouse recognizes that human effort is the real constraint so they’re developing tools to address several key bottlenecks: reading and summarizing academic papers of course, but also the ability to ask informed followup questions, and operate agents that plan (and eventually, operate) experiments.
For personal scientists, one of the best features is that it’s free. Just set up an account and access their paper search and followup capabilities right away. The site apparently has rate limits, though I never came close to reaching them in my tests. If I had, there is a “Request Limit Increase” button where supposedly they’ll grant you additional access if they determine you’re serious.
My Tests - Evaluating Sleep Changes
The world of AI tools changes quickly, so something that was state-of-the-art a few months ago is no longer a big deal, and in fact the top LLM Chatbots have already incorporated many of the Futurehouse features in their “Deep Research” models. (See my experience with some of them in PSWeek250227).
To try it out, I picked a personal sleep-related issue that’s been giving me trouble lately. For the past few years, I’ve noticed an annoying trend where I wake up in the middle of the night (around 3am) and have trouble falling back asleep. Can Futurehouse propose a few hypotheses and maybe some interventions that can help me understand the root cause and maybe a solution to my sleep troubles?
I started in Claude, giving it as much background as I could think of about my sleep habits. Although I have quantitative data (in Excel) for my sleep results going back many years, I found it easier to load that into ChatGPT and questions there. I quickly found, to my surprise that my sleep has been pretty stable: for several years (2017-2020) I went to bed around 10:30 and got up around 6 for a long-term average 7.5 hours of sleep each night. The average, I found, is roughly what I get now except that since 2020 I’ve seen a gradual increase in the nights when I have at least one wakeup; now it’s up to 5-7 nights / week—highly annoying.
With some back-and-forth, I used Claude to compose a detailed prompt describing my situation and asking Deep Researcher to summarize what’s known and then suggest some treatments and personal science experiments.
As expected, each LLM came back with many pages of results: between 3,000 words (Perplexity) and 15,000 words (Futurehouse). That’s too many to paste here, but here’s a chart I made to summarize the recommendations:
One nice Futurehouse feature is their “Owl” precedent search. Instead of asking it to search for something in particular, you can just ask if such-and-such intervention has been previously tried. For example, I asked it if anyone has ever studied potato starch as a sleep aid and it gave me a ton of directly-applicable citations:
Overall Assessment
So all-up, does FutureHouse stand out? Not particularly. While it provides solid academic rigor, it doesn't meaningfully differentiate itself from the other platforms in terms of: Depth of analysis (ChatGPT was more comprehensive), Practical utility (Gemini provided better implementation guidance), or Research synthesis (all platforms accessed similar literature effectively).
My takeaways:
No platform clearly dominates: Each has distinct strengths, suggesting the AI research landscape is becoming more competitive rather than having a single standout tool
Citation count ≠ utility: Gemini had the most citations (57) yet remained highly practical, while FutureHouse's academic rigor (54 citations) resulted in less actionable guidance.
Practical implementation varies significantly: Despite similar evidence bases, the platforms differed substantially in how actionable their recommendations were. Interestingly, Gemini Pro was the best, giving specific week-by-week experimental suggestions.
Methodology matters more than platform: My structured query approach yielded good results across all systems more than platform choice
Personal Science Weekly Readings
Speaking of sleep, have you ever noticed your body jolt awake right before falling asleep? A PopSci article summarizes the research and points out that something like 70% of people experience these “Hypnagogic jerks” yet researchers don’t know the cause.
Watch out for REM Sleep Behavior Disorder (RBD), which can be a strong predictor of dementia. If while sleeping, you (or your spouse) punch, kick, or shout you may want to look into synucleinopathies.
Knowable does a rundown of the genetics of sleep, mentioning mutations in NPSR1 and ADRB1 among others that seem associated with people who sleep very little with no obvious ill effects.
Can a mixture of milk and honey improve my sleep? A study of cardiac patients in Iran says yes. See this summary of studies that link dairy intake with better sleep
About Personal Science
Personal scientists approach health and wellness questions with the same systematic curiosity that drives professional researchers, but we focus on problems that matter to us personally. Rather than waiting for large-scale studies to provide definitive answers, we design our own n-of-1 experiments using the tools of science: careful observation, controlled variables, and objective measurement.
As AI research tools become more sophisticated and accessible, they're democratizing access to scientific knowledge that was once available only to academic researchers. But remember: the most elegant research synthesis is useless without the discipline to test hypotheses methodically and adjust based on what you learn about your own unique physiology. No AI can tell you whether glycine or magnesium will work better for your 3 AM awakenings—that requires personal experimentation with careful tracking of results.
We publish Personal Science every Thursday for anyone who believes that the scientific method shouldn't be limited to professionals. Whether you're tracking sleep, experimenting with nutrition, or investigating any other aspect of your health, the key is approaching your questions with the same rigor that characterizes good science: skeptical curiosity, controlled testing, and openness to being proven wrong.
If you have other topics you'd like us to explore, please let us know.
Hey Richard, saw your recent post. If you just went to YouTube you would have found my video on exactly what to do here: https://www.youtube.com/watch?v=zpdRKmLePaQ. I hope you find it helpful. I went to your talk with Curtis Estes.
have you tried elicit.com? edited: org to com . Also were there any suggestions that did not involve food?