2 Comments
User's avatar
Alex C.'s avatar

I use Claude, but I don't know as much about it as you do. I'm confused about how much information Claude can hold in its "working memory" (not sure if that's the right term). You wrote: "When I ask Claude a health-related question, for example, it has access to every single medical record, including images, PDFs, spreadsheets; it takes all of that into account when it gives the answer." But later you wrote: "to save tokens, the best thing you can do is regularly clear the memory and start over on a new thread as often as possible.". Surely a folder full of PDFs and other documents would consume a whole lot of tokens? Can Claude retain all that information in working memory while also engaging in a conversation with you about some topic related to your health?

Richard Sprague's avatar

It can hold 1M tokens in its "context" (~"working memory"), and with normal prompting the context grows linearly at every turn: every new question sends the entire thread of prompts and answers. But all those medical records files are treated differently: first they're cached, so they only go in once; second, Claude is smart enough to focus on the relevant files only. tdlr; Claude uses optimization tricks based on the difference between static (Project Knowledge) and dynamic (your prompts) content.