Close Menu
Roleplay With AI
    Latest Posts
    janitorBench, JanitorAI’s Benchmark For Evaluating LLMs

    janitorBench, JanitorAI’s Benchmark For Evaluating LLMs

    November 9, 2025
    How To Manage Long Chats On SillyTavern

    How To Manage Long Chats On SillyTavern

    November 6, 2025
    How To Use Chat Summary On WyvernChat

    How To Use Chat Summary On WyvernChat

    November 3, 2025
    X (Twitter) Reddit Discord
    Roleplay With AIRoleplay With AI
    • Home
    • Local Roleplay
    • Online Roleplay
    • Entertainment
    X (Twitter) Reddit
    Roleplay With AI
    Home»Local Roleplay»SillyTavern»How To Manage Long Chats On SillyTavern
    How To Manage Long Chats On SillyTavern
    SillyTavern

    How To Manage Long Chats On SillyTavern

    By WayfarerNovember 6, 202512 Mins Read

    It’s frustrating when you’re enjoying a long, fleshed-out roleplay and the LLM starts forgetting important details or acting out of character. It breaks your immersion and makes continuing a roleplay you’re invested in feel like a chore.

    Increasing your Context Size and dumping information isn’t a good option. You need to optimize your context cache to maintain an immersive and engaging experience. SillyTavern provides multiple tools to help you manage long chats, making it easy to enjoy long roleplays where the LLM doesn’t forget important details and remains coherent.

    Table of Contents
    1. Use The Summarize Extension
      1. Configuring The Summarize Extension
        1. Summarize With
        2. Prompt Builder
        3. Summary Prompt
        4. Update Frequency
        5. Injection Template & Position
      2. Summarized Messages And When To Manually Summarize
      3. How To Summarize Long Chats On SillyTavern
        1. Manual Summaries Using A Free AI Assistant
      4. Optimize Context Cache Post Summarization
    2. Use World Info (Lorebooks) For Long Chats On SillyTavern
      1. World Info Strategies
    3. Extensions To Manage Long Chats On SillyTavern
      1. Qvink’s Message Summarize
      2. Aiko Hanasaki’s Memory Books
    4. How To Manage Long Chats On SillyTavern

    Use The Summarize Extension

    Using the built-in Summarize extension is the easiest way to optimize your context cache for long chats on SillyTavern. It lets you generate summaries using an LLM or manually enter them.

    While it is the easiest tool to use, LLM-generated summaries can sometimes miss important details or include unnecessary information. Smaller models aren’t as good at generating summaries compared to larger models like DeepSeek. However, even a large model like DeepSeek can miss including important details necessary for roleplay continuity.

    SillyTavern Summarize Extension

    You can access the Summarize extension in SillyTavern’s Extensions panel (stacked cubes icon). Clicking “Summarize now” generates a summary (when a chat is open), and clicking “Summary Settings” lets you configure options and enable automatic summary updates.

    Configuring The Summarize Extension

    SillyTavern’s documentation provides detailed explanations of what each configuration setting does.

    Summarize With

    Select Main API. The extension will use your current backend, model, and settings to generate summaries. For example, if you’re using an 8B model with KoboldCpp, that same model will generate your summaries.

    Prompt Builder

    Controls how SillyTavern builds the summary prompt. The default “Classic, blocking“ should work in most cases. You can learn more about how the three different methods work in SillyTavern’s documentation.

    Summary Prompt

    You can customize the prompt that SillyTavern uses for generating the summary. SillyTavern’s default prompt works well in most cases, but you can modify it to provide further instructions to the LLM.

    • Target summary length: You can set the desired length of your summary (in words) and use the {{words}} macro in the summary prompt. The option isn’t a hard and fast rule, and the LLM may generate a summary that is shorter or longer than your specified length.
    • API response length: You can override the global response length (in tokens) that you’ve set in your AI Response Configuration settings. For example, this setting is useful when your global response length is 150 tokens for shorter LLM responses in chat, but you might not want to restrict the summary to 150 tokens.
    • Max messages per request: Only works with “Raw, blocking” and “Raw, non-blocking” prompt builder options. You can define the maximum number of messages the LLM should summarize during each generation.

    Update Frequency

    These settings control how often summaries are automatically generated and updated. You can define how many messages or words SillyTavern processes before generating a new summary. Ideally, the first summary should be automatically generated before the initial messages drop out of the context window.

    SillyTavern Prompt Itemization

    You can click on the Prompt icon on the most recent LLM response to see how many total tokens your chat has consumed. The Prompt Itemization gives you an idea of approximately how many messages it would take before the initial messages start dropping out of the context window.

    You can click on the magic wand icon next to the frequency options to let SillyTavern set the optimal frequency based on your chat. For example, in our chat with 64 messages, where the chat history consumed 14,636 tokens (16,384 Context Size), SillyTavern recommended updating the summary every 60 messages.

    Setting both update frequency options to 0 will disable automatic updates. If you decide to enable automatic summary updates, make sure you review the new summaries to ensure they include all important details and avoid unnecessary information.

    Injection Template & Position

    You can define how SillyTavern structures your summary and where it inserts it in the prompt. Depending on your chat or text completion preset and API, you may have to test if your chosen injection position works correctly.

    For example, while using CherryBox 1.4 DeepSeek Preset and DeepSeek’s Official API, we noticed the summary prompt wasn’t injected when set to “After Main Prompt / Story String.” You can use the Prompt Itemization information to verify that SillyTavern successfully injects the summary into the prompt.

    Summarized Messages And When To Manually Summarize

    The number of messages the model can summarize depends on your Context Size. For example, if your Context Size is set to 8,192, the model can summarize about 15 to 20 messages.

    The number may vary depending on the length of your messages, permanent tokens used by the character card, your persona, etc. Ideally, you should manually generate your first summary before the initial messages drop out of the context window.

    SillyTavern Prompt Itemization

    You can click on the Prompt icon on the most recent LLM response to see how many total tokens your chat has consumed. The Prompt Itemization gives you an idea of when to manually generate your first summary.

    SillyTavern Prompt Itemization Visualization

    How To Summarize Long Chats On SillyTavern

    If you’ve exceeded your Context Size and haven’t generated a summary or want to generate better summaries for long chats, you can use SillyTavern’s /hide and /unhide commands to generate better, targeted summaries.

    These commands hide and unhide one or more messages from the prompt, letting you control which messages remain in the context cache. Messages hidden from the prompt have a ghost icon next to your name or the character’s name.

    SillyTavern Hide And Unhide Messages

    For example, in our chat with 64 messages, we first used the /hide 30-64 command. The LLM could only access messages 1 to 29, and we manually generated a summary and saved it in a text editor.

    Next, we used the command /unhide 30-64 first, and then /hide 1-29. The LLM could only access messages 30 to 64. We removed the previous summary before manually generating a new one and saved it in a text editor.

    Since SillyTavern’s default summary prompt instructs the LLM to build upon any existing summary, it is important to remove the previous summary before generating a new one. Alternatively, you can modify the summary prompt.

    Now we have two detailed summaries that we can merge and refine into a single detailed summary. We recommend breaking down your roleplay into chapters and using the /hide and /unhide commands to summarize messages in organized batches.

    Manually generating summaries for long chats on SillyTavern in this manner gives you greater control over them and makes it easy to carry forward important details while removing unnecessary information.

    Manual Summaries Using A Free AI Assistant

    If generating detailed summaries in SillyTavern is tedious or your model isn’t capable of generating decent summaries, you can use a free AI assistant like DeepSeek, Gemini, or ChatGPT to help generate one.

    Note: We do not recommend using AI assistants to generate summaries if your chat contains NSFW content. NSFW content isn’t limited to sexual material; it also includes gore, drug-related topics, extreme violence, and more.

    SillyTavern Manage Chat Files

    Click on the Chat Options button at the bottom left of the chat page, then select Manage chat files.

    SillyTavern Export Chat Files

    Export your chat as a text file. You can then upload this text file and ask DeepSeek, Gemini, or ChatGPT to generate a summary of your ongoing roleplay.

    Optimize Context Cache Post Summarization

    Once you’ve summarized your chat messages, you can optimize your context cache by removing them from the prompt. This helps you in several ways.

    • When you reach your Context Size limit (for example: 16,384 Tokens), SillyTavern begins removing older chat messages from your prompt to make space for new ones. Your input prompt constantly changes, and you might not benefit from the input tokens cache mechanism.
    • Roleplay messages include both narration and dialogue. Once summarized, narration and dialogue become unnecessary input tokens, wasting money and processing time. You don’t need previous messages to exist in your context cache word for word to maintain roleplay continuity.
    • An optimized context cache with a solid summary is better than an information dump of 30-35 messages filled with narration that doesn’t affect the story’s future (for example: “he looks at her,” “he speaks with a steady voice,” etc.).

    LLMs don’t pay equal attention to all information within the context window. They tend to have a bias towards information at the beginning and end of the prompt.

    Retain the recent 5 to 8 messages and hide the rest using the /hide xx-xx command after summarizing them to optimize your context cache during long chats on SillyTavern. Make sure your summary includes all necessary information.

    Think of the summary as your notes too, which you can refer to for past details as the story progresses. Maintaining summaries that help you will also help the LLM.

    Use World Info (Lorebooks) For Long Chats On SillyTavern

    You can use World Info (Lorebooks) in SillyTavern for more than just lore and world-building. Create a lorebook and bind it to a chat using the book icon (Chat Lore button) on the character management tab. If you already have an existing lorebook bound to a chat, you can simply add entries to it.

    SillyTavern Bind Lorebook To Chat

    You can create lorebook entries that include details about important events from your long chats on SillyTavern. While the Summarize extension helps you maintain a summary of your entire conversation, Lorebooks allow you to inject more detailed information about specific past events whenever required.

    Let’s use our chat with Astrid as an example. We can create a lorebook entry for the blood contract between Astrid and Ragnar (messages #7 to #12) and insert it at the bottom of the prompt whenever triggered (at depth 0).

    While the Summarize extension briefly mentions the contract and its terms, our Lorebook entry can be more detailed and include all the terms agreed upon between Astrid and Ragnar, as well as the serious nature of forming a blood contract with a war god.

    Whenever we trigger the lorebook with its keywords, the LLM will receive detailed information about the contract, even during a long chat on SillyTavern with over 600 messages.

    World Info Strategies

    You can create lorebook entries containing detailed information about any past event you consider important.

    • First meeting with notable NPCs in the world, such as royalty, other factions, kingdoms, enemies, etc.
    • First visit to a new location, such as a keep, stronghold, kingdom, planet, etc.
    • Relationship developments (allies, contracts and agreements, moments between you and the character like a first kiss, etc.)
    • Wars, battles, or other significant moments.
    • Conclusions to story arcs/chapters.

    Inserting your lorebook entries at depth 0 works well with the input tokens cache mechanism. It doesn’t disrupt the entire prompt by causing a change higher up in the structure.

    You can either create the lorebook entries manually or use the /hide xx-xx command to hide all unrelated messages and ask the LLM to generate a detailed summary of the remaining messages. Remember to delete all OOC messages and responses once done. Alternatively, you can also use free AI assistants for this task.

    SillyTavern Manage Chat Files

    Click on the Chat Options button at the bottom left of the chat page, then select Manage chat files.

    SillyTavern Export Chat Files

    Export your chat as a text file. You can then upload this text file and ask DeepSeek, Gemini, or ChatGPT to generate a detailed summary for the relevant parts of the text file.

    Note: We do not recommend using AI assistants to generate summaries if your chat contains NSFW content. NSFW content isn’t limited to sexual material; it also includes gore, drug-related topics, extreme violence, and more.

    Think of lorebook entries for past events as sticky notes. You can create as many of them as you want, and these notes will help your LLM look up information related to a past event only when needed. It helps optimize your context cache and maintain memory and coherence in long chats on SillyTavern.

    Extensions To Manage Long Chats On SillyTavern

    If you don’t want to deal with the hassle of manually managing the Summarize extension or Lorebooks, you can use community-developed extensions to automate optimizing your context cache.

    Qvink’s Message Summarize

    Qvink’s SillyTavern Message Summarize extension is an alternative to the built-in Summarize extension.

    • The extension summarizes messages individually, not all at once, which helps reduce inaccuracies and missing details.
    • It maintains short-term memory that updates automatically over time, as well as long-term memories of messages you manually mark.
    • The extension supports using a separate completion preset or connection profile for generating summaries. For example, you can use a small local model for roleplay and a larger model through OpenRouter for summaries.
    • It automatically optimizes your context cache by hiding messages that have been summarized.

    You can learn more about the extension and its features through its documentation.

    Aiko Hanasaki’s Memory Books

    Aiko Hanasaki’s SillyTavern Memory Books extension helps you automatically create “memories.”

    • You mark the starting and ending messages (for example: 5-12), and the extension generates a summary and creates a lorebook entry.
    • The extension provides multiple presets that let you generate detailed or brief summaries.
    • The extension supports using different APIs and LLMs to generate memories through its profile management options.
    • You can automate memory creation at set intervals.
    • It automatically optimizes your context cache by hiding messages that have been summarized.

    You can learn more about the extension and its features through its documentation.

    How To Manage Long Chats On SillyTavern

    To enjoy long chats on SillyTavern, you need to optimize your context cache to keep the LLM coherent and your experience immersive and engaging. The frontend provides several tools to help you optimize your context cache.

    You can use the built-in Summarize extension to generate summaries using an LLM or manually enter them. To further optimize your context cache, you can hide summarized messages from your prompt to remove unnecessary tokens.

    For a more immersive experience during long chats, you can create lorebook entries that detail important story developments and past events. These entries can include in-depth information, rather than just a brief summary that the Summarize extension provides.

    To automate optimizing your context cache, you can use community-developed extensions like Qvink’s Message Summarize and Aiko Hanasaki’s Memory Books.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Wayfarer
    • Website
    • X (Twitter)

    Wayfarer is the founder of RPWithAI. He’s a former journalist who became interested in AI in 2023 and quickly developed a passion for AI roleplay. He enjoys medieval and fantasy settings, and his roleplays often involve politics, power struggles, and magic.

    Related Posts

    How To Use Chat Summary On WyvernChat

    How To Use Chat Summary On WyvernChat

    November 3, 2025
    How To Use Chat Memory On Chub

    How To Use Chat Memory On Chub

    November 3, 2025
    DeepSeek’s Input Tokens Cache And AI Roleplay

    DeepSeek’s Input Tokens Cache And AI Roleplay

    September 17, 2025
    How To Use Chat Memory On JanitorAI

    How To Use Chat Memory On JanitorAI

    August 24, 2025
    Optimize SillyTavern For AI Roleplay

    Optimize SillyTavern For AI Roleplay

    August 19, 2025
    Understanding Lorebooks In AI Roleplay

    Understanding Lorebooks In AI Roleplay

    August 7, 2025
    Latest Posts
    janitorBench, JanitorAI’s Benchmark For Evaluating LLMs

    janitorBench, JanitorAI’s Benchmark For Evaluating LLMs

    November 9, 2025
    How To Manage Long Chats On SillyTavern

    How To Manage Long Chats On SillyTavern

    November 6, 2025
    How To Use Chat Summary On WyvernChat

    How To Use Chat Summary On WyvernChat

    November 3, 2025
    How To Use Chat Memory On Chub

    How To Use Chat Memory On Chub

    November 3, 2025
    Character AI Will No Longer Allow Minors To Chat With Characters

    Character AI Will No Longer Allow Minors To Chat With Characters

    October 29, 2025
    Subscribe to Our Newsletter!

    Stay in the loop with the AI roleplay scene! Subscribe to our newsletter to get our latest posts delivered directly to your inbox once a week.

    About Us & Policies
    • About Us
    • Content Policy
    • Privacy Policy
    Contact Us

    You can email us at editor@rpwithai.com or reach out via the methods below.

    X (Twitter) Reddit Discord
    © 2025 RPWithAI. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.