Close Menu
Roleplay With AI
    X (Twitter) Reddit Discord
    Roleplay With AIRoleplay With AI
    • Home
    • What’s New
      • Newsletter
    • News
      • Interviews
    • Guides
      • LLMs For AI Roleplay
      • Beginner Guides
    • Entertainment
      • Opinions
    • AI Roleplay
      • Feature Articles
      • Local Roleplay
      • Online Roleplay
    Roleplay With AI
    Home»Content Policy»Model Feature Articles And What Matters

    Model Feature Articles And What Matters

    In our model feature articles, we observe how LLMs perform in AI roleplay across five different character cards and scenarios. Our conversations with the LLMs range from 12 to over 60 messages, depending on the scenario.

    AI roleplay is a diverse hobby, with everyone having personal preferences and different expectations from LLMs. We share our observations and conclusions, along with chat logs, so that readers can draw their own conclusions about a model’s performance.

    Not An “Objective” Review

    We don’t “objectively” review models. Our conclusions are based on our experience, understanding of the character and scenario, and personal preferences. It’s subjective and isn’t meant to be a standardized benchmark.

    Personal preference (and “vibes”) plays a huge role in how much someone enjoys a specific model. We roleplay with the character cards using a model, and our “hands-on” experience shapes our observations and conclusions.

    We don’t use another LLM to judge or score the roleplay. It wouldn’t be able to understand how many times we had to re-roll for the perfect response, the edits we made to the model’s replies, or the other factors that influence the overall experience.

    What Matters

    • Character Adherence: How well does a model depict a character? Does it stay true to the character’s core traits? How often does it act out of character?
    • Character Portrayal: Do the dialogue and narration help the character feel unique and not generic?
    • Character Consistency: Does the model make decisions that align with the character’s core traits? Does it take the liberty to forgo character traits and favor progress?
    • Character Depth: How effectively does the model use information from the character card? Does it naturally incorporate backstories and other details?
    • Theme Adherence: How well the model’s responses align with the theme of the roleplay (e.g., medieval, fantasy, comedy, etc.).
    • User Effort: How much effort did the user need to put in to receive good responses?
    • Instruction Following: How well does the model follow instructions?
    • Vibes: The overall “fun” factor during the conversation.

    What Doesn’t Play A (Major) Role

    • Longform or Creative Writing: Judging a model’s writing quality adds another layer of subjectivity. Unless any glaring issues negatively affect the experience, the quality of the writing does not play a major role. We are collaborating with a co-author in a turn-based roleplay, not writing novels.
    • Slop: Repetition and slop affect the overall “fun” factor during the conversation and play some role in our conclusions. However, they are not major factors, and we often give smaller models more leeway.

    Model Size

    We don’t hold an 8B model to the same standard as a 15B. Nor do we hold a 24B model to the same standard as a 70B. Smaller models are given more leeway, and we are more forgiving of their mistakes and flaws. The larger the model, the higher our expectations.

    Conclusion “Grading”

    Conclusions are based on our experience and observations, and we keep the “grading” simple.

    • Decent: The model didn’t consistently meet the What Matters criteria. 
    • Good: The model met the What Matters criteria but could have performed better in some areas.
    • Great: The model met the What Matters criteria and performed as per our expectations.

    Conversation Logs

    All of our model feature articles include complete conversation logs between us and the character using the model we are writing about.

    Additionally, we enhance user input with an AI assistant to maintain a consistent input style. As humans, some days we lack creativity, and factors like how tired we are, how much we can concentrate, etc., can drastically affect our input. The AI assistant helps us make sure our input remains consistent.

    The AI assistant only enhances our input. We are in complete control of the narrative. We publish conversation logs with the AI assistant for transparency.

    New Articles

    Use LoreBary On WyvernChat

    Use LoreBary On WyvernChat

    February 1, 2026
    Use LoreBary On Chub

    Use LoreBary On Chub

    February 1, 2026
    An Interview With Nev: WyvernChat, Its History, Challenges, And More

    An Interview With Nev: WyvernChat, Its History, Challenges, And More

    January 26, 2026
    WyvernChat: A Continuously Improving And Growing Platform

    WyvernChat: A Continuously Improving And Growing Platform

    January 26, 2026
    Use Local Models Through Sophia's LoreBary

    Use Local Models Through Sophia’s LoreBary

    January 7, 2026
    Subscribe to Our Newsletter!

    Stay in the loop with the AI roleplay scene! Subscribe to our newsletter to get our latest posts delivered directly to your inbox twice a month.

    About Us & Policies
    • About Us
    • Contact Us
    • Content Policy
    • Privacy Policy
    Connect With Us
    X (Twitter) Reddit Discord
    © 2026 RPWithAI. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.