Skip to main content

Where Applied Intelligence, Research, and Humanity Merge

Rethinking AI Personas: The Grimoire Approach to Natural Dialogue

 

Creating better AI interactions isn't about mimicking human behavior. Grimoire, a lean 1500-character prompt architecture, shows how thoughtfully designed patterns can create more meaningful dialogue across any AI model.

Read more →

Cultivating Deep Conversations with AI: The Grimoire Experiment

 

What happens when AI becomes a partner in exploration rather than just a tool for answers? Our experiments with Grimoire reveal how thoughtfully designed AI interactions can transform digital exchanges into genuine intellectual discourse.

Read more →

Ongoing Field Trials

 

Word of Lore AI Field Trials are tournament-style, one-on-one competitions between various AI tools solving real-world problems.

AI Judgement—Evaluating an AI Arbitrator

In this AI Trial, we assess how generative instruct models perform as judging entities. Human evaluators examine AI's capacity for rational, impartial, and ethical decision-making. The study aims to identify the most effective models and tools for AI-driven arbitration in complex scenarios.

Read Criteria

Summarizing Articles with AI-Powered Content Condensation

Discover how AI is revolutionizing content consumption through advanced summarization tools. Explore ongoing trials across diverse domains as researchers assess these time-saving technologies, potentially transforming how we absorb information in our fast-paced digital world.

Read Criteria

Email Generation for Work

An assessment of AI tools for work-related email composition. This analysis evaluates solutions across six key criteria to identify which tools best enhance professional communications.

Read Criteria

AI-Generated Text Detection

This Field Trial examines tools and techniques for detecting AI-authored content, focusing on accuracy, robustness, and explainability—essential factors in maintaining content quality in the digital age.

Read Criteria

Recent Face-Offs

 

Email Generation for Work: Llama 3.3 70B > Claude 3 Opus

The trial results indicate that while both models performed strongly, Llama 3.3 70B demonstrated a slight overall advantage. Its superior formatting, comprehensive detail inclusion, and more natural tone in various contexts gave it an edge. Both models maintained high accuracy and consistency, but Llama 3.3 70B's stronger customization and authenticity made it more effective for professional email generation tasks. The results suggest Llama 3.3 70B might be particularly suitable for complex business communications requiring detailed information presentation and natural tone. Read more →

More face-offs:

Read all face-off reports →

Leaderboard

 

The outcome of AI trials is compiled and reflected in AI ratings. These trials are designed for AIs to accumulate ratings over time. Technical details available.

Top 3 Models

Rank Name Rating RD W L D Face-offs
1 Claude 3.5 Sonnet 1961 111 6 0 6 12
2 GPT-4o 1860 91 7 6 9 22
3 Mistral NeMo 1842 195 4 1 0 5

Top 3 Chats

Rank Name Rating RD W L D Face-offs
1 Claude AI 2069 87 10 3 11 24
2 Mistral Le Chat 1795 113 8 3 5 16
3 Cohere Chat 1760 133 4 1 5 10

Full AI Leaderboard →

AI Publication with Purpose

 

Field Trials

The tech industry often suffers from excessive marketing and hype. More importantly, there's a lack of clarity on how to apply AI in business or personal life. Our approach addresses these issues. We design field trials for real-world, useful use cases. These trials involve hand-crafted evaluation datasets and meticulous execution for reliable results. Our team combines editorial expertise with data science knowledge to provide valuable insights.

Face-Offs

Our trial execution process involves careful scrutiny of workflow pairings and execution methods. We manage everything from pairing nominees and scheduling face-offs to ensuring consistent execution and publishing results. This comprehensive effort culminates in a trial card that clearly demonstrates how different AI tools compare in direct competition.

Ratings

Ratings form the foundation of any competition, representing a competitor's performance. This principle applies equally to our AI trials. We have carefully selected a rating system designed to converge toward an accurate representation of each AI tool's true capabilities.

Leaderboard

Top-performing AI tools earn their place at the summit of our leaderboard. This ranking system allows our community to identify and build confidence in the AI solutions that excel in areas of greatest importance.

Insights

Insights will feature top-performing workflows, practitioners' stories, and practical news from the AI industry. We curate content based on its usefulness and applicability, ensuring you receive valuable information without unnecessary noise.

Newsletters

Subscribers gain priority access to all the benefits mentioned above, with some content available exclusively to our newsletter audience.

Empowering Your AI Journey

 

Join Our Community of AI Enthusiasts and unlock AI's potential in your work, without the confusion.

Free Membership

Word of Lore members enjoy several complimentary benefits. These include newsletters featuring new trial releases and AI insights, participation in community discussions, and the ability to nominate field trials. Sign up for free to become a member and access these perks.

Sign Up for Free
Mastodon