Email Generation for Work: Claude 3.5 vs Llama 3.1 70B

by ¶.ai Research Team

•

September 09, 2024

•

2 min read

Email Quality: Claude 3.5 Sonnet > Llama 3.1 70B

Both AIs consistently produced grammatically correct, clear, and coherent emails
Claude 3.5 Sonnet often provided significantly more comprehensive and better-structured responses, particularly in complex scenarios like internal policy announcements and sales pitches
Llama 3.1 70B occasionally maintained a more personal tone in certain scenarios, especially in customer service contexts
Both AIs showed equal proficiency in crafting job interview follow-up emails

Accuracy and Information Integrity: Claude 3.5 Sonnet ≳ Llama 3.1 70B

Both AIs adhered well to the provided information without fabrication
Claude 3.5 Sonnet consistently provided more detailed and specific information, notably in welcome emails to new employees and event cancellation notices

Relevance and Customization: Claude 3.5 Sonnet ≳ Llama 3.1 70B

Both AIs demonstrated good ability to understand and address specific instructions
Claude 3.5 Sonnet showed superior tailoring in complex situations, such as collaboration requests between departments
Llama 3.1 70B excelled in maintaining a more personal and empathetic tone in customer-focused scenarios
Both AIs showed willingness to offer substantial compensation in customer complaint scenarios

Consistency: Claude 3.5 Sonnet ≳ Llama 3.1 70B

Both AIs maintained a uniform voice across multiple email types
Claude 3.5 Sonnet demonstrated more consistent quality across varying complexity levels, particularly in handling apology emails for missed deadlines
Llama 3.1 70B showed strong adaptability in matching formal tones when required

User Experience: Claude 3.5 Sonnet > Llama 3.1 70B

Claude could attach documents, guides, and examples in many formats.
Llama has no native support for document attachment, making prompts more difficult to customize.

Authenticity: Claude 3.5 Sonnet ≛ Llama 3.1 70B

Both AIs generally matched appropriate tones and styles for different email contexts
Llama 3.1 70B outperformed Claude 3.5 Sonnet in some customer service scenarios by using a more natural, conversational tone and I-statements

Conclusion: Claude 3.5 Sonnet > Llama 3.1 70B

Claude 3.5 Sonnet and Llama 3.1 70B both performed well in the email generation for work trial, with Claude 3.5 Sonnet showing a clear edge in comprehensiveness, structure, and handling complex scenarios. Claude particularly excelled in formal communications, detailed explanations, and enhancing readability through superior formatting. Llama 3.1 70B demonstrated strengths in maintaining a personal tone, especially in customer service contexts, often achieving a more natural, conversational style. While Claude 3.5 Sonnet generally provided more detailed and structured responses, Llama 3.1 70B showed surprising adaptability in matching formality when required.