Back to Skills

constitutional-ai

majiayu000
Updated 4 days ago
77 views
58
9
58
View on GitHub
OtherSafety AlignmentConstitutional AIRLAIFSelf-CritiqueHarmlessnessAnthropicAI SafetyRL From AI FeedbackClaude

About

This skill implements Anthropic's Constitutional AI method for training harmless AI models through self-critique and revision. It provides a two-phase approach using supervised learning with AI self-critique followed by RLAIF (Reinforcement Learning from AI Feedback) for safety alignment. Use it to reduce harmful outputs in your Claude applications without requiring human-labeled harmful data.

Quick Install

Claude Code

Recommended
Primary
npx skills add majiayu000/claude-skill-registry -a claude-code
Plugin CommandAlternative
/plugin add https://github.com/majiayu000/claude-skill-registry
Git CloneAlternative
git clone https://github.com/majiayu000/claude-skill-registry.git ~/.claude/skills/constitutional-ai

Copy and paste this command in Claude Code to install this skill

GitHub Repository

majiayu000/claude-skill-registry
Path: skills/constitutional-ai
0

Related Skills