continuous-skill-polishing
About
This Claude Skill converts repeated developer tasks into version-controlled skills using Git, enabling iterative refinement based on performance gaps and feedback. It's used to identify recurring patterns, improve consistency, and track skill evolution, achieving a 20% consistency gain. Trigger it with phrases like "improve skill" or "refine pattern" when addressing repeated tasks or failure cases.
Quick Install
Claude Code
Recommended/plugin add https://github.com/majiayu000/claude-skill-registrygit clone https://github.com/majiayu000/claude-skill-registry.git ~/.claude/skills/continuous-skill-polishingCopy and paste this command in Claude Code to install this skill
Documentation
Continuous Skill Polishing
Purpose
Convert repeated tasks into skills, version control them in Git, and polish iteratively based on usage feedback to achieve 20% consistency gain.
When to Use
- Identifying repeated task patterns
- Improving skill consistency
- Addressing skill failure cases
- Tracking skill evolution over time
- Refining based on user feedback
- Building team-specific expertise
Core Instructions
Step 1: Identify Repeated Task
def identify_repeated_tasks(task_history):
"""
Analyze task history to find repeated patterns
"""
# Cluster similar tasks
clusters = cluster_tasks_by_similarity(task_history)
# Find clusters with multiple instances
repeated = [
cluster for cluster in clusters
if len(cluster) >= 3 # Repeated 3+ times
]
return repeated
def should_create_skill(task_cluster):
"""
Determine if cluster warrants a skill
"""
criteria = {
'frequency': len(task_cluster) >= 3, # Repeated often
'consistency': has_consistent_pattern(task_cluster), # Similar approach
'complexity': is_sufficiently_complex(task_cluster), # Not trivial
'value': provides_reusable_value(task_cluster) # Worth capturing
}
return all(criteria.values())
Step 2: Create Initial Skill
# Identify repeated task
task="Transform CSV to JSON with validation"
# Create skill directory
mkdir -p .claude/skills/csv-to-json-transformer
# Generate initial SKILL.md
cat > .claude/skills/csv-to-json-transformer/SKILL.md << 'EOF'
---
name: csv-to-json-transformer
description: Transform CSV files to JSON with schema validation...
---
# CSV to JSON Transformer
[Initial implementation based on repeated pattern]
EOF
# Version control
git add .claude/skills/csv-to-json-transformer
git commit -m "feat: add csv-to-json-transformer skill (v1.0.0)"
Step 3: Track Usage and Failures
class SkillTracker:
"""Track skill usage and performance"""
def __init__(self, skill_name):
self.skill_name = skill_name
self.usage_log = []
def log_usage(self, task, result):
"""Log each skill usage"""
self.usage_log.append({
'timestamp': datetime.now(),
'task': task,
'success': result.success,
'issues': result.issues,
'feedback': result.user_feedback
})
def analyze_failures(self):
"""Identify patterns in failures"""
failures = [
log for log in self.usage_log
if not log['success']
]
# Group by issue type
issues_by_type = {}
for failure in failures:
for issue in failure['issues']:
issue_type = classify_issue(issue)
if issue_type not in issues_by_type:
issues_by_type[issue_type] = []
issues_by_type[issue_type].append(failure)
return issues_by_type
def calculate_consistency(self):
"""Calculate skill consistency score"""
if not self.usage_log:
return 0.0
successes = sum(1 for log in self.usage_log if log['success'])
return successes / len(self.usage_log)
Step 4: Polish Based on Feedback
# Analyze failure patterns
python analyze_skill.py csv-to-json-transformer
# Output:
# Issue type: encoding_errors (3 occurrences)
# Issue type: missing_fields (5 occurrences)
# Issue type: malformed_csv (2 occurrences)
# Update skill to address issues
cat >> .claude/skills/csv-to-json-transformer/SKILL.md << 'EOF'
## Error Handling
### Encoding Issues
Handle various encodings (UTF-8, Latin-1, etc.):
```python
encodings = ['utf-8', 'latin-1', 'cp1252']
for encoding in encodings:
try:
df = pd.read_csv(file, encoding=encoding)
break
except UnicodeDecodeError:
continue
Missing Fields
Validate against schema and provide defaults:
required_fields = ['id', 'name', 'email']
for field in required_fields:
if field not in df.columns:
df[field] = None # or default value
EOF
Version control the improvement
git add .claude/skills/csv-to-json-transformer/SKILL.md git commit -m "polish: improve error handling for encoding and missing fields" git tag csv-to-json-v1.1.0
### Step 5: Measure Improvement
```python
def measure_polish_impact(skill_name, before_version, after_version):
"""
Compare skill performance before and after polishing
"""
before_logs = load_usage_logs(skill_name, before_version)
after_logs = load_usage_logs(skill_name, after_version)
before_consistency = calculate_consistency(before_logs)
after_consistency = calculate_consistency(after_logs)
improvement = ((after_consistency - before_consistency) /
before_consistency * 100)
return {
'before': before_consistency,
'after': after_consistency,
'improvement_percent': improvement
}
# Example output:
# {
# 'before': 0.75, # 75% success rate
# 'after': 0.90, # 90% success rate
# 'improvement_percent': 20.0 # 20% improvement
# }
Polish Workflow
1. CREATE
└─> Initial skill from repeated pattern
└─> Git commit: "feat: add skill (v1.0.0)"
2. USE
└─> Apply skill to tasks
└─> Track successes and failures
3. ANALYZE
└─> Identify failure patterns
└─> Group by issue type
└─> Prioritize most common issues
4. POLISH
└─> Update skill to address issues
└─> Add error handling
└─> Improve instructions
└─> Add examples
└─> Git commit: "polish: [specific improvement]"
5. MEASURE
└─> Compare before/after consistency
└─> If improved: keep changes
└─> If worse: revert and try different approach
6. REPEAT
└─> Continue polishing based on new feedback
Versioning Strategy
# Semantic versioning for skills
# v[MAJOR].[MINOR].[PATCH]
# MAJOR: Breaking changes to skill interface
git commit -m "feat!: change required fields (BREAKING CHANGE)"
git tag csv-to-json-v2.0.0
# MINOR: New features, backwards compatible
git commit -m "feat: add Excel support"
git tag csv-to-json-v1.2.0
# PATCH: Bug fixes, polishing
git commit -m "fix: handle empty CSV files"
git tag csv-to-json-v1.1.1
Example: Skill Evolution
v1.0.0 - Initial
---
name: csv-to-json-transformer
description: Transform CSV to JSON
---
# Basic CSV to JSON transformation
Consistency: 75%
v1.1.0 - Error Handling Polish
---
name: csv-to-json-transformer
description: Transform CSV to JSON with encoding detection and validation
---
# CSV to JSON with error handling
- Handle multiple encodings
- Validate required fields
Consistency: 85% (+10%)
v1.2.0 - Feature Addition
---
name: csv-to-json-transformer
description: Transform CSV/Excel to JSON with schema validation and data cleaning
---
# CSV/Excel to JSON
- Multiple format support
- Schema validation
- Data cleaning and normalization
Consistency: 90% (+5%)
Total improvement: v1.0.0 → v1.2.0 = 20% consistency gain
Best Practices
When to Create a Skill
- Task repeated 3+ times
- Consistent approach across instances
- Sufficiently complex (not trivial)
- Provides reusable value
When to Polish
- Failure rate > 10%
- User feedback indicates confusion
- New edge cases discovered
- Competing better approaches emerge
Polishing Priorities
- Fix high-frequency failures first
- Improve unclear instructions
- Add missing error handling
- Provide better examples
- Optimize performance
Git Workflow
- Use feature branches for major changes
- Tag stable versions
- Keep CHANGELOG.md updated
- Document breaking changes
Performance Characteristics
- 20% consistency improvement through iterative polishing
- Git version history provides complete evolution trail
- Measurable progress via before/after metrics
- Team collaboration through shared skill repository
Version
v1.0.0 (2025-10-23) - Based on skill evolution patterns
GitHub Repository
Related Skills
sglang
MetaSGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.
evaluating-llms-harness
TestingThis Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.
langchain
MetaLangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.
llamaguard
OtherLlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.
