gemma_telemetry_retention_detector
About
This skill provides fast binary classification of YouTube telemetry records to determine retention strategy. It uses pattern matching to scan heartbeat data as the first phase in a cleanup workflow. Developers should use it for quick initial classification before passing records to downstream agents for retention execution.
Documentation
Gemma Telemetry Retention Detector
Purpose: Fast pattern matching to classify YouTube DAE heartbeat records for retention strategy
Architecture: Phase 1 of Gemma→Qwen→0102 cleanup wardrobe pattern
WSP Compliance
- WSP 77: Agent Coordination (Gemma fast classification → Qwen strategy)
- WSP 91: DAEMON Observability (telemetry lifecycle management)
- WSP 96: WRE Skills Wardrobe (autonomous cleanup execution)
Task Description
Scan data/foundups.db::youtube_heartbeats table and classify records into retention categories using fast binary pattern matching.
Input Contract
{
"database_path": "data/foundups.db",
"table": "youtube_heartbeats",
"scan_limit": 1000,
"current_timestamp": "2025-10-27T20:00:00Z"
}
Classification Rules (Fast Binary Decisions)
Rule 1: Recent Activity (KEEP)
- Age: < 30 days
- Pattern: High training value, operational visibility
- Binary decision:
category = "keep_recent"
Rule 2: Training Data (KEEP)
- Age: 30-90 days
- Pattern: Historical patterns for Gemma learning
- Binary decision:
category = "keep_training"
Rule 3: Archivable (ARCHIVE)
- Age: 91-365 days
- Pattern: Historical value but low operational need
- Binary decision:
category = "archive_candidate"
Rule 4: Purgeable (PURGE)
- Age: > 365 days
- Pattern: Minimal value, disk space reclamation
- Binary decision:
category = "purge_candidate"
Output Contract
{
"scan_timestamp": "2025-10-27T20:00:00Z",
"total_records_scanned": 3719,
"categories": {
"keep_recent": {
"count": 1200,
"age_range_days": "0-30",
"disk_mb": 45
},
"keep_training": {
"count": 1500,
"age_range_days": "30-90",
"disk_mb": 95
},
"archive_candidate": {
"count": 800,
"age_range_days": "91-365",
"disk_mb": 70
},
"purge_candidate": {
"count": 219,
"age_range_days": ">365",
"disk_mb": 19
}
},
"recommendation": "archive_and_vacuum",
"estimated_reclaim_mb": 89,
"confidence": 0.95
}
Execution Logic (Gemma Implementation)
from datetime import datetime, timedelta, timezone
import sqlite3
def classify_heartbeat_age(timestamp_iso: str, now: datetime) -> str:
"""Fast binary classification by age"""
ts = datetime.fromisoformat(timestamp_iso.replace('Z', '+00:00'))
age_days = (now - ts).days
if age_days < 30:
return "keep_recent"
elif age_days < 91:
return "keep_training"
elif age_days < 366:
return "archive_candidate"
else:
return "purge_candidate"
def scan_telemetry_retention(db_path: str) -> dict:
"""Gemma fast scan for retention categories"""
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
# Get all heartbeat timestamps
cursor.execute("SELECT timestamp FROM youtube_heartbeats ORDER BY timestamp DESC")
rows = cursor.fetchall()
now = datetime.now(timezone.utc)
categories = {
"keep_recent": [],
"keep_training": [],
"archive_candidate": [],
"purge_candidate": []
}
# Fast classification loop
for (ts,) in rows:
category = classify_heartbeat_age(ts, now)
categories[category].append(ts)
conn.close()
# Generate output
return {
"scan_timestamp": now.isoformat(),
"total_records_scanned": len(rows),
"categories": {
cat: {
"count": len(records),
"age_range_days": _get_age_range(cat),
"disk_mb": len(records) * 0.06 # Rough estimate
}
for cat, records in categories.items()
},
"recommendation": "archive_and_vacuum" if len(categories["archive_candidate"]) > 500 else "no_action",
"estimated_reclaim_mb": (len(categories["archive_candidate"]) + len(categories["purge_candidate"])) * 0.06,
"confidence": 0.95
}
def _get_age_range(category: str) -> str:
ranges = {
"keep_recent": "0-30",
"keep_training": "30-90",
"archive_candidate": "91-365",
"purge_candidate": ">365"
}
return ranges[category]
Performance Metrics
- Scan speed: 10,000 records/second (pure SQLite query)
- Classification: <1ms per record (simple age comparison)
- Total execution: <50ms for 3,719 records
- Token cost: 50-100 tokens (output generation only, no LLM inference)
Pattern Memory Integration
Store execution results in wre_core/recursive_improvement/metrics/telemetry_cleanup_metrics.jsonl:
{
"skill": "gemma_telemetry_retention_detector",
"timestamp": "2025-10-27T20:00:00Z",
"execution_time_ms": 47,
"records_scanned": 3719,
"recommendation": "archive_and_vacuum",
"estimated_reclaim_mb": 89,
"pattern_fidelity": 0.95
}
Next Phase
When recommendation == "archive_and_vacuum", trigger:
- Phase 2:
qwen_telemetry_cleanup_strategist- Strategic cleanup plan - Phase 3: 0102 validation and execution
Training Value
Gemma learns:
- Fast age-based classification patterns
- Binary decision thresholds (30/90/365 days)
- Disk usage estimation heuristics
Pattern reuse:
- Same logic applies to other telemetry tables
- Reusable for foundups_selenium telemetry
- Generic retention classifier for any time-series data
Quick Install
/plugin add https://github.com/Foundup/Foundups-Agent/tree/main/gemma_telemetry_retention_detectorCopy and paste this command in Claude Code to install this skill
GitHub 仓库
Related Skills
subagent-driven-development
DevelopmentThis skill executes implementation plans by dispatching a fresh subagent for each independent task, with code review between tasks. It enables fast iteration while maintaining quality gates through this review process. Use it when working on mostly independent tasks within the same session to ensure continuous progress with built-in quality checks.
cost-optimization
OtherThis Claude Skill helps developers optimize cloud costs through resource rightsizing, tagging strategies, and spending analysis. It provides a framework for reducing cloud expenses and implementing cost governance across AWS, Azure, and GCP. Use it when you need to analyze infrastructure costs, right-size resources, or meet budget constraints.
algorithmic-art
MetaThis Claude Skill creates original algorithmic art using p5.js with seeded randomness and interactive parameters. It generates .md files for algorithmic philosophies, plus .html and .js files for interactive generative art implementations. Use it when developers need to create flow fields, particle systems, or other computational art while avoiding copyright issues.
Git Commit Helper
MetaThis Claude Skill generates descriptive commit messages by analyzing git diffs. It automatically follows conventional commit format with proper types like feat, fix, and docs. Use it when you need help writing commit messages or reviewing staged changes in your repository.
