when-developing-ml-models-use-ml-expert
About
This Claude Skill provides a specialized workflow for developing, training, and deploying machine learning models. It supports various architectures like CNNs and RNNs, handles the full lifecycle from training to production, and outputs trained models, metrics, and deployment packages. Use it when you need to build a new ML model, require model training, or are preparing for a production deployment.
Documentation
ML Expert - Machine Learning Model Development
Overview
Specialized workflow for ML model development, training, and deployment. Supports various architectures (CNNs, RNNs, Transformers) with distributed training capabilities.
When to Use
- Developing new ML models
- Training neural networks
- Model optimization
- Production deployment
- Transfer learning
- Fine-tuning existing models
Phase 1: Data Preparation (10 min)
Objective
Clean, preprocess, and prepare training data
Agent: ML-Developer
Step 1.1: Load and Analyze Data
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
# Load data
data = pd.read_csv('dataset.csv')
# Analyze
analysis = {
'shape': data.shape,
'columns': data.columns.tolist(),
'dtypes': data.dtypes.to_dict(),
'missing': data.isnull().sum().to_dict(),
'stats': data.describe().to_dict()
}
# Store analysis
await memory.store('ml-expert/data-analysis', analysis)
Step 1.2: Data Cleaning
# Handle missing values
data = data.fillna(data.mean())
# Remove duplicates
data = data.drop_duplicates()
# Handle outliers
from scipy import stats
z_scores = np.abs(stats.zscore(data.select_dtypes(include=[np.number])))
data = data[(z_scores < 3).all(axis=1)]
# Encode categorical variables
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for col in data.select_dtypes(include=['object']).columns:
data[col] = le.fit_transform(data[col])
Step 1.3: Split Data
# Split into train/val/test
X = data.drop('target', axis=1)
y = data['target']
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)
# Normalize
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)
# Save preprocessed data
np.save('data/X_train.npy', X_train)
np.save('data/X_val.npy', X_val)
np.save('data/X_test.npy', X_test)
Validation Criteria
- Data loaded successfully
- Missing values handled
- Train/val/test split created
- Normalization applied
Phase 2: Model Selection (10 min)
Objective
Choose and configure model architecture
Agent: Researcher
Step 2.1: Analyze Task Type
task_analysis = {
'type': 'classification|regression|clustering',
'complexity': 'low|medium|high',
'dataSize': len(data),
'features': X.shape[1],
'classes': len(np.unique(y)) if classification else None
}
# Recommend architecture
if task_analysis['type'] == 'classification' and task_analysis['features'] > 100:
recommended_architecture = 'deep_neural_network'
elif task_analysis['type'] == 'regression' and task_analysis['dataSize'] < 10000:
recommended_architecture = 'random_forest'
# ... more logic
Step 2.2: Define Architecture
import tensorflow as tf
def create_model(architecture='dnn', input_shape, num_classes):
if architecture == 'dnn':
model = tf.keras.Sequential([
tf.keras.layers.Dense(256, activation='relu', input_shape=input_shape),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(num_classes, activation='softmax')
])
elif architecture == 'cnn':
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=input_shape),
tf.keras.layers.MaxPooling2D((2,2)),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D((2,2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(num_classes, activation='softmax')
])
# ... more architectures
return model
model = create_model('dnn', input_shape=(X_train.shape[1],), num_classes=len(np.unique(y)))
model.summary()
Step 2.3: Configure Training
training_config = {
'optimizer': tf.keras.optimizers.Adam(learning_rate=0.001),
'loss': 'sparse_categorical_crossentropy',
'metrics': ['accuracy'],
'batch_size': 32,
'epochs': 100,
'callbacks': [
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True),
tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5),
tf.keras.callbacks.ModelCheckpoint('best_model.h5', save_best_only=True)
]
}
model.compile(
optimizer=training_config['optimizer'],
loss=training_config['loss'],
metrics=training_config['metrics']
)
Validation Criteria
- Task analyzed
- Architecture selected
- Model configured
- Training parameters set
Phase 3: Train Model (20 min)
Objective
Execute training with monitoring
Agent: ML-Developer
Step 3.1: Start Training
# Train model
history = model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
batch_size=training_config['batch_size'],
epochs=training_config['epochs'],
callbacks=training_config['callbacks'],
verbose=1
)
# Save training history
import json
with open('training_history.json', 'w') as f:
json.dump({
'loss': history.history['loss'],
'val_loss': history.history['val_loss'],
'accuracy': history.history['accuracy'],
'val_accuracy': history.history['val_accuracy']
}, f)
Step 3.2: Monitor Training
import matplotlib.pyplot as plt
# Plot training curves
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
# Loss
ax1.plot(history.history['loss'], label='Train Loss')
ax1.plot(history.history['val_loss'], label='Val Loss')
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Loss')
ax1.legend()
ax1.grid(True)
# Accuracy
ax2.plot(history.history['accuracy'], label='Train Accuracy')
ax2.plot(history.history['val_accuracy'], label='Val Accuracy')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Accuracy')
ax2.legend()
ax2.grid(True)
plt.savefig('training_curves.png')
Step 3.3: Distributed Training (Optional)
# Using Flow-Nexus for distributed training
from flow_nexus import DistributedTrainer
trainer = DistributedTrainer({
'cluster_id': 'ml-training-cluster',
'num_nodes': 4,
'strategy': 'data_parallel'
})
# Train across multiple nodes
trainer.fit(
model=model,
train_data=(X_train, y_train),
val_data=(X_val, y_val),
config=training_config
)
Validation Criteria
- Training completed
- No NaN losses
- Validation metrics improving
- Model checkpoints saved
Phase 4: Validate Performance (10 min)
Objective
Evaluate model on test set
Agent: Tester
Step 4.1: Evaluate on Test Set
# Load best model
best_model = tf.keras.models.load_model('best_model.h5')
# Evaluate
test_loss, test_accuracy = best_model.evaluate(X_test, y_test, verbose=0)
# Predictions
y_pred = best_model.predict(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)
# Detailed metrics
from sklearn.metrics import classification_report, confusion_matrix
metrics = {
'test_loss': float(test_loss),
'test_accuracy': float(test_accuracy),
'classification_report': classification_report(y_test, y_pred_classes, output_dict=True),
'confusion_matrix': confusion_matrix(y_test, y_pred_classes).tolist()
}
await memory.store('ml-expert/metrics', metrics)
Step 4.2: Generate Evaluation Report
report = f"""
# Model Evaluation Report
## Performance Metrics
- Test Loss: {test_loss:.4f}
- Test Accuracy: {test_accuracy:.4f}
## Classification Report
{classification_report(y_test, y_pred_classes)}
## Model Summary
- Architecture: {recommended_architecture}
- Parameters: {model.count_params()}
- Training Time: {training_time} seconds
## Training History
- Best Val Loss: {min(history.history['val_loss']):.4f}
- Best Val Accuracy: {max(history.history['val_accuracy']):.4f}
- Epochs Trained: {len(history.history['loss'])}
"""
with open('evaluation_report.md', 'w') as f:
f.write(report)
Validation Criteria
- Test accuracy > target threshold
- No overfitting detected
- Metrics documented
- Report generated
Phase 5: Deploy to Production (15 min)
Objective
Package model for deployment
Agent: ML-Developer
Step 5.1: Export Model
# Save in multiple formats
model.save('model.h5') # Keras format
model.save('model_savedmodel') # TensorFlow SavedModel
# Convert to TFLite for mobile
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
# Save preprocessing pipeline
import joblib
joblib.dump(scaler, 'scaler.pkl')
Step 5.2: Create Deployment Package
import shutil
import os
# Create deployment directory
os.makedirs('deployment', exist_ok=True)
# Copy necessary files
shutil.copy('model.h5', 'deployment/')
shutil.copy('scaler.pkl', 'deployment/')
shutil.copy('evaluation_report.md', 'deployment/')
# Create inference script
inference_script = '''
import tensorflow as tf
import joblib
import numpy as np
class ModelInference:
def __init__(self, model_path, scaler_path):
self.model = tf.keras.models.load_model(model_path)
self.scaler = joblib.load(scaler_path)
def predict(self, input_data):
# Preprocess
scaled = self.scaler.transform(input_data)
# Predict
predictions = self.model.predict(scaled)
return np.argmax(predictions, axis=1)
# Usage
inference = ModelInference('model.h5', 'scaler.pkl')
result = inference.predict(new_data)
'''
with open('deployment/inference.py', 'w') as f:
f.write(inference_script)
Step 5.3: Generate Documentation
# Model Deployment Guide
## Files
- `model.h5`: Trained Keras model
- `scaler.pkl`: Preprocessing scaler
- `inference.py`: Inference script
## Usage
\`\`\`python
from inference import ModelInference
model = ModelInference('model.h5', 'scaler.pkl')
predictions = model.predict(new_data)
\`\`\`
## Performance
- Latency: < 50ms per prediction
- Accuracy: ${test_accuracy}
- Model Size: ${model_size}MB
## Requirements
- tensorflow>=2.0
- scikit-learn
- numpy
Validation Criteria
- Model exported successfully
- Deployment package created
- Inference script tested
- Documentation complete
Success Metrics
- Test accuracy meets target
- Training converged
- No overfitting
- Production-ready deployment
Skill Completion
Outputs:
- model.h5: Trained model file
- evaluation_report.md: Performance metrics
- deployment/: Production package
- training_history.json: Training logs
Complete when model deployed and validated.
Quick Install
/plugin add https://github.com/DNYoussef/ai-chrome-extension/tree/main/when-developing-ml-models-use-ml-expertCopy and paste this command in Claude Code to install this skill
GitHub 仓库
Related Skills
production-readiness
MetaThis Claude Skill performs comprehensive pre-deployment validation to ensure code is production-ready. It runs a complete audit pipeline including security scans, performance benchmarks, and documentation checks. Use it as a final deployment gate to generate a deployment checklist and verify all production requirements are met.
flow-nexus-neural
OtherFlow Nexus Neural enables developers to train and deploy neural networks using distributed E2B sandboxes. It supports multiple architectures including feedforward, LSTM, GAN, and transformer models, with options for custom training or pre-built templates. Use this skill when you need scalable, sandboxed environments for machine learning development and deployment.
flow-nexus-neural
OtherFlow Nexus Neural enables developers to train and deploy neural networks using distributed E2B sandboxes. It supports custom architectures like feedforward, LSTM, GAN, and transformer networks, plus pre-built templates. Use this skill when you need to manage scalable, distributed machine learning training workflows directly from your Claude environment.
github-release-management
OtherThis Claude Skill automates comprehensive GitHub release orchestration using AI swarm coordination for versioning, testing, deployment, and rollback management. It's ideal for developers needing to streamline their CI/CD pipeline with intelligent automation from changelog generation to multi-platform deployment. Use it when you want to coordinate complex release workflows across repositories with built-in rollback capabilities.
