pytorch-fsdp

zechenzhangAGI

Updated 8 days ago

471 views

Designpytorchfsdpdistributed-trainingdata-parallelshardingmixed-precisioncpu-offloadingfsdp2large-scale-training

About

This Claude Skill provides expert guidance for PyTorch Fully Sharded Data Parallel (FSDP) training, helping developers implement distributed training solutions. It covers key features like parameter sharding, mixed precision, CPU offloading, and FSDP2 for large-scale model training. Use this skill when working with FSDP APIs, debugging distributed training code, or learning best practices for sharded data parallelism.

Quick Install

Claude Code

Recommended

Primary

npx skills add zechenzhangAGI/AI-research-SKILLs -a claude-code

Plugin CommandAlternative

/plugin add https://github.com/zechenzhangAGI/AI-research-SKILLs

Git CloneAlternative

git clone https://github.com/zechenzhangAGI/AI-research-SKILLs.git ~/.claude/skills/pytorch-fsdp

Copy and paste this command in Claude Code to install this skill

GitHub Repository

zechenzhangAGI/AI-research-SKILLs

Path: 08-distributed-training/pytorch-fsdp

aiai-researchclaudeclaude-codeclaude-skillscodex

Related Skills

deepspeed

Design

This skill provides expert guidance for distributed training using Microsoft's DeepSpeed library. It helps developers implement optimization techniques like ZeRO stages, pipeline parallelism, and mixed-precision training. Use this skill when working with DeepSpeed features, debugging code, or learning best practices for large-scale model training.

View skill

flow-nexus-neural

Other

Flow Nexus Neural enables developers to train and deploy neural networks in distributed E2B sandbox environments. It supports multiple architectures like feedforward, LSTM, GAN, and transformer networks, with options for custom models or pre-built templates. Use this skill when you need to manage scalable machine learning workflows through Claude with distributed training capabilities.

View skill

flow-nexus-neural

Other

Flow Nexus Neural enables developers to train and deploy neural networks (feedforward, LSTM, GAN, transformer) within distributed E2B sandbox environments. It provides both custom model training and pre-built marketplace templates for machine learning workflows. Use this skill when you need to manage scalable, sandboxed neural network training directly through Claude.

View skill

flow-nexus-neural

Other

Flow Nexus Neural enables developers to train and deploy neural networks in distributed E2B sandbox environments. It supports multiple architectures like feedforward, LSTM, GAN, and transformer networks, with options for custom models or pre-built templates. Use this skill when you need scalable, sandboxed machine learning workflows integrated directly into your Claude development environment.

View skill