pdf-text-extractor-readability-classification

vamseeachanta

Updated 17 days ago

9 views

Otherpdf

About

This skill classifies PDFs by readability to determine the optimal text extraction strategy. It analyzes whether a PDF has a machine-readable text layer, requires OCR, or has mixed content, preventing failed extractions on scanned documents. Developers should use it before bulk PDF processing to route files to the correct extraction tool (pdfplumber, Tesseract, or a hybrid approach).

Quick Install

Claude Code

Recommended

Primary

npx skills add vamseeachanta/workspace-hub -a claude-code

Plugin CommandAlternative

/plugin add https://github.com/vamseeachanta/workspace-hub

Git CloneAlternative

git clone https://github.com/vamseeachanta/workspace-hub.git ~/.claude/skills/pdf-text-extractor-readability-classification

Copy and paste this command in Claude Code to install this skill

GitHub Repository

vamseeachanta/workspace-hub

Path: .claude/skills/data/documents/pdf-text-extractor/readability-classification

Related Skills

data-mesh-expert

Other

This Claude Skill provides expert guidance on implementing data mesh architecture for scalable, decentralized data systems. It helps developers design domain-oriented data ownership, create data products, and establish federated governance with self-serve platforms. Use this skill when planning or refactoring large-scale data infrastructure to align with organizational domains.

View skill

airflow-expert

Other

This Claude Skill provides expert-level Apache Airflow orchestration for designing and managing complex data pipelines. It offers deep knowledge of DAGs, operators, sensors, XComs, task dependencies, and scheduling for building reliable workflows. Use it when developing, troubleshooting, or optimizing production Airflow deployments.

View skill

airflow-expert

Other

This Claude Skill provides expert-level guidance for Apache Airflow workflow orchestration, including DAG design, operators, sensors, and task dependencies. Use it when building or troubleshooting complex data pipelines to implement reliable scheduling and execution patterns. It covers production operations, XComs, and dynamic task generation for scalable workflow management.

View skill

data-mesh-expert

Other

This Claude Skill provides expert guidance on implementing data mesh architecture, helping developers design decentralized, domain-owned data systems. It covers core principles like data-as-a-product, federated governance, and self-serve platforms for scalable data management. Use this skill when building or modernizing data infrastructure to handle organizational complexity at scale.

View skill