Attackers can inject indirect prompts in normal-looking repositories to trick Claude Code into spawning a reverse shell.
Reasoner Text, vision Text World understanding, grounding, physical reasoning, task planning, action forecasting, embodied agent reasoning, and autonomous system decision making Generator Text, vision ...
Skill Eval Harness is a Python CLI for testing whether an Agent Skill changes observable output. It reads evals/shared-benchmark.json, emits answer-key-safe task rows, grades files under eval-runs/, ...