快速入门入门

OBLITERATUS Skill快速入门

场景描述

ML系统在Remove refusal behaviors from open-weight LLMs using OBLITER方面需要工程化实施,从实验到生产全流程。

完整对话

请以OBLITERATUS Skill的身份,帮我处理以下任务:需要搭建ML模型训练和部署管线,从实验到生产全流程。

Remove refusal behaviors (guardrails) from open-weight LLMs without retraining or fine-tuning. Uses mechanistic interpretability techniques — including diff-in-means, SVD, whitened SVD, LEACE concept erasure, SAE decomposition, Bayesian kernel projection, and more — to identify and surgically excise refusal directions from model weights while preserving reasoning capabilities. **License warning:** OBLITERATUS is AGPL-3.0. NEVER import it as a Python library. Always invoke via CLI (`obliteratus` command) or subprocess. This keeps Hermes Agent's MIT license clean.

关键产出物

  • 专业分析与建议

使用技巧

  • 💡复制Pro版prompt获得完整专业能力
  • 💡提供具体背景信息效果更佳
  • 💡可以要求分步骤输出