BitGarden

Recent Notes

工作流设计
Feb 27, 2026
260206-llm-token-compression-survey
Feb 06, 2026
250122-OpenCode
Feb 06, 2026
- ai

See 583 more →

❯

❯

❯

RLHF

shareMar 31, 20231 min read

ai

资料

Illustrating Reinforcement Learning from Human Feedback (RLHF)
ChatGPT/InstructGPT详解 - 知乎

Graph View

Backlinks

2023-03-31
ChatGPT 的训练原理科普

Created with Quartz v4.5.0 © 2026

GitHub
Discord Community