NHacker Next

new
show
ask
jobs
submit

login

Batched reward model inference and Best-of-N samplingraw.sh

32 points by rawsh 3 days ago | 0 comments