N
Hacker Next
new
show
ask
jobs
submit
login
Batched reward model inference and Best-of-N sampling
raw.sh
32 points by
rawsh
3 days ago
|
0 comments
add comment