Towards Reliable Use of Large Language Models: Better Detection, Consistency, and Instruction-Tuning

Christopher D. Manning (Stanford University)
Towards Reliable Use of Large Language Models: Better Detection, Consistency, and Instruction-Tuning
Large Language Models and Transformers
While large pre-trained language models (LLMs) have enabled impressive results on a wide variety of tasks, even the largest existing models will answer inconsistently or head off in weird directions. For companies to be able to gain the benefits of these models in production use, it is now necessary to build an extensive tool ecosystem around the LLM engine, just like cars have seat belts, dash warning lights, and anti-lock brakes. In this talk, I will show recent work considering three such tools. (1) ConCORD: a lightweight method for improving LLM consistency through the use of off-the shelf Natural Language Inference models. (2) DetectGPT, a method to better detect LLM-generated text by looking at model probability function curvature. (3) Direct Preference Optimization, a new way of learning to steer LLMs from human preference data without needing to learn a reward model. Joint work with Eric Mitchell, Chelsea Finn, and many other Stanford coauthors.

Пікірлер: 7

@smnt4 ай бұрын
love that scott aaronson is in the crowd asking questions.
@stuffzoom3 ай бұрын
Still need to learn more about the current state of the field, but hearing Chris Manning talk is just impressive: everything he says seems so obvious and makes me think folks were just hacking around without really thinking. What a brilliant guy (and brilliant team)! But then again... It's one community and everyone starts off of what folks before found out...
@stuffzoom3 ай бұрын
Still need to learn more about the current state of the field, but hearing Chris Manning talk is just impressive: everything he says seems so obvious and makes me think folks were just hacking around without really thinking. What a brilliant guy (and brilliant team)!
@AM-qx3bq8 ай бұрын
Great talk, even though the content is surprising given the title.
@alexmatt40127 ай бұрын
My favorite nerd. :)
@SantoshGupta-jn1wn7 ай бұрын
I wonder what was wrong with the hf trl ppo implementation
@gedankenthesis
3 ай бұрын
Did anyone manage to find out?

Towards Reliable Use of Large Language Models: Better Detection, Consistency, and Instruction-Tuning

Пікірлер: 7

@gedankenthesis

3 ай бұрын

Келесі