NEW TextGrad by Stanford: Better than DSPy
Ғылым және технология
In this TEXTGRAD framework, each AI system is transformed into a computation graph, where variables are inputs and outputs of complex (not necessarily differentiable) function calls. The feedback to the variables (dubbed ‘textual gradients’) are provided in the form of informative and interpretable natural language criticism to the variables; describing how a variable should be changed to improve the system. The gradients are
propagated through arbitrary functions, such as LLM API calls, simulators, or external numerical solvers. (Stanford Univ)
Stanford University's latest research, TextGrad, builds upon DSPy, an existing self-improving machine learning system, to enable automatic differentiation via text. Unlike traditional methods like AutoGrad, which has access to tensors within neural network layers, TextGrad operates without such access. Instead, it extends PyTorch to work with proprietary large language models (LLMs) like GPT-4 Omni, focusing on prompt engineering for optimizing specific LLM tasks. By leveraging API calls between LLMs, TextGrad automates the process of finding the best prompts, enhancing logical reasoning and performance beyond DSPy.
TextGrad's functionality mirrors AutoGrad but adapts it for text. In traditional neural networks, AutoGrad records operations during a forward pass and computes gradients during a backward pass to optimize parameters using the chain rule of calculus. TextGrad applies a similar approach to text, utilizing a feedback loop where a more intelligent LLM critiques and optimizes prompts generated by a less capable LLM. This process is facilitated by a new PyTorch extension that makes TextGrad open source and accessible. The implementation includes several Jupyter notebooks that illustrate how to apply this methodology to various tasks, demonstrating significant performance improvements over DSPy.
The practical implications of TextGrad are profound. For instance, in prompt optimization, an initial prompt achieving 77% accuracy can be refined using TextGrad to reach 92% accuracy. The system is versatile, applicable not just to prompt optimization but also to other domains like code (CodeLLMs) and molecular design optimization. By integrating LLMs' self-evaluation and improvement capabilities, TextGrad enhances performance in complex tasks, although it requires careful management of complexity levels between interacting LLMs to avoid failures. TextGrad represents a significant step forward in AI research, promising more efficient and effective optimization of multiple AI (agents) systems.
All rights w/ authors:
arxiv.org/pdf/2406.07496
TextGrad: Automatic “Differentiation” via Text
Recommend:
4 Colab notebooks for TextGrad by Stanford (Python, PyTorch):
github.com/zou-group/textgrad
#airesearch
#promptengineering
#newtechnology
Пікірлер: 14
This is a concept I'd been considering myself, but I never thought of it as autodifferentiated text. Fantastic that research is being done in this direction. I knew it'd be a good idea.
@Caellyan
12 күн бұрын
I criticized this has to be done manually, but never thought of chaining 2 LLMs to achieve it. Though, it does make getting slightly better answers 3x more expensive. I guess it's useful for unsupervised learning though.
Thanks for the video, it’s very insightful! I have 1 thought: 1. Textgrad and DSPy can be combined. As DSPy is mostly based on ICL and this framework focuses more on signature optimization. Additionally, the researchers in Stanford mentioned that the combined prompt on one occasion improved the prompt by 1% and it should be further studied.
Thanks standford, though I would have called it backpromptigation. ;)
Thanks for the video. I missed the boat with DSPy but it's good to know you can just go ahead with TextGrad.
Solid. I knew if the guy behind DSPy could build that, there was a better version imminent
Very informative. Thanks
You said that you used in on your tasks. Can you release part of that code in the wild? It would be really great to see a live example. That was the thing I found very challenging with DSPy. Only with the storm project I started understanding how it should work ;-)
@code4AI
12 күн бұрын
Start with the four Jupyter Notebooks that I provided and you will see that you have immediately multiple new ideas for your specific tasks. I plan a new video on my insights, given my testing and maybe I have an idea how to optimize the TextGrad method further ....
Thanks for the links to colabs…
sounds like we need a middleware complexity assesor that can sit in the middle and auto reject if it doesnt meet that balance
Seems like one can prompt optimize for the same level system and never lack coherence.
26:51 what does 0 demonstrations mean? No examples of good output, only original prompt?
Amazing video! But pseudo as in pseudo-code is pronounced like sudo (syuudo) Not smart enough to correct anything else in this video lmao, keep up the good work! Love the channel