I benchmarked 6 prompt-optimization frameworks on the same task. Here is what each one actually optimizes.
23h ago · 4 min read · TL;DR: I ran six prompt-optimization frameworks against the same task and the same eval metric over a few weeks. They are not interchangeable: some are full programming models, some are single search
Join discussion



