Eleanor on Everything

Eleanor on Everything

Share this post

Eleanor on Everything
Eleanor on Everything
Structured Outputs Big Time

Structured Outputs Big Time

OpenAI introduce native structured output support in the API. You probably don't want to use it yet.

Aug 06, 2024
1

Share this post

Eleanor on Everything
Eleanor on Everything
Structured Outputs Big Time
Share

UPDATE:

  • Folks from OpenAI have confirmed that the significant time-to-first-token with structured outputs only happens on the first call with a new schema, and that it is then cached for future calls. In most production cases that’s fine and you can issue one initial call to warm the cache.

  • Also, I made a mistake in my initial testing, using the wrong model (‘gpt-4o’ instead of ‘gpt-4o-2024-08-06‘) which led me to believe that the new response_format parameter isn’t working and instead this feature can only be used with tool calls.

  • Considering everything, I still feel that getting to full power of Instructor provides enough advantage for me to stick with it, but the new feature from OpenAI is definitely usable. Hopefully we can use both and get the power of controlled generation with the richness of the complete model and validation offered by Pydantic and Instructor.

SEE ALSO: https://simonwillison.net/2024/Aug/6/openai-structured-outputs/ with a thorough review of the new capabilities and some additional details.


If there’s anything people really hate about me, other than my exceptionally good looks, is that I can’t seem to stop talking (preaching?) about how using LLMs with structured outputs is the best thing since sliced bread and Pydantic is all you need and AI is Software and yada yada yada … so naturally I felt vindicated to learn that OpenAI agrees, and have now introduced structured output to their APIs and SDKs.

This is great news:

  1. It will bring structured outputs to a much wider audience that did not yet discover them as the best technique for working with LLMs because doing so depended on using community libraries they didn’t even know about.

  2. By implementing this natively, OpenAI are able to tightly control the LLM during generation and thus guarantee 100% compliance with the specified schema. That was previously only possible when using open models that allow you to intervene in the generation. ( note: Cohere also introduced this recently to their API - kudos to them! )

The (admittedly beta) implementation as it is right now is probably not something you’ll want to use in most cases though:

  1. Time to first token is loooooooooong. Because OpenAI need to compile the schema into a grammar for use in generation, there is an initial overhead that’s making every call take a very long time. I’m pretty sure they’ll be able to overcome this eventually with faster compilation and caching of re-used schemas, but the current implementation is not usable for many scenarios.

  2. The JSON schema accepted by the API is limited. OpenAI claim that they focused on core use cases and left out a “long tail” of additional features that are not necessary. Perhaps, but when I tried to migrate existing code I have to this new format I discovered that many of my schemas are not accepted. At the very least we’ll need to get used to using a subset of JSON schema with this feature.

  3. The Python SDK released today doesn’t actually include all the changes advertised in the documentation. In particular, support for passing Pydantic BaseModel subclasses as schema definition isn’t there yet. I’m sure that will be improved in future releases. But it is a good reminder that this is beta software.

What should we use instead? Instructor + Pydantic still offers the easiest way to do structured output with OpenAI as well as many other LLMs. It does not guarantee compliance in generation (not possible without controlling the LLM itself), but it does validate the results using the response model’s definition and even retries with hints from the error message when encountering validation errors.

I’m super excited about OpenAI recognising the power of structured outputs and including it in the API. I am certain that in some time this will become the main way software developers integrate LLMs into their code. But it may take just a bit longer.


Subscribe to Eleanor on Everything

By Eleanor Berger
everything.intellectronica.net
1

Share this post

Eleanor on Everything
Eleanor on Everything
Structured Outputs Big Time
Share

Discussion about this post

User's avatar
Negotiating with the Machine
... in which a surprisingly reflective AI tells me no (for good reasons) ...
Mar 26
4

Share this post

Eleanor on Everything
Eleanor on Everything
Negotiating with the Machine
Gongzilla
The collapsing cost of dumb ideas
Jan 25
6

Share this post

Eleanor on Everything
Eleanor on Everything
Gongzilla
3
Two Programming-with-AI Approaches
I experiment relentlessly with using AI for computer programming.
Jan 26
3

Share this post

Eleanor on Everything
Eleanor on Everything
Two Programming-with-AI Approaches
2

Ready for more?

© 2025 Eleanor Berger
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.