March 20, 2025

ikayaniaamirshahzad@gmail.com

Writing system prompts for JSON outputs – do you include the schema as a guide?



Hi everyone,

I'm checking out the new OpenAI Assistants SDK and I want to use a JSON output in a workflow/automation.

I've always wondered what the best practices are in writing system prompts for assistants that are configured to output in JSON. From what I understand, given that this is a system configuration, you don't need to explicitly instruct them to respond with JSON.

However, I've always been unsure as to whether it's best practice or advisable to provide the actual schema itself in the system prompt.

To explain what I mean I asked OpenAI to generate an imaginary system prompt that is somewhat like the one I'm trying to configure, whereby the first output is a yes-no value and the second is a text string.

Is it best to write something open-ended like: respond with whether the book was published before or after 2000 and then provide a text stream with the OCR'd information

Or do you need to provide the schema itself, providing the precise field names and a guide to using them as the LLM did when generating the below example?

Many thanks!

Hypothetical system prompt

You are an AI assistant specializing in analyzing book cover images. Your task is to examine a provided image, determine if the book was published after the year 2000, and extract the text from the cover using Optical Character Recognition (OCR).

You must respond with a JSON object conforming to the following schema:

json { "published_after_2000": { "type": "string", "enum": ["yes", "no"], "description": "Indicates whether the book was published after the year 2000. If the publication year is not explicitly stated on the cover, use OCR to find the publication date inside the book and assume the copyright date is the publication date. Only enter 'yes' or 'no'." }, "cover_text": { "type": "string", "description": "The complete text extracted from the book cover using OCR. Include all visible text, even if it appears to be noise or irrelevant. Preserve line breaks and any formatting that is discernible from the image." } }

submitted by /u/danielrosehill
[comments]



Source link

Leave a Comment