model_settings
Fine-tune how models respond to your prompts. These settings let you control creativity, output length, format, and more.
Find model settings in the right panel of the console, below the model selector. Settings apply to all selected models for the current run.
creativity (temperature)
Controls the randomness of model outputs. Lower values make responses more deterministic and focused, while higher values produce more varied and creative results.
Low (0.1 - 0.3)
Best for factual questions, coding, and tasks requiring precision
Medium (0.4 - 0.7)
Balanced for general-purpose writing and explanations
High (0.8 - 1.0)
Best for brainstorming, creative writing, and generating diverse ideas
response_length
Sets the maximum number of tokens (roughly words) in the model's response.
Tip: Longer responses cost more API credits. Start with medium and increase only if outputs are being cut off.
output_format
Instructs the model to structure its response in a specific format. This adds formatting instructions to your system prompt.
free_text
No format constraints. The model responds naturally. (default)
bullets
Responses formatted as bullet point lists. Great for summarizing information.
table
Responses formatted as markdown tables. Ideal for comparisons and structured data.
json
Responses in valid JSON format. Perfect for structured data extraction and integration.
repetition_penalty
Reduces repetitive phrasing by penalizing tokens that have been used recently. Higher values more strongly discourage repetition.
Low (0 - 0.3)
Allows natural repetition. Good for technical content where terms must repeat.
High (0.5 - 1.0)
Strongly discourages repetition. Useful for creative writing variety.
Note: This setting maps to both presence_penalty and frequency_penalty in the underlying API.
sampling_sharpness (top_p)
Controls nucleus sampling - how many token options the model considers. Lower values = more focused on likely tokens. Higher values = considers more options.
Tip: Generally, adjust either creativity (temperature) OR sampling sharpness (top_p), not both. They have similar effects and adjusting both can produce unpredictable results.
tips
For coding tasks
Use low creativity (0.2-0.3), medium length, and free_text format. Code requires precision over creativity.
For creative writing
Use high creativity (0.8-0.9), long response length, and higher repetition penalty. This produces more varied and interesting prose.
For data extraction
Use low creativity (0.1-0.2) and json output format. This ensures consistent, parseable responses.