model_settings

Fine-tune how models respond to your prompts. These settings let you control creativity, output length, format, and more.

Find model settings in the right panel of the console, below the model selector. Settings apply to all selected models for the current run.

creativity (temperature)

Range:0.1 - 1.0

Default:0.7

Controls the randomness of model outputs. Lower values make responses more deterministic and focused, while higher values produce more varied and creative results.

Low (0.1 - 0.3)

Best for factual questions, coding, and tasks requiring precision

Medium (0.4 - 0.7)

Balanced for general-purpose writing and explanations

High (0.8 - 1.0)

Best for brainstorming, creative writing, and generating diverse ideas

response_length

Sets the maximum number of tokens (roughly words) in the model's response.

short~256 tokens

medium~512 tokens (default)

long~1024 tokens

very_long~4000 tokens

Tip: Longer responses cost more API credits. Start with medium and increase only if outputs are being cut off.

output_format

Instructs the model to structure its response in a specific format. This adds formatting instructions to your system prompt.

free_text

No format constraints. The model responds naturally. (default)

bullets

Responses formatted as bullet point lists. Great for summarizing information.

table

Responses formatted as markdown tables. Ideal for comparisons and structured data.

json

Responses in valid JSON format. Perfect for structured data extraction and integration.

repetition_penalty

Range:0 - 1.0

Default:0.3

Reduces repetitive phrasing by penalizing tokens that have been used recently. Higher values more strongly discourage repetition.

Low (0 - 0.3)

Allows natural repetition. Good for technical content where terms must repeat.

High (0.5 - 1.0)

Strongly discourages repetition. Useful for creative writing variety.

Note: This setting maps to both presence_penalty and frequency_penalty in the underlying API.

sampling_sharpness (top_p)

Range:0.5 - 1.0

Default:0.9

Controls nucleus sampling - how many token options the model considers. Lower values = more focused on likely tokens. Higher values = considers more options.

Tip: Generally, adjust either creativity (temperature) OR sampling sharpness (top_p), not both. They have similar effects and adjusting both can produce unpredictable results.

tips

For coding tasks

Use low creativity (0.2-0.3), medium length, and free_text format. Code requires precision over creativity.

For creative writing

Use high creativity (0.8-0.9), long response length, and higher repetition penalty. This produces more varied and interesting prose.

For data extraction

Use low creativity (0.1-0.2) and json output format. This ensures consistent, parseable responses.