This guide will walk you through integrating a chat model into your application using Nscale’s API. With our serverless architecture, you can focus on building your application without worrying about infrastructure management.
meta-llama/Llama-3.1-8B-Instruct
"Authorization": "Bearer <API-KEY>"
"Content-Type": "application/json"
"model"
: "<model id e.g., meta-llama/Llama-3.1-8B-Instruct>"
"messages"
: "<array of messages to send to the model>"
choices
: An array of message objects containing the model’s output.usage
: An object containing the input (prompt_tokens), output (completion_tokens), and total number of tokens used.Status | Description | Response Format |
---|---|---|
200 | Success (synchronous) | application/json response with completion |
201 | Success (streaming) | text/event-stream with delta updates |
401 | Invalid API key or unauthorized | Error object |
404 | Model not found or unavailable | Error object |
429 | Insufficient credit | Error object |
500 | Internal server error | Error object |
503 | Service temporarily unavailable | Error object |