July 2023 Release Notes
OctoAI product updates and release notes for July 2023
July 26, 2023
OctoAI added several new things including better graceful concurrency handling, updated Python SDK, and diarization to Whisper model template.
-
Added more graceful concurrency handling: when users send more than N concurrent request to an endpoint with N replicas actively running, we will queue all extra requests instead of failing them. This queuing behavior has been activated for selected customers, and will be gradually rolled out over this week and next week. You will temporarily see a new replica spin up while the rollout is occurring on your endpoint.
-
Updated our Python SDK from 0.1.2 to 0.2.0—it now support both streaming and async inference requests.
-
Added diarization to our Whisper template endpoint and rectified the list of languages supported. Diarization enables use cases where you’d like to identify the speaker of each segment in a speech recording. You can view the full API specs in the Whisper demo template. Here’s an example of how to use the template with diarization:
July 20, 2023
Added an OctoAI template for Llama2-7B Chat.
- Added an OctoAI template for Llama2-7B Chat, which is an instruction-tuned model for chatbots. Users can now work with this brand-new to the market LLM directly in the web UI with limited token response or programmatically with additional optionality. A similar template for Llama2-70B is coming soon!
July 18, 2023
Changed the HTTP status code to 201 for the REST API calls for create secret and create registry credentials. Previously, we returned 200 for these calls.
- Changed the HTTP status code to 201 for the REST API calls for create secret and create registry credentials. Previously, we returned 200 for these calls. The behavior of the SDK and web frontend is not affected.