OpenVINO™ toolkit: An open source AI toolkit that makes it easier to write once, deploy anywhere.
What's New in Version 2024.1
The OpenVINO™ toolkit version 2024.1 release enhances generative AI accessibility with improved large language model (LLM) performance and expanded model coverage. It also boosts portability and performance for deployment anywhere: at the edge, in the cloud, or locally.
Latest Features
Easier Model Access and Conversion
Product |
Details |
---|---|
New Model Support |
Support for Falcon-7b-Instruct, a GenAI LLM ready-to-use chat/instruct model with superior performance metrics. |
Generative AI and LLM Enhancements
Expanded model support and accelerated inference.
Feature |
Details |
---|---|
Model Coverage |
New Jupyter* Notebooks added: YOLOv9*, YOLOv8* Oriented Bounding Boxes Detection (OBB), Stable Diffusion* in Keras, MobileCLIP, RMBG-v1.4 Background Removal, Magika, TripoSR, AnimateAnyone, LLaVA-Next, and retrieval augmented generation (RAG) system with OpenVINO toolkit and LangChain. |
Performance Improvements for LLMs |
LLM compilation time reduced through additional optimizations with compressed embedding. Improved first token performance of LLMs on 4th and 5th generations of Intel® Xeon® platforms with Intel® Advanced Matrix Extensions (Intel® AMX). Better LLM compression and improved performance with Intel® oneAPI Deep Neural Network Library (oneDNN), int4 and int8 support for Intel® Arc™ GPUs. |
More Portability and Performance
Develop once, deploy anywhere. OpenVINO toolkit enables developers to run AI at the edge, in the cloud, or locally.
Product |
Details |
---|---|
Arm* Processor Support Updates |
FP16 inference on Arm processors now enabled by default for the convolutional neural network (CNN). |
Intel Hardware Support |
Mixtral and URLNet models optimized for performance improvements on Intel® Xeon® processors. Stable Diffusion* 1.5, ChatGLM3-6b, and Qwen-7B models optimized for improved inference speed on Intel® Core™ Ultra processors with an integrated GPU. The preview neural processing unit (NPU) plug-in for Intel Core Ultra processors is now available in the OpenVINO toolkit open source GitHub* repository, in addition to the main OpenVINO toolkit package on PyPI. Significant memory reduction for select smaller generative AI (GenAI) models on Intel Core Ultra processors with an integrated GPU. |
JavaScript* API |
The JavaScript API is now more easily accessible through the npm repository, enabling JavaScript* developers to have seamless access to the OpenVINO toolkit API. |
Sign Up for Exclusive News, Tips & Releases
Be among the first to learn about everything new with the Intel® Distribution of OpenVINO™ toolkit. By signing up, you get early access product updates and releases, exclusive invitations to webinars and events, training and tutorial resources, contest announcements, and other breaking news.
Resources
Community and Support
Explore ways to get involved and stay up-to-date with the latest announcements.
Get Started
Optimize, fine-tune, and run comprehensive AI inference using the included model optimizer and runtime and development tools.
The productive smart path to freedom from the economic and technical burdens of proprietary alternatives for accelerated computing.