Microsoft AI updates 2023 - Q4

Within this blog, I want to give an overview of all the feature in Q4 2023 that becomes available in General Availability, Technical Preview or End of Support by Microsoft. This information can be found at Microsoft Azure AI Blog.

Features are now supported by Microsoft (GA):

  • [General available] Azure OpenAI Service Launches GPT-3.5-Turbo-1106 Model
    At Microsoft Ignite 2023, Satya Nadella announced the imminent launch of the most advanced OpenAI generative AI models, GPT-4 Turbo and GPT-3.5 Turbo 1106 on Azure. Today, we’re thrilled to announce the global availability of GPT-4 Turbo and GPT-3.5 Turbo 1106 on Azure OpenAI Service, unlocking leading cost performance and generative AI capabilities for businesses to revolutionize their workflows. GPT-3.5 Turbo 1106 brings the same new advanced capabilities as GPT-4 Turbo such as improved function calling and JSON Mode in the wildly popular GPT-3.5 Turbo format. GPT-3.5 Turbo 1106 will become the new default GPT-3.5 Turbo model in the coming weeks, featuring 16K context window at an attractive price.GPT-3.5 Turbo 1106 is generally available to all Azure OpenAI customers immediately. GPT-3.5 Turbo pricing is 3x most cost effective for input tokens and 2x more cost effective for output tokens compared to GPT-3.5 Turbo 16k. To deploy GPT-3.5-Turbo 1106 from the Studio UI, select "gpt-35-turbo" and then select version "1106" from the dropdown. Version 1106 has separate quota from the existing versions of GPT-3.5 Turbo, enabling customers to start experimenting with it immediately without impacting existing GPT-3.5 deployments.
  • [General available] New task-optimized summarization capabilities powered by fine-tuned large-language model
    Today Microsoft is thrilled to announce new capabilities that are designed to accelerate customers’ AI workflows allowing them to build summarization use-cases faster than ever before. Microsoft is expanding our utilization of LLMs to GPT-3.5 Turbo, along with our proprietary z-Code++ models to offer task-optimized summarization using a balance of output accuracy and cost. Click here, to learn more.
  • [General available] Updates in Azure AI Speech
    • Bilingual Models (These models allow users to seamlessly switch between language pairs in real-time interactions. On Nov 30th Microsoft starts with support for English & Spanish and English & French as the language pairs by updating our es-US and fr-CA models to bilingual models by default). Click here to learn more.
    • Embedded Speech (mbedded speech is designed for on-device speech to text and text to speech scenarios where cloud connectivity is intermittent or unavailable. It provides an additional way for you to access Azure AI Speech beyond Azure cloud and connected/disconnected containers. You will be able to access the same technology as what powers many of Windows 11’s experiences like Live Captions, Voice Access, and Narrator). Click here to learn more.
    • Pronunciation Assessment Language Support & Enhancements (now supports 14+ locales including English (United States), English (United Kingdom), English (Australia), French, German, Japanese, Korean, Portuguese, Spanish, Chinese, and more. In addition to General Availability of these locales, Microsoft is also releasing prosody, grammar, vocabulary, and topic support as new features in Public Preview for English. These features will provide a comprehensive language learning experience for chatbots and conversation-based evaluations). Click here to learn more.
  • [General available] Embedded Speech is now generally available
    Embedded speech is designed for on-device speech to text and text to speech scenarios where cloud connectivity is intermittent or unavailable. It provides an additional way for customers to access Azure AI Speech beyond Azure cloud and connected/disconnected containers. Click here to learn more.
  • [General available] Azure AI Vision Image Analysis 4.0 API
    Microsoft is thrilled to announce the general availability of Azure AI Image Analysis 4.0. This cutting-edge solution offers a single API endpoint, empowering users to extract comprehensive insights from images. Key features of Azure AI Image Analysis 4.0 include:
    • Optical Character Recognition;
    • Multimodal image embeddings;
    • Image Captioning;
    • Dense Captions;
    • Tagging;
    • People Detection (without identification of individuals);
    • Smart Crops;
    • Object Detection.
      Click here to learn more.
  • [General available] Vector search and semantic ranker in Azure AI Search
    Vector search in Azure AI Search, offers a comprehensive vector database solution to store, index, query, filter and retrieve your AI data in a secure, enterprise-grade environment. Vector search provides swift searches on extensive datasets for precision, offering flexibility to cater to your specific use cases. To accelerate customer adoption, Microsoft has streamlined the developer experience by offering client libraries for Python, JavaScript, .NET, and Java, and have introduced new features such as an improved semantic ranker, multi-vector queries, Exhaustive K-Nearest Neighbors (KNN) and pre-filtering. Click here to learn more.
  • [General available] Azure AI Content Safety
    Microsoft is excited to announce the general availability of Azure AI Content Safety, a new service that helps you detect and filter harmful user-generated and AI-generated content in your applications and services. Content Safety includes text and image detection to find content that is offensive, risky, or undesirable, such as profanity, adult content, gore, violence, hate speech, and more. You can also use our interactive Azure AI Content Safety Studio to view, explore, and try out sample code for detecting harmful content across different modalities. Click here to learn more.

Features are not yet supported by Microsoft (GA)

  • [Public Preview] Microsoft Copilot for Azure
    Microsoft Copilot for Azure is now in preview. With Copilot for Azures users can gain new insights into their workloads, unlock untapped Azure functionality and orchestrate tasks across both cloud and edge. Copilot leverages Large Language Models (LLMs), the Azure control plane and insights about a user’s Azure and Arc-enabled assets. All of this is carried out within the framework of Azure’s steadfast commitment to safeguarding the customer’s data security and privacy. Click here, to learn more.
  • [Public Preview] Azure OpenAI Service Launches GPT-4 Turbo Model
    At Microsoft Ignite 2023, Satya Nadella announced the imminent launch of the most advanced OpenAI generative AI models, GPT-4 Turbo and GPT-3.5 Turbo 1106 on Azure. Today, Microsoft is thrilled to announce the global availability of GPT-4 Turbo and GPT-3.5 Turbo 1106 on Azure OpenAI Service, unlocking leading cost performance and generative AI capabilities for businesses to revolutionize their workflows. Prepare for a transformative leap with the public preview of GPT-4 Turbo. This model offers lower pricing, extended prompt length, tool use, and structured JSON formatting, delivering improved efficiency and control. GPT-4 Turbo is more capable and has knowledge of world events up to April 2023. It has a 128K context window so your applications benefit from a lot more custom data tailored to your use case using techniques like RAG (Retrieval Augmented Generation). GPT-4 Turbo is available to all Azure OpenAI customers immediately. GPT-4 Turbo pricing is 3x most cost effective for input tokens and 2x more cost effective for output tokens compared to GPT-4, while offering more than 15x the context window. To deploy GPT-4 Turbo from the Studio UI, select "gpt-4" and then select version "1106-preview" in the version dropdown. Version 1106-preview has separate quota from the existing versions of GPT-4, enabling customers to start experimenting with it immediately without impacting existing GPT-4 deployments.
  • [Public Preview] Announcing Azure AI Document Intelligence preview (formally Form Recogniser)
    Azure AI Document Intelligence, formerly known as Form Recognizer, is an AI service for all your document understanding needs. With the latest update Azure AI Document Intelligence is previewing new features such as markdown output for semantic chunking in the RAG pattern with large language models, language expansion, field expansion and new prebuilt models. Microsoft is also happy to announce structure analysis updates, quality improvements to tables, reading order and section headings. Document Intelligence preview adds query fields, new prebuilt models and other improvements:
    • Output format styles of the Layout is now possible with Text and Markdown;
    • At the analyse options, it is now possible to detect also "key-value pairs";
    • Extend model schema with Query Fields;
    • Predict models like 1099 tax form, invoices and Health insurance cards;
    • Language Expansion;
    • Custom classification models are deep-learning-model types that combine layout and language features to accurately detect and identify documents you process within your application.
      Try out the Document Intelligence Studio to experience all the new and updated capabilities.
  • [Public Preview] Azure AI Video Indexer
    Microsoft is excited to share with you some of the latest features that Microsoft has added to Azure AI Video Indexer (VI), the video analysis solution that helps you to extract insights from your video and audio files. Here are some of the highlights:
    • Edge version of Azure VI, enabled by Arc: You can now use Azure VI on your own edge devices, without sending your video data to the cloud. This will allow adhering to local regulation, cost saving, efficiency and security constrains, while still enjoying the same functionalities that Azure VI offers on the cloud;
    • Bring-your-own AI feature: You can now use models from Hugging Face, Azure Florence, or any other AI model and connect them to VI insights. This will allow you to customize and enhance your video analysis with your own models, and to benefit from a single joint VI experience.
    • Enrich your video metadata and search experience: You can now add custom tags and free text as video metadata for any specific video in your account. This will help you to annotate your videos with any information that is relevant to your business. You can also search for videos based on the custom tags and free text that you have added to them;
    • Features improving the efficiency of customized people model (subjected to limited access features program).
  • [Public Preview] Native Document Support for PII Detection
    Today, Microsoft is excited to announce the public preview of native document support for PII detection. This capability can now identify, categorize, and redact sensitive information in unstructured text directly from complex documents, allowing users to ensure data privacy compliance within a streamlined workflow. It effortlessly detects and safeguards crucial information, adhering to the highest standards of data privacy and security. The formats currently supported are .pdf, .docx and .txt. Click here to learn more.
  • [Public Preview] Updates in Azure AI Speech
    TToday at Microsoft Ignite, Microsoft is super excited to announce a number of new capabilities for Azure AI Speech! This article provides a summary of all the new and recent releases:
    • Speech in Chat Playground (This enhances the chat interaction experience, enabling powerful multi-modal input and output).
    • TTS (Text to Speech) Avatar (This that allows developers to use simple text input to generate a 2D photorealistic avatar that is speaking using neural text to speech for its voice. To create the visualization of the avatar, a model is trained with human video recordings). Click here to learn more.
    • Personal Voice (new text to speech feature that Microsoft is releasing in Public Preview, which allows you to build applications where your users can easily create and use their own AI voice. Users can easily replicate their voice by providing a 1-minute speech sample as the audio prompt, and then use it to generate speech in any of the 100 supported locales). Click here to learn more.
    • Speech Analytics Try-out (Azure AI Studio now includes a new try-out experience for Speech Analytics. Speech analytics is an upcoming capability that integrates Azure AI Speech with Azure OpenAI to transcribe audio and video recordings to generate enhanced outputs like summaries, extract valuable information such as key topics, Personal Identifiable Information (PII), sentiment, and more). Click here to learn more.
    • Customization of OpenAI's Whisper model in Azure AI Speech (OpenAI’s Whisper models using audio with human-labeled transcripts! This allows you to finetune Whisper models to domain-specific vocabulary and acoustic conditions of your use-cases. Customized models can then be used through Azure AI Speech’s batch transcription API);
    • Real-time Speaker Diarization (It is an enhanced add-on feature which answers the question of who said what and when. It differentiates speakers in the input audio based on their voice characteristics to produce real-time transcription with results attributed to the different speakers as Guest 1, Guest 2, Guest 3, etc.). Click here to learn more.
  • [Public Preview] AI Safety & Responsible AI features in Azure OpenAI Service
    Microsoft Azure OpenAI Service team is excited to announce new AI safety and responsible AI features at Ignite 2023:
    • Jailbreak risk detection is a feature in Azure OpenAI Service that focuses on detecting jailbreak attacks, which pose significant risks to Large Language Model (LLM) deployments. A Jailbreak Attack, also known as a User Prompt Injection Attack (UPIA), is an intentional attempt by a user to exploit the vulnerabilities of an LLM-powered system, bypass its safety mechanisms, and provoke restricted behaviors;
    • Protected material detection is a feature in Azure OpenAI Service that helps detect and protect against outputting known natural language content and code. It checks for matches with an index of third-party text content and public source code in GitHub repositories. The feature was designed to help flag certain known third-party content for customers when integrating and using generative;
    • Expanded customer control , can now configure all content filtering severity levels and create custom policies including less restrictive filters than default based on their use case needs. For each category, it detects one of four severity levels (safe, low, medium, high) and takes action on potentially harmful content in both user prompts and model completions; Asynchronous Modified Content Filter is a new feature in Azure OpenAI Service that allows content filters to run asynchronously with significant latency improvements for streaming scenarios. When content filters are run asynchronously, completion content is returned immediately without being buffered first, resulting in a smooth and fast token-by-token streaming experience.
  • [Public Preview] Azure AI Speech launches Personal Voice in preview
    Microsoft is taking customization one step further with its new 'Personal Voice' feature. This innovation is specifically designed to enable customers to build apps that allow their users to easily create their own AI voice, resulting in a fully personalized voice experience. The value of this new feature is immense:
    • Extremely small samples: Preparing training samples for creating an AI voice could be difficult or costly. With personal voice, users can create a voice that just sound like them with a voice sample, as short as 60 seconds.
    • Express: Personal voice greatly reduced waiting time, allowing users to create a voice in seconds.
    • Multilingual and global reach: With an audio prompt in one language, the voice created can be used in 100 languages & variances, reaching a global audience.
      Starting December 1st, 2023, you can start to try the personal voice feature on Speech Studio and through API with your own data. This feature is currently only available in West Europe, East US, and South East Asia. Create a Speech or Cognitive Services standard resource in any of these three regions in order to access the feature.
  • [Public Preview] Video Retrieval: GPT-4 Turbo with Vision Integrates with Azure to Redefine Video
    Microsoft is thrilled to unveil the Azure AI Vision Video Retrieval preview. This innovative feature revolutionizes video search, enabling the exploration of thousands of hours of video content through advanced multi-modal vector indexing of vision and speech. Further enhancing the Azure OpenAI GPT-4 Turbo with Vision, Video Retrieval seamlessly integrates, providing customers with the capability to craft solutions that can both perceive and interpret video content. This opens novel possibilities and use cases. It simplifies the process for developers, allowing them to effortlessly incorporate video input into their applications, skipping complex video processing and indexing code. This is the power of Azure OpenAI Service and Azure AI Services working together. Click here to learn more.
  • [Public Preview] GPT-4 Turbo with Vision on Azure OpenAI Service
    Microsoft is thrilled to announce that GPT-4 Turbo with Vision on Azure OpenAI service is coming soon to public preview. GPT-4 Turbo with Vision is a large multimodal model (LMM) developed by OpenAI that can analyze images and provide textual responses to questions about them. It incorporates both natural language processing and visual understanding. This integration allows Azure users to benefit from Azure's reliable cloud infrastructure and OpenAI's advanced AI research. Click here to learn more.
  • [Public Preview] New Models and Multimodal Advancements at Microsoft Ignite 2023
    Today, Microsoft is excited to announce a significant expansion of the service, ushering in a new era of possibilities for multimodal use cases, customization and creative expression.
    • GPT-4 Turbo with Vision - A Visionary Preview;
    • DALL·E 3 - Expanding Creativity;
    • GPT-4 Turbo - A Leap Forward in Generative AI;
    • Azure AI Studio;
    • GPT-3.5 Turbo 16k – 1106 - General Availability on the Horizon;
    • Assistants API - Shaping Intelligent Experiences.
      Click here to learn more.
  • [Public Preview] New realistic AI voices optimized for conversations in 7 languages for public preview
    Now, in human-bot conversational interactions, AI can produce more natural, fluent, and high-quality responses than ever before, thanks to the power of Large Language Models (LLMs) such as Azure OpenAI GPT. Consequently, when engaging in verbal conversations, the demand for naturalness and expressiveness in Text-to-Speech (TTS) voices is higher than ever. Microsoft is introducing these new voices specifically designed for conversational scenarios. Whether you are creating a speech-based chatbot, a voice assistant, or a conversational agent, these new voices will ensure your interactions are more realistic, lifelike, and engaging. The new realistic conversational voices are perfect matches for any application necessitating lifelike speech interactions, including chatbots, voice assistants, gaming, e-learning, entertainment, and more. Microsoft is introducing 7 more new voices for more locales on region East US/Southeast Asia/West Europe: French (Canada), French (France), German (Germany), Italian (Italy), Korean (Korea), Portuguese (Brazil), Spanish (Spain) Click here to learn more.