Microsoft AI updates 2024 - Q3

Within this blog, I want to give an overview of all the feature in Q3 2024 that becomes available in General Availability, Technical Preview or End of Support by Microsoft. This information can be found at Microsoft Azure AI Blog and What's new in Azure OpenAI Service.


Features that are now supported by Microsoft (GA):

  • [General available] GPT-4o mini regional availability
    For standard and global standard deployment in the East US and Sweden Central regions.
    For global batch deployment in East US, Sweden Central, and West US regions.

  • [General available] Global batch deployments are now available
    The Azure OpenAI Batch API is designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, with 24-hour target turnaround, at 50% less cost than global standard. With batch processing, rather than send one request at a time you send a large number of requests in a single file. Global batch requests have a separate enqueued token quota avoiding any disruption of your online workloads. Key use cases include:

    • Large-Scale Data Processing: Quickly analyze extensive datasets in parallel;
    • Content Generation: Create large volumes of text, such as product descriptions or articles;
    • Document Review and Summarization: Automate the review and summarization of lengthy documents;
    • Customer Support Automation: Handle numerous queries simultaneously for faster responses;
    • Data Extraction and Analysis: Extract and analyze information from vast amounts of unstructured data;
    • Natural Language Processing (NLP) Tasks: Perform tasks like sentiment analysis or translation on large datasets;
    • Marketing and Personalization: Generate personalized content and recommendations at scale.
  • [General available] New Responsible AI default content filtering policy
    Azure OpenAI Service includes a content filtering system that works alongside core models, including DALL-E image generation models. This system works by running both the prompt and completion through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. Variations in API configurations and application design might affect completions and thus filtering behavior. The new default content filtering policy DefaultV2 delivers the latest safety and security mitigations for the GPT model series (text), including:

    • Prompt Shields for jailbreak attacks on user prompts (filter);
    • Protected material detection for text (filter) on model completion;
    • Protected material detection for code (annotate) on model completions.

    While there are no changes to content filters for existing resources and deployments (default or custom content filtering configurations remain unchanged), new resources and GPT deployments will automatically inherit the new content filtering policy DefaultV2. Customers have the option to switch between safety defaults and create custom content filtering configurations. Refer to our Default safety policy documentation for more information. Click here to learn more about the "Default content safety policy".

  • [General available] GPT-4o mini model available for deployment
    GPT-4o mini is the latest Azure OpenAI model first announced on July 18, 2024. "GPT-4o mini allows customers to deliver stunning applications at a lower cost with blazing speed. GPT-4o mini is significantly smarter than GPT-3.5 Turbo—scoring 82% on Measuring Massive Multitask Language Understanding (MMLU) compared to 70%—and is more than 60% cheaper.1 The model delivers an expanded 128K context window and integrates the improved multilingual capabilities of GPT-4o, bringing greater quality to languages from around the world." The model is currently available for both standard and global standard deployment in the East US region. For information on model quota, consult the quota and limits page and for the latest info on model availability refer to the models page.

  • [General available] Expansion of regions available for global standard deployments of gpt-4o
    GPT-4o is now available for global standard deployments in:

    • australiaeast;
    • brazilsouth;
    • canadaeast;
    • eastus;
    • eastus2;
    • francecentral git;
    • germanywestcentral;
    • japaneast;
    • koreacentral;
    • northcentralus;
    • norwayeast;
    • polandcentral;
    • southafricanorth;
    • southcentralus;
    • southindia;
    • swedencentral;
    • switzerlandnorth;
    • uksouth;
    • westeurope;
    • westus;
    • westus3.

Features are not yet supported by Microsoft (GA)

  • [Public Preview] Latest GPT-4o model available in the early access playground (preview)
    On August 6, 2024, OpenAI announced the latest version of their flagship GPT-4o model version 2024-08-06. It has all the capabilities of the previous version as well as an "enhanced ability to support complex structured outputs" and "Max output tokens have been increased from 4,096 to 16,384".
  • [Public Preview] GPT-4o mini is now available for fine-tuning
    GPT-4o mini fine-tuning is now available in public preview in Sweden Central and in North Central US. Click here to learn more.
  • [Public Preview] Assistants File Search tool is now billed
    The file search tool for Assistants now has additional charges for usage. File Search augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users. OpenAI automatically parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries. Click here to learn more. See the pricing page for more information.