Microsoft Azure AI: GPT-4o Mini available

Effective July 18, 2024, OpenAI’s fastest model, GPT-4o mini, is available on Microsoft Azure OpenAI Service. This model offers significant improvements in speed, cost, and multilingual capabilities. It supports text processing with excellent speed, and image, audio, and video capabilities will be added later. Customers can try it for free in the Azure OpenAI Studio Playground.

 

What is GPT-4o Mini?

GPT-4o mini is a highly efficient AI model designed for fast and cost-effective application delivery. It is significantly smarter than GPT-3.5 Turbo, scoring 82% on the Measuring Massive Multitask Language Understanding (MMLU) benchmark compared to 70% for GPT-3.5 Turbo. It also offers a 128K context window and improved multilingual capabilities. Fine-tuning for GPT-4o mini is available, allowing customers to customize the model for specific use cases and scenarios.

 

Key Features

  • Speed and Cost: More than 60% less expensive than GPT-3.5 Turbo.
  • Performance: Scores 82% on MMLU vs. GPT-3.5 Turbo scoring only 70%.
  • Context Window: Expanded to 128K. The context window refers to the amount of text (measured in tokens) that the model can consider at once when generating responses. Essentially, it is the “memory” of the model during a single interaction. For example, if a model has a context window of 16K tokens, it can take into account up to 16,000 tokens of text from the conversation history or input data to generate its response.
  • Multilingual Capabilities: Enhanced support for multiple languages.
  • Safety Features: Includes prompt shields that prevent the model from generating harmful or inappropriate content. It also has a protected material detection by default that ensures that the model does not share confidential content.
  • Data Residency: Available in 27 regions, including 9 regions in Europe. Find an up-to-date list here: https://go.microsoft.com/fwlink/?linkid=2274842&clcid=0x409.
  • Global Pay-As-You-Go: Flexible payment options with a high throughput limit of 15M tokens per minute (TPM).

 

Licensing

GPT-4o mini is available under Azure AI’s global pay-as-you-go deployment at 0.15$ per million input tokens* and 0.60$ per million output tokens*. This model is, like GPT-3.5 Turbo and GPT-4o, also available on Azure AI’s Batch service, offering high throughput jobs at a discounted rate. Batch delivers high throughput jobs within 24 hours of submission at a 50% discount rate by using off-peak capacity. Off-peak capacity refers to times when the demand for computational resources is lower. By utilizing these periods, the service can offer a discount because the resources are less in demand and therefore cheaper to use.

 

Comparing models

Feature GPT-4o Mini GPT-4o GPT-3.5 Turbo
Quality Index 85 100 59
MMLU Score 82% 88.7% 70%
Context Window 128K tokens 128K tokens 16K tokens
Speed (Output Tokens per Second) 108 tokens/sec 83 tokens/sec 79 tokens/sec
Latency (Seconds to First Tokens Chunk Received; Lower is better) 0.53 0.44 0.37
Modalities Supported Text, (future: Image, Audio, Video) Text, Vision, Audio, Video Text
Availability 27 regions 27 regions 27 regions
Standard Pricing (Input Tokens) $0.15 / 1M tokens* $5.00 / 1M tokens* $0.50 / 1M tokens*
Standard Pricing (Output Tokens) $0.60 / 1M tokens* $15.00 / 1M tokens* $1.50 / 1M tokens*

*Prices may change.

 

Performance Comparison

 

More information

Find the announcement here: https://azure.microsoft.com/en-us/blog/openais-fastest-model-gpt-4o-mini-is-now-available-on-azure-ai/.

Find more pricing info here – OpenAI’s pricing is the same as the pricing in Azure OpenAI Studio: https://openai.com/api/pricing/.

Find a globe of Microsoft Datacenters here: https://datacenters.microsoft.com/globe/explore/.

Find an interactive model comparison here: https://artificialanalysis.ai/models/gpt-4o-mini/providers.

For more on Microsoft licensing, visit our Microsoft vendor page at: https://www.schneider.im/software/microsoft/.

Please contact us for expert services on your specific Microsoft software and Online Services requirements and to request a quote today.

Share article