AI Model

Meta: Llama 3.2 90B Vision Instruct

Meta: Llama 3.2 90B Vision Instruct logoMeta
Text Generation
Vision
About Llama 3.2 90B Vision Instruct

The Llama 90B Vision model is a top-tier, 90-billion-parameter multimodal model designed for the most challenging visual reasoning and language tasks. It offers unparalleled accuracy in image captioning, visual question answering, and advanced image-text comprehension. Pre-trained on vast multimodal datasets and fine-tuned with human feedback, the Llama 90B Vision is engineered to handle the most demanding image-based AI tasks.

This model is perfect for industries requiring cutting-edge multimodal AI capabilities, particularly those dealing with complex, real-time visual and textual analysis.

Click here for the [original model card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD_VISION.md).

Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).

Specifications
Provider
Meta
Context Length
32,768 tokens
Input Types
text, image
Output Types
text
Category
Llama3
Added
9/25/2024

Frequently Asked Questions

Common questions about Llama 3.2 90B Vision Instruct

Use Llama 3.2 90B Vision Instruct and 200+ more models

Access all the best AI models in one platform. No API keys, no switching between apps.