Dec 12, 2023
How to Get Started with the New Mixtral-8x7B Mixture of Experts Model
French startup Mistral AI recently released their new sparse mixture of experts model called Mixtral-8x7B. This model has shown superior performance compared to the Llama 2 70B model in various benchmarks and offers six times faster inference. The best part is that Mixtral-8x7B is an open-weights model released under the Apache 2.0 license, making it accessible to anyone for their own projects.
If you're an experienced researcher or developer, you can directly download Mixtral-8x7B using the provided magnet link from Mistral. The download includes both Mixtral 8x7B and Mixtral 8x7B Instruct.
Alternatively, you can sign up for beta access to Mistral's AI platform, which allows easy access to the models through an API. If you have a business use-case, you can reach out to Mistral and work with their team to accelerate access. The Mixtral-8x7B model is available behind the mistral-small endpoint, and you'll also gain access to their most powerful model, mistral-medium.
Another option is to use Perplexity Labs, which provides a chat interface for using the instruction-tuned version of Mixtral-8x7B. Simply select it from the model selection dropdown in the bottom right corner.
Hugging Face is another platform that hosts both the base model and the instruction fine-tuned model of Mixtral-8x7B for chat-based inference. You can download ready-to-use checkpoints from the HuggingFace Hub or convert the raw checkpoints to the HuggingFace format. Detailed instructions on loading and running the model using Flash Attention 2 are available.
Together.AI offers one of the fastest inference stacks via API and specializes in FlashAttention. They have optimized their Together Inference Engine for Mixtral, and you can access the instruction fine-tuned model through their chat interface.
Fireworks is a platform that aims to bring fast, affordable, and customizable open LLMs (Language Learning Models) to developers. You can search for the mixtral-8x7b-instruct model on Fireworks once you log in.
If you prefer running models offline on your own computer, you can use LM Studio. This software supports running models locally on Mac, Windows, or Linux computers. LM Studio v0.2.9 is compatible with Mixtral 8x7B, and you can use models through the in-app Chat UI or an OpenAI compatible local server.
These are some of the best ways to get started with Mistral AI's Mixtral-8x7B Mixture of Experts Model. Choose the option that suits your needs and start exploring the capabilities of this powerful model.