Mixture Of Experts, yet another new type of architecture

Understanding Mixture of Experts and Mistral in Machine Learning

Introduction

Are you intrigued by the advancements in machine learning but don’t have an extensive academic background in the field? Don’t worry! In this blog post, we’ll explore two fascinating concepts in machine learning: Mixture of Experts (MoE) and Mistral. We’ll break down these ideas in a way that’s accessible and engaging, without compromising on the exciting details.

Mamba, the novel architecture that threatens Transformers

Mamba is a new state space model architecture that has shown promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers. In this blog post, we will explore the key features of Mamba and its potential applications.