Skip to content

04 Parameters

Introduction

  • Parameters or weights controls what kind of outputs the models generate. It predicts the next tokens. It gets better with training with lots of data and finetuning it.

Number of Parameters

  • GPT-1 - 117 million
  • GPT-2 - 1.5 billion
  • Gemma - 2 billion
  • Mixtral - 140 billion
  • GPT-3 - 175 billion
  • Llama 3.1 - 8 billion, 70 billion and 405 billion
  • GPT-4 - 1.76 trillion
  • Latest models - Non-disclosed