top of page

Let's understand the Important Definitions in AI

Artificial intelligence (AI) - A broad discipline with the goal of creating intelligent machines, as opposed to the natural intelligence of humans and

animals. While artificial general and super intelligence (AGI and ASI) are terms that don’t have agreed upon definitions, we use them to describe

machines that could match (AGI) and then exceed (ASI) the full range of human cognitive ability across all economically valuable tasks.


AI Agent - An AI-powered system that can take actions in an environment. For example, an LLM that has access to a suite of tools and has to decide

which one to use in order to accomplish a task that it has been prompted to do.


AI Safety - A field that studies and attempts to mitigate the risks (minor to catastrophic) which future AI could pose to humanity.


Context window -The number of input tokens that an LLM model can attend to while answer a user’s prompt.


Diffusion - An algorithm that iteratively denoises an artificially corrupted signal in order to generate new, high-quality outputs. In recent years it has

been at the forefront of image generation and protein design.


Environment - The world an AI agent acts in. It receives the agent’s actions and returns the next observation and often a reward (i.e. a signal of the

action being good or bad). In this context, trajectories are the time-ordered record of an agent’s experience in an environment, typically tuples like

(observation/state, action, reward, next observation) from start to finish. These trajectories are used for RL.


Function calling / tool use - Structured calls that let models invoke APIs, search, code, or calculators with typed arguments and schemas.


Generative AI - A family of AI systems that are capable of generating new content (e.g. text, images, audio, or 3D assets) based on 'prompts'.


Graphics Processing Unit (GPU) - The workhorse AI semiconductor that enables a large number of calculations to be computed in parallel.


(Large) Language model (LM, LLM) - A model trained on vast amounts of (often) textual data to predict the next word in a self-supervised manner.


Mixture-of-Experts (MoE) - A model type where only few expert blocks activate per token, giving high capacity at lower compute per step.


Prompt - A user input often written in natural language that is used to instruct an LLM to generate something or take action.


Reasoning model - A model that plans and verifies its thinking as it generates output tokens, often via test-time compute and post-hoc checking. The

model’s explicit step-by-step reasoning trace (intermediate tokens that lay out calculations, sub-goals, and logical steps en route to an answer) is

called a Chain of Thought (CoT).


Reinforcement learning (RL) - An area of ML in which software agents learn goal-oriented behavior by trial and error in an environment that

provides rewards or penalties in response to their actions (called a “policy”) towards achieving that goal.


Test-time compute (or inference-time compute) - Spending more inference budget (longer chains, multiple samples, self-consistency) to raise

accuracy without changing weights.


Transformer - A model architecture at the core of most state of the art (SOTA) ML research. It is composed of multiple “attention” layers which learn

which parts of the input data are the most important for a given task. Transformers started in NLP (specifically machine translation) and

subsequently were expanded into computer vision, audio, and other modalities.


Vision-Language-Action Model (VLAM) - A model that jointly learn from visual inputs, natural language, and embodied interactions to not only

interpret and describe the world but also to plan and execute actions within it. Without the actions piece, this model becomes a VLM.


World model - A model that predicts next states conditioned on actions, enabling real-time, interactive control.


Comments


bottom of page