AI models operate through mathematical logic. Any data that an AI model operates on, including unstructured data such as text, audio or images, must be expressed numerically. Vector embedding is a way to convert an unstructured data point into an array of numbers that still expresses that data’s original meaning.
The first two sentences about artwork and last two sentences that share the keyword dogs are nearer to one another than the first and third sentences, which share no common words or meanings.
Embeddings are a way of turning human language — like words, sentences, or even full documents — into numbers that a machine can understand.
But not just any numbers. The numbers are designed so that similar meanings result in similar vectors. For example, 'king' and 'queen' might be close together in this space, and the relationship between 'man' and 'woman' is encoded as a direction.
These vectors live in what's called a high-dimensional space — often with hundreds or thousands of dimensions. You can’t visualize it easily, but conceptually, it's like mapping language into a giant 3D galaxy where meaning determines position.
You can think of an embedding as a brain’s way of “feeling out” what something means, not just what it says.