Most legal AI tools use the wrong data structure

Almost all tools run on the same underlying models (ChatGPT or Claude). The real difference lies in how they structure and search legal sources.

Nov 11, 2025

The problem with vector search

The most commonly used search method is called vector search. This technique cuts legal sources into text fragments, converts them into vectors, and then searches for fragments that are linguistically similar. The system does not, for example, automatically “understand” whether “liability” is meant in a civil or criminal law context. Two texts may be very similar in wording but have different legal implications.

Vector search brings you close to potentially relevant sources, but that ‘proximity’ is defined by linguistic patterns rather than legal logic. This works fine for many applications, but in legal work, this inaccuracy is problematic.

Why knowledge graphs work differently

A knowledge graph is essentially a network in which each legal source is connected
to other sources by explicit relationships. For example, it records which judgments
refer to which articles of law, which conclusions of the Advocate General correspond to which judgments, which lower court rulings are linked to a judgment or ruling, and whether, and if so, when articles have been amended.

The system follows the same legal route that a lawyer would follow. You no longer search on the basis of linguistic similarity, but on the basis of explicit (legal) relationships.

Where vector search looks for “Which texts are similar to this question?”, a knowledge graph searches for “Which sources have a legally relevant relationship?”.

Conclusion

For good legal AI, you need more than semantic similarity. You need a legal data
structure that does not suggest the relationships and hierarchy between sources based on language patterns, but explicitly models the normative framework.

Unfiltered Bits

Discussion about this post