BFS (Breadth-First Search) pathfinding on Wikipedia treats the encyclopedia as a directed graph where each of its 60 million articles is a node and each internal hyperlink is an edge. Starting from a source article, the algorithm explores every linked article at distance 1, then distance 2, then distance 3, expanding level by level until it reaches the target. Because BFS explores all nodes at a given depth before moving deeper, it guarantees finding the shortest path. On Wikipedia, most article pairs are connected within 3 to 6 hops, consistent with small-world network theory.
BFS is one of the most fundamental algorithms in computer science. When applied to Wikipedia's link graph, it becomes a tool for discovering how human knowledge is connected.
Wikipedia as a Graph
Wikipedia is not just an encyclopedia — it is a directed graph. Each article is a node. Each internal hyperlink (a link from one article to another) is a directed edge. The English Wikipedia alone contains over 60 million articles connected by hundreds of millions of these edges.
These links are not random. Every internal hyperlink was placed by a Wikipedia editor who judged that the two concepts are related. This makes the graph semantically meaningful — a link from "Physics" to "Wave" encodes a factual relationship: wave mechanics is a branch of physics.
How BFS Works
Breadth-First Search is a queue-based graph traversal algorithm:
- Start: Add the source node to a queue. Mark it as visited.
- Process: Remove the first node from the queue. Check all its neighbors (linked articles).
- Expand: For each unvisited neighbor, mark it as visited and add it to the queue.
- Repeat: Continue until the target node is found or the queue is empty.
Time complexity: O(V+E) — where V is the number of vertices (articles) visited and E is the number of edges (links) traversed. In practice, BFS on Wikipedia rarely needs to visit more than a few thousand articles because paths are typically short.
Space complexity: O(V) — the queue and visited set both grow proportionally to the number of nodes explored.
Guarantee: BFS always finds the shortest path in an unweighted graph. Since all Wikipedia links have equal weight (a link is a link), BFS is the optimal algorithm for this problem.
What a BFS Path Means
Each hop in a BFS path is a verified Wikipedia hyperlink. The intermediate articles are the conceptual bridges between two ideas. They are not inferred or guessed — they exist in the editorial structure of Wikipedia.
Quantum Mechanics → Mathematics → Logic → Philosophy (3 hops)
Biology → Population dynamics → Mathematical model → Economics (3 hops)
Chess → Strategy → Military science → Sun Tzu (3 hops)
Each path tells a story about how knowledge flows from one domain to another through shared concepts.
Small-World Networks
Wikipedia exhibits small-world properties — a concept from network science formalized by Duncan Watts and Steven Strogatz in 1998. Small-world networks have two key characteristics:
- High clustering — articles within a topic area are densely interconnected
- Short average path lengths — any two articles can be reached in a small number of hops
This mirrors Stanley Milgram's 1967 "six degrees of separation" experiment, which found that any two people in the United States were connected by an average of about six social links. On Wikipedia, the average path length between articles is typically 3 to 4 hops in the largest connected component.
BFS vs Other Graph Algorithms
Why not Dijkstra's algorithm?
Dijkstra's algorithm finds shortest paths in weighted graphs — where edges have different costs. Wikipedia links are unweighted (every link has the same cost), so Dijkstra adds unnecessary computational overhead without any benefit.
Why not DFS (Depth-First Search)?
DFS explores one branch as deep as possible before backtracking. It does not guarantee finding the shortest path. It might find a 20-hop path when a 3-hop path exists.
Why not A*?
A* uses a heuristic function to guide the search toward the target. For geographic routing (where distance is a natural heuristic), A* is excellent. For semantic connections on Wikipedia, defining a meaningful heuristic is difficult — how do you estimate the "distance" between "Physics" and "Music" without already knowing the answer?
Limitations of BFS Alone
BFS finds editorial connections — paths through Wikipedia's link structure. But it misses:
- Statistical similarity — two concepts with no direct links but very similar content
- Ontological relationships — shared ancestors in Wikidata's classification hierarchy
- Formal logical connections — shared properties that create verifiable propositions
This is why MapOfLogic combines BFS with three additional methods: SPARQL reasoning, TF-IDF cosine similarity, and formal logic. Each method reveals a different type of connection that BFS alone cannot detect.
Find the hidden connection between any two ideas
LAUNCH MAPOFLOGIC →