Self-Supervised Learning for Structural Inference in Knowledge Graphs: Beyond Manual Annotation
Main Article Content
Abstract
Traditional knowledge graph construction relies on extensive human annotation, hand-crafted extraction patterns, and escalating operational costs that limit scalability to enterprise-grade deployments. This paper presents a comprehensive framework for self-supervised learning (SSL) that eliminates annotation dependencies through algorithmic discovery of semantic relationships from raw textual corpora. Our approach integrates masked entity prediction, contrastive learning objectives, and graph topology exploitation to automatically infer entity relationships without manual supervision. We develop hybrid neural architectures that synergistically combine transformer-based language understanding with graph neural network structural reasoning, creating systems capable of processing both semantic meaning and topological patterns. Extensive evaluation on benchmark datasets (DocRED, FB15K-237) and three industrial case studies demonstrates that SSL approaches achieve 85-92% cost reduction compared to supervised methods while maintaining competitive performance (F1 scores of 0.73 for relations, 0.88 for entities). Our empirical analysis reveals that SSL methods excel at discovering rare relationship patterns and demonstrate superior domain transfer capabilities, though challenges remain in handling negation, domain-specific jargon, and extremely rare entities. These findings suggest SSL represents a paradigm shift toward democratized knowledge graph construction, enabling organizations to build comprehensive semantic infrastructures without prohibitive annotation costs.