Distantly supervised wrapper induction for semi-structured documents
Topics: Entity based search, Knowledge Graph, Semantic Search
The patent describes systems and methods for distantly supervised wrapper induction for semi-structured documents. It includes generating and annotating training documents for the wrapper and training the wrapper in two phases using these documents. The method identifies semi-structured web pages with a subject entity in a knowledge base, identifies target objects, connects the subject entity to these target objects, and annotates the pages for training.