Inventors
IV Ramakrishnan, Saikat Mukherjee, Guizhen Yang, Hasan Davulcu
Publication date
2005/3/10
Patent office
US
Application number
10658312
Description
(57) ABSTRACT A method for extracting an attribute occurrence from tem plate generated Semi-Structured document comprising multi attribute data records comprises identifying a first Set of attribute occurrences in the template generated Semi-struc tured document using an ontology. The method further comprises determining a boundary of each multi-attribute data record in the template generated Semi-structured docu ment, learning a pattern for an attribute corresponding to an identified attribute occurrence of the first set in the template generated Semi-structured document, and applying the pat tern within the boundary of each multi-attribute data record in the template generated Semi-structured document to extract a Second Set of attribute occurrences.
Total citations
200620072008200920102011201220132014201520162017201820192020202120222023202413691152696242415122
Scholar articles