Abstract
This paper presents a table structure understanding algorithm designed using optimization methods. The algorithm is probability based, where the probabilities are estimated from geometric measurements made on the various entities in a large training set. The methodology includes a global parameter optimization scheme, a novel automatic table ground truth generation system and a table structure understanding performance evaluation protocol. With a document data set having 518 table and 10,934 cell entities, it performed at the 96.76% accuracy rate on the cell level and 98.32% accuracy rate on the table level.
Original language | English (US) |
---|---|
Pages (from-to) | 1479-1497 |
Number of pages | 19 |
Journal | Pattern Recognition |
Volume | 37 |
Issue number | 7 |
DOIs | |
State | Published - Jul 2004 |
Externally published | Yes |
Keywords
- Document image analysis
- Document layout analysis
- Non-parametric statistical modeling
- Optimization
- Pattern recognition
- Performance evaluation
- Table structure understanding
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence