25.2.2.2.1 Page Segmentation, General Evaluations

Chapter Contents (Back)
Page Segmentation. Document Analysis. Application, Document Layout. More the segmentation part than analysis of the structure.

Nadler, M.[Morton],
Document Segmentation and Coding Techniques,
CVGIP(28), No. 2, November 1984, pp. 240-262.
Elsevier DOI Survey, Page Segmentation. BibRef 8411

Pavlidis, T.[Theo], Zhou, J.Y.[Jiang-Ying],
Page Segmentation and Classification,
GMIP(54), No. 6, November 1992, pp. 484-496. Survey, Page Segmentation. BibRef 9211

Pavlidis, T.[Theo],
Page Segmentation by White Streams,
ICDAR91(945-953). BibRef 9100

Zlatopolsky, A.A.,
Automated Document Segmentation,
PRL(15), No. 7, July 1994, pp. 699-704. BibRef 9407

Leng, G.W., Mital, D.P., Yong, T.S., Kang, T.K.,
A Differential-Processing Extraction Approach to Text and Image Segmentation,
EngAAI(7), No. 6, December 1994, pp. 639-651. BibRef 9412

Jain, A.K.[Anil K.], Zhong, Y.[Yu],
Page Segmentation Using Texture Analysis,
PR(29), No. 5, May 1996, pp. 743-770.
Elsevier DOI 9605
BibRef
Earlier:
Page segmentation using texture discrimination masks,
ICIP95(III: 308-311).
IEEE DOI 9510
BibRef

Jain, A.K., Bhattacharjee, S.,
Text Segmentation Using Gabor Filters for Automatic Document Processing,
MVA(5), 1992, pp. 169-184. BibRef 9200

Jain, A.K., Bhattacharjee, S.K., Chen, Y.,
On texture in document images,
CVPR92(677-680).
IEEE DOI 0403
BibRef

Venkateswarlu, N.B., Boyle, R.D.,
New segmentation techniques for document image analysis,
IVC(13), No. 7, September 1995, pp. 573-583.
Elsevier DOI 0401
BibRef

Shih, F.Y., Chen, S.S.,
Adaptive Document Block Segmentation and Classification,
SMC-B(26), No. 5, October 1996, pp. 797-802.
IEEE Top Reference. Segment based on run length smoothing. Then a rule-based classification into text, graphics, picture. BibRef 9610

Patel, D.,
Page Segmentation for Document Image-Analysis Using a Neural-Network,
OptEng(35), No. 7, July 1996, pp. 1854-1861. 9608
BibRef

Patel, D., Stonham, T.J.,
Texture image classification and segmentation using RANK-order clustering,
ICPR92(III:92-95).
IEEE DOI 9208
BibRef

Payne, J.S., Stonham, T.J., Patel, D.,
Document segmentation using texture analysis,
ICPR94(B:380-382).
IEEE DOI 9410
BibRef

Etemad, K., Doermann, D.S., Chellappa, R.,
Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration,
PAMI(19), No. 1, January 1997, pp. 92-96.
IEEE DOI 9702
BibRef
And:
Multiscale Document Page Segmentation Using Soft Decision Integration,
UMDTR3444, 1995.
WWW Link. BibRef
Earlier:
Page Segmentation Using Decision Integration and Wavelet Packets,
ICPR94(B:345-349).
IEEE DOI Classify regions of the page image into text or images. BibRef

Etemad, K.[Kamran],
Multi-Scale Discriminant Analysis and Recognition of Signals and Images,
Ph.D.Thesis, April 1996. BibRef 9604 UMDTR3629. The goal is to find efficient multi-scale representations that yield maximum between-class separations and minimum within-class scatters.
WWW Link. Also for Faces. BibRef

Chen, J.L.,
A Simplified Approach to the HMM Based Texture Analysis and Its Application to Document Segmentation,
PRL(18), No. 10, October 1997, pp. 993-1007. 9802
Markov model texture analysis. BibRef

Kise, K.[Koichi], Sato, A.[Akinori], Iwata, M.[Motoi],
Segmentation of Page Images Using the Area Voronoi Diagram,
CVIU(70), No. 3, June 1998, pp. 370-382.
DOI Link For evaluation:
See also Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms. BibRef 9806

Hobby, J.D.[John D.],
Matching Document Images with Ground Truth,
IJDAR(1), No. 1, Spring 1998, pp. xx-yy. BibRef 9800
Earlier: ICDAR97(Tu-2B) 9708
In program, not in proceedings. BibRef

Cinque, L., Lombardi, L., Manzini, G.,
A Multiresolution Approach for Page Segmentation,
PRL(19), No. 2, February 1998, pp. 217-225. 9808

See also Shape-Description and Recognition by a Multiresolution Approach. BibRef

Cantoni, V., Cinque, L., Lombardi, L., Manzini, G.,
Page Segmentation Using a Pyramidal Architecture,
CAMP97(Session 6). BibRef 9700

Cinque, L., Levialdi, S., Lombardi, L., Tanimoto, S.,
Segmentation of page images having artifacts of photocopying and scanning,
PR(35), No. 5, May 2002, pp. 1167-1177.
Elsevier DOI 0202
BibRef

Cinque, L., Forino, L., Levialdi, S., Lombardi, L., Tanimoto, S.,
Understanding the page logical structure,
CIAP99(1003-1008).
IEEE DOI 9909
BibRef

Cinque, L., Levialdi, S., Malizia, A., de Rosa, F.,
DAN: An Automatic Segmentation and Classification Engine for Paper Documents,
DAS02(491 ff.).
Springer DOI 0303
BibRef

Cinque, L., Levialdi, S., Malizia, A.,
A system for the automatic layout segmentation and classification of digital documents,
CIAP03(201-206).
IEEE DOI 0310
BibRef

Liu, J.M., Tang, Y.Y.,
Distributed Autonomous Agents For Chinese Document Image Segmentation,
PRAI(12), No. 1, February 1998, pp. 97-118. 9806

See also Adaptive Image Segmentation With Distributed Behavior-Based Agents. BibRef

de Queiroz, R.L.,
Processing JPEG Compressed Images and Documents,
IP(7), No. 12, December 1998, pp. 1661-1672.
IEEE DOI 9812
BibRef

de Queiroz, R.L.,
Processing JPEG-Compressed Images,
ICIP97(II: 334-337).
IEEE DOI BibRef 9700

de Queiroz, R.L., Eschbach, R.,
Fast Segmentation of the JPEG Compressed Documents,
JEI(7), No. 2, April 1998, pp. 367-377. 9807
BibRef

de Queiroz, R.L., and Eschbach, R.,
Segmentation of Compressed Documents,
ICIP97(III: 70-73).
IEEE DOI BibRef 9700

de Queiroz, R.L.[Ricardo L.],
Compression of Compound Documents,
ICIP99(I:209-213).
IEEE DOI BibRef 9900

Antonacopoulos, A.[Apostolos],
Page Segmentation Using the Description of the Background,
CVIU(70), No. 3, June 1998, pp. 350-369.
DOI Link BibRef 9806

Jain, A.K., Yu, B.,
Document Representation and Its Application to Page Decomposition,
PAMI(20), No. 3, March 1998, pp. 294-308.
IEEE DOI 9805
Generates a structured version of the document for editing, storage, retrieval, and analysis. Performs skew correction, segmentation, and labeling (text, table, image, drawing, and ruler). Some review of approaches. BibRef

Jain, A.K., Yu, B.,
Page segmentation using document model,
ICDAR97(34-38).
IEEE DOI 9708
BibRef

Yang, J.C.Y.[James Ching-Yu], Tsai, W.H.[Wen-Hsiang],
Document image segmentation and quality improvement by Moiré pattern analysis,
SP:IC(15), No. 9, July 2000, pp. 781-797.
Elsevier DOI 0008
BibRef

Mao, S.[Song], Kanungo, T.[Tapas],
Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms,
PAMI(23), No. 3, March 2001, pp. 242-256.
IEEE DOI 0103
Survey, Page Segmentation. Evaluation, Page Segmentation. Created separate test and training data, a computable performance metric, find optimal parameters for different algorithms, evaluate. Compare Voronoi (Kise) (
See also Segmentation of Page Images Using the Area Voronoi Diagram. ); Docstrum (O'Gorman) (
See also Document Spectrum for Page Layout Analysis, The. ); Caere (commercial system) (
See also Caere. ); (these 3 have about the same performance) Are better than ScanSoft (commercial system) (
See also ScanSoft. ); which is better than the older X-Y cut (
See also Prototype Document Image Analysis System for Technical Journals, A. ). Similar conclusion in later analysis:
See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms. BibRef

Mao, S.[Song], Kanungo, T.[Tapas],
Software Architecture of PSET: A Page Segmentation Evaluation Toolkit,
IJDAR(4), No. 3, 2002, pp. 205-217.
Springer DOI 0205
BibRef
Earlier: UMD--TR4190, September 2000.
WWW Link. Evaluation, Page Segmentation. BibRef

Mao, S.[Song], Kanungo, T.[Tapas],
A Methodology for Empirical Performance Evaluation of Page Segmentation Algorithms,
UMD--TR4093, December 1999.
WWW Link. BibRef 9912

Mao, S., Kanungo, T.,
Automatic Training of Page Segmentation Algorithms: An Optimization Approach,
ICPR00(Vol IV: 531-534).
IEEE DOI 0009
BibRef

Kanungo, T., Mao, S.[Song],
Stochastic language models for style-directed layout analysis of document images,
IP(12), No. 5, May 2003, pp. 583-596.
IEEE DOI 0307
BibRef

Amin, A.[Adnan], Shiu, R.[Ricky],
Page Segmentation And Classification Utilizing Bottom-up Approach,
IJIG(1), No. 2, April 2001, pp. 345-361. 0104
BibRef

Deng, S.[Shulan], Latifi, S.[Shahram], Regentova, E.E.[Emma E.],
Document segmentation using polynomial spline wavelets,
PR(34), No. 12, December 2001, pp. 2533-2545.
Elsevier DOI 0110
BibRef

Regentova, E.E., Latifi, S., Chen, D., Taghva, K., Yao, D.,
Document analysis by processing JBIG-encoded images,
IJDAR(7), No. 4, September 2005, pp. 260-272.
Springer DOI 0512
BibRef

Diligenti, M.[Michelangelo], Frasconi, P.[Paolo], Gori, M.[Marco],
Hidden Tree Markov Models for Document Image Classification,
PAMI(25), No. 4, April 2003, pp. 520-524.
IEEE Abstract. 0304
Learning. Learn the concept of a set of documents of similar structure. BibRef

Diligenti, M., Gori, M., Maggini, M., Scarselli, F.,
Classification of HTML documents by Hidden Tree-Markov Models,
ICDAR01(849-853).
IEEE DOI 0109
BibRef

Haji, M.M., Katebi, S.D.,
An Efficient Text Segmentation Technique Based on Naive Bayes Classifier,
GVIP(05), No. V7, 2005, pp. 21-30
HTML Version. BibRef 0500

Wang, Y.L.[Ya-Lin], Phillips, I.T.[Ihsin T.], Haralick, R.M.[Robert M.],
Document zone content classification and its performance evaluation,
PR(39), No. 1, January 2006, pp. 57-73.
Elsevier DOI 0512
Evaluation, Page Segmentation. BibRef
Earlier:
A Study on the Document Zone Content Classification Problem,
DAS02(212 ff.).
Springer DOI 0303
BibRef
And:
A method for document zone content classification,
ICPR02(III: 196-199).
IEEE DOI 0211
BibRef
Earlier: A1, A3, A2:
Zone content classification and its performance evaluation,
ICDAR01(540-544).
IEEE DOI 0109

See also Table structure understanding and its performance evaluation. BibRef

Leydier, Y.[Yann], Le Bourgeois, F.[Frank], Emptoz, H.[Hubert],
Text search for medieval manuscript images,
PR(40), No. 12, December 2007, pp. 3552-3567.
Elsevier DOI 0709
BibRef
Earlier:
Omnilingual Segmentation-Free Word Spotting for Ancient Manuscripts Indexation,
ICDAR05(I: 533-537).
IEEE DOI 0508
BibRef
Earlier:
Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts,
ICPR04(I: 494-497).
IEEE DOI 0409
Word-spotting; Medieval manuscripts BibRef

Le Bourgeois, F.[Frank], Kaileh, H.[Hala],
Automatic Metadata Retrieval from Ancient Manuscripts,
DAS04(75-89).
Springer DOI 0505
BibRef

Allier, B., Emptoz, H.,
Segmentation and typography extraction in document images using geodesic active regions,
ICPR04(I: 409-412).
IEEE DOI 0409
BibRef

Leydier, Y.[Yann], Ouji, A.[Asma], Le Bourgeois, F.[Frank], Emptoz, H.[Hubert],
Towards an omnilingual word retrieval system for ancient manuscripts,
PR(42), No. 9, September 2009, pp. 2089-2105.
Elsevier DOI 0905
Document indexing; Word-spotting; Word retrieval; Ancient documents; Segmentation-free; Omnilingual BibRef

Ouji, A.[Asma], Leydier, Y.[Yann], Le Bourgeois, F.[Frank],
Chromatic / Achromatic Separation in Noisy Document Images,
ICDAR11(167-171).
IEEE DOI 1111
BibRef

Shafait, F.[Faisal], Keysers, D.[Daniel], Breuel, T.M.[Thomas M.],
Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms,
PAMI(30), No. 6, June 2008, pp. 941-954.
IEEE DOI 0804
Survey, Page Segmentation. Evaluation, Page Segmentation. BibRef
Earlier:
Performance Comparison of Six Algorithms for Page Segmentation,
DAS06(368-379).
Springer DOI 0602
BibRef
And:
Pixel-Accurate Representation and Evaluation of Page Segmentation in Document Images,
ICPR06(I: 872-875).
IEEE DOI 0609
Also use the dummy program -- no segmentation for a minimum level. X-Y Cut (
See also Prototype Document Image Analysis System for Technical Journals, A. ), Run Length Smearing (
See also Document Analysis System. ), Whitespace Analysis (
See also Two Geometric Algorithms for Layout Analysis. ) and Constrained textline detection. The last two: Docstrum (
See also Document Spectrum for Page Layout Analysis, The. ), Voronoi (
See also Segmentation of Page Images Using the Area Voronoi Diagram. ). are generally the best choice. For similar analysis also see:
See also Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms. BibRef

Nagy, G.[George], Seth, S.C.[Sharad C.], Viswanathan, M.[Mahesh],
Comment: Projection Methods Require Black Border Removal,
PAMI(31), No. 4, April 2009, pp. 762-762.
IEEE DOI 0903
Flaw in page segmentation evaluation.
See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms. Relative to evaluation of:
See also Prototype Document Image Analysis System for Technical Journals, A. BibRef

Shafait, F.[Faisal], Keysers, D.[Daniel], Breuel, T.M.[Thomas M.],
Response to 'Projection Methods Require Black Border Removal',
PAMI(31), No. 4, April 2009, pp. 763-764.
IEEE DOI 0903

See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms. BibRef

Shafait, F.[Faisal], Breuel, T.M.[Thomas M.],
The Effect of Border Noise on the Performance of Projection-Based Page Segmentation Methods,
PAMI(33), No. 4, April 2011, pp. 846-851.
IEEE DOI 1103
Page segmentation usually sensitive to border noise. BibRef

Stamatopoulos, N.[Nikolaos], Gatos, B.[Basilis], Perantonis, S.J.[Stavros J.],
A method for combining complementary techniques for document image segmentation,
PR(42), No. 12, December 2009, pp. 3158-3168.
Elsevier DOI 0909
Document image segmentation; Combination method; Document image analysis; Segmentation BibRef


Liu, M.Y.[Meng-Yang], Li, C.S.[Chong-Shou], Zhu, W.B.[Wen-Bin], Lim, A.[Andrew],
A Morphology-Based Border Noise Removal Method for Camera-Captured Label Images,
CBDAR13(126-138).
Springer DOI 1404
BibRef

Deryagin, D.,
Unified Performance Evaluation for OCR Zoning: Calculating Page Segmentation's Score, That Includes Text Zones, Tables and Non-text Objects,
ICDAR13(953-957)
IEEE DOI 1312
image segmentation BibRef

Lebourgeois, F., Drira, F., Gaceb, D., Duong, J.,
Fast Integral MeanShift: Application to Color Segmentation of Document Images,
ICDAR13(52-56)
IEEE DOI 1312
computational complexity BibRef

Antonacopoulos, A.[Apostolos], Pletschacher, S.[Stefan], Bridson, D.[David], Papadopoulos, C.[Christos],
ICDAR 2009 Page Segmentation Competition,
ICDAR09(1370-1374).
IEEE DOI 0907
BibRef

Antonacopoulos, A., Gatos, B., Bridson, D.,
Page Segmentation Competition,
ICDAR07(1279-1283).
IEEE DOI 0709
BibRef
Earlier:
ICDAR2005 page segmentation competition,
ICDAR05(I: 75-79).
IEEE DOI 0508
BibRef
Earlier:
ICDAR 2003 page segmentation competition,
ICDAR03(688-692).
IEEE DOI 0311
BibRef

Peng, L.R.[Liang-Rui], Chen, M.[Ming], Liu, C.S.[Chang-Song], Ding, X.Q.[Xiao-Qing], Zheng, J.R.[Ji-Rong],
An automatic performance evaluation method for document page segmentation,
ICDAR01(134-137).
IEEE DOI 0109
BibRef

Fumera, G., Pillai, I., Roli, F.,
Classification with reject option in text categorisation systems,
CIAP03(582-587).
IEEE DOI 0310
BibRef

Ma, H.F.[Huan-Feng], Doermann, D.S.,
Gabor filter based multi-class classifier for scanned document images,
ICDAR03(968-972).
IEEE DOI 0311
BibRef

Allier, B.[Bénédicte], Emptoz, H.[Hubert],
Type extraction and character prototyping using gabor filters,
ICDAR03(799-803).
IEEE DOI 0311
BibRef
And:
Character prototyping in document images using Gabor filters,
ICIP03(I: 537-540).
IEEE Abstract. 0312
BibRef
And: SCIA03(28-35).
Springer DOI 0310
BibRef

Laurence, D.[Duffy], Le Bourgeois, F.[Frank], Emptoz, H.[Hubert],
Logical structure analysis by typographic characteristics extraction,
CIAP97(II: 639-646).
Springer DOI 9709
BibRef

Allier, B., Duong, J., Gagneux, A., Mallet, P., Emptoz, H.,
Texture feature characterization for logical pre-labeling,
ICDAR03(567-571).
IEEE DOI 0311
BibRef

Liu, L.J.[Li-Jie], Dong, Y.[Yan], Song, X.M.[Xiao-Mu], Fan, G.L.[Guo-Liang],
An entropy-based segmentation algorithm for computer-generated documentimages,
ICIP03(I: 541-544).
IEEE Abstract. 0312
BibRef

Leedham, G., Yan, C.[Chen], Takru, K., Tan, J.H.N.[Joie Hadi Nata], Mian, L.[Li],
Comparison of some thresholding algorithms for text/background segmentation in difficult document images,
ICDAR03(859-864).
IEEE DOI 0311
BibRef

Leedham, G., Varma, S., Patankar, A., Govindaraju, V.,
Separating text and background in degraded document images: A comparison of global thresholding techniques for multi-stage thresholding,
FHR02(244-249).
IEEE Top Reference. 0209
BibRef

Kise, K., Miki, Y., Matsumoto, K.,
Stippling data on backgrounds of pages-toward seamless integration of paper and electronic documents,
ICDAR03(1213-1217).
IEEE DOI 0311
BibRef

Kise, K., Yanagida, O., Takamatsu, S.,
Page Segmentation Based on Thinning of Background,
ICPR96(III: 788-792).
IEEE DOI 9608
(Osaka Prefecture Univ., J) BibRef

Kise, K., Yamaoka, M., Babaguchi, N., Tezuka, Y.,
Model based system for analyzing document images,
ICPR92(II:647-650).
IEEE DOI 9208
BibRef

Suvichakorn, A.[Aimamorn], Watcharabusaracum, S.[Sarin], Sinthupinyo, W.[Wasin],
Simple Layout Segmentation of Gray-Scale Document Images,
DAS02(245 ff.).
Springer DOI 0303
BibRef

Caillault, E., Viard-Gaudin, C., Ahmad, A.R.,
MS-TDNN with global discriminant trainings,
ICDAR05(II: 856-860).
IEEE DOI 0508
NN HMM. BibRef

Golenzer, J., Viard-Gaudin, C., Lallican, P.M.,
Finding regions of interest in document images by planar HMM,
ICPR02(III: 415-418).
IEEE DOI 0211
BibRef

Sivaramakrishnam, R., Phillips, I.T., Ha, J., Subramanium, S., Haralick, R.M.,
Zone Classification in a Document Using the Method of Feature Vector Generation,
ICDAR95(541-544). Pixel based, multiple classes. BibRef 9500

Cheng, H.[Hui], Fan, Z.G.[Zhi-Gang],
Background identification based segmentation and multilayer tree representation of document images,
ICIP02(III: 1005-1008).
IEEE DOI 0210
BibRef

Blumenstein, M., Verma, B.,
Analysis of segmentation performance on the CEDAR benchmark database,
ICDAR01(1142-1146).
IEEE DOI 0109
BibRef

Yang, Y.D.[Yu-Dong], Zhang, H.J.[Hong-Jiang],
HTML page analysis based on visual cues,
ICDAR01(859-864).
IEEE DOI 0109
BibRef

Mukherjee, D.P.[Dipti Prasad], Acton, S.T.[Scott T.],
Document Page Segmentation using Multiscale Clustering,
ICIP99(I:234-238).
IEEE DOI BibRef 9900

He, S., Abe, N.,
A Clustering-Based Approach to the Separation of Text Strings from Mixed Text/Graphics Documents,
ICPR96(III: 706-710).
IEEE DOI 9608
(National Univers. of Singapore, SGP) BibRef

Randen, T.[Trygve], and Husøy, J.H.[John Håkon],
Segmentation of text/image documents using texture approaches,
Proc. NOBIM-konferansen-94, Asker (Norway), June 1994, pp. 60-67.
HTML Version. BibRef 9406

Fischer, S., Amin, A., and Drivas, D.,
Segmentation of the Yellow Pages,
ICDAR95(605-609). BibRef 9500

Randriamasy, S., Vincent, L.,
Benchmarking Page Segmentation Algorithms,
CVPR94(411-416).
IEEE DOI BibRef 9400

Higashino, J., Fujisawa, H., Nakano, Y., Ejiri, M.,
A Knowledge-Based Segmentation Method for Document Understanding,
ICPR86(745-748). Top-down layout analysis using FDL. BibRef 8600

Makino, H.,
Representation and Segmentation of Document Images,
CVPR839291-295). BibRef 8300

Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Find Text in Documents .


Last update:Mar 16, 2024 at 20:36:19