HomeHomeJournal CollectionJournal SocietiesMeeting Calendar
Logo
Search for

Volume 14, Issue 3, Pages 449-470 (June 2010)


View previous. 16 of 17 View next.

Automatic detection of informative frames from wireless capsule endoscopy images

M.K. BasharaCorresponding Author Informationemail address, T. Kitasakaad, Y. Suenagaademail address, Y. Mekadaacemail address, K. Moribcemail address

Received 25 December 2007; received in revised form 12 September 2009; accepted 2 December 2009. published online 04 January 2010.

Abstract 

Wireless capsule endoscopy (WCE) is a new clinical technology permitting visualization of the small bowel, the most difficult segment of the digestive tract. The major drawback of this technology is the excessive amount of time required for video diagnosis. We therefore propose a method for generating smaller videos by detecting informative frames from original WCE videos. This method isolates useless frames that are highly contaminated by turbid fluids, faecal materials and/or residual foods. These materials and fluids are presented in a wide range of colors, from brown to yellow, and/or have bubble-like texture patterns. The detection scheme therefore consists of two steps: isolating (Step-1) highly contaminated non-bubbled (HCN) frames and (Step-2) significantly bubbled (SB) frames. Two color representations, viz., local color moments in Ohta space and the HSV color histogram, are attempted to characterize HCN frames, which are isolated by a support vector machine (SVM) classifier in Step-1. The rest of the frames go to Step-2, where a Gauss Laguerre transform (GLT) based multiresolution texture feature is used to characterize the bubble structures in WCE frames. GLT uses Laguerre Gauss circular harmonic functions (LG-CHFs) to decompose WCE images into multiresolution components. An automatic method of segmentation was designed to extract bubbled regions from grayscale versions of the color images based on the local absolute energies of their CHF responses. The final informative frames were detected by using a threshold on the segmented regions. An automatic procedure for selecting features based on analyzing the consistency of the energy-contrast map is also proposed. Three experiments, two of which use 14,841 and 37,100 frames from three videos and the rest uses 66,582 frames from six videos, were conducted for justifying the proposed method. The two combinations of the proposed color and texture features showed excellent average detection accuracies (86.42% and 84.45%) with the final experiment, when compared with the same color features followed by conventional Gabor-based (78.18% and 76.29%) and discrete wavelet-based (65.43% and 63.83%) texture features. Although intra-video training–testing cases are typical choices for supervised classification in Step-1, combining a suitable number of training sets using a subset of the input videos was shown to be possible. This mixing not only reduced computation costs but also produced better detection accuracies by minimizing visual-selection errors, especially when processing large numbers of WCE videos.

a MEXT Innovation Center for Preventive Medical Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan

b Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan

c School of Information Science and Technology, Chukyo University, Japan

d Faculty of Information Science, Aichi Institute of Technology, Yakusa-cho, Toyota 470-0392, Japan

Corresponding Author InformationCorresponding author. Tel.: +81 52 789 5688; fax: +81 52 789 3815.

PII: S1361-8415(09)00145-5

doi:10.1016/j.media.2009.12.001


View previous. 16 of 17 View next.