User-calibration-free Gaze Tracking with Estimation of the Horizontal Angles
between the Visual and the Opt...
2 User-calibration-free gaze tracking method XPOAL ...
eye caused by asphericity of the cornea as described in Nagamatsu 4 Experimental evaluation...
Acknowledgements
Table 1: Root-mean-square error (RMSE) of the gaze tracking er-
ror for the 20 subjects. ...
of 4

Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizontal Angles Between The Visual And The Optical Axes Of Both Eyes

This paper presents a user-calibration-free method for estimating the point of gaze (POG) on a display accurately with estimation of the horizontal angles between the visual and the optical axes of both eyes. By using one pair of cameras and two light sources, theoptical axis of the eye can be estimated. This estimation is carriedout by using a spherical model of the cornea. The point of intersectionof the optical axis of the eye with the display is termed POA.By detecting the POAs of both the eyes, the POG is approximately estimated as the midpoint of the line joining the POAs of both the eyes on the basis of the binocular eye model; therefore, we can estimatethe horizontal angles between the visual and the optical axes of both the eyes without requiring user calibration. We have developeda prototype system based on this method using a 19 display with two pairs of stereo cameras. We evaluated the system experimentally with 20 subjects who were at a distance of 600 mm from the display. The result shows that the average of the root-meansquare error (RMSE) of measurement of POG in the display screen coordinate system is 16.55 mm (equivalent to less than 1.58◦).
Published on: Mar 3, 2016
Source: www.slideshare.net


Transcripts - Nagamatsu User Calibration Free Gaze Tracking With Estimation Of The Horizontal Angles Between The Visual And The Optical Axes Of Both Eyes

  • 1. User-calibration-free Gaze Tracking with Estimation of the Horizontal Angles between the Visual and the Optical Axes of Both Eyes Takashi Nagamatsu∗ Ryuichi Sugano† Yukina Iwamoto‡ Junzo Kamahara§ Naoki Tanaka¶ Kobe University Abstract line of sight connecting the fovea and the point of gaze (POG). In this model, we make the approximation that the two axes of the eye This paper presents a user-calibration-free method for estimating intersect at the center of the corneal curvature. The horizontal and the point of gaze (POG) on a display accurately with estimation vertical angles between the visual and the optical axes of the eye of the horizontal angles between the visual and the optical axes of range from 3.5◦ to 7.5◦ (average of 5.5◦ ) and from 0.25◦ to 3.0◦ both eyes. By using one pair of cameras and two light sources, the (average of 1.0◦ ), respectively [Osaka 1993]. optical axis of the eye can be estimated. This estimation is carried out by using a spherical model of the cornea. The point of intersec- tion of the optical axis of the eye with the display is termed POA. Rotation Center Center of Corneal Curvature By detecting the POAs of both the eyes, the POG is approximately Cornea POG estimated as the midpoint of the line joining the POAs of both the eyes on the basis of the binocular eye model; therefore, we can es- lA xis Visua timate the horizontal angles between the visual and the optical axes E of both the eyes without requiring user calibration. We have devel- Optical Axis oped a prototype system based on this method using a 19 display A with two pairs of stereo cameras. We evaluated the system experi- F B Pupil Center mentally with 20 subjects who were at a distance of 600 mm from Fovea the display. The result shows that the average of the root-mean- Pupil square error (RMSE) of measurement of POG in the display screen coordinate system is 16.55 mm (equivalent to less than 1.58◦ ). Figure 1: Eye model. CR Categories: H.5.2 [Information Interfaces and Presentation]: User Interfaces—Ergonomics; I.4.9 [Image Processing and Com- Shih and Liu [2004] and Guestrin and Eizenman [2007] presented puter Vision]: Applications methods for reconstructing the optical axis of the eyeball using stereo cameras without actually knowing the characteristics of an Keywords: Gaze tracking, calibration-free, eye movement, eye individual’s eye, such as the shape and refractive index of the sur- model face of the cornea. These methods also estimate the visual axis of the eye through a one-point user-calibration that requires a user to gaze at a single point. We improved these one-point user-calibration 1 Introduction methods to precisely calculate the rotation of the eyeball on the basis of Listing’s law [Nagamatsu et al. 2008a; Nagamatsu et al. Gaze tracking technology can be used in the development of a 2008b]. human-machine interface [Duchowski 2007; Jacob 1991; Naga- matsu et al. 2007]. However, the user-calibration phase required by current systems still requires some effort by the user, and it would Several previous systems claim to be user-calibration-free [Shih therefore be desirable to eliminate it. Because of requirements of et al. 2000; Kohlbecher et al. 2008; Yamazoe et al. 2008]; how- user-calibration, current gaze tracking systems find limited applica- ever, these systems measure the geometric center line of the eye, tions; for example, they have difficulty developing systems meant i.e. the optical axis of the eye. Since the user-calibration is not car- for the general public. ried out, the optical axis of the eye is used as an approximation of the visual axis of the eye; therefore, the accuracy of estimation of Several studies have attempted to reduce the user-calibration effort the visual axis of the eye is expected to be approximately 5◦ . using a model-based approach. Figure 1 shows a typical eye model that is used in a model-based approach [Nagamatsu et al. 2008a; Shih and Liu 2004; Guestrin and Eizenman 2006]. In this model, Recently, we proposed a novel method that estimates the horizon- an eye has two axes: one is the optical axis, which is the geometric tal angles between the visual and the optical axes of both eyes by center line of the eye, and the other is the visual axis, which is the using a binocular eye model [Nagamatsu et al. 2009]. This method can realize user-calibration-free gaze tracking for a computer dis- ∗ e-mail:nagamatu@kobe-u.ac.jp play with a high degree of accuracy. While most previous studies † e-mail:0667093w@stu.kobe-u.ac.jp measured the optical axis of only one eye, we measure the optical ‡ e-mail:0667286w@stu.kobe-u.ac.jp axes of both eyes. From the relation between the two optical axes § e-mail:kamahara@maritime.kobe-u.ac.jp of both the eyes and the display that the user gazes at, we estimate ¶ e-mail:ntanaka@maritime.kobe-u.ac.jp the POG and the horizontal angles between the visual and the opti- Copyright © 2010 by the Association for Computing Machinery, Inc. cal axes of both the eyes without requiring the user to intentionally Permission to make digital or hard copies of part or all of this work for personal or gaze at a fiducial point before using the system. The previous study classroom use is granted without fee provided that copies are not made or distributed reported results obtained with only one subject. for commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on In this paper, we describe our novel user-calibration-free gaze track- servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail ing method in detail, and present the experimental results of 20 permissions@acm.org. adult subjects. ETRA 2010, Austin, TX, March 22 – 24, 2010. © 2010 ACM 978-1-60558-994-7/10/0003 $10.00 251
  • 2. 2 User-calibration-free gaze tracking method XPOAL XPOG XPOAR This section describes a mathematical model for user-calibration- Display free gaze tracking. This method can determine the offsets between the visual and the optical axes of both the eyes by using binocular optics. 2.1 Estimation of optical axis of eye The optical axis of a single eye can be estimated by the methods described in [Nagamatsu et al. 2008b; Shih and Liu 2004; Guestrin and Eizenman 2007]. This method requires a minimum of two cam- eras and two point light sources. Optical Axis Optical Axis 2.2 Binocular eye model Axis Axis al al Visu Visu It is known that the visual axis of the eye inclines toward the nose, BL BR away from the optical axis of the eye. Figure 2 shows a 3D model of the visual and the optical axes of both the eyes when a user gazes at a certain point on the display. The visual axes of both the eyes in- AL AR tersect on the display at the POG, whose position vector is XPOG . The point of intersection of the optical axis of the eye with the dis- play is termed POA. XPOAL and XPOAR are the position vectors EL ER of POAs of the left and the right eyes, respectively. POA of Left Eye POG FL FR XPOAL XPOG POA of Figure 3: Binocular 3D eye model in detail. Right Eye XPOAR Even though this concept is very simple, it is the most accurate theoretical solution that is currently available for realization of user- Optical Axis calibration-free gaze tracking. Although it seems to be more accurate that the POG is determined such that the angles between the visual and the optical axes of both the eyes are equal than estimation of POG by Equation 1, we expect Visual Axis Optical Axis estimation of POG by the Equation 1 is stable and enough accurate Display in practical use. Left Eye By using calculated POG, the offsets between the visual and the Right Eye optical axes of both the eyes can be calculated using the one-point calibration gaze estimation method [Nagamatsu et al. 2008b]. After Figure 2: Binocular 3D eye model. estimating these offsets, when the system is unable to detect one eye due to image processing problems etc., the POG can be calculated from the data of the other eye. Thus, robust operation of the system 2.3 Estimation of point of gaze (POG) is achieved. As described in the introduction, the average horizontal and vertical 3 Implementation angles between the visual and the optical axes of the eye are 5.5◦ and 1.0◦ , respectively. In this study, we focus on the estimation of the horizontal angle between the visual and the optical axes of 3.1 Development of prototype system the eye. We ignore the vertical angle because it is smaller than the horizontal angle and the average is only 1.0◦ . A prototype system was implemented, as shown in Figure 4. This system consists of four synchronized monochrome IEEE-1394 dig- Figure 3 shows a detailed model of the binocular eyes. We can esti- ital cameras (Firefly MV, Point Grey Research Inc.), three infrared mate XPOAL and XPOAR by calculating the intersections between light sources (LED), 19 LCD, and a Windows-based PC (Win- the optical axes of both the eyes and the display. We assume the dows XP, Intel Core 2 Quad). The software was developed using human body to be symmetric about the sagittal plane, and there- OpenCV 1.0 [Intel]. Each camera uses a 1/3 CMOS image sensor fore, the horizontal angles between the visual and the optical axes whose resolution is 752 × 480 pixels. A 50-mm lens and an IR fil- of both the eyes to be similar. Therefore, the midpoint of the POAs ter are attached to each camera. In order to capture high-resolution can serve as a good approximation of the POG, since the horizon- images of the eyes, we used lenses with a narrow field of view. tal offsets of the POAs from the POG cancel each other. XPOG is Thus, a pair of cameras was used for capturing the left eye, and calculated as follows: another pair was used for the right eye. These cameras were posi- tioned under the display. The intrinsic parameters of the cameras 1 were determined before setting the system. The reason for using XPOG (XPOAL + XPOAR ). (1) three LED is to reduce the estimation error of the optical axis of the 2 252
  • 3. eye caused by asphericity of the cornea as described in Nagamatsu 4 Experimental evaluation et al. [2008b]. 4.1 Experiment and results LED LED We evaluated the prototype system in a laboratory with 20 adult subjects (15 men and 5 women) who did not wear glasses or contact lenses. The ages of the subjects range from 21 to 40. The proposed gaze estimation method can allow the user to move; however, the current implementation cannot capture the user’s eyes in a wide area. Therefore, in order to measure the performance of the gaze estimation method without error caused by head move- ment, the head was supported by a chin rest to prevent from it from being out of focus / field of view of the cameras in this experiment. The eyes were approximately 600 mm from the display. The subjects were asked to fixate on 25 points that appeared one- by-one on the display. Data were recorded when the optical axes Camera for Left Eye Camera for Right Eye of both eyes were detected. We recorded more than 10 data points when the subject gazed at each point. Figure 4: Prototype system. Figure 6 shows an example of the experimental result. POGs and POAs of the left and right eyes are plotted in the display coordinate 3.2 Calibration of extrinsic camera parameters system. The black diamond-shaped points on the grids represent fiducial points that were intentionally gazed at by the subject. The If the cameras and lenses are selected, the intrinsic camera parame- triangle- and square-shaped points indicate the POAs of the left and ters do not change. However, the extrinsic camera parameters may right eyes, respectively. The black circular points indicate the esti- change if the position of the camera changes. mated POGs. The plotted points of the POAs are the representatives of the recorded data when the subject gazed at each point; the po- Figure 5 shows an arrangement of the components when the ex- sition is the average value of the recorded data. In cases in which trinsic camera parameters are calibrated. A display, a camera- the POA of either the left eye or the right eye deviated from the calibration board, and cameras are set up on the table. The dis- median value by more than 5◦ in terms of view angle, the data were play and the camera-calibration board are parallel to each other and removed as an outlier. In such a case, it is assumed that the subject upright to the table. The origin of the world coordinate system is gazed at another target or that the image processing failed to detect O, which is on the table in the plane including the display plane. the center of the pupil or the first Purkinje images. The x-axis is horizontal toward the left, and the y-axis is vertical downward. The z-axis is the direction of the cross product of x and 0 640 1280 y. 0 205.5 m m LED2 19” Display Camera-Calibration Board 512 L R LED1 LED0 442 mm 100 mm 353 mm 1024 Cam era 3 Ca mera 123 mm 1 Came x ra 2 Ca mera Figure 6: Example of evaluation results for one subject. 0 O z mm y 600 Root-mean-square error (RMSE) of the system evaluation are listed in Table 1; here, RMSEx and RMSEy denote the horizontal and ver- tical RMSE of measurement in the display screen coordinate sys- Figure 5: Arrangement of components. tem, respectively. RMSE is given by RMSE2 + RMSE2 . The x y values in the table are the averages of 25 points of these values for The calibrations of the extrinsic camera parameters for the four each subject. The lowest row indicates the average of the values for cameras are carried out by using a camera-calibration board. The all the subjects by each column. board has two checker patterns (13 × 9 grid points). Cameras 0 and 1 are for the right eye, and therefore, they are directed toward the 4.2 Discussion right-hand side checker pattern, indicated by R in Figure 5. Cam- eras 2 and 3 are for the left eye, and therefore, they are directed As shown in Table 1, the average of the RMSEx and RMSEy for 20 toward the left-hand side checker pattern, indicated by L in Figure subject is 8.40 mm and 12.92 mm, respectively. These are equiv- 5. alent to less than 0.80◦ and 1.23◦ in terms of the view angle for a 253
  • 4. Acknowledgements Table 1: Root-mean-square error (RMSE) of the gaze tracking er- ror for the 20 subjects. This research was partially supported by the Japan Science and Technology Agency, Research for Promoting Technological Seeds, Ave. of RMSEx Ave. of RMSEy Ave. of RMSE Subject 2008. for 25 points for 25 points for 25 points 1 5.43mm (0.52◦ ) 5.72 mm (0.55◦ ) 8.42 mm (0.80◦ ) 2 12.72 mm (1.21◦ ) 12.09 mm (1.15◦ ) 18.19 mm (1.74◦ ) References 3 7.96 mm (0.76◦ ) 7.10 mm (0.68◦ ) 11.11 mm (1.06◦ ) 4 8.13 mm (0.78◦ ) 12.37 mm (1.18◦ ) 15.33 mm (1.46◦ ) D UCHOWSKI , A. T. 2007. Eye Tracking Methodology: Theory 5 12.49 mm (1.19◦ ) 16.09 mm (1.54◦ ) 20.81 mm (1.99◦ ) and Practice, 2nd ed. Springer-Verlag. 6 17.95 mm (1.71◦ ) 8.89 mm (0.85◦ ) 21.58 mm (2.06◦ ) G UESTRIN , E. D., AND E IZENMAN , M. 2006. General theory 7 5.24 mm (0.50◦ ) 6.58 mm (0.63◦ ) 8.71 mm (0.83◦ ) of remote gaze estimation using the pupil center and corneal re- 8 4.05 mm (0.39◦ ) 7.61 mm (0.73◦ ) 8.97 mm (0.86◦ ) flections. IEEE Transactions on Biomedical Engineering 53, 6, 9 6.65 mm (0.64◦ ) 18.16 mm (1.73◦ ) 19.84 mm (1.89◦ ) 1124–1133. 10 7.79 mm (0.74◦ ) 14.47 mm (1.38◦ ) 17.31 mm (1.65◦ ) 11 5.68 mm (0.54◦ ) 3.81 mm (0.36◦ ) 7.00 mm (0.67◦ ) G UESTRIN , E. D., AND E IZENMAN , M. 2007. Remote point- 12 3.71 mm (0.35◦ ) 5.38 mm (0.51◦ ) 6.68 mm (0.64◦ ) of-gaze estimation with free head movements requiring a single- 13 19.39 mm (1.85◦ ) 12.83 mm (1.22◦ ) 23.86 mm (2.28◦ ) point calibration. In Proceedings of the 29th Annual Interna- 14 12.97 mm (1.24◦ ) 8.78 mm (0.84◦ ) 16.39 mm (1.56◦ ) tional Conference of the IEEE EMBS, 4556–4560. 15 8.32 mm (0.79◦ ) 32.11 mm (3.06◦ ) 33.67 mm (3.21◦ ) 16 5.52 mm (0.53◦ ) 26.19 mm (2.50◦ ) 26.86 mm (2.56◦ ) I NTEL. Open source computer vision library. 17 5.61 mm (0.54◦ ) 15.01 mm (1.43◦ ) 16.27 mm (1.55◦ ) http://sourceforge.net/projects/opencvlibrary/. 18 5.62 mm (0.54◦ ) 6.92 mm (0.66◦ ) 9.28 mm (0.89◦ ) JACOB , R. J. K. 1991. The use of eye movements in human- 19 5.01 mm (0.48◦ ) 20.14 mm (1.92◦ ) 20.86 mm (1.99◦ ) computer interaction techniques: what you look at is what you 20 7.66 mm (0.73◦ ) 18.09 mm (1.73◦ ) 19.88 mm (1.90◦ ) get. ACM Transactions on Information Systems 9, 2, 152–169. Ave. 8.40 mm (0.80◦ ) 12.92 mm (1.23◦ ) 16.55 mm (1.58◦ ) KOHLBECHER , S., BARDINST, S., BARTL , K., S CHNEIDER , E., P OITSCHKE , T., AND A BLASSMEIER , M. 2008. Calibration- free eye tracking by reconstruction of the pupil ellipse in 3D distance of 600 mm from the eye, respectively. space. In Proceedings of the 2008 Symposium on Eye Tracking Research & Applications, 135–138. Therefore, the horizontal error is improved as compared to previ- ous user-calibration-free methods that approximate the visual axis NAGAMATSU , T., K AIEDA , Y., K AMAHARA , J., AND S HIMADA , of the eye by the optical axis of the eye. On the other hand, the H. 2007. Development of a skill acquisition support system us- vertical error is as expected, which is similar to the average value ing expert’s eye movement. In Proceedings of HCI International for humans reported in previous literature [Osaka 1993]. 2007, vol. 9, 430–439. As a whole system, the average of the RMSE for 20 subjects is NAGAMATSU , T., K AMAHARA , J., I KO , T., AND TANAKA , N. 16.55 mm (∼1.58◦ ). Therefore, the system is one of the most 2008. One-point calibration gaze tracking based on eyeball kine- accurate user-calibration-free remote gaze tracking system at the matics using stereo cameras. In Proceedings of the 2008 Sympo- moment. sium on Eye Tracking Research & Applications, 95–98. NAGAMATSU , T., K AMAHARA , J., AND TANAKA , N. 2008. 3D gaze tracking with easy calibration using stereo cameras for 5 Conclusion robot and human communication. In Proceedings of IEEE RO- MAN 2008, 59–64. User-calibration-free gaze tracking using a binocular eye model was described, which is the most accurate theoretical solution that NAGAMATSU , T., K AMAHARA , J., AND TANAKA , N. 2009. is currently available for realization of user-calibration-free gaze Calibration-free gaze tracking using a binocular 3D eye model. tracking. The proposed system uses two pairs of stereo cameras. In Proceedings of the 27th International Conference Extended One pair of cameras each is used to estimate the optical axes of the Abstracts on Human Factors in Computing Systems, 3613–3618. left and the right eye. The POG is estimated as the midpoint of O SAKA , R. 1993. Experimental Psychology of Eye Movements (in the line joining the POAs of both the eyes. From the POG, we can Japanese). The University of Nagoya Press, Nagoya, Japan. calculate the offsets of the visual and the optical axes of both the eyes. S HIH , S.-W., AND L IU , J. 2004. A novel approach to 3-D gaze tracking using stereo cameras. IEEE Transactions on Systems, We developed a prototype system and evaluated it experimentally Man, and Cybernetics, Part B 34, 1, 234–245. with 20 subjects. The results show that the average RMSEx , RMSEy , and RMSE for 20 subjects is 8.40 mm (∼0.80◦ ), 12.92 S HIH , S.-W., W U , Y.-T., AND L IU , J. 2000. A calibration-free mm (∼1.23◦ ), and 16.55 mm (∼1.58◦ ), respectively. The horizon- gaze tracking technique. In Proceedings of International Con- tal error is improved by using the binocular eye model as compared ference on Pattern Recognition, vol. 4, 201–204. to previous user-calibration-free methods that approximate the vi- YAMAZOE , H., U TSUMI , A., YONEZAWA , T., AND A BE , S. 2008. sual axis of the eye by the optical axis of the eye. Remote gaze estimation with a single camera based on facial- feature tracking without special calibration actions. In Proceed- Future works include estimation of vertical angle between the vi- ings of the 2008 Symposium on Eye tracking research & appli- sual and the optical axes of the eye, improving the gaze estimation cations, 245–250. area where the head can move, and applying the system in real-life situations. 254

Related Documents