### 443

```MVA2000
IAPR Workshop on Machine Vision Applications, Nov. 28-30,2000, The University of Tokyo, Japan
12-1
Self-Calibration from Optical Flow and Its Reliability Evaluation
Department of
Kenichi K a n a t a n i *
Computer Science, Gunma University
Abstract
An algorithm is presented for 3-D reconstruction from
optical flow observed by an uncalibrated camera. We
show that by incorporating a statistical model of image noise, we can not only compute a statistically optimal shape but also evaluate its reliability in quantitative
terms. We show real-image experiments and discuss the
effect of the "gauge" on the uncertainty description.
1. Introduction
3-D reconstruction from optical flow has been
studied by many researchers [4, 5, 131, but most have
assumed that the camera is calibrated. Recently, the
self-calibration approach using an uncalibrated camera was formulated by ViCville et al. [16] and Brooks
et al. [2]. The self-calibration procedure consists of
the following steps:
1. We detect optical flow from an image sequence.
2. We compute the flow fundamental matrices
from the detected flow.
3. We decompose the computed flow fundamental matrices into the motion parameters.
4. We compute the 3-D shape of the scene.
In this paper, we show that by incorporating a statistical model of image noise, we can not only compute
a statistically optimal shape but also evaluate its reliability in quantitative terms. We show real-image
experiments and discuss the effect of the gauge on
the uncertainty description.
2. Optical Flow Detection
The conventional method for optical flow detection is based on what is known as the gradient constraint [ l l , 121. However, the resulting flow does
not have sufficient accuracy for 3-D reconstruction.
Here, we assume that a limited number of salient
feature points are traced by template matching and
other means with high accuracy.
3. Fundamental Matrices
Let {(x,,~,)) and {(xk,yk)), a = 1, ..., N , be
image coordinates of two sets of points on two different images. We define the "flow" and the "midpoint" of the a t h point as
E-mail: kanataniocs .gunma-u. ac .jp
where fo is an appropriate scale factor (e.g., the image size). If noise does not exist, the following epipolar equation is satisfied: [2, 5, 6, 13, 161 (throughout
this paper, the inner product of the vectors a and b
is denoted by ( a , b ) ) :
(xa,W x a )
+ (x,,
Cx,) = 0.
(2)
Here, W is an antisymmetric matrix, and C is a
symmetric matrix. They play the same role as the
fundamental matrix for finite motion images, so we
call them the flow fundamental matrices.
The matrices W and C are not independent of
each other. The following relationship holds [2]:
We call this the decomposability condition1.
From {x,, x,), a = 1, ..., N , the flow fundamental matrices W and C are computed by a technique
called renormalization [6, 91. The program is implemented in C++ and is publicly available2. It outputs the estimates w and c of the flow fundamental
(+)
matrices along with their standard deviations W ,
, c(+), and c(-'. If, say, w(+)and w(-)ccincide up to three significant digits, the estimate w
is likely to have accuracy up to approximately three
significant digits.
w(-)
4. Motion Parameters
We assume that the camera is freely moving and
freely changing its focal length. Other camera parameters such as the principal point, the aspect ratio, and the skew angle, which usually do not change
in the course of camera motion, are assumed to be
'This corresponds to the constraint that the fundamental
matrix for finite motion should have rank 2.
2http://vvv.ail.cs .gunma-u.ac. jp/'kanatani/e.
e且1ibratedbefbrehand．FIenee．theunknownparam−

etersarethetrans玉ationvelocityv．therot且tionYe−
locity叫thefb亡a11engthf．aJlditヨChangeratlef・
BTOOksetal・［2】showedthattheBePar乱TneteT5Can
diagona）matril：Withdiagona）element芳…1nthat
order．
becomp山ed抑叫血callyfromⅣ＝〔町j）and亡ご
＝〔Cij）．lmttheireomputatiorlinvolvesratheTCOm−
pli亡atedalgebrai亡manip111ations・打ere，WepTeSent
aneleg弧t叩叩一触口柁t加古♪m亡e血門derivedbyex−
PreSBlngquaJltitiesintermsofi7Te血cLbEerepresen一
InthepreseTICeOfTlOise．thedata丘≡andⅡmay
notne亡eSSaTilysati叫eq．榊hrthe亡nmputed恥w
h皿da爪ent且ユmatrice5CaTldW・So，WPOptima11y
COrr配f．磨andD：tOenforceeq．〔2）．Thisi日dorwas

htⅧ＝〔叫）be七bev既tOTde扇nedi¶eqB・（3），

A＝C‖＋Cコ才，磨＝（C‖−C22）＋2fC12、（4）
e＝2〔C．3十盲C23），
Here．wedeGme

￠三

， 桝

Ⅴ（よ，可
（161
．︺
l l
．■■■■．＼
l
WI
Utp

l′
りふ
J＝J′九
r﹁

tlち匝】Ⅳ瑚叫歳】W可丁
Ⅴ〔丘，可
l
ノ′＝イ軸】，
l纏】＝鴨匝卜
︺
O
t−＝・＼−【
and丘aregh唱ma日払l】0Ⅶ・S【6ト
︵
J＝J′九
2PU3

The covarian亡e matrices ofthe resulting values立

ロ

U3 ＝￣
J∫＝

Weassumf？thatcrroFSiTl嚢and工rarq5tati写tjea勒
imdepend印t，bt】tt・he仁0汀eCtPdvalues丘and虔have
〕ト〔13）
〔 げ〃抽叫
thetbllowingcoTre】a†・ion＝一軒
Here．iis theimaglnaryunit．Thequantitieswith
tildesare・巳Dmplexnumbers：呵・］aLnd9［・ldenote
therealandimaglnaryPaTrtB．reSpeCtively．Wedefine
the“inneT PrOduct”ofcomp［exnumbers z＝J＋
OPerationNt・】designate3nOrmalizationLntO孔unit
lノ右脚Ⅳ去十2仇：）（叫可Ⅳ正）T
l−け、訂）
（lTJ
After the focallen郎hJ鋸Idits change rate
have been computed，We tranSform鐘aTldコ亡
v眈tOr：叫瓜］＝山川榔
．√Jd
n

l封蓋，可＝
Intheaboveprocedure，LL，3i畠亡OmPl血dbyeq，（即 LheiT亡0V乱丁i射1CPmatTi亡e5郎fo11帥唱〔wede6nef，k
aTldbythe鮎stofeqs．（10）intwDW町S．Thedecom−
＝diag（1，1，0”：
posabilitycondition桝requiresthatthetwovaIues

abovecomputat丘onfailsoc亡ur占Whenthecameraop−

ticalaxismoveswithintheplanespannedbyitand
thetranslationvelocityv，e，g．．WllenthecameTaun−
dergoe5apuretranSlationorthecameTaOpticalaxis

PaSSeSthroughafixedpointinthes亡ene・

5．Corrモ亡t呈0ⅡOrFlow
Let鴨匝】ar）dl／L［塵］bethe亡OVarian仁ematricesof

They亡an be deteTmined fTOm the He55ian ofthe
（1即
Then，We⊂ZulViewtheimaginggeometrya占ifusing
TeSidualsu一触：e Oftemplate matehing ofgr邑y Zev−
aper叩e亡tivec訂−1eTaWithunit加由1en軒h・
f南［14，1叶IfれOpriorin氏｝rmationisav由1able，We

mayusethedぬultvalueslち［可＝2diag（1，l，叫and
444
The depth Z of the point x is given as follows [6]:
Z=-
(v, Sxv)
(v, S,(X w X x ) ) '
+
(19)
Here, we define
Figure 1: Real images of an indoor scene.
. 3-D position of this point is
and k = (O,O, I ) ~ The
given by
r = ZX.
(21)
At this point, we need to check the sign of the depth.
This is because the signs of W and C are indeterminate as implied by eq. (2). Let 2, be the depth
associated with x,. We replace the sign of each 2,
N
if
sgn[za] < 0, where sgn[.] is the signature
function that takes 1, 0, and -1 for x > 0, x = 0,
and x < 0, respectively.
Em=,
8. Reliability Evaluation
From eq. (21), the covariance matrix of the reconstructed position i is given up to scale as follows:
.
.
The matrix Vo[x] is given in eqs. (18). From
eqs. (19), the matrices Y[z] and v~[z,
x] are given
as follows:
Here, tr denotes trace, and we define
However, this analysis is based on the computed flow
fundamental matrices c and W . They are computed from the data {x,, x,), cr = 1, ..., N, and
hence are not exact. It follows that the values f , f , v
and w are not exact. However, it is difficult to analyze the error propagation precisely. Here, we adopt
the following approximation. We reconstruct two 3D positions r(*)for x from the standard deviations
c(*)
and w(') and regard ( r ( + ) - i ) ( r ( + ) - i)T
as the covariance matrix of r due to the errors in c
and W . The total covariance matrix of F is given
by
Figure 2: 3-D reconstruction and uncertainty ellipsoids
(stereogram).
where Z2 is the absolute noise magnitude, which can
be estimated in the process of com.puting c and w
[6, 91.
9. Real Image Experiment
We reconstructed the 3-D shape from the two images shown in Fig. 1, using the feature points marked
in the images. Fig. 2 is a side view of the reconstructed points (stereogram); wireframes are shown
for some points. On each reconstructed point is centered the uncertainty ellipsoid defined by the covariance matrix given by eq. (25). All ellipsoids look like
thin needles, indicating that the uncertainty is large
along the depth orientation.
This description is deceptive, however. This uncertainty description is based on a particular gauge,
i.e., a choice of normalization: the world coordinate system is identified with the camera frame and
the translation velocity is normalized to unit length
[8, 101. This gauge hides the fact that the uncertainty is mostly due to that of the translation velocity. In fact, what is uncertain is the depth of the
object as a whole, not the object shape.
For example, if we take the centroid of the polyhedral object as the coordinate origin and normalize
the root-mean-square distance t o the vertices from
the centroid to unit length, we obtain the description
shown in Fig. 3(a). By construction, the uncertainty
is almost symmetric with respect to the centroid,
and the object shape has very little uncertainty.
Fig. 3(b) is the uncertainty description for yet another gauge: one of the object vert,ex is taken to be
the coordinate origin, another is taken to be (1,1,0),
and a third one is on the XY plane. By definition,
the first two points have no uncertainty.
It follows that uncertainty of individual quantities
has no absolute meaning. In other words, the discrepancy of the reconstructed quantities from their
Figure 4
malization based o n t h r e e vertices.
computed value
true value
predicted standard deviation
ratio
1.02
1.00
0.08
.,
angle (deg)
95.1
90.0
17.0
Table 1: Reliability of gauge invariants.
true values is not a meaningful measure of accuracy
if artificial normalizations are involved.
Let us call the description changes due to choosing different gauges (i.e., normalizations) gauge
transformations. Absolute meaning can be given
only to gauge invariants [8], i.e., quantities invariant
to gauge transformations. Typical gauge invariants
for Euclidean reconstruction are ratios of lengths
and angles of lines. Table 1 lists the ratio of two
sides of the polyhedral object and the angle they
make along with their true values and their standard deviations derived by the covariance matrices
of the vertices.
Fig. 4 shows two real images of a car. Fig. 5
shows its 3-D shape computed from the feature
points marked in these images. We defined a wireframe with triangular meshes from the reconstructed
points and mapped the texture onto it. A fairly accurate 3-D shape is created even though only two
views are used.
10. Concluding Remarks
An algorithm has been presented for 3-D reconstruction from optical flow observed by an uncalibrated camera. We have shown that by incorporating a statistical model of image noise, we can
not only compute a statistically optimal shape but
also evaluate its reliability in quantitative terms, although the accuracy is not as high as that using
the fundamental matrix (1, 71. We have shown realimage experiments and discussed the effect of the
gauge on the uncertainty description.
Acknowledgments: T h i s work was in p a r t s u p p o r t e d
by t h e Ministry of Education, Science, S p o r t s a n d Cult u r e , J a p a n under a G r a n t i n Aid for Scientific Research
C(2) (No. 11680377). T h e a u t h o r t h a n k s Mike Brooks of
t h e University of Adelaide a n d his colleagues for collaboration in this research. H e also t h a n k s Naoya O h t a of
G u n m a University a n d Yoshiyuki Shimizu of S h a r p L t d .
for their assistance in doing real image experiments.
images of a car
I
Figure 5: Reconstructed 3-D s h a p e
References
[I] L . Baumela, L. Agapito, P. Bustos and I. Reid, Motion estimation using the differential epipolar equation,
Proc. 15th Int. Conf. Patt. Recogn., September 2000,
Barcelona, Spain, Vo1.3, pp. 848-851.
[2] M. J . Brooks, W. Chojnacki and L. Baumera, Determining the egomotion of an uncalibrated camera from
instantaneous optical flow, J. Opt. Soc. Am., A, 14-10
(1997), 2670-2677.
[3] K. Kanatani, Group-Theoretical Methods in Image Understanding, Springer, Berlin, 1990,
(41 K. Kanatani, 3-D interpretation of optical flow by renormalization, Int. J. Comput. Vision, 11-3 (1993), 267282.
[5] K. Kanatani, Geometric Computation for Machine Vision, Oxford University Press, Oxford, 1993.
(61 K. Kanatani, Statistical Optimization for Geometric Computation: Theory and Practice, Elsevier, Amsterdam,
1996.
[7] K. Kanatani, Gauge-based reliability analysis of 3-Dreconstruction from two uncalibrated perspective views,
Proc. 15th Int. Conf. Patt. Recogn, September 2000,
Barcelona, Spain, Vol.1, pp. 76-79.
[8] K. Kanatani and D. D. Morris, Gauges and gauge transformations in 3D reconstruction from a sequence of images, Proc. 4th Asian Conf. Computer Vision, January
2000, Taipei, Taiwan, pp. 1046 - 1051.
(91 K. Kanatani, Y . Shimizu, N . Ohta, M. J . Brooks, W .
Chojnacki and A. van den Hengel, Fundamental matrix
from optical flow: Optimal computation and reliability
evaluation, J. Electronic Imaging, 9-2 (2000), 194-202.
[lo] D. D. Morris, K. Kanatani and T . Kanade, Uncertainty
modeling for optimal structure from motion, IEEE Workshop on Vision Algorithm: Theory and Practice, September 1999, Corfu, Greece, pp. 33-40.
[ l l ] N. Ohta, Image movement detection with reliability indices, IEICE 7'mns., E79-10 (1991), 3379-3388.
[12] N. Ohta, Optical flow detection using a general noise
model, IEICE Trans. Inf. 0 Syst., E79-D-7 (1996), 951957.
[13] N . Ohta and K. Kanatani, Optimal structure-from-motion
algorithm for optical flow, IEICE %ns. Inf. d Sys.,
E78-D-12 (1995), 1559-1566.
[14] J . Shi and C. Tomasi, "Good features t o track," in Proc.
IEEE Conf. Comput. Vision Patt. Rewgn., June 1994,
Seattle, WA, U.S.A., pp. 593-600.
[15] A. Singh, "An estimation-theoretic framework for imageflow computation," in Proc. 3rd Int. Conf. Comput. Vision, December, 1990, Osaka, Japan, pp. 168-177.
(161 T. Vibville and 0. D. Faugeras, The first order expansion
of motion equations in the uncalibrated case, Comput.
Vision Image Understanding, 64-1 ( 1 996), 128-146.
```