A Method to Interpret 3D Motions Using Neural Network

Akira WATANABE  Nobuyuki YAZAWA  Arata MIYAUCHI  Minami MIYAUCHI  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E77-A   No.8   pp.1363-1370
Publication Date: 1994/08/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8508
Type of Manuscript: Special Section PAPER (Special Section on Information Theory and Its Applications)
Category: 
Keyword: 
3D motion,  optical flow,  neural network,  complex-BP,  3DV-BP,  

Full Text: PDF>>
Buy this Article




Summary: 
In computer vision, the interpretation of 3D motion of an object in the physical world is an important task. This study proposes a 3D motion interpretation method which uses a neural network system consisting of three kinds of neural networks. This system estimates the solutions of 3D motion of an object by interpreting three optical flow (OF-motion vector field calculated from images) patterns obtained at the different view points for the same object. In the system, OF normalization network is used to normalize diverse OF patterns into the normalized OF format. Then 2D motion interpretation network is used to interpret the normalized OF pattern and to obtain the object's projected motion onto an image plane. Finally, 3D motion interpretation network totally interprets the three sets of the projected motions and it derives the solutions of the object's 3D motion from the inputs. A complex numbered version of the back-propagation (Complex-BP) algorithm is applied to OF normalization netwerk and to 2D motion interpretation network, so that these networks can learn graphical patterns as complex numbers. Also a 3D vector version of the back-propagation (3DV-BP) algorithm is applied to 3D motion interpretation network so that the network can learn the spatial relationship between the object's 3D motion and the corresponding three OF patterns. Though the interpretation system is trained for only basic 3D motions consisting of a single motion component, the system can interpret unknown multiple 3D motions consisting of several motion components. The generalization capacity of the proposed system was confirmed using diverse test patterns. Also the robustness of the system to noise was probed experimentally. The experimental results showed that this method has suitable features for applying to real images.