Single Camera Vehicle Localization Using Feature Scale Tracklets

David WONG  Daisuke DEGUCHI  Ichiro IDE  Hiroshi MURASE  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E100-A   No.2   pp.702-713
Publication Date: 2017/02/01
Online ISSN: 1745-1337
Type of Manuscript: PAPER
Category: Vision
Keyword: 
ego-localization,  monocular vision,  feature scale,  

Full Text: PDF(8MB)
>>Buy this Article


Summary: 
Advances in intelligent vehicle systems have led to modern automobiles being able to aid drivers with tasks such as lane following and automatic braking. Such automated driving tasks increasingly require reliable ego-localization. Although there is a large number of sensors that can be employed for this purpose, the use of a single camera still remains one of the most appealing, but also one of the most challenging. GPS localization in urban environments may not be reliable enough for automated driving systems, and various combinations of range sensors and inertial navigation systems are often too complex and expensive for a consumer setup. Therefore accurate localization with a single camera is a desirable goal. In this paper we propose a method for vehicle localization using images captured from a single vehicle-mounted camera and a pre-constructed database. Image feature points are extracted, but the calculation of camera poses is not required — instead we make use of the feature points' scale. For image feature-based localization methods, matching of many features against candidate database images is time consuming, and database sizes can become large. Therefore, here we propose a method that constructs a database with pre-matched features of known good scale stability. This limits the number of unused and incorrectly matched features, and allows recording of the database scales into “tracklets”. These “Feature scale tracklets” are used for fast image match voting based on scale comparison with corresponding query image features. This process reduces the number of image-to-image matching iterations that need to be performed while improving the localization stability. We also present an analysis of the system performance using a dataset with high accuracy ground truth. We demonstrate robust vehicle positioning even in challenging lane change and real traffic situations.