Design of Deep Networks for Pedestrian Detection
- Design of Deep Networks for Pedestrian Detection
- Date Issued
- Pedestrian detection is a process of drawing bounding boxes that tightly enclose pedestrians in a given image. With a boosting algorithm such as AdaBoost, designing good features that improve the detection accuracy has been the mainstream of pedestrian detection task. However, since from the big progress of image classification using a deep convolutional neural network (DCNN), a DCNN has been applied to various visual recognition problems including pedestrian detection. In this dissertation, we present two novel DCNN architectures for pedestrian detection task.
In the first-designed DCNN, we propose a guiding network that assists with training a pedestrian detection network. To avoid computational burden, we used proposal-and-classification strategy that extracts proposals using AdaBoost-based methods and classifies them using DCNN. A guiding network is adaptively appended to the pedestrian region of the last convolutional layer of the detection network. This guiding network helps the convolutional layers to learn more-discriminative features for pedestrians by focusing on the pedestrian (-like) regions. The guiding network is used only for training, and therefore does not affect the inference speed. By adopting the guiding network, our method yields a new state-of-the-art detection accuracy on the Caltech Pedestrian benchmark and presents competitive results with the state-of-the-art methods on the INRIA and KITTI benchmarks.
In the second-designed DCNN, we propose a pedestrian detection network that takes a full-size image as an input and outputs pedestrians of multiple sizes, which is called Direct Multi-Scale Dual-Stream Network (DMSDSN). DMSDSN has three characteristics. 1) DMSDSN detects pedestrians without extracting proposal and resampling process (`Direct'). 2) DMSDSN detects various-sized pedestrians by splitting networks depending on the scales of pedestrians ('Multi-Scale'). 3) DMSDSN combines two types of features for a detection network by branching off from two different layers (`Dual-Stream'). DMSDSN is a single network and is trained end-to-end. DMSDSN was evaluated on the Caltech Pedestrian benchmark and presents competitive detection accuracy with the state-of-the-art methods while the detection speed is quite fast due to the simplicity of the process.
- Article Type
- Files in This Item:
- There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.