Collaborative Monocular SLAM with Crowdsourced Data

Jianzhu Huai, Grzegorz Jóźków, Charles Toth, Dorota A. Grejner‐Brzezinska

January 2018

Image credit: Unsplash

Abstract

Targeted at operations without adequate global navigation satellite system signals, simultaneous localization and mapping (SLAM) has been widely applied in robotics and navigation. Using data crowdsourced by cameras, collaborative SLAM presents a more appealing solution than SLAM in terms of mapping speed, localization accuracy, and map reuse. To bridge the gap of real-time collaborative SLAM using forward-looking cameras, this paper presents a framework of a client-server structure with attributes: (1) Multiple users can localize within and extend a map merged from maps of individual users; (2) The map size grows only when a new area is explored; (3) A robust stepwise pose graph optimization technique is used. These attributes are validated with real world KITTI benchmark and datasets crowdsourced by smartphones. It is shown that even a server hosted on a consumer-grade computer could process messages coming concurrently from several clients in real time and create compact and accurate maps.

Type

Journal article

Publication

Navigation

Problem

People often have difficult times navigating a place they have never visited before, for instance, a hospital. As a person usually carries a mobile device equipped with cameras and inertial measurement units (IMUs), can data crowdsourced from these mobile devices be used collectively for solving the situation of getting lost?

Method

With the crowdsourced visual and inertial data, create a map of the area, track the motion of a user, and ultimately provide a planned path to a user.

To create a collective map and estimate a user's location in the map, a client-server architecture is designed in which a client running on a user's device estimates its incremental motion with the sensor data and the server constructs a map of keyframes and landmarks using the collective information emitted by the clients. A handler on the server is responsible for handling data from one client.

Key results

The architecture is realized in two prototypes based on the ORB-SLAM2 program, one for dealing with monocular camera frames, the other for monocular camera frames and IMU data. Both prototype systems were tested with multi-session data collected by mobile devices, demonstrating map re-use, robust loop closure, and real time accurate collaborative mapping.

Jianzhu Huai

Associate researcher

My research interest is mobile mapping and robotic exploration.