Spline Fusion

Spline Fusion. This work focuses on the fusion of visual-inertial sensors for robot localization and mapping. In particular, a continuous-time trajectory representation is developed to address this problem. Continuous-time representations are useful when dealing with multiple unsynchronized devices, high-frame rate sensors and rolling shutter cameras. We show that our proposed representation can be used on-line as well as in batch mode. It also facilitates the calibration of the intrinsic parameters and relative poses of multiple sensors.



  • Toyota grant 33643/1/ECNS20952N: Accurate visual-inertial map estimation
  • Motorola grant 34525/1/EENS20681N: Mobile Perception
  • NSF CISE grant 34083/1/CCLS20795F: EAGER: Prototype Dense Motion Capture for Large-Scale Deformable-Scene Tracking


Front End Fusion (Incremental and Adaptive Dense Tracking and Mapping).  This method entails reconstruction of a 3-D scene using methods of non-convex optimization. The reconstruction takes advantage of the highly parallel nature of the optimization to compute predicted depth using a GPU. As frames are captured, the function to be optimized is augmented by the new data. At any point in time, only the two most recent frames are required for depth estimation, which results in an estimation better than, or at least as good as, that which would have been generated from storage of hundreds of images. This incremental technique is highly tailored to mobile computing, as it relies on a constant memory footprint. Furthermore, by invoking the characteristic of simply incrementing the optimization on a frame by frame basis, the overall compute time is greatly reduced. In addition to its incremental nature, the optimization also adapts to the environment detected to constantly produce depth images with the highest resolution possible.


  • CIA grant 33998/1/CCNS21068F: Focused Dense Tracking and Mapping
  • MITRE grant 35499/1/CCNS21211N: Multi-Sensor Monocular Dense 3D Reconstruction

semantic scene understanding

Semantic Scene Understanding and Place Recognition.The goal of this research is to infer spatial and semantic information from autonomous robot sensor data. We extend segmentation techniques to exploit temporal and 3D information jointly, and to aid in unsupervised object discovery. Further, place recognition is an important and related component of any long term mapping system because it allows us to do two things: close loops, and build maps over disjoint time frames. Most contemporary approaches to place recognition employ Bag of Words approaches where image descriptors are quantized into visual words, and this quantization phase is determined through a lengthy offline training process that typically creates a huge vocabulary that is potentially biased towards the sort of images it has been trained on. Our work is based on the notion that a single image is a poor representation of a place, and that by moving away from representing the world as a set of arbitrary images we gain 2 things: fewer images need to be retained to support place recognition as images represent distinct places – this is crucial for long term autonomy, and also the distance (in image space) between places is further so place recognition performance improves.



  • Motorola grant 34525/1/EENS20681N: Mobile Perception

Scalable Multi-device SLAM and Augmented Reality

Scalable Multi-device SLAM and Augmented Reality. Ubiquitous and powerful modern cellphones offer an ideal platform for pervasive, large-scale augmented reality and scene understanding. We are developing a scalable system for building and sharing AR-capable mapped environments with heterogeneous cellphones. This system moves beyond privileged frame sub-mapping techniques and produces a unified representation that is both amenable to simultaneous asynchronous access from multiple devices and easily scales to maps of many kilometers.
We leverage our recent results in visual-inertial SLAM work to provide a fully relative framework (based on RSLAM) for sharing accurate maps created with a calibrated cellphone’s camera and integrated IMU. The relative formulation allows for seamless integration of shared maps into onboard AR processing. We demonstrate this capability on live cellphone video and IMU data, showing the possibility of crowdsourced, live AR from multiple devices.


  • NSF MRI grant 35118/2/IXXS20968N: Dense Scene Capture
  • Motorola grant 34525/1/EENS20681N: Mobile Perception


The Parkour Cars. The Parkour Cars project aims to develop high fidelity real-time systems for perception, planning and control of agile vehicles in challenging terrain including jumps and loop-the-loops. The current research is focused on the local planning and control problem. Due to the difficulty of the maneuvers, the planning and control systems must consider the underlying physical model of the vehicle and terrain. This style of simulation-in-the-loop planning enables very accurate prediction and correction of the vehicle state, as well as the ability to learn precise attributes of the underlying physical model.



  • NSF CISE grant 34083/1/CCLS20795F: EAGER: Prototype Dense Motion Capture for Large-Scale Deformable-Scene Tracking
  • Toyota grant 33643/1/ECNS20952N: Robust Perception
  • Toyota grant 33643/1/ECNS20952N: Accurate visual-inertial map estimation


Unsupervised place-recognition. As a foundation for evolving, plastic maps that can enable long-term autonomy, at Oxford Christopher Mei and I have developed a new topo-metric representation of the world based on landmark co-visibility. This approach simplifies data association and improves the performance of unsupervised place-recognition. We introduce the concept of dynamic bag-of-words, which is a novel form of query expansion based on finding cliques in the landmark co-visibility graph. The proposed approach avoids the – often arbitrary – discretization of space from the robot’s trajectory that is common to most image-based loop closure algorithms. Instead we show that reasoning on sets of co-visible landmarks leads to a simple model that out-performs pose-based or view-based approaches. Using real and simulated imagery, we demonstrate that dynamic bag-of-words query expansion can improve precision and recall for appearance-based localization.

  • Publications: IROS 2010 PDF
  • Sponsors: European Commission under grant agreement number FP7-231888- EUROPA and EPSRC under Platform Grant EP/D037077/1.

NeverLostNever-lost. In collaboration with Ashley Napier at Oxford this research describes a system for online, constant-time pose estimation for road vehicles. We exploit both the state of the art in vision based SLAM and the wide availability of overhead imagery of road networks. We show that by formulating the pose estimation problem in a relative sense, we can estimate the vehicle pose in real-time and bound its absolute error by using overhead image priors. We demonstrate our technique on data gathered from a stereo pair on a vehicle traveling at 40 kph through urban streets. Crucially our method has no dependence on infrastructure, needs no workspace modification, is not dependent on GPS reception, requires only a single stereo pair and runs on an every day laptop.

HelmetRelative visual-inertial estimation. Detecting moving reference frames using relative visual-inertial estimation — measuring vertigo. We have recently begun looking into visual-inertial estimation in the relative framework. As its prevalence in nature attests, the combination of inertial and visual sensing is powerful — we hope to use it to detect the presence of moving reference frames, such as occurs on an elevator or in a moving train.

  • Sponsors: European Commission under grant agreement number FP7-231888-EUROPA. Systems Engineering for Autonomous Systems (SEAS) Defence Technology Centre.

LisaRelative SLAM. Visual simultaneous localization and mapping is the problem of estimation the motion of a camera and providing a representation of the environment. This is an essential task for robot autonomy. In joint work with Christopher Mei I have developed a Relative simultaneous localization and mapping engine based on a novel continuous relative representation (essentially a manifold). This stereo-vision based system runs at 20-40hz, even at loop closure. State updates are constant time due to the relative formulation. Loop closures are automatically detected using FABMAP.




Sliding Window Filter for the Simultaneous Localization and Mapping problem
(book chapter, Tech Report).
We’re using this filter to help us estimate environment structure using bias-corrected long range stereo from moving platforms.
Better long range stereo from a moving vehicle requires a solution to the SLAM problem.
Applications we’re working on include hazard detection for safe and precise landing on Mars, long range obstacle detection for autonomous boats, and moving object detection from a moving platform for unmanned ground vehicles.Combining Sliding Window Filter with new methods for recognizing loop closures is an interesting idea.

New filters for long range stereo. This work has developed two new filters to deal with statistical bias issues in long range stereo. The first is a second order Gauss-Newton filter, the second is the Itereated Sigma Point Kalman Filter.



mapping_stereoStereo-based mapping and localization: Stereovison mapping and localization utilizes a combination of feature based methods and dense range data to create high fidelity models of the environment and to localize the robot against the model. We use least-squares statistical Point Estimation techniques to solve the SLAM problem. We keep a keen eye on real-time online implementation strategies. Stereo offers several advantages over classical range finding devices such as 3D information, image intensity information and distant object sensing. This information is very useful in the context of data association allowing the robot to disambiguate different locations.


zerogCo-founder and ex-CEO of BlueSky Robotics, LLC, which develops very interesting platforms.




lwpr-introAutomatic gait learning for an 18-degree of freedom hexapod using Locally Weighted Projection Regression. The robot learned the classic tripod gait in the Gazebo dynamic simulator.




yosi_poseEngineer for NASA Explorer Team (NExT) Micro Robot Explorer project at JPL. The robots in this work have demonstrated a wide range of tasks including autonomous sensor network deployment, discovery and repair (DDR), inverted mesh traversal, multi-robot autonomous construction, and micro-gravity mesh traversal. These robots got a lot of press – from BBC to the Discovery Channel. More


adhocRobotic Sensor Networks. Very small networked mobile robots, swarms, etc.




marsRovIs & RovIs Networks. Work done at UFL under an NSF REU grant. RoVis is a classic “rocker-boggie” style Mars rover knock-off. The goal of this work is to develop inexpensive outdoor robotics research platform. RoVis is equiped with a Linux PC-104 stack, a stereo-rig, an IMU and optical-encoders for odometry.