OpenDroneMap does an excellent job at post-processing aerial images. Fly the drone, download the images, process, wait a few hours and get your maps and 3D models.
What about real-time?
Real-time reconstructions present several challenges and requires a different approach. For starters, using only images as input, without an RTK system and accurate IMUs (better GPS and inertial sensors) you just can’t estimate the camera positions quickly enough. By relaying just on standard GPS and the on-board camera pitch/yaw/roll readings, problems with alignment are evident even on simple datasets. Structure from motion methods are simply not fast enough to correct for that error.
This is where Simulatenous Localization And Mapping (SLAM) comes in. SLAM has been gaining a lot of attention lately, especially since it’s the foundation of the work behind recent state-of-the-art AR algorithms. If you wonder how your iPad can track itself in space and take measurements using just a monocular camera, there’s SLAM algos behind it.
Given a video input and a camera model, we can extract video frames and pass them to one of several FOSS SLAM implementations (ORB_SLAM2, DSO and several others), estimate camera poses and get a sparse point cloud. If we can receive GPS coordinates in real-time, we can fuse the information from GPS to establish a world coordinate system.
With a (filtered) sparse point cloud and camera information, we have two options:
- Image mosaicking assuming a flat plane, which doesn’t give us true orthorectified maps, but for high altitude flights at nadir angles, it works remarkably well. This is the approach used by programs such as Map2DFusion.
- Volumetric 3D Mapping, which would construct a mesh in real time and then incrementally export orthophotos to be merged.
Similar georeferencing processes that already exists in OpenDroneMap would take care of generating GeoTIFFs. As a bonus, we could also work to generate real time digital surface models (DSMs).