MicMac – a free, opensource solution for photogrammetry
 Ewelina Rupnik^{1, 2}Email author,
 Mehdi Daakir^{1, 3} and
 Marc Pierrot Deseilligny^{1}
https://doi.org/10.1186/s4096501700272
© The Author(s) 2017
Received: 20 December 2016
Accepted: 1 May 2017
Published: 5 June 2017
Abstract
The publication familiarizes the reader with MicMac  a free, opensource photogrammetric software for 3D reconstruction. A brief history of the tool, its organisation and unique features visàvis other software tools are in the highlight. The essential algorithmic aspects of the structure from motion and image dense matching problems are discussed from the implementation and the user’s viewpoints.
Keywords
Background
Photogrammetry is the art, science, and technology of obtaining geometric information on the 3dimensional shape and orientation of objects from images and other imaging sensors. It is a cheap measurement methodology as it can be executed by practically any digital camera of a decent make. It is instantaneous as it captures the observed phenomena at once and in a split second, and highly automated, therefore accessible to nonexpert users. Thanks to the field of computer vision, photogrammetry rejuvenated and today places among other competitive remote sensing techniques (e.g. Light Detection and Ranging LiDAR) [1]. The several milestones leading to this progress are the automated interest points detection [2], the Structure from Motion (SfM) algorithms capable of reconstructing scenes from sets of unordered image collections [3, 4], and the dense image matching techniques delivering surface models of resolution equal to the pixel size on the ground [5, 6].
All this contributes to an ever growing visibility of the photogrammetric tools across various fields of science and engineering, the growing market interest, and subsequently a multitude of photogrammetric/computer vision libraries and software solutions, be it commercial or free/opensource. MicMac – together with Bundler, PMVS, VisualSfM, openMVG, OpenCV and others – belongs to the opensource solutions^{1}. This publication aims at familiarizing the reader with the philosophy behind MicMac, some of its crucial algorithmic aspects, the software architecture and the pool of available tools.
Historical notes
MicMac has been developed at the National Institute of Geographic and Forestry Information (IGN) and the National School of Geographic Sciences (ENSG), since 2003. Initially, the software tools were developed having in mind exclusively the IGN’s cartographic production. The independent tools were interfaced in 2005 via an XML framework, allowing the user to freely parametrize the calculations at all processing stages. In 2007, IGN began to freely distribute MicMac under the CECILLB license that is a version of the LGPL license adapted to the french law.
Until 2008, the dense image matching of already oriented images was possible only with the IGN’s internal image file format. In the same year the Apero tool was added to the sofware kernel, offering from now on the possibility to estimate camera exterior and interior orientations, with no restriction on the image format. In 2010, the XML interface was replaced by a simplified command line. This evolution contributed to an improved accessibility, diffusion and subsequently a better visibility of the software in the scientific communities, and the general public.
Since 2010 MicMac has been undergoing a significant evolution due to its implication in many french and european projects. Besides the contributions to the source code, new software distributions under Windows, Mac OSX, as well as the GPU processing for certain tasks became available.
MicMac visàvis other tools

access to intermediary results in open data formats allowing to interact with the processing chain at any desired stage,

qualitative evaluation of the results via quality indicators (e.g. bundle block adjustment residuals, covariance matrices, parameter’s sensitivity, correlation images),

a wide range of camera calibration models (e.g. the models adapted to consumer grade cameras, largeframe aerial cameras, cameras with very long focal lengths, fisheye and spherical cameras),

twodimensional dense image matching for deformation studies,

processing of frame camera and pushbroom sensor images,

processing of scanned analogue images,

architecture adapted to big datasets.
Software organisation
Implementation
The algorithmic aspects
The photogrammetric workflow encompasses the passage from images, through the estimation of the orientation parameters, finalizing with a 3D surface model. In other words, it is a passage from a 2D representation of the world captured by a camera, through inference of the position and rotation of that camera at the moment of the image taking, towards a 3D restitution of the lost dimension. As long as the quality of the final output depends on the skillful engineering of many small processing steps, estimation of the camera orientation parameters and the matching algorithms constitute the heart of the pipeline. Respectively, the coming sections concentrate on these two aspects, report on the adopted methods and give a global look on what is possible in MicMac.
Structure from motion
Recovery of the structure (i.e. 3D coordinates of the observed scene) and the model. It is well known that this model (i.e. the transformation from 3D to 2D space and viceversa also referred to as collinearity equation) is not linear, hence requires linearization. Moreover, there exists no direct algorithm able to compute orientation parameters globally consistent with a number of images (generally n>3). To overcome this gap, bootstrap solutions were proposed. Using direct algorithms for a single image, a pair or triplet of images, the global orientation is deduced sequentially [3] or hierarchically [7] starting from a seed image pair. The soobtained parameters serve as input to a system of equations composed of the linearized collinearity equations, where their optimal values – in the stochastic sense – are found iteratively. The observations (e.g. tie points) are redundant therefore the solution is deduced with the least squares method by minimizing an objective function. The typical function is the squared differences between the nominal observations and those predicted from the estimated model, possibly subject to constraints. The latter stage is also known as the bundle block adjustment (BBA).
MicMac implementation.

tie points,

lines,

perspective centers derived from direct georeferencing (i.e. GNSS),

ground control points (GCP),

the leverarm vector separating the GNSS antenna and the perspective center,

rigid relative orientation between cameras.
On top of that, a broad spectrum of camera calibration models are available. Both argumentation are supported, the physical and the phenomenological. The former models any imaging error attributed to a phenomenon of a physical nature, while the latter models solely the effects of the lens imperfections, without interrogating its causes. The total distortion is defined as a composition of predefined elementary distortions, such as radial, decentric or a generic polynomial. Typically, the major distortion are modeled by the physical model, while the generic polynomial models remove the less significant remaining systematism.
The critical element of the BBA’s mathematical model is the observation weighting, known as the stochastic model. MicMac specifies three different weighting strategies. The first strategy corresponds to the weighting in the classical model known in photogrammetry – the GaussMarkov – where the observations are weighted by their true standard deviations known a priori. The second strategy controls the influence of a particular category of observations so as to avoid solutions that are driven by a single category only for the sheer reason of its abundant observations. The last strategy addresses the robustness. The weight is a function of the observation’s residual in the BBA, therefore giving more credibility to observations that are close to the estimated model, and contrary, limiting the influence of observations with high residuals.
User’s viewpoint.
There are two principal modules that handle the camera orientation recovery with the simplified interface – Tapas, Campari – and both of the modules call the parent orientation tool Apero (cf. Fig. 1). Tapas calculates the purely relative orientation of images, using observed tie points as the only input. Since at this point there is no a priori on positions and rotations of the cameras, Tapas also entails the initialization step where it progressively reconstructs the cameras with the help of the direct orientation algorithms, and intertwines it with the BBA routine.
Unlike Tapas, Campari is a BBA module adapted to handle heterogeneous observations. Besides the tie points, it works with GCPs, the GNNS data and can impose various constraints on the solution. This module is typically executed once a good relative orientation is established, and the cameras are moved two a coordinate system (CS) consistent with that of auxiliary observations, e.g. the GCPs. The latter is performed with any Bascule tool, and it is a spatial similarity transformation (cf. Fig. 1). Both Campari and all variations of the Bascule can be regarded as georeferencing tools.
Since recently, MicMac provides the user with the internal accuracy estimates of the BBA parameters (their standard deviations and correlations) derived from covariance matrices.
Multiview stereo image matching (MVSM)
Given several overlapping images with known orientation parameters, MVSM is the process of reconstructing a complete 3D model of the scene by finding corresponding pixels (i.e. their disparities or depths, respectively) between image pairs, and triangulating them in space. The generic algorithm is defined as an energy minimization problem that searches for a disparity map minimizing the energy. The minimization can be solved with local, semiglobal and global algorithms. Today the golden standard method for producing dense 3D models from images is the semiglobal matching [6].
Image matching breaks down into three processing stages: computation of the pixel matching cost, cost aggregation and disparity calculation. The matching cost is a measure of dissimilarity, i.e. describes the unlikelihood that two pixels belong to a unique point in 3D. As the matching can be ambiguous, and the minimum cost insignificant, an additional a priori is imposed on the energy function that penalizes disparity changes at neighboring pixels (the case of local and semiglobal algorithms). This aggregation takes place within a window, a 1D path (locally) or multiple paths (semiglobally). In the latter case, the cost for a given disparity is an accumulated cost along many paths that end in the pixel in question [9].
MicMac implementation
The motivation to distinguish between these two restitution geometries – the ground and the image – is twofold. On the one hand, a Digital Surface Model (DSM) produced from aerial images is normally defined in some reference CS, therefore it is more intuitive to perform the reconstruction directly in the target CS (i.e. ground geometry), where the disparity space ε _{ px } is the \(\mathcal {Z}\) coordinate (cf. Fig. 3 a). On the other hand, it is known that matching in image space with a master and a set of slaves images is more reliable, especially in closerange photogrammetry applications and for small datasets. The disparity space ε _{ px } in the image geometry is then either the depth along the ray or the respective disparity in image space (i.e. the image geometry; cf. Fig. 3 b and c). It is up to the user which geometry to employ.
where \({{\mathcal {A}}}(x,y,{{\mathcal {F}}_{px}}(x,y))\) is the measure of similarity between pixels with \({{\mathcal {A}}}=0\) when they are identical; \(\ \nabla ({{\mathcal {F}}_{px}}) \^{reg}\) a norm on the gradient, which is used as a regularity criteria (it penalizes the variation of \({{\mathcal {F}}_{px}}\)); α _{1} is the regularization on first component of disparity and α _{2} the regularization on the second component (equivalent of matching in the direction transverse to the epipolar line).

if the desired model is smooth, a convex function F is adequate (it’s better to climb up a given jump by regular steps),

if the desired model has many discontinuities, a concave function F is adequate (it’s better to climb up a given jump in a single step),

when there is no strong prior, the default choice is to have F linear,

if a priori knowledge on scene slope exists, it can impose an allowable maximum scene slope,

for 2D matching, nonisotropic smoothing factors can be set.
The actual similarity measure \({{\mathcal {A}}}(x,y,{{\mathcal {F}}_{px}}(x,y))\) is calculated from the the normalized cross correlation coefficient (1−C o r), defined as a function of multiple images, rather than a single pair. The coefficient can privilege an image out of a set of images (e.g. a master image), or consider the set as “symmetrical”. Varying cross correlation window sizes and weighted windows are also possible.
To find the disparity map that minimizes the energy, MicMac by default employs a multidirectional variant of the dynamic programming algorithm. The results along independent directions are aggregated by taking a mean of all results or simply the best result. Optionally, a global minimization by the MinCut/MaxFlow algorithm [10] can be enforced.
In order to limit the disparity searchspace, speed up the calculation and to reduce the noise, MicMac adopts a multiresolution matching approach where coarse resolution disparity maps serve as predictors of the fine resolution output.
User’s viewpoint
DSM creation in MicMac takes place via several tools that are semi or fullyautomated. In both cases the parent tool to handle the lowlevel dense matching is the MicMac, which an expert user can access via an XML file.
Besides the classical 1D matching (along an image ray, an image row or the Zcoordinate) for reconstruction of object geometry, MicMac also implements a 2D matching strategy. This strategy is useful in 2D deformation studies between rectified images (cf. Fig. 5; also MM2DPosSism), as an orientation quality estimation tool to asses the remaining yparallax (MMTestOrient; cf. Fig. 6), or in cases when orientation parameters are unknown or known with poor precision (executed with XML file). Because the expected disparities pertinent to geometry are of higher frequencies than those along the yparallax, the regularization of the energy function in the two direction is managed separately; cf. Eq. (2).
Lastly, in the event where all camera optical centers are located on a 3D line, parallel to the surveyed object, and the reconstruction is defined in object space, the matching becomes ambiguous along the epipolar lines. To avoid the ambiguities and the resultant noise, MicMac calculates the matching in orthocylindrical coordinates (see RepLocBascule).
Interactive tools
All results produced by MicMac are exclusively in open data formats (tif, xml, ply) and as such it opens up the possibility to interact with the processing chain at any desirable stage. For instance, the user can import interior and exterior image orientation data and proceed with the dense matching and orthophoto generation. Or vice versa, the orientation parameters or epipolar images can be generated to proceed with an alternative dense image matcher. The reader is advised to refer to the software documentation [11] for more details on the conversion procedures.
Image measurement tools
In the following, several tools for image measurements and the use of 2D/3D masks are discussed.
Visualization tools
To visualize the intermediary and final results MicMac offers a range of different tools.
The SEL tool visualizes the tie points between a stereo pair. The AperiCloud tool converts the result of the BBA, that is the poses and 3D tie points, to the point cloud ply format.
The Vino tool is a simple image viewer adapted to visualization of very large images (i.e. in size of a few gigabytes), irrespective of their type (i.e. integer, float or >8Bit). Moreover, it lets the user to modify the histogram of the image or generate an image crop.
Other tools allow a more intuitive visualization of depth maps. For instance, the to8bits tool converts 32bit or 16bit images to 8bit images, while the GrShade tool computes shading from a depth map image (cf. Fig 4). Lastly, the Nuage2Ply transforms a depth map into a point cloud ply format (cf. Fig 7).
Discussion

http://micmac.ensg.eu, as the software’s reference webpage;

http://forummicmac.forumprod.com/, for technical support;

https://github.com/micmacIGN/, for the source code.
The high priority ongoing developments and shortterm perspectives are those concerning effective processing of big datasets. The developments pertain to (a) the tie points extraction, (b) the SfM, as well as (c) the storage of 3D data. Regarding the tie points, very precise tie point detectors and detectors invariant to affine distortions are being developed. Within the SfM, global, structureless methods [12–14] are under development. As a further perspective, adequate methods for storing very large 3D point clouds will be conceived. Lastly, as there has been numerous demands to create a GUI, computer programmers willing to contribute are strongly encouraged to contact the team of developers. A GUI in form of a standalone application or a GUI integrated with other opensource software tools, e.g. QGIS, are possible.
Declarations
Funding

Institut National de l’Information Geographique et Forestiere (IGN main funder since 2003)

The French FUI Project “Culture 3D Cloud”

The French ANR Project “MONUMENTUM”

Centre national d’études spatiales (CNES) via the TOSCA programme
Authors’ contributions
As long as all authors equally contributed to this article, the software itself was in major part developped by MPD. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Granshaw SI, Fraser CS. Editorial: Computer vision and photogrammetry: Interaction or introspection?Photogrammetric Rec. 2015; 30(149):3–7.View ArticleGoogle Scholar
 Lowe DG. Distinctive image features from scaleinvariant keypoints. Int J Comput Vis. 2004; 60(2):91–110. doi:10.1023/B:VISI.0000029664.99615.94.View ArticleGoogle Scholar
 Snavely N, Seitz SM, Szeliski R. Photo tourism: exploring photo collections in 3d. In: ACM Transactions on Graphics (TOG), vol. 25. ACM: 2006. p. 835–46. http://dl.acm.org/citation.cfm?id=1141964.
 PierrotDeseilligny M, Clery I. Apero, an open source bundle adjustment software for automatic calibration and orientation of set of images. 2011; 38(5):269–76.Google Scholar
 PierrotDeseilligny M, Paparoditis N. A multiresolution and optimizationbased image matching approach: An application to surface reconstruction from spot5hrs stereo imagery. Arch Photogramm Remote Sens Spat Inf Sci. 2006;36(1/W41).Google Scholar
 Hirschmuller H. Stereo processing by semiglobal matching and mutual information. IEEE Trans Pattern Anal Mach Intell. 2008; 30(2):328–41.View ArticleGoogle Scholar
 Toldo R, Gherardi R, Farenzena M, Fusiello A. Hierarchical structureandmotion recovery from uncalibrated images. Comput Vis Image Underst. 2015; 140:127–43.View ArticleGoogle Scholar
 Nocedal J, Wright S. Numerical Optimization: Springer; 2006. http://link.springer.com/book/10.1007/9780387400655.
 Szeliski R. Computer Vision: Algorithms and Applications: Springer; 2010. http://dl.acm.org/citation.cfm?id=1941882.
 Roy S, Cox IJ. A maximumflow formulation of the ncamera stereo correspondence problem. In: Computer Vision, 1998. Sixth International Conference On. IEEE: 1998. p. 492–9. http://ieeexplore.ieee.org/abstract/document/710763/.
 MicMac, Apero, Pastis and Other Beverages in a Nutshell. https://github.com/micmacIGN/Documentation/blob/master/DocMicMac.pdf. Accessed 11 May 2017.
 Enqvist O, Kahl F, Olsson C. Nonsequential structure from motion. In: Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference On. IEEE: 2011. p. 264–71. http://ieeexplore.ieee.org/abstract/document/6130252/.
 Moulon P, Monasse P, Marlet R. Global fusion of relative motions for robust, accurate and scalable structure from motion. In: Proceedings of the IEEE International Conference on Computer Vision: 2013. p. 3248–255.Google Scholar
 Reich M, Yang MY, Heipke C. Global robust image rotation from combined weighted averaging. ISPRS J Photogrammetry Remote Sens. 2017.Google Scholar