1. Introduction
Developing 3D digital models of heritage assets, monuments, archaeological excavation sites, or natural landscapes is becoming commonplace in areas such as heritage documentation, virtual reconstruction, visualization, inspection of a crime scene, project planning, augmented and virtual reality, serious games, and scientific research. Conventional geometry-based modelling approaches using software like Maya, Blender or 3D Studio Max etcetera, typically involves a steep learning curve and requires a considerable amount of time and effort. Advancements in hardware (laser scanners, UAV, etc.), especially for 3D reconstruction of real-world objects, have made it easier for professionals to virtually reconstruct 3D scenes. However, tools like laser scanners and structured lighting systems are often costly. Additionally, this technology has limitations in regards to rendered material properties and environmental conditions, such as strong sunlight [
1].
There is a critical need for tools, which allow non-expert users to comfortably and efficiently create 3D reconstruction models, especially for visual and documentation purposes. To answer this demand, some commercial, as well as free and open source software (FOSS) based on image-based modelling (IBM) or photogrammetry, have emerged. Image-based 3D reconstruction software creates 3D point cloud with camera poses derived from uncalibrated photographs. The software determines the geometric properties of objects from photographic images. This process requires comparing reference points or matching pixels across a series of photographs. The quality and specific number of photographs are needed to allow the surface to process, match and triangulates visual features and further generating 3D point-cloud.
Structure from Motion (SfM) is one of the most common techniques in image-based modelling approach implemented in different software. This technology enables non-experts to quickly and easily capture high-quality models through uncalibrated images captured from cheap setups, without requiring any specialized hardware or carefully designed illumination conditions. Generally the workflow follows six steps to produce 3D reconstructions/3D models:
- (1)
Image acquisition (or adding photos)
- (2)
Feature detection, matching, triangulation (or align photos)
- (3)
Sparse reconstruction, bundle adjustment (or point cloud generation)
- (4)
Dense correspondence matching (or dense cloud generation)
- (5)
Mesh/surface generation, and
- (6)
Texture generation.
A few software packages also offer cloud/mesh editing within a single package [
2]. The whole process of 3D reconstruction can be done with the support of cloud cloud-based computation or a local PC; based of the service/application used. This paper does not cover the cloud processing method and only focuses on those software/applications that run on a local PC workstation.
Ample studies have been published so far on image-based modelling software analyzing their performance [
3,
4,
5], accuracy in 3D production [
6,
7,
8], used algorithms [
9], and scalability [
1,
9,
10]. A few of them also discuss workflows [
11,
12]. However, it is rare to find studies on either best practice or optimized workflows, which can be adopted by the general public to create 3D reconstruction models easily and free or at little cost.
This paper presents a comparison study of four 3D reconstruction FOSS (free and open source software) packages selected on the basis of their price, platform independence, scalability, output format, accuracy, ease of use and installation, and most importantly, their required processing time (suitable for home consumer viewpoints). Popular commercial, and free and open source software (FOSS) was studied first. Agisoft’s Metashape (version 1.3.3) commercial package and four applications, i.e., Visual SFM (version 0.5.26), Python Photogrammetry Toolbox with the graphical user interface (PPT GUI) (version 0.1), COLMAP (version 3.1) and Regard3D (version 0.9.2) were selected to represent the FOSS group. This article will later describe and illustrate their workflows for a clearer understanding of their working methods, including differences at each step of their production pipelines.
Two datasets were used to evaluate the selected FOSS group; and compared with the reconstruction achieved from a commercial software (as ground truth data), by using CloudCompare software to assess the average deviation of their produced point clouds. This article concludes with a discussion and assessment of their limitations and strengths, especially in regards to their ease of use, workflow, performance, and any required learning curves (from a non-expert end-user’s perspective). Please note, this study mostly used the default settings offered by the selected FOSS to produce the reconstruction; results may vary for customised settings.
2. Related Works
Recently, image-based modelling has become a growing area of interest in academic circles; a considerable amount of literature has also become available on relevant topics. 3D point clouds can only be produced when the relations between photographs are appropriately established. Several articles describe the overall production of 3D models as a sequence of calibrated or uncalibrated photographs [
13,
14,
15], including details of different techniques for achieving a high degree of accuracy [
1,
6].
Researchers have successfully worked on developing various methods of producing 3D models from photographs. For example, Debevec et al. [
16] developed the ‘Façade system’ and captured a 3D model of MIT campus through the ‘MIT City Scanning Project’. From a sequence of street view images, Xiao et al. [
17] developed a semi-automatic image-based approach to reconstruct 3D façade models. Brown et al. [
18] later presented another technique based on image-based modelling, which used ‘recover camera’ parameters to develop 3D scene geometry. Snavely et al. [
19] introduced Photo Tourism (later named as Photo Synth by Microsoft) based on the work of Brown, with more scalability and robust engine. Agarwala [
20] proposed another technique based on panoramas of roughly planar scenes to produce 3D. Some researchers have worked on evaluating photogrammetry software [
11,
14], however their study mostly concentrated on modelling accuracy and performance.
It is rare to find articles that dealt with comparing workflows on determining best practice, exploring effective learning methods and uses intended for the general public who have limited technical knowledge. This significant issue has also been previously noted by Remondino et al. [
21], Deseilligny et al. [
7], and De Reu et al. [
22]. However, Ben Nausner [
23] worked on the project ‘Virtual Nako’ and discussed the workflow used for 3D acquisition and visualisation of the scene that can be finally embedded with Google Earth for public viewing. Sauerbier [
24] described the workflow for photogrammetric processing of areal images for generation of a textured 3D model of Tucume, Peru. Koutsoudis et al. [
12] presented a ‘versatile’ workflow for making 3D reconstruction models based on a set of open source software. He used eight different software applications to integrate the workflow and claimed this approach could be useful for those who are on a level similar to a “computer graphics enthusiast experienced with 3D graphics”. Similarly, Deseilligny et al. [
7] described another workflow for an automated image-based 3D modelling pipeline for the accurate and detailed digitization of heritage artefacts based on open-source software.
3D documentation of cultural heritage poses prime importance regarding historic preservation, tourism, educational and spiritual values [
25,
26]. Image-based reconstruction software also claims to be cost-effective as compared to traditional laser scanning methods and can provide an automated system with considerable accuracy in the 3D model generation [
15,
27]. However, costs are incurred for acquiring commercial software licenses, and a level of technical skill and knowledge is necessary. Boochs et al. [
28] and Kersten and Lindstaedt [
29] demonstrated free or low-cost 3D reconstruction methods for archaeological and heritage objects but the presented methods were targeted at technical users.
Structure from Motion (SfM) based 3D reconstruction software has become widely used in recent years [
30]. Articles those covers benchmarking and compare SfM based reconstruction, however, merely covers FOSS solutions. While few studies cover VisualSfM [
8,
11,
21,
31] and Bundler/PMVS [
11,
13,
21], however these studies are rarely tailored for novice users or non-expert users in supporting their involvement in making 3D reconstruction.
Considering the interest of local communities in showcasing their heritage assets, the issue of the ownership, and the significant amount of time and money to train, appoint and retain 3D reconstruction specialists; finding a suitable FOSS with an optimized workflow would be of benefit to these communities. This paper aims to examine various FOSS-based applications and their related workflow and presents a comprehensive comparison of their modelling accuracy to help general users with limited relevant technical knowledge, and who are interested in heritage documentation and visualization.
3. Selection of the Software
A wide variety of 3D modelling programs are available based on SfM; ranging from simple home-brew systems to high-end professional packages [
30]. Keeping the target groups in mind who have limited budget and entry level of skills and requires a robust, easy to learn, scalable model building environment, the following selection criteria are used in selecting FOSS package
Free or low cost.
Simple workflow; easy to install, learn and use.
Supports Graphic User Interface (GUI).
Supports close-range photogrammetry and Structure from Motion (SfM).
This selection process, however, has excluded free commercial products with limited functions/capabilities, and cloud-based online services. A list of image-based 3D reconstruction software has been found from Wikipedia at the time of writing (
https://en.wikipedia.org/wiki/Comparison_of_photogrammetry_software, visited, 30.09.2018). This site has described the presence of ninety-eight (98) existing software, including standalone programs and plugins that can build 3D models from photographs. While we are wary of Wikipedia articles being used as reference literature, in this case, we know of no comparable study that lists such a large number of image-based modelling software (for older reviewers please see
http://www.pvts.net/pdfs/ortho/photosftw_purch.pdf, visited 02.10.2018).
During the selection process, some commercial software such as 3Dsom, Autodesk ReMake, PhotoModeler, Metashape, Aspect3D, 3DF Zephyr, and RealityCapture were reviewed. As budget is one of the primary concerns, Agisoft’s Metashape has been selected due to its low price (education standard version) and accuracy in production of 3D point cloud [
14,
32]. On the other hand, based on the selection criteria four popular Free and Open Source (FOSS) software, i.e. VsualSfM, Python Photogrammetry Toolbox (PPT GUI), COLMAP, and Regard3D were selected as they also support SfM (Structure form Motion) for 3D reconstructions.
Table 1 presents the various basic modelling features, which are offered by the selected software.
We note here that the study is not a comprehensive comparison of output 3D environments. Mostly the default settings have been used while excluding some other crucial aspects. For example, features such as the capacity to handle large datasets, GPU support, multiple image frames captured at different time moments with alternative settings were not considered in this study.
4. Workflow Study
For the convenience of the reader, a basic review of the workflow or production pipelines of the selected application is described first. A typical critical attribute is presented later in
Section 6.
4.1. Metashape/PhotoScan
Agisoft Metashape is a low-cost commercial 3D reconstruction software from Agisoft LLC, Russia. Metashape automatically builds precise textured 3D models by using digital photos (both metric and non-metric) of an object or scene and is available in Standard and Pro versions. This program works on Windows, Mac OS and Linux operating systems on a local PC, and therefore all data remains with the user. In some cases, it is difficult or even impossible to generate a 3D model of the whole object in a single attempt. To overcome this difficulty Metashape offers options for splitting the set of photos into several separate "chunks" within the project. This way, the default processing/steps can be performed on each chunk separately, and then the resulting 3D models can be combined.
To use Metashape, users must capture the images or photographs all around the object (to get all possible views, mostly in a circular fashion). Either masked or unmasked photographs can be added to the workflow (step 1), and image alignment is required before computing (step 2) (
Figure 1). However, Metashape recommends masking all irrelevant elements on the source photos (such as the background and any accidental foreground) for better reconstruction results. Step 3, Metashape computes the photographs and builds the geometry (create a point cloud) of the scene. Users can edit and clean the unnecessary point cloud by cropping or removing extra floating objects before creating the mesh. The density of the point cloud (generated from step 3) can vary from normal, medium to ultra-high format, and later, the mesh (step 4) can be calculated by following any setting (from fine to high pre-set). Unwanted geometry/faces can be edited at this stage before building the texture (step 5). A user can use additional features (optional) to close holes (step 6) and export the model.
The resulting mesh can be textured with minimum user effort by leaving the default setting (which can be generic, average, fill holes, 2048 × 2048, and standard). 3D models can be exported in various formats (OBJ, 3DS, VRML, COLLADA, PLY, FBX, DXF, and PDF) for further editing and rendering.
4.2. Visual SfM
VisualSfM is an academic open-source software solution that supports Linux, Windows and Mac OS, and developed by Changchang Wu after he had combined several of his previous projects (more information can be found on Dr Wu’s website
http://ccwu.me/vsfm, visited 02.10.2017). VisualSfM does not require the input of any camera information; instead, it provides a GUI, which includes SiftGPU and PMVS. SiftGPU finds the camera positions while PMVS creates a point cloud from the matched photos. VisualSfM does not create a complete reconstruction, but it basically provides a point cloud that requires post-processing.
The workflow starts with adding image file (step 1); either by loading a .txt file containing relative image paths (NView Match) or by directly importing of multiple images. VisualSfM can automatically determine all the used parameters of the camera to acquire the photos. The next step (step 2), VisualSfM detect features in each image and find matches. It provides a variety of different algorithms for feature detection including Scale Invariant Feature Transform (SIFT) and SiftGPU (a GPU implementation of SIFT) [
11]. Matches found in the previous step are later converted to points in 3D space (step 3) achieved through Bundle adjustment.
This step can be done by going to the SfM menu and selecting Reconstruct Sparse or by using the Compute 3D Reconstruction shortcut (
Figure 2). A denser point can be achieved by using the PMVS/CMVS tool (step 4). Select Reconstruct Dense or click on the Run Dense Reconstruction shortcut. This command prompts to save the output as a *.nvm file, which VisualSFM creates and saved in a folder titled *.nvm.cmvs and runs the CMVS (Clustering view for Multi-view Stereo). A *.ply file is also automatically saved in the same location.
4.3. Python Photogrammetry Toolbox (PPT GUI)
The Python Photogrammetry Toolbox (PPT) is free and open-source software that runs on various platforms (Mac OS, Linux, and Windows). The software was initially developed by Pierre Moulon and Alessandro Bezzi. The toolbox is composed of python scripts that automate the 3D reconstruction process from a set of pictures. The reconstruction process is mainly performed in two parts: camera pose estimation/calibration and dense point cloud computation. Open-source software such as Bundler for the calibration; and CMVS/PMVS for the dense reconstruction is employed to perform these intensive computational tasks. PPT GUI provides a 2-step reconstruction workflow. However, before starting step 1, we recommend checking the camera database. The terminal window will prompt for the camera model and sensor (CCD) width size if it is not in the PPT’s database.
Step 1: Run Bundler performs the camera calibration and computes the 3D camera pose from the set of images. Despite automation, the user can control the result by choosing from two initial parameters: the image size and the feature detector. Step 2: Run CMVS/PMVS or run PMVS without CMVS, it takes the output of the previous step as input, and perform the dense 3D point cloud computation. However, running CMVS before PMVS is highly recommended, but not strictly necessary. It is also possible to use PMVS directly (
Figure 3) [
33]. The software generates the outputs (*.ply file) automatically to a ‘temp’ directory, which prompts through the terminal window.
4.4. COLMAP
COLMAP is a general-purpose Structure from Motion (SfM) and Multi-View Stereo (MVS) pipeline which supports both graphical and command-line interface. The software was developed by Johannes L. Schoenberger and is licensed under the GNU General Public License v3 or later (source:
https://colmap.github.io/license.html, visited: 03Oct.2018). COLMAP offers a single click Automatic Reconstruction with an inbuilt default setting. This automatic process is faster, when compared to the step-by-step process, however, it has a trade-off in terms of the reconstruction quality. On the other hand, a manual step-by-step process may provide more flexibility on settings and accuracy in dense reconstruction.
The user needs to run the *.bat file to open the program followed by the file menu to open/create a new project (
Figure 4). The user must show the program the present location of the images and where to locate the database (step 1). The user needs to start with the feature extraction (step 2) under the Processing tab; followed by the ‘feature matching’ (step 3). Then the user needs to reconstruct the camera positions, and produce a sparse point cloud. The start reconstruction command is located under the pull-down menu Reconstruction (step 4). COLMAP produces the 3D view while depicting cameras being added to the scene while it simultaneously forms the sparse point cloud. After completing this stage, the sparse cloud can be exported. A Bundle adjustment can be run before densification. Dense reconstruction (step 4) contains three steps, i.e. un-distortion, stereo, and fusion. The final step (step 5) is Meshing. All models can be exported as *.nvm, *.out, *.ply, and *.wrl file format.
4.5. Regard3D
Regard3D is another free and open source structure from motion program that supports multiple platforms (Windows, OS X, and Linux). Regard3D has a simple and straightforward GUI. The details of the executed tasks are highlighted in the left tree view (
Figure 5).
Experimenting with settings thereby is more accessible, since the user only has to click on a completed task to see a list of the arguments used to generate it, as well as view the running time of that selected step. Similar to other software applications, the user needs to set a project path and a name to start a project. Photographs are required to be set (step 1) for the software to compute the matches (step 2). Next up is camera registration. In other words, the process of determining each camera’s position and orientation in the scene (step 3), can be done by selecting the match results and clicking ‘Triangulation’.
Based on this simple sparse point cloud (which consists of points that are unevenly distributed over the scene and generated from the corresponding camera positions), users can “densify” the triangulation result (step 4). From the tree view, it is possible to highlight the results of step 4 and choose ‘Create dense point cloud’. The dense cloud (*.ply, *.pcd) can be exported at the end of this step. Users can also generate a mesh by clicking Create Surface (step 5). Please note that if users select CMVS/PMVS Poisson reconstruction is the only option offered. On the other hand, two colorization methods are offered: coloured vertices or texture. The user can now export the generated surface as a *.obj file or directly export to MeshLab as *.mtl file format (step 6).
5. Performance Study
This section presents a comparative study of the four pre-selected FOSS based 3D reconstruction applications on the basis of their produced point cloud, computation time, and reconstruction accuracy.
5.1. Dataset and Computation
The primary goal of this article is to evaluate and compare the efficiency, accuracy and constraints of the selected software to provide insight and help in choosing the most appropriate software, in general. To conduct this evaluation, a repository of seven objects (or data sets) was used for a pilot study. Two objects were selected for the final comparison based on image acquisition response time and a satisfactorily dense point cloud (with minimum holes). Two data sets, i.e., 22 photographs of a sculpture (frog) and 50 photographs of a historic building elevation (Kidogo Arthouse, Perth, Western Australia) were used to run the full test. The photographs were captured with a resolution of 3456 × 2304 pixels, in an outdoor setting by a Canon EOS 600D camera (sensor size 22.3 × 14.9 mm, sensor type CMOS, lens 10.0–20.0 mm). The reconstructions were computed on a standalone PC with Intel i7-6700 CPU, 3.4GHz system with 16GB RAM, and Quadro K620 graphic card with 2GB VRAM. The operating system was Windows 7 Enterprise.
5.2. Comparing Methods
To determine the accuracy of the 3D point clouds derived from the different software the two different data sets (sculpture and building façade) were used for comparison. The comparison was made between the point cloud produced by the FOSS and a reference mesh surface produced from the commercial software (i.e., Metashape formerly known as PhotoScan).
The idea presented by Schöning and Heidemann [
14] was used for the comparison study. Since a ground truth is required for benchmarking the FOSS applications, and as an alternative to LIDAR data and related technology, the point cloud generated from Metashape was used as ground truth (
Table 2). This also means that the errors in Metashape reconstruction were not taken into consideration. This accuracy should suffice for general purposes of modelling such as documentation and visualization (but not for scientific analysis). The Metashape reconstruction will be referred to as ‘ground truth’ objects, while the outputs from each FOSS will be referred as ‘reconstructed’ objects. The comparisons were made with free open source software CloudCompare, and the calculated results are summarized in
Table 3 and
Table 4.
The different point clouds were co-registered manually by CloudCompare. Additionally, we note that the image-based models were not scaled, meaning that the model does not reflect the real world structure’s size. The reconstruction models were scaled in relation to the ground truth and registered by using an ‘iterative closest point algorithm’ (ICP) [
34] with a target error difference (RMS) of 1.0 × 10
−20, and a random sampling unit 60,000 was applied. Once the models were registered, the minimal distance between every point to any triangular face of the meshed model (i.e. ground truth) was computed. Using the normals of the meshes, the distance was calculated as indicated. These distances were visualized using a ‘pseudo schematic colour heat map’ (or heat maps), where the range is based on a blue-green-red scheme. The generated heat maps of the given datasets are presented in
Table 2 and
Table 3. From the distances, the mean and standard deviation of the distance distribution for the whole object was also calculated. A Chi-square distribution was assumed for the modelling of the distance distribution between the ground truth and the reconstruction. The computation time of each solution is measured and also presented in
Table 2 and
Table 3.
5.3. Result
For the first dataset (frog sculpture, 22 photographs are used), three out of four FOSS managed to compute the data, and later these three results were used to compare with the ground-truth (
Table 2). This figure shows that all tested software tools yield useful results with this data set. PPT GUI produced a considerable amount of point cloud data, with three partial sets, which however prohibited comparing its accuracy with the ground truth data. The heat map and the Chi-square/Gaussian histogram distributions in
Table 3, shows that the closest match was produced by COLMAP, followed by Regard3D and VisualSfM. While VisualSfM produced a larger deviation distance from the ground truth, it took considerably less computation time. COLMAP generated the closest match with the ground truth; however, it took the longest time and did not produce any textures. Overall, Regard3D seems to be, overall, the most well-rounded performer here.
50 photographs were used for the second data set, but the acquired mesh developed holes in the reconstruction (
Table 4). Except for PPT GUI, which produced the noisiest clouds, the other three software produced consistent results and had similar distance distributions with reference to the ground truth. The heat map also indicates the minor deviation between the produced results. COLMAP and Regard3D managed to capture most of the details without much noise as found on the histogram. On the other hand, VisualSfM failed to produce the roof of the model. Similar to the previous experiment, COLMAP took three times longer for computation in comparison to Regard3D, and did not produce any texture.
6. Discussion
Photogrammetric 3D modelling for heritage documentation is a well-studied topic. This paper has studied the workflows and compared the results of the 3D reconstructions achieved from the four FOSS-based applications. This section briefs the understanding gained from the study.
6.1. Workflow
Metashape is one of the most popular commercial packages; hence we used it to produce the ground truth data. It has a robust but straightforward pipeline that can produce accurate results [
14,
32]. Metashape has the capability to split the set of photos, manually remove extra point clouds and also has some nice extra features such as the ability to automatically close holes and directly export to Sketchfab.
In general, VisualSfM is an excellent tool for taking something static, converting and then importing point clouds into a 3D environment, but it requires additional processing and human skill to make a perfect digital environment. Additional tools (PMVS/CMVS) are also required to run VisualSfM, they need to be downloaded from their respective sites and copied to the local folders. This might be tricky for new and non-technical users; however, the rest of the installation process is mostly automatic. If some cameras fail to align correctly, users are required to start again and go back and shoot (or collect) more photographs and recommence the computation process.
It is also possible to manipulate the initial reconstruction (i.e. the sparse reconstruction) while it is in progress, removing those bad points/cameras that have been added to the wrong position/orientation. The software atomically generates the sparse cloud and dense cloud to the user-designated folder including the SIFT and matching process (*.sift and *.mat file). Thus, the user does not need to re-sift or re-match images that have already been analyzed. However, adding new images to improve previously completed reconstruction takes as much time, for it requires analysing a new data set. VisualSfM does not offer any inbuilt editing or noise cleaning option; therefore external software is required.
Python Photogrammetry Toolbox (PPT GUI) presents a relatively simple user interface but requires some time and effort to learn. It is a little confusing that the GUI suggests to start with the numeric sequence, i.e., ‘1. Run Bundler’ as the first step. However, a user must insert the photo path and camera data (if it is not in the inbuilt database) before running the bundler, which is supposed to be the first step. The bundle adjustment automatically saves itself in the temporary directory (by default). Thus, it is a good idea to copy the OSM-* directory to somewhere safe because it will be lost next time the computer boots. The main drawback is that the GUI does not provide any visual cues of the generated point cloud or mesh, so users must need to use an external viewer or editor such as MeshLab to view or edit the output.
COLMAP offers a simple one-step reconstruction, which may be convenient as a quick solution for novice users. The manual or step-by-step process, on the other hand, offers better reconstruction quality with many adjustment options. However, the ‘stereo’ step of the dense reconstruction process may crash the program if the video card memory is inadequate. Reduction of the memory use during a dense reconstruction process is therefore recommended (source:
https://colmap.github.io/faq.html# faq-dense-memory, access date 23 Dec.2018). Downsizing the pixels of photographs to between 750 and 2000 may also solve this issue.
Additionally, the ‘stereo’ step takes quite a long time during computation. Despite this, COLMAP offers a plethora of variables to tweak, and it may be possible to minimise the computation time and achieve reasonable results. Aside from resolving the memory issue by lowering ‘max_image_size’, the process is mostly straightforward. COLMAP can produce point cloud movie animation as well. The only drawback to mention here is that the meshing process does not produce a texture; instead, it applies colours to each vertex. This output may appear little muddy, but a user can export cameras as bundle or as *.nvm files, and import them to MeshLab and generate a texture.
Regard3D offers a simple one-step installation process, and the installer file can be downloaded from its official website. Regard3D is completely free to use with a simple workflow that offers a vertical menu to flow various steps. Additional options are offered on the right side of the GUI. Although it does not offer mesh or point cloud editing, it does offer surface reconstruction and texturing. The only drawback is that Regard3D uses a modified version of openMVG, which means that there is frequently a delay in getting the most recent version. However, this may not be an issue for non-technical users and beginners.
In general, based on which the computation method is chosen, image-based modelling software typically follows a six-step approach to produce 3D reconstructions or 3D models (
Table 5). VisualSFM and PPT GUI produce point cloud automatically in a default location, whereas COLMAP, Regard3D, and Metashape ask for user input to export the cloud. Regard3D can produce mesh (using Poisson/FSSR) and can colour vertices or textures. COLMAP instead applies colours to each vertex and cannot produce textures.
6.2. Reconstruction
Three of the four software packages were able to compute models from the two datasets as presented in
Table 3 and
Table 4. This paper presented a qualitative ranking based on the computation time and heat maps of distances. The heat maps make it clear that, in both cases, COLMAP and Regard3D, that they have produced the best results, i.e. few deviations from the ground truth (e.g., the Metashape model). However, PPT GUI produced the ‘noisiest’ clouds and failed to produce a single set of point clouds from the sculpture data set. The pipeline is often fragile and returned several unsuccessful outputs while dealing with a scale factor to run bundler. Additionally, lack of documentation and user support is still a significant issue and this might deter many novice users from selecting PPT GUI.
Being free software, COLMAP offers a pleasant GUI and plenty of tweaking options. However, in our experiment, it crashed several times (with both datasets) during the dense reconstruction phase. This dataset only worked when the ‘Stereo’ phase ‘max_image_size’ is set to 750. The automatic reconstruction, however, has not crashed but it produced less impressive results. Regard3D, on the other hand, was found to be more balanced in computation, documentation, GUI and tweaking options. Moreover, it is entirely free and can produce surface and texture. None of the FOSS applications offers any editing of point cloud or mesh, the user therefore needs to use an external application such as MeshLab (free software, can be downloaded from
www.meshlab.net) for further cleaning and editing 3D.
Based on this comparative study, Regard3D is our preferred application because of its academic licensing, runtime performance and quality of the produced 3D. COLMAP is our second most recommended application, followed by VisualSfM and PPT GUI. However, in regards to our benchmarking, a few issues need to be raised. First, the ‘ground truth data’, which was used to conduct the accuracy test, was obtained from another software application, Metashape. Nor was any laser scan LIDAR data used. Furthermore, we used the default settings of the FOSS applications; different settings would impact their output and any subsequent results.
7. Concluding Remarks
Defining and pointing to a specific 3D image-based modelling program as the ‘best’ free and open source (FOSS) solution is a difficult task. The four selected FOSS applications offer various options for handling data sets, tweaking options, computation time, the usability, GUI, and the learning curve. We tested these applications with seven different challenging 3D objects (datasets) as a pilot study, among these two datasets were finally chosen for the final test as they produced the most successful results. Workflows of these applications were studied first, and later their reconstruction results were evaluated against ground truth objects on the basis of distance measurement and computation time.
The most promising thing we found during the study that, each of these programs have very similar basic workflows. PPT GUI seems less sophisticated for beginners in both the GUI and the outputted results. Both COLMAP and Regard3D offer a sophisticated and clean GUI, which could support a wide range of users’ needs. However, both COLMAP and PPT GUI crashed several times during the study. On the other hand, VisualSfM produced relatively good results. However, the GUI is non-intuitive, and the learning curve seems a little steep for new users. Considering the process automation, the processing time, the GUI, the density of the point cloud and, not the least, the accuracy; we, therefore, rank the study software starting with the best one as Regard3D, COLMAP, VisualSfM and PPT GUI.
This paper set out to investigate 3D reconstruction (based on FOSS) solutions while keeping in consideration the potential benefits for small museums, heritage institutes, interested community, and local groups who are currently lacking high-end technological resources and related skills but interested in developing 3D heritage objects for documentation, visualisation, knowledge sharing and showcasing heritage assets. Regard3D, a free and open source software is therefore suggested as the most convenient solution because it is easy to install, requires no programming knowledge to use, can produce a significant good result with relatively low computation time, and offers a smooth learning curve. The official website also provides detailed documentation and tutorials. When the main purpose of the project is not primarily for scientific analysis or study, and where the project objectives demand only visualization and presentation, we believe this investigation will help non-expert users to understand and select the most suitable software for producing image-based 3D models at low cost.