1. Introduction
The assurance of providing adequate protection of cultural heritage sites, as well as preserving their authenticity, can only be obtained by creating a comprehensive inventory of a given site. This includes the definition of the type, shape, dimensions and geospatial location of the given structure. One of the ways to create such an inventory for a cultural heritage building is based on its photogrammetric documentation in the form of a three-dimensional model.
Generating a 3D model of a historical structure using photogrammetric methods can be troublesome due to the height of the structure. Whether a camera is used or a terrestrial laser scanner, too great a height of the structure will greatly limit the possibility of using terrestrial photogrammetry techniques. In such instances, better results can be obtained by integrating terrestrial imagery data with data acquired from a low altitude flights. Using an Unmanned Aerial Vehicle (UAV) as a platform for the sensor ensures that relatively large scale imagery can be acquired, which warrants high-quality end products.
In recent years UAVs are being increasingly used in architectural photogrammetry, being a fundamental module of management and conservation of national cultural heritage sites [
1]. Together with terrestrial laser scanning [
2], imagery is the main source of information when producing cultural heritage inventory. The purpose of photogrammetric systems based on UAVs, in which the data acquisition module is a video camera, is to acquire data to develop orthophotomaps [
3,
4], digital terrain models [
5,
6,
7], 3D city models [
8], 3D models of buildings [
9] and sculptures [
10]. UAV systems prove particularly useful where the access to an architectural object is difficult, which may be due to the topography or close proximity to other architectural objects. The complexity of the shapes of buildings is also a frequent obstacle during the implementation of photogrammetric techniques. It is necessary to use a relative image orientation obtained from multiple camera positions or a terrestrial laser scanner. In the case of an insufficient number of camera positions, detection of tie points in the images may not be feasible. Then, it is very helpful to use a sequence of video images. Finding similarities in adjacent images is much easier than in images taken from distant camera positions [
11]. In addition, in the case of video data, the redundant number of video images makes it possible to eliminate blurred images [
12,
13], images with a low radiometric quality index [
14] or those, for which the values of the exterior orientation parameters significantly differ within an image sequence which may be a result of flight instability.
In the literature, one can often find descriptions of systems involving the integration of data from the two altitudes: terrestrial and aerial. These methods often relate to the integration of point clouds from terrestrial and aerial laser scanning [
15], point clouds and imagery data [
16] or image data from terrestrial and aerial levels. An example includes the studies conducted by Bolognesi et al. [
17]. The authors have developed a 3D model of a historical architectural structure based on image data acquired using a Canon EOS M high resolution digital camera mounted on UAV platforms and from ground level. In studies conducted by Püschel et al. [
18] a method was developed of documenting an architectural monument, consisting of the integration of terrestrial images with images captured using a UAV. The image data were recorded by a non-metric digital camera in video mode.
The large majority of video image sequence orientation methods are based on Structure from Motion (SFM) algorithms [
19]. SFM relies on determining the 3D point coordinates and the camera projection matrix simultaneously, based on homologous points measured on a large number of images [
20]. It should be mentioned that the Rodriquez matrix method can be used to solve the problem of nonlinearity models of absolute and relative orientation. One of the advantages this method is the lack of gimble effect [
21].
To ensure high accuracy of orientation of the video images, methods of image orientation based on the fundamental equation of photogrammetry, i.e., the collinearity equation should be used [
22].
This paper presents the issues which occur when performing an orientation of two images, of which one was acquired with a tilt angle close to 90°. Such a situation takes place quite often when conducting photogrammetric measurements of buildings based on video imagery. During such measurements, first the side walls of the building are filmed and then the camera is put into aerial orientation in order to acquire imagery from the roof. The aim of this research was to assess the possibility of conducting an orientation procedure of video imagery, in which the external orientation for the first image was typical for aerial photogrammetry whereas the external orientation of the second was typical for terrestrial photogrammetry. This type of orientation (
Figure 1) of video images is a special case of processing two images for photogrammetric documentation of architectural structures. It is closely associated with the problem of integrating aerial and terrestrial imagery data. This issue is especially important, due to the fact that it occurs very often when conducting photogrammetric measurements of buildings. The problem of integrating aerial and terrestrial imagery data stems from using two different coordinate systems (and two rotation matrices): typical for aerial photogrammetry and for terrestrial photogrammetry in order to photogrametrically process both the walls and the roof of a given architectural structure in one unified coordinate system.
2. Proposed Method of Orientation for Terrestrial and Aerial Video Images
Assuming, that the axes of the terrestrial coordinate system are parallel to the building's walls, a rotation occurs about one of the axes of the terrestrial coordinate system X or Y (when the video camera makes the transition from aerial orientation to terrestrial orientation). In this case, the tilt angle (ω or ) for the terrestrial image is close to 90°.
By adopting the rotation matrix and maintaining the order of the rotation ω
κ [
22]:
The rotation matrix coefficient equations can be greatly simplified and take on the form shown in (
Table 1):
By implementing the inverse of the collinearity equations for a stereo made up of video frames with two different orientations (aerial and terrestrial), the following set of equations was created:
After substituting the third and fourth Equations from Equation (2) with the simplified equations (
Table 1), the set of collinearity equations for the aerial (first image) and terrestrial (second image) imagery will look as follows, if
”=90°:
For ω” = 90° the set of equations will look as follows:
where:
X, Y, Z—ground control point coordinates
X0’, Y0’, Z0’, ω’ ’ κ’—exterior orientation parameters of aerial image
X0’’, Y0’’, Z0’’, ω’’, ’’, κ’’—exterior orientation parameters of terrestrial image
x’, y’—image coordinates of a point on the aerial image
x’’, y’’—image coordinates of a point on the terrestrial image
ck, x0, y0—camera's interior orientation parameters
a’11, …, a’33—rotation matrix coefficients for the aerial image
a’’11, …, a’’33—rotation matrix coefficients for the terrestrial image
b’’11, …, b’’33—coefficients of the simplified rotation matrix for the terrestrial image when ” = 90°
c’’11, …, c’’33—coefficients of the simplified rotation matrix for the terrestrial image when ω” = 90°.
The above Equations (3) and (4) can be rewritten in a simplified form:
After transforming Equation (5), new equations are obtained for calculating the ground coordinates X, Y, Z of a point with known image-space coordinates:
In order to determine the ground coordinates XYZ, those equations for which XYZ can be calculated independently from other ground coordinates were selected. When the following are known: the interior orientation parameters of the camera: ck, x0, y0, the exterior orientation parameters of the video frame with the aerial orientation: X0’, Y0’, Z0’, ω’ ’ κ’, and with the terrestrial orientation: X0”, Y0”, Z”, ω”, ”, κ”, it is possible to calculate the ground coordinates of points, whose image coordinates had been measured on the video frames (x’, y’, x”, y”) based on the simplified collinearity equations.
Such a simplification of the collinearity equations is done to limit the amount of intermediate calculations when determining the coordinates of architectural structures in a ground coordinate system. It had therefore been decided, that the video camera's lens errors, such as radial and tangential distortion, which could increase the number of calculations, would not be taken into account in this prototype version of system.
Of course, the situation in which the difference in tilt angles of the camera during video registration is equal to 90° still remains only theoretical. The possibility of processing such a stereoimage was verified in a laboratory experiment. It was feared that because of the low coverage between the images and a large tilt angle, the absolute orientation of video images could be impossible.
3. Test Data Used in the Study
In order to verify these assumptions and check the possibility of conducting an absolute orientation of a pair of images for which the difference between tilt angle is near 90°, research was carried out on the test object in the shape of a 1 m × 2 m × 1 m cuboid. The test object was filmed with a Sony NEX-VG10 E video camera with Sony E 16 mm F2.8 fixed focal length lens [
23] (
Table 2).
A simulated UAV flight was performed to acquire the test object video data from the terrestrial and aerial level. Six video frames were selected from the videos: three with a terrestrial orientation and three with an aerial orientation (
Figure 2).
Based on the acquired image data, the following processes were performed: (1) orientation of video frames from the terrestrial level, creating stereo: 1–2 and 2–3; (2) aerotriangulation of video frames from the UAV level, creating stereo: 10–11 and 11–12; (3) orientation of video frames acquired from terrestrial and UAV levels creating stereo 2–10. For the stereo 2–10 the difference in camera rotation angle about the Y axis was 74°.
Table 3 presents values of angles of video frames' exterior orientation parameters. Values of angles have been calculated based on standard collinearity equations.
Table 4 below shows the base-distance ratio of each of the stereos as well as the spatial resolution in the XY plane (∆XY) of the geometric model and along the Z axis (∆Z) calculated from the following equations:
where:
For all the stereos, a planar resolution (ΔXY) of 1 mm was obtained. Moreover, the resolution (ΔZ) of the 2–10 stereo proved to be the lowest compared to other stereos because of the favourable base-to height ratio. Research conducted on the test object confirmed the validity of the assumption that it is possible to perform the orientation of video images acquired using a video camera mounted on a UAV.
4. Experiment on an Architectural Structure
Positive orientation results of video images obtained for the test object prompted the authors to investigate further. To this end, a photogrammetric flight of an unmanned mini-copter was performed over a building with a sloping roof and rectangular base with the following dimensions: 13 m × 20 m × 6 m (
Figure 3).
4.1. Photogrammetric Flight Planning
In order obtain the video data, a Sony NEX-5N non-metric digital camera and Sony E 16 mm F2.8 lens were used. The camera weight with the lens was 336 g. The following factors were decisive in the choice of altitude: building dimensions, GSD, and terrestrial frame dimensions. The key equipment and flight plan parameters are listed in
Table 5.
4.2. Choosing Targets for the Photogrammetric Network
Due to the low accuracy of the GPU-IMU systems in low-budget UAVs, it is recommended to use control points. In order to select the appropriate targets to act as control points, four groups of targets were designed: crosses, circles, squares and checkerboards. It was assumed that the targets for both aerial and terrestrial level applications should allow for clear identification of their centres, regardless of the distance and angle of imaging. At the testing stage, several possibilities for visualizing the GCP signals were examined. After considering both criteria, the checkerboard targets proved to be the best. A design was created for the placement of photogrammetric terrestrial network points on building walls and around the measured structure (
Figure 4). The signals were made from a white board painted with black paint. The targets placed on the walls of the building were printed on A4 sheets, then laminated with non-reflective foil (
Figure 5). The signals were made in two sizes: 30 cm × 30 cm (horizontal network) and 20 cm × 20 cm (points on the wall).
The terrestrial photogrammetric network consisted of 24 control points and 10 check points. Targets were also placed on the walls of the building—a total of 47 signals on the walls of the building and 12 signals on the roof (
Figure 6). The coordinates of the points were measured using the Topcon Total Station with an error of ±6 mm.
Two different types of UAV flights were conducted. The first flight over the building, with the aerial orientation, was performed along 27 pre-designed POIs (
Figure 7a). The second flight, with the terrestrial orientation, was performed along the building's four walls (
Figure 7b). In this case the average flight altitude was 3 m above the ground level. For each wall, a set of two flights was performed at two different camera tilt angles.
Due to insufficient data (the targets were covered by vegetation), the south east (SE) wall of the building was not included further in this research. Pairs of images had been chosen from the acquired image sequences from both the aerial and terrestrial orientations to create stereopairs (
Figure 8). The overlap area between images of two different orientations (aerial and terrestrial) includes the following features of the building: the basement, fragments of the side walls and the roof's edge.
4.3. Control Point and Tie Point Configurations
Modern UAV systems are all equipped with a GPS/INS receiver, which records information about the location of the camera during acquisition. However, the precision of these systems is sometimes so low, that it is better to determine the exterior orientation parameters of an image using space resection, i.e., based on control point measurements. When performing measurements in field conditions, it is sometimes impossible to place measurement targets where they are most needed as the surface may be inaccessible (with very high structures) or it may simply be forbidden to do so (with cultural heritage structures). In the research work, three configurations of control points and tie points were considered: (a) control points on the facade (b) control points on the facade and around the object (c) control points around the object (
Figure 9).
The exterior orientation parameters were determined from control point measurements, using three different configurations (
Figure 9). When transitioning from the aerial orientation to the terrestrial orientation, the angle which changes by close to 90° will differ between the different walls. Namely, for the north eastern (NE) and south western (SW) walls it will be the
angle, whereas for the north western (NW) wall it will be the ω angle.
Table 6 and
Table 7 present values of angular exterior orientation and differences between the linear exterior orientation parameters. These values have been calculated based on standard collinearity equations.
The stereo orientation of video frames from two different levels was successful. It turned out to be impossible to perform an absolute orientation of aerial and terrestrial imagery with the control point distributed in accordance with configuration III—i.e., when control points are located only on the ground around the building.
This research has shown that for a pair of images with an aerial and terrestrial exterior orientation and having an unfavourable configuration control points, it is impossible to obtain reliable values for the exterior orientation parameters calculated using only control points located around the building (configuration III).
When using the other two control point configurations (I—control points only on the walls and II—control points on the walls and around the building) similar values for the exterior orientation parameters were obtained (differences for the linear parameters of between 0–0.1 m and 0.5°–15° for the angular parameters).
5. Results of the Experiment
A verification was performed, whether it is possible to apply the simplified collinearity equations to determine the X, Y, Z coordinates of the building. Matlab software was used to run an algorithm for calculating the ground coordinates of points (X, Y, Z) based on classic equations sets Equation (2) and the simplified collinearity Equations (3) and (4). The process of determining coordinates used the following input data: known interior and exterior orientation parameters of the camera and the measured image coordinates of the points on the building's walls. The calculated coordinates were compared to theoretical values, which is shown in
Table 8 below.
Figure 10 and
Figure 11 below illustrate the ratios of the RMSE of the point coordinates (X, Y, Z) calculated using a classic set of collinearity equations to the RMSE of the point coordinates (X, Y, Z) calculated using the simplified collinearity equations.
Similar error values were obtained for all measured walls for tilt angles , ω ≈ 85°. This means that for tilt angles close to 90°, the mean errors of determining coordinates based on simplified collinearity equations are close to 10 cm. The closer the tilt angle is to 90°, the RMSE value increases for coordinates calculated using the classic set of equations.
An analysis of the results obtained using the simplified collinearity equations shows, that the worst results were achieved for a = 80° tilt angle. Slightly better results were obtained for the ω angle of the same value (80°). This means that when simplifying a collinearity equation using ω = 90°, the new collinearity equation is simplified by a smaller number of rotation matrix coefficients, than when the same is done with angle . It should be noticed, that when the tilt angle = 90°, 4 out of 5 rotation matrix coefficients are equal to zero (0). However, when the tilt angle ω = 90°, only one of these coefficients is equal to zero (0).
The greatest RMSE m0Z, both for the classic equation set and for the simplified collinearity equations, had been obtained for the Z coordinate. This is due to the fact that the Z coordinate is calculated based on the values of the other two coordinates (X and Y). Slightly higher errors had been observed for configuration II of the distribution of control and tie points.
6. Determining the Relation between the Established and Actual Tilt Angle of the Video Camera
The research described above had shown that for tilt angles and ω close to 90°, the calculated ground coordinates of points on the building using the simplified set of equations are very close to the results obtained using the classic set of equations. This means that when performing a flight over a building using a copter, (in case of using the simplified equations) the tilt angle of the camera (ω) for all video frames should be within the 85°–90° range.
Therefore, for low-cost unmanned aerial copter systems, which do not have a flight stabilization system, it is essential to determine the difference between the established camera tilt angle (ω) defined during flight planning and the actual tilt angle value defined during the image orientation process. The difference between their values can differ for different copter systems. This difference depends on many factors, including the speed with which the platform is moving, it's mass, payload capacity, number of engines, etc. Most importantly, this difference in angles will also be affected by the speed and direction of winds. This difference can be determined empirically by conducting a series of test flights at different speeds.
In the article, an experiment had been conducted to show the change in the difference between the established and actual tilt angles of a camera mounted on a UAV computer system in relation to the speed of the platform. A series of 6 flights were performed at chosen speeds: 1, 3, 5, 7, 9, 11 m/s. The experiment was performed during a windless day.
For the UAV system flying at a speed of 3 m/s, the difference between the established camera tilt angle and the same angle derived from the exterior orientation parameters was equal to 3°. The experiment shows that with an increase in the UAV's speed, the difference between the established and actual tilt angles lessens, therefore making the platform's flight more stable (
Figure 12). During video acquisition of an architectural structure at high flying speeds, the possibility of image blur must be taken into account. The assumption that the actual camera tilt angle
(ω) is equal to the established angle is only possible with high-end copters, equipped with modern stabilization systems.
7. Discussion
Starting from the collinearity equations, which are the basis for photogrammetry, simplified mathematical model was proposed. The use of a simplified collinearity equations can significantly shorten the processing of image data from UAV. It was found that it is possible to perform an absolute orientation of video images acquired using a video camera mounted on a UAV, with a smooth transition from a terrestrial to aerial exterior orientation.
Experiments were conducted on a geometrically simple architectural structure. The video camera was mounted on a UAV copter system not equipped with a high precision GPS/INS measurement system. Therefore the exterior orientation parameters were determined using space resection—by measuring the location of control and tie points in two different distribution configurations. The target used to signalise these points made it possible to identify control points in two planes—on the wall and on the ground. The experiments have proven, that for two out of the three proposed control point distribution configurations—I and II (I—control points on the facade, II—control points on the facade and around the object, III—control points around the object) it is possible to obtain good results when processing video imagery. Slightly better results were obtained for configuration I.
The errors in point ground coordinates calculated using simplified collinearity equations oscillate about a few to a few hundred centimetres. When using the classic set of collinearity equations, the errors in coordinates are greater when the tilt angle (ω) 85°, i.e., when the value of this angle is close to the angle (90°). The opposite is true when the coordinates are calculate using the simplified collinearity equations, where the errors for all coordinates (m0X m0Y, mZ) become smaller.
The proposed method deals with the problem of integrating 3D models of the roof and walls of an architectural structure. The presented algorithm for determining the X, Y, Z coordinates of a structure had been established with the thought of designing an unmanned system, the aim of which would be to determine the coordinates on the surface of a structure in near-real-time. The designed algorithm could be implemented in a device onboard the UAV. Given known exterior orientation parameters, its purpose would be to automatically determine the X, Y, Z coordinates of points on the surface of a building. In the situation that one of the camera tilt angles
or ω would be between 85°–90°, the set of simplified collinearity equations would be used to calculated the X, Y, Z coordinates of the points (
Figure 13).
The proposed methods will shorten the calculation time by over 50%. In the case of a low cost system, the number of equations has a great influence on the computational speed.
8. Conclusions
This paper describes the issue of performing absolute orientation of video images which have two different exterior orientations of the camera mounted on the UAV: terrestrial and aerial. The proposed method can be used to determine the X, Y, Z coordinates of points based on a set of collinearity equations of a pair of images, with the equations for the terrestrial exterior orientation image being greatly simplified, assuming that the camera tilt angle is equal to 90°. The aim of this simplification is to limit the number of intermediate computations when calculating the coordinates of points on the surface of architectural structures in a ground coordinate system in the special case of image orientation mentioned.