RADAR: Resilient Application for Dependable Aided Reporting
Abstract
:1. Introduction
2. Related Works
2.1. Reporting and Knowledge Representation
- Concept/Class. It represents a category of (either physical or virtual) real-world objects. For example, it can be a task, a function, an action, a physical object, a notion or an idea, and so on. Concepts can be abstract or concrete, elementary or derived by aggregating other concepts.
- Relationship. It represents a correlation among the concepts/classes.
- Axioms. They are true statements about the domain described by the ontology. They are used to specify the semantics of concepts. They generally specify how the conceptual vocabulary can be used.
- Top-level (foundational) ontology. These are interdisciplinary ontologies, which constitute the basic bricks of multiple domains, i.e., different ontologies could derive from the same top-level ontology.
- Domain ontology. It models specific portions of knowledge by identifying entities of interest and their relationships, which characterize a given domain. In principle, a domain ontology should specialize a top-level ontology.
- Application ontology. It specializes a domain ontology with detailed concepts for very specific application domains.
2.2. Literature Concerning the Running Example
3. Running Example, Approach and Architecture
3.1. Running Example
- Portfolio 1. This report provides summary data based on a few categories: (i) “Outstanding Principal”, which summarizes the theoretical residual debt; (ii) “Accrued Interest”, which summarizes the accrued interest and delinquent installments; (iii) “Unpaid Outstanding Principal”, which summarizes the actual remaining debts. These aggregations are also disaggregated based on loan status (“Performing”, “Delinquent” and “Defaulted”) and related guarantees (“Mortgage”, “Guaranteed” and “Unguaranteed”).
- Portfolio 2. This report provides aggregate data related to advance payments (closure of debts) during a given quarter, together with an immediately preceding quarter.
- Loan by Loan. This report summarizes the status of each single loan. For each loan, apart from the usual identification data of the bank, the customer and the loan, the current status of the loan, the current rate, the amount of the loan secured by mortgage (and so on) are reported.
3.2. Problem Description and Approach
- Technical Complexity. Gathering and processing the possibly-large amount of data to aggregate in reports asks for integrating many different technologies, such as APIs, streaming, relational databases, reporting tools, and so on. Specialized technicians are involved to deal with this kind of complexity.
- Business Complexity. Many people perceive reporting as a simple task. However, this is not true, thinking about the complexity of the business activities for which reports must be produced. In fact, national and international regulations and regulatory bodies ask for specific reports with specific interpretations of concepts used to define what they want in reports. Domain experts must contribute with their domain knowledge.
- Skill Complexity. Both technical and business complexity can be dealt with only by a wide variety of skills. However, these skills are hardly owned by the same person: the scenario asks for many people with different skills in various areas (e.g., technical skills, general and domain-specific business skills, and so on) in different times.
- Transferring and Provisioning Domain Knowledge. Business complexity can be addressed only by transferring knowledge from domain experts to employees. However, these people rarely work together. Consequently, it is necessary to gather and formalize domain knowledge, possibly by associating it to data by exploiting ontologies.
- Accessing Data. Employees in charge of generating reports must access the data to use to generate reports. However, data are usually stored within relational databases and, usually, employees are not familiar with this kind of tools (as the SQL language, in particular). A bridge between data-management systems and employees is necessary, possibly based on domain knowledge.
- Easy Design of Reports. Once data are accessible and comprehensible (in relation to the application domain), high-level tools for generating reports by aggregating data are necessary, to assist employees in a non-tedious and efficient way.
- Conceptual characterization. Semantics of data must be conceptually characterized by means of an ontology. In fact, an ontology can provide the conceptual framework to improve data comprehension.
- Concrete and high-level view of data. A data model is introduced, able to relate real data to ontological concepts, as well as to provide a concrete vision of data. This model, which is called RADAR Data Model, is an object-oriented model in which classes and relationships are called Concrete Classes and Concrete Relationships (respectively). Report designers are provided with the RADAR Schema, which is an instance of the RADAR Data Model for the managed data. Data to put in reports are stored within an internal database called RADAR DB, whose schema is compliant with the RADAR Schema. This way, report designers will directly operate on the RADAR Schema and will be totally unaware of the structure of source data.
- Mapping from RADAR Schema to Source Schema. The RADAR Schema must be mapped into the actual schema of source data, to automatically transfer data from the external sources to the RADAR DB; furthermore, this information will be precious for technicians who will have to maintain the gathering process.
- Defining Update Rules. The RADAR Rule Language is a language defined to complete properties that do not correspond to attributes in the source data. These rules allow for defining how to derive properties that are missing in new data.
3.3. Architecture
- Rectangles with orange borders represent software tools that provide a kind of user interface;
- Rectangles with green border represent software services;
- Rectangles with black border represent storage and archives;
- Blue arrows without labels represent synchronous communications between the connected software tools (the arrow represents the direction of the main data flow);
- Rectangles with light-blue background represent documents or descriptions specifically generated and received by software tools;
- Blue arrows labeled with documents or descriptions represent asynchronous communications between software tools, performed through the generation of documents and descriptions denoted by the label.
3.3.1. Design Layer
- The Knowledge Base persistently stores the overall RADAR Schema, which is the core description for the RADAR Framework. The name is motivated by the fact that the RADAR Framework manages an ontology, concrete classes, RADAR Rules and external tables (see Section 4), which are possibly semantically annotated; all of them constitute the knowledge necessary to actually collect data, interpret them, design and generate reports.
- The Knowledge-Base Manager is a suite of micro-services that are responsible to create and manage the overall RADAR Schema, stored within the Knowledge Base. The module deploys various descriptions, which are used to set up the Information-System Layer. They are (i) the DB Schema, used to create the RADAR DB and (ii) ETL Directives, which pilot the ETL (Extract, Transform and Load ) process that transfers data from external sources into the RADAR DB. The Knowledge-Base Manager is presented in details in Section 5.1.
- A Source Ontology (or Reference Ontology) is loaded by the Knowledge-Base Manager, and saved into the Knowledge Base, to constitute the basis for defining the RADAR Schema.
- The Report Designer provides analysts with a user-friendly interface, able to browse the Knowledge Base so far built, define aggregations and design reports. The Report Designer generates Report Definitions and Aggregation Queries, which are stored within the Report Folder; through these documents, report layouts and aggregations necessary to obtain them are persistently saved.
3.3.2. Information-System Layer
- The Source DBs are storage systems, usually relational databases, which store and provide the External Tables in which data to integrate are stored. Notice that the way these tables are populated depends on the specific applications that provide source data: for example, in the case of the running example external tables store flows about cessions and payments received from external systems.
- The RADAR DB is the database in which all data possibly involved in reporting are collected, when they are transferred from the Source DBs. The transfer is performed periodically by ETL (Extract-Transform-Load) tasks. The data in the RADAR DB will then be used to perform further aggregations, necessary as input for generating reports. The RADAR DB is actually managed by a relational DBMS; nevertheless, in order to be not tightly coupled with a specific DBMS, the Hibernate [43] bridge is adopted: this way, any relational DBMS can be used to manage the RADAR DB.
- The ETL Service is the micro-service in charge of transferring data from the Source DBs to the RADAR DB, piloted by the ETL Directives provided by the Knowledge-Base Manager in the Design Layer.
- The Rule-Execution Service is a micro-service that acquires the definitions of RADAR Rules (which are an integral part of the RADAR Data Model, see Section 4.3.2) from the Knowledge Base by synchronously calling the Knowledge-Base Manager, to execute them on the RADAR DB when new data are uploaded.
- The Report-Generator Service is a micro-service whose goal is to actually generate reports in xlsx format, based on (i) the data stored in the RADAR DB, (ii) a Report Definition generated by the Report Designer and (iii) the Aggregation Queries necessary to compute the aggregate measures that must be inserted into the final report. Report Definitions and Aggregation Queries are retrieved by accessing directly the Report Folder, in which the Report Designer stores them: this way, changes in report layouts are seamlessly deployed to the Report-Generator Service.
4. RADAR Data Model
- Ontological Layer. This layer defines the Reference Ontology that provides the conceptual framework for managed data.
- Concrete Layer. This layer provides a high-level and uniform view of actual data, in a way suitable for report designers.
- Mapping Layer. This layer maps classes defined in the Concrete Layer to source data, used to feed the RADAR DB, i.e., the internal database of the RADAR Framework.
4.1. Ontological Layer
4.2. Concrete Layer (Core RADAR Data Model)
4.2.1. Concrete Classes
1. | ||
2. |
4.2.2. Look-Up and Virtual Relationships
1. | |
2. | |
, | |
3. | |
4. |
- Each referencing property must be a property in the full set of properties in the referencing class.
- Each target property must be part of the key of the target class.
- The type and the data type of the referencing property and of the target property must be the same.
- All the properties in the key of the target class must be referenced.
1. | |
2. | |
3. | |
4. | |
- The first descriptor in moves from the concrete class.
- The last descriptor in reaches the concrete class.
- For each intermediate descriptor in , its concrete class coincides with the concrete class of the next descriptor (the path is continuous).
- For each descriptor in , it denotes a look-up relationship that associates the and classes either in forward direction ( refers to ) or in backward direction ( refers to ).
4.2.3. Concreting Relationships
- All properties inherited by the concrete class must be defined in the ontological class.
- All inherited properties must have the same name in the concrete class.
- Each inherited property must have the same data type in both the ontological class and the concrete class.
4.3. Mapping Layer
4.3.1. Mapping Relationships
1. | |
2. | |
3. | |
- All the attribute names specified in the mapping must be in the schema of the ”” external table.
- All the property names specified in the mapping must be properties in the ”” concrete class.
- Associated properties and attributes must have the same data type.
4.3.2. RADAR Rule Language
Rule: Junk_BCE_Rating |
Class: BCE_State |
Condition: Not_Paid_Installments > Residual_Debt * 1.3 |
Action: BCE_Rating=“Junk" |
Rule: Default_Moodys_Rating |
Class: Moodys_State |
Condition: BCE_Rating == “Junk" AND |
Reach(Cession via Related_Cession).PerformingCategory == “Default" |
Action: Moodys_Rating=“D" |
Rule: Performing_Cession |
Class: Cession |
Action: Performing_Category = “Performing" |
Rule: Junk_BCE_Rating |
Rule: Junk_BCE_Rating |
4.4. RADAR Schema
- : is the set of ontological classes (see Definition 1);
- is the set of concrete classes (see Definition 2) and subclasses (see Definition 4);
- is the set of look-up relationships between concrete classes (see Definition 6);
- is the set of virtual relationships (on concrete classes) that give a semantic view of transitive relationships obtained by navigating look-up relationships (see Definition 7);
- is the set of concreting relationships between ontological classes in and concrete classes in (see Definition 8);
- is the set of external tables (see Definition 10);
- is the set of mapping relationships, which associate concrete classes in and external tables in ;
- is the set of RADAR Rules(see Definition 12).
1. | such that such that . |
2. | such that , such that . |
3. | , such that and . |
4. | and , such that . |
5. | , such that and . |
6. | , such that and . |
7. | , such that . |
- The parent class of a non-root ontological class in must be defined in .
- The parent class of a concrete subclass in must be defined in .
- The concrete classes associated by a look-up relationship in must be defined in .
- The look-up relationships navigated by virtual relationships in must be defined in .
- The ontological class and the concrete class associated by a concreting relationship in must be defined in and in , respectively.
- The concrete class and the external table associated by a mapping relationship in must be defined in and in , respectively.
- Each RADAR Rule must be applied to a concrete class defined in .
- Domain experts provide the reference ontology and define the schema for the Concrete Layer. This way, they provide other users with the unifying and high-level schema for concrete concepts to use to represent data to aggregate into reports.
- Technicians that deal with databases and data flows map concrete classes to external tables, to feed instances of concrete classes (they work on the Mapping Layer).
- Users in charge of designing reports exploit ontological classes and semantic annotations in the Concrete Layer to find out the data to aggregate in reports. These users exploit the Ontological Layer and the Concrete Layer, while they have no access to the Mapping Layer.
4.5. Discussion
- Ontological classes are necessary to give a clear semantics to data items, to assist during the evolution of the model and to help users in charge of designing reports.
- On the opposite side, external tables describe the source data, as they come to the RADAR DB.
- Concrete classes are the bridge between semantics and raw data: they provide a uniform view of data, because when data are imported from several external tables, they are translated into a homogeneous view with ontological classes that denote their meaning.
- Domain experts who participate to the original design of the system are precious to define the Reference Ontology and to orient the design team to properly interpret rules and the application context.
- Database administrators who deal with source data provide their knowledge about incoming data and contribute to map them into the concrete classes. Apart from the initial design activities, they are also precious to add novel data sources if the system is operational.
- Users in charge of designing reports work directly on the concrete classes, because they do not have to take case about raw source data; in fact, they are not technicians. Furthermore, the availability of domain knowledge associated with concrete classes (the ontology and the annotations), provided by the domain expert(s) involved in the initial design phase, provides report designers with a valuable source of information to choose the concrete classes to query in relation to the data to be aggregated into the report to produce.
- The formal definition of the RADAR Data Model is given to provide the scientific community with a rigorous formalization of the model. Since the model encompasses concepts such as “ontological class” and “table”, the paper reports specific definitions for them, even though they are well known in the scientific community. The goal of these definitions is not to provide yet another definition; in contrast, their goal is to give a representational structure of them, to be effectively used within the RADAR Data Model. Indeed, Definitions 1 and 10 provides the way such concepts are represented and used in the specific model.Further notice that the adoption of MOF (acronym for ”Meta-Object Facility”) [48,49] might be evaluated as well to define the RADAR Data Model; it could be considered in the future, as an engineering approach, to build novel components of the framework and connectors. Nevertheless, the MOF approach can accompany the formal approach followed in this section, to support the practical exchange of models and data.
- When the term “ontology” is used, many people think about RDF (acronym for “Resource Description Framework”) [50,51] and OWL (acronym for “Web Ontology Language” ) [52,53]. The reader could wonder why they are not mentioned before in the paper. The answer is that in the RADAR approach, they are possible formats to represent ontologies when the Source Ontology is loaded into the Knowledge-Base Manager. Currently, the framework relies on the JSON-LD format, adopted by the Schema.org ontology. JSON-LD is the most recent format for representing ontologies, and Schema.org is the most recent ontology (based on JSON-LD) that encompasses the financial domain. Nevertheless, if necessary, our framework could be easily extended to import both RDF and OWL ontologies.
- The RADAR Data Model maintains a separation between ontological classes and concrete classes, i.e., concepts and data. This is due to various reasons. (i) The considered application context is characterized by huge volumes of raw data to deal with; while associating raw data (described by concrete classes) to concepts (described by ontological classes) should help retrieve the concrete class to query for aggregating raw data, it is not feasible to think to express queries on ontological classes, because there is not a certain correspondence with raw data (in the application context, raw data to aggregate in report must be certain, no imprecision is allowed). (ii) RDF and JSON-LD are thought for describing “Linked Data” over the Internet, i.e., documents (resources) are linked each other based on their content; queries on linked data can retrieve data with a certain degree of imprecision and efficiency is not a key issue; this is not true for the application context considered in this paper: queries must be done on certain data and efficiency is a key issue.
- Based on the considerations reported in the previous item, it is clear why we had to define the RADAR Rule Language: already existing languages, in particular those that were thought to apply logical inference on the bases of OWL ontologies (to infer intensional data from extensional data) were not applicable. In fact, RADAR Rules ensures an efficient execution (they can be translated into SQL updates) and they are based only on the Concrete Layer, because ontological classes cannot be considered to manage and query raw data.
5. Creating the Knowledge Base and Designing Reports
5.1. The Knowledge-Base Manager
- The users in charge of uploading the Reference Ontology and designing concrete classes exploits the Class Service. Figure 11 reports a screenshot of the Class Designer, the user interface of the Class Service. Hereafter, the figure is briefly explained.
- -
- On the left-hand side of the figure, there is the Ontology Browser, which allows the user to select the ontological classes (concepts) which the new concrete class derives from. In the case that the Reference Ontology does not provide the ontological classes that are needed to define concrete classes, the Class Service allows for defining new ontological classes.
- -
- The dialog window for defining the schema of the new concrete class is depicted on the right-hand side of Figure 11. Specifically, it shows the definition of the Cession concrete class, whose properties are reported in Figure 5. The inherited properties do not appear, because they are shown in a preliminary stage, once the ontological classes are selected.
- Once concrete classes have been created, the users exploit the Relationship Service to create look-up relationships and virtual relationships among the concrete classes. The user interface of the service is called Relationship Designer. Figure 12 shows a screenshot of the user interface: specifically, the figure shows the definition of the Related_Cession relationship; notice how properties in the referring class (i.e., Financing_State), on the left-had side, are associated with properties in the key of the referred class (i.e., Cession), on the right-hand side.
- Data managers can populate the Mapping Layer to set up the correspondence between external tables and concrete classes; specifically, they exploit the Mapping Service to define mapping relationships between concrete classes and external tables (in the Source DBs). Users are provided with a graphical user interface, which is called Mapping Designer (for the sake of space, no screenshot is reported).
- Data managers and domain experts can define RADAR Rules to complete missing properties in the instances of data. To this end, they exploit the Rule Service and its user interface called Rule Designer (for the sake of space, no screenshot is reported).
- Finally, once the RADAR Schema is complete, the software architect in charge of managing the framework exploits the Deploy Service to actually generate and deploy the two descriptions necessary to set-up the Information-System Layer, i.e., the DB Schema and the ETL Directives.
5.2. The Report Designer
- The user can browse the Knowledge Base and consult the RADAR Schema through the Knowledge Browser.
- The user defines the layout of the reports and defines the aggregations she/he is interested in through the functionality named Report-Layout Designer.
- The user stores in a persistent way, through the functionality named Archive Manager, the actual layout of the reports (Report Definitions) together with the queries that specify how to aggregate the data to put into the reports (Aggregation Queries).
- A report consists of several “worksheets”, each of which contains one and only one “table”.
- A “table” is the fundamental complex object in reports. In particular, the user can create two distinct types of tables: “stratification tables” and “detail tables” (also called “stratification reports” and “detail reports”, under the assumption that a table is the simplest form of report); specifically, a “stratification table” is useful to generate reports where instances of concrete classes are aggregated at different stratification levels, while ”detail tables” are filled in with detailed values coming from instances of concrete classes.Figure 13 shows an example of a table defined for a report in the running example: the report contains three distinct worksheets (see the bottom of the screenshot), where the shown worksheet is Portfolio1, which is an example of “stratification table”; Figure 14 shows the worksheet called LoanByLoan, which is an example of “detail table”.
- A “cell” content is defined through the Knowledge Browser. It lets the user exploit the knowledge provided by the domain experts and and saved in the Knowledge Base. It also letd the user to define the “cell” type. The “cell” corresponds to the information unit in a table. Several cell types are available, with respect to the value they can report within:
- -
- Empty Cell: the cell has no value.
- -
- Label Cell: the value is a constant textual label.
- -
- Arithmetic-Expression Cell: the value of the cell is specified by an expression that can rely on numerical values and class properties, as well as aggregations and cells that were previously defined.
- -
- Condition Cell: this type of cell is used to specify selection conditions on the instances of the concrete class; the condition holds for the entire line that contains the cell, so as only instances of the concrete class that meet the condition will be used to compute aggregations and cell values. Figure 15 reports a screenshot of the user interface that allows users to define conditions: it refers again to the running example. Specifically, the condition is called Mortgage_Loans and selects those instances of the Financing_State concrete class that refer (through the Related_Cession relationship) to cessions whose Performing_State property has the "PERFOMRING" value and the Guarantee_Category property has the "MORTGAGE" value; notice the presence of the Reach operator (the same introduced for RADAR Rules) that navigates the Related_Cession relationship to reach the instances of the Cession concrete class. In the bottom part of Figure 15, the reader can notice the buttons that allow users to express complex conditions.
- -
- Aggregation Cell: this type of cell allows users to aggregate values in instances of concrete classes. A specific interface is provided: Figure 16 shows a screenshot, which refers to the running example. Specifically, values of the Paid_Installments property of the Financing_State class are summed, based on the SUM aggregation function. Notice that the aggregation is given a name, to be referred to by other cells. Usual aggregation functions are available, i.e., SUM, COUNT (instance counter), AVG (average), MIN (minimum) and MAX (maximum). If the line in which the aggregation is defined contains a condition cell, the aggregation implicitly works on the instances selected by the condition. Aggregations can be specified in “arithmetic-expression cells” too, by using the AGGR(n, , , p) operator, where n is the name of the aggregation, is the concrete class whose instances are aggregated, is the aggregation function and p is the property whose values are aggregated.
- When the user defines a report, she/he also specifies all the queries that are required to generate the report (the so-called Aggregation Queries).
- Given a report, the user can define some parameters called Report Parameters ()). They receive external data to be used to select data to aggregate: for example, in the running example, the StartDate and EndDate parameters could be defined, to indicate the period of interest to consider; only the amortization plans referring to the specified time period are considered; to this end, when a new parameter is defined, the user interface forces the user to define a condition on this parameter, which selects instances further considered by tables in the report.
- The user searches for a reference to a key concept that characterizes the report to generate within the knowledge base. For example, since the running example is focused on loans, she/he looks for the word “loan”.
- The result of the search is the ontological class “Loan or Credit”, which is made concrete by various concrete classes (see the RADAR Schema depicted in Figure 2). The user can also navigate the ontology: to this end, the user is provided with a browser similar to the one depicted in Figure 11.Furthermore, annotations associated with class definitions, property definitions and relationship definitions can be consulted, so as the user can fully comprehend how to use them. For example, what is exactly a “performing loan”? such an annotation, provided by the domain expert who designed the RADAR Schema, can accompany the definition of the Cession class; the user can consult it to learn about it and exploit this knowledge to prepare reports.
- Once identified the concrete classes, the user is now ready to define the report layout, since concrete classes to work on have been identified.
- The user creates the worksheet and then inserts an “aggregation table”. This choice is based on the layout required by the regulatory body.
- To comply with requirements, the user defines various groups of aggregate values; these groups are labeled as “1”, “2” and so on, on the A column. Specifically, group “1” summarizes ”Performing Loans” (see the label in cell B6), group “2” summarizes “Delinquent Loans” (see the label in cell B11), and so on.
- For each line in a group, the user defines “condition cells”, because only specific instances of the concrete class must be considered for computing aggregations. For example, consider cells B7, B8 and B9: since they are “condition cells”, lines 7, 8 and 9 are derived by selecting those instances of the concrete classes that meet the condition. In the cells, notice the names given to the conditions: for example, cell B7 is associated with the condition called “Mortgage_Loans”; to define it, the user is provided with the dialog box shown in Figure 15. The effect of having defined cells B7 as a “condition cell” is that next cells on line 7 will be derived only from instances of the Cession concrete class (see Figure 15. ) that meet the condition (those instances such that the value of the Performing_Category property is "Delinquent" and the value of the Guarantee_Category property is "Unguaranteed").
- After having defined the “condition cell” for a line, the user can define aggregations in the other cells of the same line. For example, considering again line 7 (that is the selected line in Figure 13), cell C7 is defined as an “aggregation cell”, whose name is PM_ResidualDebt: it indicates the sum of the residual debts of performing loans guaranteed by a mortgage. To define it, the user is provided with the dialog box shown in Figure 16: instances of the Financing_State class that are related to the instances of the Cession class selected by the B7 conditional cell are aggregated, to sum the values of the Residual_Debt property (since the condition in cell B7 selects only instances describing performing loans that are guaranteed by a mortgage, only these selected instances are aggregated by the specified aggregation, so this explains the name PM_Residual_Debt). Cells D7 and F7 are aggregation cells too.
- The reference layout provided by the regulatory bodies asks to sum the values of “aggregation cells”; this is the case of cell C6. For this reason, the user defines it as an “arithmetic-expression cell”; it sums the values of the cells C7 (denoted as CELL(7,3)), C8 (denoted as CELL(8,3)) and C9 (denoted as CELL(9,3)).
- In the case of cell E7 the user must define an aggregation within a mathematical expression. The Report Designer allows that in “arithmetic-expression cells”, by means of the AGGR operator.Consider the expression within the cell, which is CELL(7,3)+AGGR(PM_Residual_Debt, Financing_State, SUM, Residual_Debt). The AGGR operator sums values of the property named Residual_Debt of instances of the Financing_State class; the aggregation is given the PM_Residual_Debt name; then, the resulting aggregate value is summed to the C7 cell (denoted as CELL(7,3)).
- The user is asked to define the heading of the report and the structure of each “detail line”, i.e., the line that is repeated as many times as the number of selected instances. In the heading (lines 1 and 2 in in Figure 14), only label cells can be used.
- The user defines the line to be actually tied to instances of the selected concrete class. This is the selected line in Figure 14.For each cell, the user ties it to a property of the concrete class (in this case, the Financing_State concrete class): such cells are formally defined as “aggregation cells”, for which the aggregation function is not specified; this way, the basic model for cells is kept.For example, cell B4 is defined with name CessionId, as shown in the figure; the source class is the Financing_State class, and the considered property is CessionId.Consequently, all the cells on the line are automatically tied to the same concrete class, because the line is automatically bounded to an instance of the concrete class.If necessary, “arithmetic-expression cells” can be defined, to obtain values from other cells on the same line.
6. Conclusions
6.1. Summary
- The RADAR Framework is designed to cover the overall process that leads to generate reports, from data gathering to final reports.
- The Knowledge Base stores the knowledge about the application domain and data integrated within the framework.
- The RADAR DB is the unique storage system where data are gathered from external sources; derived from the concept of “Operational Data Store”, it provides a unique and uniform view of data, although they are still modeled in an operational fashion.
- The RADAR Data Model provides a high-level yet concrete view of data, which are semantically characterized by the adoption of the Reference Ontology: the Ontological Layer gives the basic semantic framework to data; the Concrete Layer models actual (operational and possibly massive) data by giving them a semantic characterization by means of the Reference Ontology; the Mapping Layer maintains knowledge about provenance of data.
- The Report Designer is used to design the layout of reports and connect them to data stored within the RADAR DB; the user browses the Knowledge Base to retrieve data of interest and specify aggregations.
- The rigid distinction between Design Layer and Information-System Layer allows for easily deploying the RADAR Framework within existing information systems: the computational resources necessary for actually processing data and generating reports are decoupled from those necessary for managing the Knowledge Base, designing the RADAR Schema and reports; this way, analysts and designers interfere with the information system as little as possible.
6.2. Future Work
- The adoption of the micro-service approach has been a good choice, which allows for decoupling engines and user interfaces. Following this direction, we are going to make a complete re-engineering of all the user interfaces towards a pure web-application approach (which is quite appreciated by users).
- Currently, the RADAR Framework is designed to gather only relational data, both from relational databases and from CSV (acronym for Comma-Separated Values) and MS Excel files. However, NoSQL (which stands for Not Only SQL) databases [54] based on the JSON format [55] have become widely used, due to the ability of JSON to represent data with complex structures, also called ”Non-First Normal Form” (denoted as ) [56,57]). Moving from previous work on the management of large JSON data sets [58,59,60,61], we plan to extend the Mapping Layer of the RADAR Data Model, to make the framework able to load JSON data sets from JSON stores and web sources.
- Finally, we plan to extend the Report Designer with a wizard that should drive the user through knowledge browsing, aggregation definition and report-layout definition. This functionality will be added while re-engineering the user interface.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lim, E.P.; Chen, H.; Chen, G. Business intelligence and analytics: Research directions. Acm Trans. Manag. Inf. Syst. 2013, 3, 1–10. [Google Scholar] [CrossRef]
- Golfarelli, M.; Rizzi, S.; Cella, I. Beyond data warehousing: What’s next in business intelligence? In Proceedings of the 7th ACM International Workshop on Data Warehousing and OLAP, Washington, DC, USA, 12–13 November 2004; pp. 1–6. [Google Scholar]
- Sen, A.; Sinha, A.P. A comparison of data warehousing methodologies. Commun. ACM 2005, 48, 79–84. [Google Scholar] [CrossRef]
- Azzini, A.; Cortesi, N.; Topalovic, A.; Psaila, G. Radar, a framework for automated reporting. In Proceedings of the 16th International Conference on Applied Computing 2019, Cagliari, Italy, 7–9 November 2019; pp. 54–62. [Google Scholar]
- Luhn, H.P. A business intelligence system. IBM J. Res. Dev. 1958, 2, 314–319. [Google Scholar] [CrossRef]
- Rao, R. From unstructured data to actionable intelligence. IT Prof. 2003, 5, 29–35. [Google Scholar] [CrossRef]
- Elena, C. Business intelligence. J. Knowl. Manag. Econ. Inf. Technol. 2011, 1, 1–12. [Google Scholar]
- Neches, R.; Fikes, R.E.; Finin, T.; Gruber, T.; Patil, R.; Senator, T.; Swartout, W.R. Enabling technology for knowledge sharing. AI Mag. 1991, 12, 36. [Google Scholar]
- Gruber, T.R. A translation approach to portable ontology specifications. Knowl. Acquis. 1993, 5, 199–220. [Google Scholar] [CrossRef]
- Swartout, B.; Patil, R.; Knight, K.; Russ, T. Toward distributed use of large-scale ontology. In Proceedings of the Tenth Workshop on Knowledge Acquisition for Knowledge-Based Systems, Banff, AB, Canada, 9–11 November 1996; pp. 138–148. [Google Scholar]
- Şimşek, U.; Kärle, E.; Holzknecht, O.; Fensel, D. Domain specific semantic validation of schema. org annotations. In Proceedings of the International Andrei Ershov Memorial Conference on Perspectives of System Informatics, Moscow, Russia, 27–29 June 2017; pp. 417–429. [Google Scholar]
- Ye, J.; Stevenson, G.; Dobson, S. A top-level ontology for smart environments. Pervasive Mob. Comput. 2011, 7, 359–378. [Google Scholar] [CrossRef] [Green Version]
- Berners-Lee, T.; Hendler, J.; Lassila, O. The semantic web. Sci. Am. 2001, 284, 28–37. [Google Scholar] [CrossRef]
- Bontcheva, K.; Wilks, Y. Automatic Report Generation from Ontologies: The MIAKT Approach. In Proceedings of the International Conference on Application of Natural Language to Information Systems, Salford, UK, 23–25 June 2004; pp. 324–335. [Google Scholar]
- Romero, O.; Abelló, A. A framework for multidimensional design of data warehouses from ontologies. Data Knowl. Eng. 2010, 69, 1138–1157. [Google Scholar] [CrossRef] [Green Version]
- Nebot, V.; Berlanga, R.; Pérez, J.M.; Aramburu, M.J.; Pedersen, T.B. Multidimensional Integrated Ontologies: A Framework for Designing Semantic Data Warehouses; Springer: Berlin/Heidelberg, Germany, 2009; Volume 13, pp. 1–36. [Google Scholar]
- Calvanese, D.; De Giacomo, G.; Lembo, D.; Lenzerini, M.; Poggi, A.; Rosati, R. Ontology-based Database Access. In Proceedings of the Fifteenth Italian Symposium on Advanced Database Systems SEBD, Torre Canne di Fasano, Italy, 17–20 June 2007; pp. 324–331. [Google Scholar]
- Xiao, G.; Calvanese, D.; Kontchakov, R.; Lembo, D.; Poggi, A.; Rosati, R.; Zakharyaschev, M. Ontology-Based Data Access: A survey; IJCAI Organization: Sydney, Australia, 2018. [Google Scholar]
- Xiao, G.; Ding, L.; Cogrel, B.; Calvanese, D. Virtual Knowledge Graphs: An Overview of Systems and Use Cases. Data Intell. 2019, 1, 201–223. [Google Scholar] [CrossRef]
- Poggi, A.; Lembo, D.; Calvanese, D.; Giacomo, G.D.; Lenzerini, M.; Rosati, R. Linking Data to Ontologies; Springer: Berlin/Heidelberg, Germany, 2008; Volume 10, pp. 133–173. [Google Scholar]
- Guerrini, M.; Possemato, T. Linked data: Un nuovo alfabeto del web semantico. Bibl. Oggi Mens. Inf. Aggiorn. Dibatt. 2012, 30, 7–15. [Google Scholar]
- Sporny, M.; Longley, D.; Kellogg, G.; Lanthaler, M.; Lindström, N. JSON-LD 1.0. W3C Recomm. 2014, 16, 41. [Google Scholar]
- Sporny, M.; Kellogg, G.; Lanthaler, M.; Group, W.R.W. JSON-LD 1.0: A JSON-based serialization for linked data. W3C Recomm. 2014, 16, 127. [Google Scholar]
- Pan, J.Z. Resource Description Framework; Springer: Berlin/Heidelberg, Germany, 2009; pp. 71–90. [Google Scholar]
- Browne, O.; O’Reilly, P.; Hutchinson, M.; Krdzavac, N. Distributed Data and Ontologies: An Integrated Semantic Web Architecture Enabling More Efficient Data Management. J. Assoc. Inf. Sci. Technol. 2019, 70, 575–586. [Google Scholar] [CrossRef]
- Petrova, G.; Tuzovsky, A.; Aksenova, N. Application of the Financial Industry Business Ontology (FIBO) for development of a financial organization. In Conference Series of Journal of Physics, Proceedings of the International Conference on Information Technologies in Business and Industry, Bali, Indonesia, 30–31 January 2016; IOP Publishing: Bristol, UK, 2016. [Google Scholar]
- Butler, T.; Abi-Lahoud, E. A Mechanism-Based Explanation of the Institutionalization of Semantic Technologies in the Financial Industry. In Creating Value for All Through IT; Bergvall-Kåreborn, B., Nielsen, P.A., Eds.; Springer: Berlin/Heidelberg, Germany, 2014; pp. 277–294. [Google Scholar]
- Financial Industry Business Ontology. 2021. Available online: http://www.fibo.org/schema (accessed on 15 January 2021).
- The Schema.org Ontology. 2020. Available online: http://schema.org/ (accessed on 15 November 2020).
- Guha, R.V.; Brickley, D.; Macbeth, S. Schema.org: Evolution of structured data on the web. Commun. ACM 2016, 59, 44–51. [Google Scholar] [CrossRef]
- Kärle, E.; Fensel, A.; Toma, I.; Fensel, D. Why are there more hotels in tyrol than in austria? Analyzing schema. org usage in the hotel domain. In Information and Communication Technologies in Tourism 2016; Springer: New York, NY, USA, 2016; pp. 99–112. [Google Scholar]
- Michener, W.K.; Jones, M.B. Ecoinformatics: Supporting ecology as a data-intensive science. Trends Ecol. Evol. 2012, 27, 85–93. [Google Scholar] [CrossRef] [Green Version]
- Madin, J.S.; Bowers, S.; Schildhauer, M.; Krivov, S.; Pennington, D.; Villa, F. An ontology for describing and synthesizing ecological observation data. Ecol. Inform. 2007, 2, 279–296. [Google Scholar] [CrossRef]
- Madin, J.S.; Bowers, S.; Schildhauer, M.P.; Jones, M.B. Advancing ecological research with ontologies. Trends Ecol. Evol. 2008, 23, 159–168. [Google Scholar] [CrossRef]
- Science Environment for Ecological Knowledge. 2004. Available online: http://seek.ecoinformatics.org/ (accessed on 10 October 2020).
- Foundry, O. Environment Ontology ENVO. 2020. Available online: http://www.obofoundry.org/ontology/envo.html (accessed on 10 October 2020).
- Buttigieg, P.L.; Pafilis, E.; Lewis, S.E.; Schildhauer, M.P.; Walls, R.L.; Mungall, C.J. The environment ontology in 2016: Bridging domains with increased scope, semantic density, and interoperation. J. Biomed. Semant. 2016, 7, 57. [Google Scholar] [CrossRef] [Green Version]
- Kharlamov, E.; Hovland, D.; Skjæveland, M.G.; Bilidas, D.; Jiménez-Ruiz, E.; Xiao, G.; Soylu, A.; Lanti, D.; Rezk, M.; Zheleznyakov, D.; et al. Ontology based data access in Statoil. J. Web Semant. 2017, 44, 3–36. [Google Scholar] [CrossRef] [Green Version]
- Ekaputra, F.; Sabou, M.; Serral Asensio, E.; Kiesling, E.; Biffl, S. Ontology-based data integration in multi-disciplinary engineering environments: A review. Open J. Inf. Syst. 2017, 4, 1–26. [Google Scholar]
- Mate, S.; Köpcke, F.; Toddenroth, D.; Martin, M.; Prokosch, H.U.; Bürkle, T.; Ganslandt, T. Ontology-based data integration between clinical and research systems. PLoS ONE 2015, 10, e0116656. [Google Scholar] [CrossRef] [Green Version]
- Nadareishvili, I.; Mitra, R.; McLarty, M.; Amundsen, M. Microservice Architecture: Aligning Principles, Practices, and Culture; O’Reilly Media Inc.: Sebastopol, CA, USA, 2016. [Google Scholar]
- Newman, S. Building Microservices; O’Reilly Media Inc.: Sebastopol, CA, USA, 2015. [Google Scholar]
- Bauer, C.; King, G. Hibernate in Action; Manning: Greenwich, CT, USA, 2005; Volume 1. [Google Scholar]
- Date, C. Introduction To Database Systems, 8th ed.; Addison-Wesley: Ithaca, NY, USA, 2003; p. 1024. [Google Scholar]
- Elmasri, R.; Navathe, S.B. Database Systems; Pearson Education: Boston, MA, USA, 2011; Volume 9. [Google Scholar]
- de Almeida Cruz, J.; de Azevedo Silva, K. Relational Algebra Teaching Support Tool. J. Inf. Syst. Eng. Manag. 2017, 2, 8. [Google Scholar] [CrossRef] [Green Version]
- Zaniolo, C. A unified semantics for active and deductive databases. In Rules in Database Systems; Springer: New York, NY, USA, 1994; pp. 271–287. [Google Scholar]
- OMG Available Specification, Meta Object Facility (MOF) 2.0 Core Specification, 2006. Available online: https://www.omg.org/spec/MOF/2.0/PDF (accessed on 15 January 2021).
- Poernomo, I. The Meta-Object Facility typed. In Proceedings of the 2006 ACM Symposium on Applied Computing, Dijon, France, 23–27 April 2006; pp. 1845–1849. [Google Scholar]
- Miller, E. An introduction to the Resource Description Framework. Bull. Am. Soc. Inf. Sci. Technol. 1998, 25, 15–19. [Google Scholar] [CrossRef] [Green Version]
- Candan, K.S.; Liu, H.; Suvarna, R. Resource Description Framework: Metadata and its applications. ACM Sigkdd Explor. Newsl. 2001, 3, 6–19. [Google Scholar] [CrossRef]
- Heflin, J. An Introduction to the OWL Web Ontology Language; Lehigh University: Bethlehem, PA, USA; National Science Foundation (NSF): Arlington, VA, USA, 2007; p. 7. [Google Scholar]
- McGuinness, D.L.; Van Harmelen, F. OWL web ontology language overview. W3C Recomm. 2004, 10, 2004. [Google Scholar]
- Meier, A.; Kaufmann, M. SQL & NoSQL Databases; Springer: New York, NY, USA, 2019. [Google Scholar]
- Ihrig, C.J. Javascript object notation. In Pro Node.js for Developers; Springer: New York, NY, USA, 2013; pp. 263–270. [Google Scholar]
- Abiteboul, S.; Bidoit, N. Non first normal form relations: An algebra allowing data restructuring. J. Comput. Syst. Sci. 1986, 33, 361–393. [Google Scholar] [CrossRef] [Green Version]
- Papakonstantinou, Y. Semistructured Models, Queries and Algebras in the Big Data Era: Tutorial Summary. In Proceedings of the 2016 International Conference on Management of Data, San Francisco, CA, USA, 26 June–1 July 2016; pp. 2229–2233. [Google Scholar]
- Bordogna, G.; Capelli, S.; Ciriello, D.E.; Psaila, G. A cross-analysis framework for multi-source volunteered, crowdsourced, and authoritative geographic information: The case study of volunteered personal traces analysis against transport network data. Geo-Spat. Inf. Sci. 2018, 21, 257–271. [Google Scholar] [CrossRef]
- Marrara, S.; Pelucchi, M.; Psaila, G. Blind Queries Applied to JSON Document Stores. Information 2019, 10, 291. [Google Scholar] [CrossRef] [Green Version]
- Psaila, G.; Fosci, P. J-CO: A Platform-Independent Framework for Managing Geo-Referenced JSON Data Sets. Electronics 2021, 10, 621. [Google Scholar] [CrossRef]
- Fosci, P.; Psaila, G. Towards Flexible Retrieval, Integration and Analysis of JSON Data Sets through Fuzzy Sets: A Case Study. Information 2021, 12, 258. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Azzini, A.; Cortesi, N.; Psaila, G. RADAR: Resilient Application for Dependable Aided Reporting. Information 2021, 12, 463. https://doi.org/10.3390/info12110463
Azzini A, Cortesi N, Psaila G. RADAR: Resilient Application for Dependable Aided Reporting. Information. 2021; 12(11):463. https://doi.org/10.3390/info12110463
Chicago/Turabian StyleAzzini, Antonia, Nicola Cortesi, and Giuseppe Psaila. 2021. "RADAR: Resilient Application for Dependable Aided Reporting" Information 12, no. 11: 463. https://doi.org/10.3390/info12110463
APA StyleAzzini, A., Cortesi, N., & Psaila, G. (2021). RADAR: Resilient Application for Dependable Aided Reporting. Information, 12(11), 463. https://doi.org/10.3390/info12110463