1. Introduction
Technology is a very important tool in the today’s information society. Especially in recent years, as innovation is emphasized, the importance of technology is being recognized again. Technological innovation consists of the creation and application of knowledge, and it is an important factor in sustainable growth [
1,
2]. In the past, companies operated independently in the areas of research and development, management, and marketing. As mentioned above, however, as the importance of technology is emphasized, companies are pursuing the concept of technology management (TM), which considers technology and management together. Many companies have tried to standardize technologies or manage intellectual properties in order to secure a competitive advantage through TM [
1].
Intellectual property management (IPM) is a managerial method that strategically utilizes intellectual property, such as patents, trademarks, and copyrights, for corporate management [
3]. Among these, a patent discloses information about an invention through the law, and the inventor is able to achieve exclusive rights to the invention. In other words, although a patent possesses a disadvantage in terms of its potential to be disclosed to the public, the inventor can have exclusive rights to the invention for 20 years. For this reason, many technology-based companies try to acquire exclusive rights through patent applications after completing the research and development (R&D) of a technology. In this regard, patents are considered very useful by helping not only to preserve technology rights, but also to identify the direction of R&D and the competition intensity in the market [
4]. Researchers who consider the characteristics of a patent, as previously mentioned, have attempted to analyze patents from a macro perspective and to forecast the direction of technology innovation [
5,
6,
7,
8,
9].
A patent is recognized as a means of creating profit through licensing or technology transfer, as well as protecting the rights to an invention. This can be explained through the following example. Qualcomm has become a strong player in the telecom market with its huge royalty income from its Code Division Multiple Access (CDMA) technology. Another example is the recently emerged patent trolls, formally referred to as non-practicing entities (NPEs) [
10,
11]. In particular, a patent troll does not develop the actual invention but is allowed to purchase a patent from patent holders and retain it. If another company infringes on the patent it purchased, it will try to gain economic advantage through litigation [
10]. As such, a patent has economic advantages, as well as legal rights.
Corporations, research institutes, and universities put forth a great deal of effort to transfer their technologies in order to create economic value from the patents. Technology transfer (TT) means that the holder of the knowledge, know-how, or technical materials transfers ownership or rights to the technologies to others who require them [
12,
13]. From the perspective of technology holders, it is advantageous because they can achieve financial profit to invest in a new research and development task when they sell their licensing or technical rights to another. On the other hand, from a buyers’ perspective, it is beneficial because they can reduce the time and cost required for the technology development process. In recent years, we can see that innovation has been taking place through technology convergence [
6,
14,
15]. From this perspective, TT is a good way to cope with the patterns of technological change. Furthermore, many scholars are doing a lot of research to promote TT that has a beneficial impact on both innovation and economy. Lai and Tsai introduced an evaluation methodology to enable efficient technology transfer. The proposed method is based on the analytic hierarchy process (AHP) and fuzzy logic method. They analyzed 20 questionnaires on Taiwan’s machinery industry to demonstrate the effectiveness of the proposed method. Although this article suggested a methodology of using scientific methods to evaluate the effectiveness of TT, the study is limited because only survey data was used [
16]. In order to help technology developers plan their R&D process in a multi-technology industry, Park et al. proposed a method of analyzing the possibility of technology transfer using patent citation information [
17,
18]. In addition, Choi et al. showed a technology prediction model based on the scientific method that uses patent information [
13]. They extracted the quantitative indexes included in the patents for each organization through patent analysis and proposed an integrated technology transfer prediction model using a regression model, social network analysis, and decision tree algorithm.
When evaluating the quality of a patent, it is very important to consider the technological description, as well as qualitative factors. Nevertheless, previous studies have only considered quantitative indicators or technical material. The goal of this study is to propose a technology transfer prediction model based on the AdaBoost algorithm that uses technology topics and quantitative indexes of a patent to solve the above limitation. This paper is organized as follows. In
Section 2, we describe the core theories used by the proposed model, and we propose a technology transfer prediction model using the ensemble method in
Section 3. To illustrate the proposed model, the authors carry out a case study using the proposed methodology in
Section 4. In
Section 5, we discuss the conclusion of our experiment and suggest areas for future study at the end.
3. Methodology
In this research, we propose a scientific-based technology transfer prediction model using the AdaBoost algorithm for sustainable technology management. The proposed model is able to cover the quantitative indexes and information in the patent literature. Patent data are composed of various information on the inventions, such as text, numeric information, equations, and figures. In order to advance this experiment, therefore, patent data need to be changed into structured data. This study progresses as follows. In the first step, we identify the technology topics by analyzing titles and abstracts in patent data. In the next step, we extract quantitative indexes from patent data. Finally, merging the technology topics and quantitative indexes, the technology transfer prediction model is produced using AdaBoost.
Figure 3 illustrates this experimental process.
3.1. Collecting the Data
The aim of our study is to produce a technology transfer prediction model using patent data. The patents used for the experiment belong to the domain of inference and machine learning technologies filed before August 2016 in the United States. Because artificial intelligence (AI) technology is a fusion of various technologies, such as speech recognition, natural language processing, machine learning, and so on, it is difficult for a company to conduct research and development on all the sub-technologies in that field. Therefore, a lot of enterprises adopt a technology trading strategy. For this reason, we collect the patent data included in the field of inference and machine learning and try to generate the model. The data were collected from worldwide intellectual property service (WIPS), which is a Korean service provider of patent information. The collected data contains noise or redundant patents, so we removed them. As a result, a total of 711 valid patents were selected for the further analysis. In the patent search database that we used, one is able to check the information of applicant and current assignee of a patent. If current assignee is different from the original applicant, we considered that the patent is transferred. Out of the 711 valid patents, 208 were found to have been transferred.
Table 1 shows a detailed description of the collected data.
3.2. Technology Clustering Using Topic Model
To enable searchers to locate patents easily, the patent examiner classifies them using IPC, cooperative patent classification (CPC), or file index (FI) according to the contents of the patent specification. However, since the classification codes do not contain detailed technical content, the ability to ascertain technical material based on them is limited. Therefore, in our study, the technologies in the target domain are first classified through topic modeling.
Figure 4 illustrate the process of finding the technology topic in this research. In order to grasp the technological topics of the collected data, we used the text of patent documents obtained by merging the titles and the abstracts into a corpus, which is a linguistic set of texts. Under the given grammatical structure, the word class is generally written according to the location of the predicate, object, and the like. Among them, stop-words such as “the”, “a”, “by”, “as”, and “is”, referring to commonly used terms, are necessary for building general sentences, but are worthless for the analysis in this study. As such words do not provide special meaning in LDA and cause an unnecessary increase in the complexity of the computation when included, it is necessary to remove them appropriately for efficient information processing. Therefore, preprocessing is performed by eliminating stop-words. In addition, because not every word in the documents is significant for further analysis, the term frequency-inverse document frequency (TF-IDF) method was used to select significant words for the experiment. The form of the word is determined by the position of each word in the sentence. For this reason, if the word form is not the same, a computer is not able to recognize whether some words have the same meaning or not, which can result in a data distortion. Thus, the unification of words that have the same meaning is necessary. In this study, stemming is used to unify the forms of words. Also, we eliminate any numbers, punctuation, and symbols that may distort the analytic process.
The data that have been refined through the preprocessing system are used to fit the LDA model. To fit the LDA model, we adopted the Gibbs sampling method and increased it from 2 to 10 to find the optimal
K. In addition, the number of iterations is limited to 500, 1000, and 2000 so that the topic can be displayed well. The Dirichlet parameter
α is set as the estimated value from the document and parameter
β is set as 0.1.
Table 2 shows the most suitable parameters for fitting LDA models found through repetition.
There are various opinions on how to determine the optimal number of topics [
23,
24,
30,
31,
32]. For our purposes, the topic is determined by the topic probability
θ of each document.
3.3. Technology Transfer Prediction Model Using the AdaBoost Algorithm
In order to implement the technology transfer prediction model, we use the AdaBoost algorithm, which is a typical ensemble model.
Figure 5 illustrates the proposed model, and
Table 3 shows the used variables in this study.
In our experiment, the quantitative indexes mean the variables, such as citation, period, claim, family patent, family country, patent references, and non-patent references. The technology topic represents the technical description of a patent using the LDA. As mentioned in the background, however, it is not appropriate to use categorical independent variables directly as inputs to the AdaBoost algorithm. Therefore, the technology topic variable needs to be changed into the dummy variable. Among the variables in the dataset, “
Transfer” indicates whether a patent has been transferred from the original holder to another. In the patent search database that we used, one is able to check the information of applicant and current assignee of a patent. If an assignee is changed, we consider that the patent is transferred. We used “
Transfer” as the output variable
in this experiment. Therefore, if the patent is transferred to another, it has a value of +1, otherwise it has a value of −1. The parameters used in the AdaBoost algorithm were estimated through iteration.
Table 4 shows the parameters used in this experiment.
To validate the performance of the proposed model, we consider the measures of accuracy, specificity, and sensitivity, which can be used for evaluating the classification performance. The measures are shown in detail in
Table 5.
The accuracy represents the overall performance of the classifier; it considers true positive (TP) and true negative (TN) together. However, when a classifier learns the noisy training data excessively, it may cause overfitting. Thus, the accuracy is not able to measure the correct performance of a classifier in such a situation. In order to overcome this problem, we evaluate the performance of the proposed model considering both the sensitivity, which only considers true positive, and the specificity, which considers true negative only.
4. Experiment Result
To carry out this study, we collected data for the technical field of inference and machine learning according to the conditions mentioned at the beginning of
Section 3.1.
First, in order to examine the trend of the collected data, we show the trend for applications and technology transfer in the graph in
Figure 6. In
Figure 6, “n” is the graph showing the number of patent applications by year, and “tr” indicates the number of technology transfers by year.
As can be seen, from 1995 to 2008, patent applications for the above technology field showed a steady increase. In 2009, patent applications dropped sharply compared with 2008, but the number of applications increased until 2013. However, from 2014 onward, patent applications have decreased again. Looking at the trends in technology transfer, the chart begins to measure it in 1997. Although the number of technology transfers is not large, it has gradually increased since 2008 and rose sharply in 2014.
We would like to know trends according to specific technologies, but the patent data do not have technical classification information. Therefore, in this study, a technical classification is performed using the LDA as mentioned above. The LDA uses the parameters shown in
Table 2. In addition, several methods for selecting the optimal number of topics have already been proposed [
23,
24,
31,
32,
33]. We used the method Cao et al. proposed, which is considered to be the most appropriate method for this study [
31]. The result is shown in
Figure 7.
As a result of the analysis of the optimal number of topics using the method described by Cao et al., it was found to have a minimum at
k = 5. Based on these results, we classified the technology topic of collected data.
Table 6 shows the results. The top ten keywords included in each topic were utilized to define the technology topic.
As a result of the technical classification, the technology of “natural language understanding” occupied the largest part, about 22.4%, in the field of inference and machine learning. Next, “expert system” technology accounted for 21.4%, followed by signal processing, image processing, and artificial neural network technology.
Table 7 shows the number of technology transfers for the above technology fields.
The transfer rate of the collected data was 29% on average. As a result of confirming the proportion of technology transfers according to the technology classification using LDA, the rate of technology transfer in the field of Topic 3 (natural language understanding) is 38%, which is relatively higher than other technology fields.
To generate the technology transfer prediction model proposed in this study, we merge the previous topic model results and the quantitative index for patents, which is described in
Table 3 and used the AdaBoost algorithm. In order to compare the performance of the proposed method, performance comparison tests are performed with K-nearest neighbor classifier (K-NN), support vector machine (SVM), and neural network algorithm, which are representative classification algorithms. The performance measures use the accuracy, sensitivity, and specificity mentioned in
Section 3. We also discuss models that include technology topics and those that do not. The experimental results are shown in
Table 8 and
Table 9 and
Figure 8 and
Figure 9. In
Figure 8 and
Figure 9, NT refers to a model that does not include the technology topic, and YT refers to a model that includes the technology topic.
We compared models that include the technology topic with ones that do not include it. The result is shown in the following. The classification performance of the technology transfer prediction model including the technology information from the literature is superior to the model that does not include the literature information. The sensitivity of the model that does not contain the technical content is notably lower than that of the model that includes the technical information because overfitting occurs in the model. Therefore, it can be assumed that technology information is an important factor in predicting technology transfer.
Next, we generated models using the proposed method and other models for comparison, which were based on classifiers such as K-nearest neighbor classifier (K-NN), support vector machine (SVM), and neural network, respectively. The K-NN is simple in structure but has an excellent performance. For this reason, it is used in many classification problems [
33,
34,
35,
36]. The support vector machine and the neural network are also well known to have excellent classification performance and are applied in various fields [
37,
38,
39,
40]. As a result of the comparison between the proposed model in this study and the other classifiers mentioned above, the accuracy of the models was found to be similar overall. However, in terms of the sensitivity and the specificity, which indicates the true positive and the true negative, respectively, there was a significant difference between the proposed model and the other models. In particular, the sensitivity and specificity of the other models we compared were lower than those of the proposed model. This seems to be because of the overfitting of the model. These results show that the proposed model performs better than the other models do in the case of technology transfer prediction. It can, therefore, be inferred that the proposed model based on patent data in this study is suitable for predicting technology transfer.
5. Discussion
The advancement of science and technology has made human life more convenient than ever, but competition in society has also become very intense. In today’s technology-intensive market environment, companies strive to survive through sustained growth. Such efforts are made in a variety of ways, such as self-driving car alliance. Technology transfer is also a management strategy for maintaining technological competitiveness and sustaining the growth of companies. Previously, studies using surveys or patent data have been conducted to promote technology transfer, but no systematic model was suggested.
This study proposes a predictive model of technology transfer based on an ensemble method to support the continuous growth of enterprises and countries. The proposed model can predict the transferability of patents. In the experimental results, the proposed model showed better classification performance than the other models. If companies or research institutes use the predicted results, it is possible to select patents with a high potential for being transferred, which can increase the success rates of the transactions. The capital acquired through technology transfer can be reinvested in the activities for continuous growth of enterprises or research institutes.
In future work, we expect to improve the generalization performance of the model by using various technical data. In addition, it is necessary to develop an additional algorithm that is able to enhance the performance of prediction.
6. Conclusions
In recent years, intellectual property has become an indispensable element for the sustainable growth of a corporation. A lot of technology-intensive companies have tried to maintain their competitiveness directly through technological research and development. As a method of technological development is the pursuit of convergence, these days, it is difficult to secure competitiveness using traditional methods. As an alternative, technology transfer is becoming widely used. Technology transfer should be encouraged because not only companies, but also universities and research institutes that have developed technologies, are able to acquire opportunities to create new technologies through such transfers.
Previous work has been studied to find the factors needed to predict technology transfers. In addition, models that only consider either quantitative elements or technical content have been proposed. However, these studies have not focused on a technology transfer prediction model that considers both of them. In this study, we proposed a methodology for predicting technology transfer to enable more effective technology transfers. LDA was used to take into account the technical contents of the collected patent data. Its results were used as variables for the proposed model to represent patent technologies. Also, quantitative factors of the patents were extracted. The technology transfer prediction model based on both the result of the LDA and the quantitative patent variables was finally produced using the AdaBoost algorithm, which is a representative ensemble method. As a result, it was confirmed that the accuracy, sensitivity, and specificity of the proposed model were superior to those of the other methods we compared.
Through the outcome of our study, we were able to predict technology transfer, and the following advantages are expected. There are differences in the quantitative elements of patents in each technology area. Therefore, there is a limitation in that it is challenging to generalize when attempting a technology transfer prediction using only quantitative patent factors. The proposed model is able to reflect the technology field, and it is also sufficient to cover the difference of quantitative factors existing in each technical area as information on technology is included. Also, it is expected that the result of the technology transfer prediction can be taken into consideration when evaluating the value of a technology.