1 Introduction
Data mining proposed in the mid-1980s has been developed rapidly. It is a kind of automated data analysis techniques based on databases or data warehouses. It can quickly find new knowledge which is valuable, meaningful and with specific relationships in vast amounts of data. These new knowledge can provide managers with effective decision support. But there are still some problems. For example, the gained knowledge is static because of the static nature of the database, and does not reflect the dynamic nature of knowledge.
The human history is a history of solving contradictory problems. People are constantly dealing with the contradictory problem in the past, the present and the future. The tool to solve the contradictory problems is transformation. Therefore, the research on extension transformations is the important part of solving the contradictory problems. In the modern society, all trades and professions have accumulated vast amounts of data. Now the question is: how can we mine the transformation knowledge from these data to assist us with solving the contradictory problemss? And this question proposes a new subject for data mining.
Extension data mining [1, 2] (EDM) is a product combining Extenics with data mining. Extension data mining was proposed in 2004. After several years’ researching and exploring, we finally have clarified the object and objective of this study. By using the theory and method of Extenics, it can mine the knowledge from database which is related to solve contradictory problems. And the knowledge includes the Extension classification knowledge, conductive knowledge and other knowledge associated with transformation, which are collectively called Extension knowledge. At present, we preliminary explored some questions of EDM, just the basic theory, basic method and their implementation on computers.
2 The set theory foundation of EDM
The set theory is a mathematical method of classifying and identifying the objective things by our brain. At the end of 19th century, a German named Cantor propounded Cantor Set, which is used to classify the confirmatory things. The Cantor set uses characteristic function valued by {0, 1} to characterize whether object belongs to the set. Cantor Set cannot describe the fuzziness of things and fuzzy things. In 1965, an American named Zadeh propounded the concept of Fuzzy Set, which can characterize the degree of having a certain property by using function valued by the interval [0, 1]. However, in numerous cases, the degree of having a certain property is changeable. During the problem solving, things with a certain property change into one that without, and problem with contradiction change into one that without. In order to describe the positive and negative aspects in a certain condition, the concept of Extension Set was pronounced, which describes the degree of something having a certain property by using function valued by the interval (-∞ , + ∞). Extension Set also tries to use Extension Domain to turn something with a certain property to one that without. In other words, Extension Set is a kind of changeable set which describes the variability of things and researches variable classification. EDM mines the changeable knowledge by using the expanding minds of Extension Set. ??????????
Extension set is the set theory foundation to study the variable classification. It has a wide range of applications, such as market segmentation, enterprise customer classification, product classification, the customer value research.
3 The Main Contents of EDM
Extension Data Mining (EDM) focuses on mining extension knowledge based on extension transformations. At present, the mining objects of EDM are mainly relational database or data warehouse, which is typically structured data. With in-depth research and technology development, the research objects will be gradually extended to the semi-structured or unstructured data, such as text data, image, video, data and Web data.
EDM matches the data with elements, and matches the database and data warehouse with the domain of extension set. EDM manages to apply the extension set theory and extension logic into data mining in theory, and consequently form the basic theory of mining extension knowledge. In the method, EDM manages to match database, data warehouse and formal system which built on element logic cell, and form a knowledge representation method which is suitable for data transformation. Using extension reasoning and extension tools such as correlation function, we can establish extension data mining methods suitable for mining extension knowledge.
Currently, the extension data mining research mainly focuses on extension classification knowledge mining, conductive knowledge mining and extension clustering knowledge mining which is based on database, the extension knowledge mining based on knowledge base and the computer implementation of these methods with the cases.
4 The development prospects of Extension data mining
With the rapid development of information technology, management information systems, Internet, data mining and knowledge management are accumulating more and more data, information and knowledge. We will move forward to information overload era from data explosion and knowledge explosion era. In the knowledge overload era, businesses’ decision-maker is in more need of practical knowledge. For example, in an increasingly competitive market, customers have become important resources today. The transformation knowledge will help the initial registration users and customers become ?loyal customers, so as to reduce the cost of customer retention and new customer development. During credit risk analyzing, it is not only necessary to identify high-risk customers, but also have to take measures to stop fraud actions of motivated customers, The classification methods and their extension-related knowledge is of great development potentials. During new products development, implicational analysis helps to find product trends earlier and identify the potential need of customer. During business process optimizing, extension data mining can contribute to identify bottlenecks in the efficiency and take transformation measures. In the medical industry, transformed knowledge can help doctors to detect fundamental change symptoms much earlier and identify the most effective programs to improve treatment. In marketing, the transformation knowledge has a guiding significance to market development. All in all, extension data mining can play a role in classification transformation, finding the root causes of the problem, identifying potential transformation knowledge, and so on. Therefore, extension data mining has broad application prospects.
At present, in the extension data mining area, from the standpoint of extension theory, methods, algorithms, applications, technical improvements, some experts have studied the existed data mining problems mentioned above by using the Extension principles and methods and achieved some achievements. With in-depth study and strengthen research efforts, there will be research results with a greater application value.
The knowledge-based economy has greatly accelerated the pace of economic globalization. The ever-changing environment shortens the information’s and knowledge’s update cycle. So innovation and resolving contradictory problems have become an increasingly important work of all kinds of occupations. Therefore, how to mine transforming knowledge has become an important task of researching data mining.
At present, studying object of extension data mining has its limit in relational database or data warehouse. In fact, when studying text, image and video data, web data and so on, we should also consider the effect of transformation on data. And these are also the study area of extension data mining.