knowledge discovery in databases: INLEN

(Michalski, Kaufman, Ternstedt, Bloedorn, Kerschberg, Wnek, Michalewicz*, Jodlowski*, Skowronski*)
* Collaborators at the Institute of Computer Science, Polish Academy of Sciences, Warsaw Poland

This project is concerned with the development of a large-scale multi-type reasoning system, called INLEN, for extracting knowledge from databases. The system assists a user in discovering general patterns or trends, meaningful relationships, conceptual or numerical regularities or anomalies in large databases. The volume of information in a database is often too vast for a data analyst to be able to detect such patterns or regularities. INLEN integrates symbolic learning and statistical techniques with database and knowledge base technologies. It provides a user with “knowledge generation operators” (KGOs) for discovering rules characterizing sets of data, generating meaningful conceptual classifications, detecting similarities and formulating explanations for the rules, generating rules and equations characterizing data, selecting and/or generating new relevant variables or representative examples, and testing the discovered rules on new data.

A screen in which the user may examine or modify the data set to be learned from. In this dataset each example describes a separate country. The Key field provides the country name (which is not learned from), and the values for the other attributes are presented in spreadsheet form

A screen displaying rules learned from an example set. In this example the rules are displayed which describe the conditions under which eye injuries have occurred on a construction site. The numbers of examples supporting each part of the rules are measured in the columns on the right.

An example screen from the INLEN Advisory Module, in which a user may be assisted in making a decision or can speculate on unknown data. The current question for the user is displayed in the middle of the screen, and INLEN’s current best hypotheses are displayed in the top right.

Selected References

Michalski, R.S. and Kaufman, K.A., “A Measure of Description Quality for Data Mining and its Implementation in the AQ18 Learning System,” International ICSC Symposium on Advances in Intelligent Data Analysis, Rochester, NY, June, 1999.

Kaufman, K.A. and Michalski, R.S., “Learning from Inconsistent and Noisy Data: The AQ18 Approach,” Proceedings of the Eleventh International Symposium on Methodologies for Intelligent Systems (ISMIS-99), Warsaw, June, 1999.

Kaufman, K.A. and Michalski, R.S., “Multistrategy Data Mining via the KGL Metalanguage,” Proceedings of the Seventh Symposium on Intelligent Information Systems (IIS’98), Malbork, Poland, pp. 39-48, June 15-19, 1998.

Kaufman, K.A. and Michalski, R.S., “Discovery Planning: Multistrategy Learning in Data Mining,” Proceedings of the Fourth International Workshop on Multistrategy Learning (MSL’98), Desenzano del Garda, Italy, June 11-13, 1998.

Michalski, R.S. and Kaufman, K.A., “Data Mining and Knowledge Discovery: A Review of Issues and a Multistrategy Approach,”, in Michalski, R.S., Bratko, I. and Kubat, M. (Eds.), Machine Learning and Data Mining: Methods and Applications, London: John Wiley & Sons, pp.71-122, 1998.

Kaufman, K.A., “INLEN: A Methodology and Integrated System for Knowledge Discovery in Databases,” Ph.D. Dissertation, School of Information Technology and Engineering, Reports of the Machine Learning and Inference Laboratory, MLI 97-15, George Mason University, Fairfax, VA, November, 1997.

Kaufman, K.A. and Michalski, R.S., “KGL: A Language for Learning,” Reports of the Machine Learning and Inference Laboratory, MLI 97-3, George Mason University, Fairfax, VA, 1997.

Kaufman, K. and Michalski, R.S., “A Method for Reasoning with Structured and Continuous Attributes in the INLEN-2 Knowledge Discovery System,” Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, OR, August, 1996, pp. 232-237.

Kaufman, K. and Michalski, R.S., “A Multistrategy Conceptual Analysis of Economic Data,” Ein-Dor, P. (ed.), Artificial Intelligence in Economics and Management: An Edithed Proceedings on the Fourth International Workshop, Boston, Kluwer Academic Publishers, 1996, pp. 193-203.

Kaufman, K., “Addressing Knowledge Discovery Problems in a Multistrategy Framework,” Proceedings of the Third International Workshop on Multistrategy Learning (MSL-96), Harpers Ferry, WV, May 23-25, 1996, pp. 305-312.

Ribeiro, J., Kaufman, K. and Kerschberg, L., “Knowledge Discovery from Multiple Databases,” Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), Montreal, Canada, August, 1995, pp. 240-245.

Michalski, R.S., Kerschberg, L., Kaufman, K.A. and Ribeiro, J.S., “Mining For Knowledge in Databases: The INLEN Architecture, Initial Implementation and First Results,” Intelligent Information Systems: Integrating Artificial Intelligence and Database Technologies, Vol. 1, No. 1, pp. 85-113, August 1992.

Kaufman, K., Michalski, R.S. and Kerschberg, L., “Knowledge Extraction from Databases: Design Principles of the INLEN System,” Proceedings of the Sixth International Symposium on Methodologies for Intelligent Systems, ISMIS’91, October 16-19, 1991.

Kaufman, K.A., Michalski, R.S. and Kerschberg, L., “Mining for Knowledge in Databases: Goals and General Description of the INLEN System,” Knowledge Discovery in Databases, G. Piatetski-Shapiro and W.J. Frawley (Eds), AAAI Press/The MIT Press, Menlo Park, CA 1991.

Kaufman, K.A., Michalski, R.S. and Kerschberg, L., “Mining for Knowledge in Databases: Goals and General Description of the INLEN System,” Proceedings of IJCAI-89 Workshop on Knowledge Discovery in Databases, Detroit, MI, August 1989.

Kaufman, K., Michalski, R.S., Zytkow, J. and Kerschberg, L., “The INLEN System for Extracting Knowledge from Databases: Goals and General Description,” Reports of the Machine Learning and Inference Laboratory, MLI 89-6, School of Information Technology and Engineering, George Mason University, Fairfax, VA, 1989.

For more references, seeĀ publications section.