data-driven constructive induction: AQ17-DCI

(Michalski, Bloedorn, Wojtusiak)

Most machine learning programs view the problem of learning an inductive hypothesis as a search for the “best” hypothesis in the given representation space. This works well if the problem is already well designed by some domain or machine learning expert using attributes which are relevant and simply related to the target concept.

However, finding a representation that is well suited to the problem is not a trival task. A number of different aspects have to be determined:

1) What attributes are relevant to the given task?
2) What values should those attributes take?
3) Are the concepts boundaries easily describable in the language and bias of the given learner?

In this project we have developed a method for automatically answering these questions. In our data-driven constructive induction approach we base decisions about the changes to make to the representation space on information and heuristics derived from the data. The data-driven approach can perform both expansions of the representation space through attribute construction, and reductions of the representation space through attribute removal, and abstraction.

A functional diagram of the DCI method. Changes to the Representation Space are based on Data and expert advise in the form of constraints provided by the user.

Data-driven constructive induction has been successfully applied to a number of different problems. These include artificial domains such as those in the 1st International Machine Learning Competition (Monk’s Problems) to real-world domains involving predicting the voting pattern of members of the House of Representatives to predicting the size of national Gross National Product (GNP) of countries around the world.

Selected References

Wojtusiak, J., “Data-driven Constructive Induction in the Learnable Evolution Model,” Proceedings of the 16th International Conference Intelligent Information Systems, Zakopane, Poland, June 16-18, 2008.

Wojtusiak, J., “Handling Constrained Optimization Problems and Using Constructive Induction to Improve Representation Spaces in Learnable Evolution Model,” Ph.D. Dissertation, College of Science, Reports of the Machine Learning and Inference Laboratory, MLI 0-3, George Mason University, Fairfax, VA, November, 2007.

Bloedorn, E. and Michalski, R.S., “Data-Driven Constructive Induction: A Methodology and Its Applications”, Special issue on Feature Transformation and Subset Selection, IEEE Expert, Huan Liu and Hiroshi Motoda (Eds.), 1997.

Bloedorn, E. and Michalski, R.S., “The AQ17-DCI System for Data-Driven Constructive Induction and Its Application to the Analysis of World Economics,” Proceedings of the Ninth International Symposium on Methodologies for Intelligent Systems (ISMIS-96), Zakopane, Poland, June 10-13, 1996.

Bloedorn, E., and Michalski, R.S.,”Data-Driven Constructive Induction in AQ17-PRE: A Method and Experiments”, Proceedings of the IEEE International Conference on Tools for AI, San Jose, CA, Nov. 1991. p. 30-27.

Bloedorn, E. and Michalski, R.S., “Constructive Induction from Data in AQ17-DCI: Further Experiments”, Reports of the Machine Learning and Inference Laboratory, MLI91-12, George Mason University, Fairfax, VA, 1991.

For more references, see publications section.