experimental software developed by the machine learning and inference laboratory

The MLI Laboratory has developed many experimental programs for machine learning, data mining and knowledge discovery, machine inference, knowledge visualization and related tasks. Among its major programs are: ABACUS, AQ11, AQ15c, AQ16 (POSEIDON), AQ17-DCI, AQ18, AQ19, AQ21, CLUSTER, EMERALD-AQ, EMERALD-SUN, iAQ, INDUCE1..4, INLEN, ISHED, VINLEN, KV1 and KV2, LEM1, LEM2, LEM3, RT, SPARC/E and SPARC/G. These programs are briefly described below.

Some programs have already been arranged to be directly downloadable from this website. They were placed at the beginning of the list. As time permits, we will try to make other programs downloadable too.

The current work focuses on implementation of AQ21, LEM3, and ISHED systems.

Older programs were developed on different platforms, and may not work on PCs. To obtain them, please contact Dr. Janusz Wojtusiak. Although we try to maintain source code and executables for everything developed in MLI, some older programs, particularly those developed before 2000, may not be available. We will try to locate them, but it may take time.

The figure above is the opening screen of the 1999 JAVA Version of the EMERALD system developed in the Machine Learning and Inference Laboratory. The EMERALD system (Experimental Machine Example-based Reasoning and Learning Disciple) integrates five modules (“robots”) each displaying a capability for some form of learning and discovery. An earlier version of EMERALD was presented at a national exhibit “Robots and Beyond: Age of Intelligent Machines” which toured eight major U.S. museums of science.

downloadable software

AQ21 can be viewed as a laboratory for performing experiments in machine learning and knowledge discovery. It is the newest member of the AQ family of systems developed over the years in the Machine Learning and Inference Laboratory. It consists of a learning module that generates general inductive hypotheses from data, and a testing module that applies the learned hypotheses to testing data. The learned hypotheses are in the form of rulesets in attributional calculus, a logic system that combines elements of propositional, predicate and multi-valued logic. AQ21 can generate many different types of descriptions, such as complete and consistent generalizations, approximate theories, strong patterns, discriminant or characteristic descriptions, or descriptions with exceptions. The descriptions are optimized according to task-dependent criteria. The testing module, which is fully integrated with the learning module, includes a variety of methods for testing the learned hypotheses. AQ21 runs on Linux and Windows platforms. To download AQ21 click here. The AQ21 version available here is created in 2004. To obtain the most recent version, please contact Dr. Janusz Wojtusiak.

iAQ program demonstrates Natural Induction, that is, an ability of a computer program to learn knowledge from data in forms natural to people, and by that easy to understand and interpret.

In iAQ, discovered rules are expressed verbally and also as natural language text. The program has an entertaining introduction, accompanied by music. iAQ can be run on any Windows XP system. Because of the voice and sound output, to run the program it is necessary to attach speakers to the computer. An important new feature of iAQ (not yet fully completed) is the “Your data” option, which allows the user to apply the learning program to the user’s own data. To learn about the AQ learning methodology, click on “Publications” in the main menu, and then search for papers (numerous) that have “AQ” in the title.

To download iAQ click here.

LEM3 system implements a novel, non-Darwinian methodology for evolutionary computation, called Learnable Evolution Model or LEM. LEM employs a learning program to guide the evolutionary computation. Instead of conventional random mutations and recombinations, LEM employs hypothesis formation and generation operators to create new populations of individuals. Initial experiments have shown that LEM can very significantly speed up evolutionary computation in terms of number of fitness functions evaluations, and can be particularly useful for very complex optimization and design applications in which such evaluation is not trivial. LEM3 uses the AQ21 learning module for hypothesis formulation to guide the evolution process. LEM3 runs on Linux and Windows platforms. To download LEM3 click here.

To learn about the LEM methodology, click on “Publications” in the main menu, and search for papers that have “Learnable Evolution Model” in the title. The LEM3 version available here is created 2004. To obtain the most recent version, please contact Dr. Janusz Wojtusiak.

older software

ABACUS 2 is a program for integrated quantitative and qualitative discovery. Specifically, given data consisting of numeric and possibly also symbolic characterizations of some phenomenon (an object, a process, a system), ABACUS will generate mathematical equations characterizing this phenomenon and qualitative conditions under which these equations apply. These equations can then be used for predicting the behavior of this system or process. For example, given data characterizing an electric circuit (voltage, current, resistance, and any other relevant or irrelevant properties, ABACUS will generate the Ohm’s Law)
AQ Family: All of the programs in the AQ family learn general decision rules from examples of decision classes. Here are standard features of the “base” version of the AQ program

The learned decision rules are optimized according to user-defined criteria or a default optimality criterion. The criteria refer to syntactic simplicity of the rules (measured by the number of rules, number of conditions in the rules, the simplicity of the conditions,or a combination of these factors), and/or the evaluation cost of the rules (the cost of measuring the attributes involved in the rules). Programs allow the user to generate different types of descriptions (“rulesets”), such as discriminant (that discriminate among given decision classes), or characteristic (that specify common features of the objects in the individual classes. The programs can also generate rulesets that have different relations among the rules — intersecting (rules of different classes may logically intersect over areas not covering training examples), disjoint (rules or different classes are logically disjoint) or ordered (rules for each class are totally ordered and must be executed in the given order when applied to a given object). Learned rules are evaluated either by a strict match or by a flexible match. Individual versions of AQ programs have some additional features above the “base” version of the program.

AQ15c: a plain version of the AQ learning program (implemented in the ANSI C). This version is available for SunOS 4.1, MacOS 7.5 and DOS 6.x

AQ16 (POSEIDON): Plain AQ with mechanisms for optimizing rules by applying rule modification mechanisms. There are two mechanisms: TRUNC–that truncates insignificant rules (which corresponds to performing a form of ruleset specialization) or TRUNC/SG that modifies rules conditions and truncates insignificant rules (which corresponds to performing of both specialization and generalization of rules). Rules are evaluated either by a strict match or by a flexible match. These version is oriented toward learning concepts from noisy data or learning “flexible” concepts, that lack precise definition. The program applies some simple forms of “two-tiered” concept representation. A two-tiered representation consist of a base concept representation (BCR) that captures typical concept properties, and inferential concept representation that captures non-typical, variable, or exceptional concept properties. (See MLI papers on two-tiered concept representation). This version is available for SunOS 4.1

AQ17-DCI: AQ program with Data-driven constructive induction capabilities. These capabilities allow the program to automatically modify the representation of the problem, e.g. adding or removing attributes or removing attribute-values. This version is available for SunOS 4.1.

AQ17-HCI: AQ program with Hypothesis-driven constructive induction capabilities. These capabilities allow the program to automatically modify the representation of the problem, e.g. adding or removing attributes.

AQ18 (STAR): An environment for symbolic learning that integrates a large set of modules such as ruleset learning, decision structure learning, constructive induction, ruleset testing and knowledge visualization.

CLUSTER creates meaningful categories and classifications of given entities, and formulates descriptions of these created categories. Each class description is given in conjunctive form involving selected object attributes. CLUSTER has been applied to varied practical problems including classifying Spanish Folk songs, microcomputers, and reconstructing soybean disease categories.
EMERALD: SUN Version: Integrated Learning Systems for Research and Education. An earlier version of this system, called ILLIAN, was presented at eight major U.S. Museums of Science.
INDUCE learns structural descriptions of groups of objects, and determines important distinctions between the groups.
Sparc/G: Predicts possible future objects or events by discovering rules characterizing the sequence of objects or events observed so far. Individual objects or events are described by sets of multitype attributes.

Sparc/E: Discovers rules for predicting sequences in EULESIS, a card game that models scientific discovery.

DIAV: Diagrammatic Visualization of learning algorithms and discrete knowledge transmutations.
Knowledge Visualizer: a software system similar to DIAV but developed using Java 1.0.2 and suitable for large problems.
MIST: Software system supporting an application of symbolic learning in computer vision. Current system applies AQ learning methodology for determining rules that characterize different objects in a visual scene.