Fusion Rule Technology

An Overview

Fusion rule technology is being developed to merge structured reports. Syntactically, a structured report is a data structure containing a number of grammatically simple phrases together with a tag (giving semantic information) for each phrase. Each phrase that is tagged is a textentry. The set of tags in a structured report is meant to parameterize a stereotypical situation, and so a particular structured report is an instance of that stereotypical situation. For example, news reports on corporate acquisitions can be represented as structured reports using tags including buyer, seller, acquisition, value, and date. Each phrase in structured report is very simple, such as a proper noun, a date, or a number with unit of measure, or a word or phrase from a prescribed lexicon. For an application, the prescribed lexicon delineates the types of states, actions, and attributes, that could be conveyed by the structured report.

In order to merge structured reports, we need to take account of their content. Different kinds of content need to be merged in different ways. In our approach to merging structured reports we draw on domain knowledge to help produce merged reports. The approach is based on fusion rules defined in an XML file. These rules are of the form X IMPLIES Y, where if X is true in the knowledgebase, then Y is an instruction that needs to be undertaken in the process of building an output merged report.

To merge a set of structured reports, we start with the background knowledge and the information in the input reports to be merged, and apply the fusion rules to this information. For a set of structured reports and a set of fusion rules, we attempt to ground each fusion rule with textentries from the structured reports, and then check whether all the conditions of each ground fusion rule are implied by the background knowledge, and, if they are, then the ground actions of the rule are added to the actionlist (a list of actions that specify how the merged report should be constructed).

The basic architecture for fusion rule technology is based on three key modules implemented in Java: A fusion engine that executes a rulefile (an XML file containing the fusion rules marked up in FusionRuleML) by grounding each fusion rule with textentries from the structured reports, and then checks whether all the conditions of each ground fusion rule are implied by the background knowledge and, if they are, then the ground actions of the rule are added to the actionlist (a list of actions that specify how the merged report should be constructed); An action engine that executes the actionlist to build a merged report; and a knowledge manager that queries prolog knowledgebases and SQL databases.

A fusion system is a system for merging structured reports for a particular domain. It incorporates the fusion rule technology modules for executing fusion rules and for executing the resulting actions. It also incorporates a set of fusion rules that has been defined for the application domain together with an appropriate background knowledgebase.

Benefits of the fusion rules approach to merging

It is clear that there is a need for greater use of more sophisticated knowledge representation and reasoning for addressing semantic heterogeneity in information integration. Fusion Rule Technology offers a software platform for context-dependent merging of heterogeneous structured information in a way that minimizes conflict and reduces redundancy. This offers distinct advantages over other approaches to information integration. That is, we obtain the following benefits with fusion rules:

We get an actual merged report with a specific structure, which is important for some applications, whereas with many information integration approaches the aim is to answer queries from diverse sources.
We can undertake a deeper logical analysis of the inconsistencies arising between input reports and the background knowledge.
We can do context dependent merging and so select the aggregations to use depending on the nature of the input reports.
We use deeper background knowledge about the nature of the information to be merged and the sources of that information.
Information to be merged can include information representing uncertainty about sources, uncertainty about results, noise, etc, and this information can be handled and merged using established theories for handling uncertain information including probability theory and Dempster-Shafer theory.
The merged information can be annotated with meta-information about source, quality, degree of agreement between sources, and type of merging used.

Overall, Fusion Rule Technology provides a principled context-sensitive approach to handling a range of kinds of uncertainty and inconsistency that arise, and taking account of semantic heterogeneity issues arising, can produce aggregated output that is less conflicting, better integrated, and more informative, than the input. In a sense, it creates new knowledge.

Contact a.hunter@cs.ucl.ac.uk or +44 20 7679 7295.

Back to Fusion Rule Technology homepage.