We show how the additional information provided by the decomposition primitives can be incorporated into a semantic-based dependency analysis technique. The resulting analysis reveals parallelism at compile time that is very difficult, if not impossible, to discover by traditional syntactic analysis techniques. Simulation and implementation results demonstrate scalable and broadly available parallelism.
Intuition suggests that languages based on the production system model admit a high degree of parallelism [Gup86]. Efforts to exploit parallel processing to increase production system performance have been ongoing for over a decade [KM92,Mir91,SG92]. However, the maximum speedup achieved by actual implementation rarely exceeds tenfold and has never done so over a general suite of applications no matter how many processors are used.
Decomposition abstraction is the process by which programmers specify
decomposition strategies for the exploitation of parallelism embodied by
an application. Most parallel programming languages provide constructs
for decomposition. This is almost a must because parallelizing compilers
for sequential languages can not completely resolve name ambiguity and
dependencies at compile-time. Numerous examples of language constructs
for parallel decomposition can be
found in both imperative and declarative parallel languages.
The PARTITION statement in Refined Fortran [KKK89], the DECOMPOSITION
and related statements in Fortran D [HKKKT91], the Ada tasking mechanism
[MF86], the process type in Concurrent C [GR86], the pcall
and future in Multilisp [Hal85], and the notions of blackboard
and theory in Shared Prolog [BC91], just to name a few, are
all examples of language mechanisms or constructs for data or function
decomposition to facilitate the expression of parallelism.
The need for decomposition abstraction has not been well recognized by the research community of production system languages [IS85,KMB91,KM91,Mir91,MKB90,Sch91]. In a pilot paper [WB93], we analyzed several commonly used benchmark programs and pointed out that the rather limited success in the past was primarily due to the failure to properly exploit the parallelism inherent in the application. Contrary to the conclusions drawn by previous work that the true performance gain from parallelism is quite limited [GFN89], we showed that massive and scalable speedup was indeed achievable.
In this research we develop methods of decomposition abstraction suitable for rule-based programming. The methods are declarative. Further, the methods capture intrinsic properties more closely related to the application than to issues of parallel computing. Thus we say that the abstraction mechanisms capture, normally unexpressed, semantic properties of the application. After detailing the semantics of these primitives we develop a semantic-based analysis technique for detecting, at compile-time, rule interference such that, at run-time, multiple instantiations of rules may be executed concurrently. Simulation results demonstrate that decomposition abstraction provides us with a key to scalable and massive parallelism in production systems.
The main results in this research can be summarized as follows.