# The Details about the Unification Algorithm implemented in Javascript

For details about the Unification Algorithm itself, please scroll down to the chapter "The Unification Algorithm".# Why Javascript?

Often, when I talk with people about my implementation of the Unification Algorithm and I tell them that I use Javascript as the programming language to implement an example, I get a rather sceptic reaction. This doesn't surprise me. Javascript is a very misunderstood programming language.

Some people know Javascript from flashy and irritating web sites. Other people even think that Javascript is like Java. Of course the similarity of the names doesn't help. Javascript is very different from Java.

# What is Javascript? What does it offer?

- An interpreted language (scripting language)
- Syntax is similar to C or Java
- Functional (Values can be functions and higher order functions are possible)
- Functions are closures (Some local variables are kept between invocations)
- Lambdas (anonymous functions) are possible
- Nested functions are possible
- Eager evaluation
- Object based (objects are function closures)
- Objects/Function closures are like dynamic records
- Prototype-based Object Orientation
- Dynamic typing
- Weak or no subtyping
- Very dynamic
- Sometimes a bit confusing
- Supports Perl-like Regular Expressions

# Some links about Javascript

- Wikipedia Javascript
- The World's Most Misunderstood Programming Language
- Classical Inheritance in Javascript
- Private Static Members in Javascript
- jslint: The Javascript Verifier
- Lambda calculus to Javascript source-to-source filter
- Curried JavaScript functions
- JScript is a Functional Language

# The Unification Algorithm

This text is a summary from the exercise text describing the Unification Algorithm, partially expressed differently in a form which I understand better.

The unification algorithm is used in Prolog to infer queries from a
knowledge base using substitutions or for type inference in programming
languages. This requires finding substitutes that make different
logical expressions look identical. This process is called **unification**
and is a key component of all first-order inference algorithms. The
Unify algorithm takes two rules or
sentences and returns a **unifier** for them if one exists:

Unify(

p, q) =θwhere Subst(θ, p) = Subst(θ, q).

An **unifier** is a set of n substitutions {
*var*_1/*val*_1, ...,
*var_n*/*val_n* }
where each variable *var_i* in an expression is
to be replaced by the value *val_i*.
For example the substitution *x/Jane* applied to the expression
*Knows(John, x)* yields the expression *Knows(John, Jane)*.
The unifier has the property that the two rules *p* and *q*
become equal if all substitutions in the set are applied to them.

For some expressions there is more than one unifier. The different
unifiers have different results of the substitutions. We look for
the **most general unifier**, which yields the "most general"
equal form of the substituted rules. I think the most general
form is the form which keeps as many
variables "open" as possible.
For example, the same variable appearing in both rules at the
same place is left as is.

The algorithm consists of the functions Unify and Unify-Var. An occur check in Unify-Var is needed: if a variable is contained in an expression, the unification must fail, because no consistent unifier can be constructed in this case.

In the algorithm the expressions *p* and *q* are
both recursively explored. If one of the expressions is a
variable,
proceed to the function
Unify-Var, which finds and adds a
substitution for the variable. Compound rules and rule lists are
broken up and unified for each
element separately. If *p* and *q* are equal,
however, just return the substitution accumulated so far.

Failures in unification come from rule or argument lists of inequal lengths, or from the occur check, or from two different constants.

# The Rule object explained more in detail

For the Unification Algorithm I defined some "helper classes". One class is "Rule". A Rule object is both a single rule and a rule list (because the Unification Algorithm accepts both single rules and rule lists).

**Some definitions**

- Leaf rule
- A rule without arguments. A variable or a constant. Examples:
*John*or*x*. - Constant
- A leaf rule which is never to be substituted.
- Variable
- A leaf rule which can be substituted.
- Compound rule
- A rule with arguments. Example:
*Knows(x, John)*. - Rule name
- The name of a compound rule, constant or variable. Rule lists don't have a name. Variables have a name of only one letter.
- Rule list
- Zero or more rules. Separated by comma. Arguments of a compound
rule are extracted as a rule list by
`Rule.args()`. - Nil rule
- An empty rule list.
`Rule.head()`and`Rule.tail()`don't work for nil rules (for leaf rules neither)! - Variable names for compound rules
- Not only leafs can be variables. The names of compound rules
can be variables, too. For example
*f(Jane, John)*can be substituted to*Knows(Jane, John)*by {*f*/*Knows*}.

Here some part of the implementation in Javascript (with comments to explain some specialities in Javascript) is shown below. I didn't include everything to keep the example short.

// Objects are function closures. Function arguments are never // typed explicitly in Javascript. function Rule(op, list) { // 'this' is the reference to the closure. // Return the length of the rule/argument list. this.length = function() { return list.length; } // Return the first element of the argument list. this.head = function() { return list[0]; } // Return the rest rule list without the first argument. // slice() is an array method to return a sub-array. // new Rule() is needed because a Rule object must be returned. this.tail = function() { return new Rule("", list.slice(1, list.length)); } // add, join, op, args etc. are omitted. // Define an empty array as default for the parameter 'list' if (!list) list = new Array(); // Is the parameter 'list' a Rule list, extract the array instead if (is_list(list)) list = list.list; // Define an empty name as default for the rule name if (!op) op = ""; // Set the public members name and list this.name = op; this.list = list; } // Rule.prototype is the prototype object of Rule. Functions and // members defined in the prototype are available to all Rule instances. // Pretty-print a rule (using [] to delimit lists if show_list is true) Rule.prototype.dump = function(show_list) { ... } // Non-method functions which work on Rule instances // Non-method, because is_leaf(rule) and others returns false reliably // if rule is undefined, but rule.is_leaf() throws an exception. // Is 'rule' a Rule object? // First test rule, then rule.constructor, then compare it with the // constructor function Rule(). Each Javascript object instance has // the special member 'constructor' which points to the constructor // function. Very useful to find the "type" of an object instance. function is_rule(rule) { return rule && rule.constructor && rule.constructor == Rule; } // A Rule is leaf when it has a name and an empty argument list // rule.length() is the method defined inside the Rule object. // Empty strings evaluate to false. function is_leaf(rule) { return is_rule(rule) && rule.length() > 0 && rule.name; }

To create a rule Knows(John,Jane), one invokes:

var rule = new Rule("Knows", [new Rule("John"), new Rule("Jane")]);

To create a rule list John, Jane, Knows(John,Jane), one invokes:

var list = new Rule("", [ new Rule("John"), new Rule("Jane"), new Rule("Knows", [new Rule("John"), new Rule("Jane")]) ]);

We see that it is easier to use a parser. I wrote a parser. With the parser the rule list is defined more compactly:

var list = parse("John, Jane, Knows(John, Jane)");

The parser is a recursive-descent parser and uses regular
expressions to scan the text. It is very compact (25 lines of code).
The inner function parse(leaf) consumes the string variable (a
closure member of the outer function parse(string)) and terminates
on an empty string or too many closing parentheses. In a preprocessing
phase first white-space is found with the regular expression
`/\s/g` and removed, and
then bad characters and illegal separator strings are catched
with the regular expression
`/.{0,5}([^A-Za-z(),]|[,(][,()]).{0,15}/`. The leading
five and trailing fifteen characters are used for the error
message to show the context. The parser is not perfect (its syntax
is a bit too lax).

# The implementation of Unify in Javascript

The Unification Algorithm itself renders in Javascript in a compact and rather easily understood form. I omitted logging and internal error handling.

function unify(x, y, subst) { if (!subst) subst = new Substitutions(); if (subst == failure) return failure; if (is_leaf_equal(x, y)) return subst; if (is_nil(x) && is_nil(y)) return subst; if (is_variable(x)) return unify_var(x, y, subst); if (is_variable(y)) return unify_var(y, x, subst); // Split up compound rules or rule lists and unify each part separately // For lists, check additionally that they aren't empty (nil) if (is_compound(x) && is_compound(y)) return unify(x.args(), y.args(), unify(x.op(), y.op(), subst)); if (is_list(x) && is_list(y) && !(is_nil(x) || is_nil(y))) return unify(x.tail(), y.tail(), unify(x.head(), y.head(), subst)); return failure; } function unify_var(variable, expression, subst) { var vs = variable.dump(); var es = expression.dump(); if (subst.subst[vs]) return unify(subst.subst[vs], expression, subst); if (subst.subst[es]) return unify(variable, subst.subst[es], subst); if (occurs(variable, expression)) return failure; // Substitution vs/expression found and add it subst.subst[vs] = expression; return subst; }