A Prolog Grammar written in Prolog

Presentation of WIP

Falco Nogatz
University of Würzburg

http://falco.io/prolog-grammar

M.Sc.

Falco Nogatz

Lehrstuhl für Informatik I

falco.nogatz@uni-wuerzburg.de
Raum E32
Termin nach Absprache

More feasible is a source-to-source compiler similar to JavaScript code minification. (...) Writing such a conversion has been discussed several times, but so far never found the funding and interest to really do it.
Jan Wielemaker at the SWI-Prolog mailing list
25 Jan 2018

(Even more)

Motivational
Use-Cases

Falco Nogatz and Daniel Haumann: Ein Style Linter für Prolog (Praktikum WS 2016/17)
Jona Kalkus: An Interactive Visualisation for Definite Clause Grammars (Master Thesis 2017)
Covington, Michael A., et al. "Coding guidelines for Prolog." Theory and Practice of Logic Programming 12.6 (2012): 889-927.
PLLint Demo (Mockup)
Seipel, Dietmar, Falco Nogatz, and Salvador Abreu. "Domain-specific languages in Prolog for declarative expert knowledge in rules and ontologies." Computer Languages, Systems & Structures 51 (2018): 102-117.

On the Why and How

A Grammar Formalism

  • Interactive Prolog demos with syntax highlighting
  • Enforce consistent code style
  • Support for multiple Prolog dialects and extensions
  • Source-to-source Transformations
    (Prolog → Prolog, Prolog → Any)

with Prolog and DCGs

  • Eat your own dog food
  • Prolog's DCGs are very powerful
  • Proven to be useful for graph transformations
  • Automatically derive currently (most-) used styles
  • Automatically derive operator definitions

Related Work

ISO-Prolog Standard

ISO/IEC 13211-1 (1995)

6.4.3 Variables

variable token (* 6.4.3 *)
  = anonymous variable (* 6.4.3 *)
  | named variable (* 6.4.3 *) ;
anonymous variable (* 6.4.3 *) 
  = variable indicator char (* 6.4.3 *) ;
named variable (* 6.4.3 *)
  = variable indicator char (* 6.4.3 *),
    alphanumeric char (* 6.5.2 *),
    { alphanumeric char (* 6.5.2 *) }
  | capital letter char (* 6.5.2 *),
    { alphanumeric char (* 6.5.2 *) } ;
variable indicator char (* 6.4.3 *)
  = underscore char (* 6.5.2 *) ;

Prolog: the standard – reference manual

Deransart, P., Ed-Dbali, A., & Cervoni, L. (1996).

9. Prolog Syntax (p. 221ff)


variable_token(V)             -->  anonymous_variable(V) | named_variable(V)

anonymous_variable(['_'])     -->  "_" underscore

named_variable(['_', A | S])  -->  "_", alpa_numeric_char(A),
                                   alpha_num_seq_char(S)

named_variable([C | S])       -->  capital_letter_char(C), alpha_num_seq_char(S)

Tau Prolog (2018)

José A. Riaza Valverde, Miguel Riaza Valverde, José M. García-García

Tau Prolog (2018)

José A. Riaza Valverde, Miguel Riaza Valverde, José M. García-García

Lexer written in pure JavaScript

  var tokenize = (function() {
    var rules = {
      whitespace: /^\s*(?:(?:%.*)|(?:\/\*(?:\n|\r|.)*?\*\/)|(?:\s+))\s*/,
      variable: /^(?:[A-Z_][a-zA-Z0-9_]*)/,
      point: /^\./,
      compound: /^((,|;|[a-z][0-9a-zA-Z_]*|[#\$\&\*\+-\.\/\:<=>\?@\^\~\\]+|'(?:[^']*?(?:\\(?:x?\d+)?\\)*(?:'')*(?:\\')*)*')\()/,
      atom: /^(,|;|[a-z][0-9a-zA-Z_]*|[#\$\&\*\+-\.\/\:<=>\?@\^\~\\]+|'(?:[^']*?(?:\\(?:x?\d+)?\\)*(?:'')*(?:\\')*)*')/,
      number: /^(?:0o[0-7]+|0x[0-9a-f]+|0b[01]+|0'(?:''|\\[abfnrtv\\'"`]|\\x?\d+\\|.)|\d+(?:\.\d+(?:e[+-]?\d+)?)?)/i,
      string: /^(?:"([^"]|""|\\")*"|`([^`]|``|\\`)*`)/,
      l_brace: /^(?:\[)/,
      r_brace: /^(?:\])/,
      l_bracket: /^(?:\{)/,
      r_bracket: /^(?:\})/,
      bar: /^(?:\|)/,
      l_paren: /^(?:\()/,
      r_paren: /^(?:\))/,
      cut: /^(?:\!)/,
      error: /^./
    };

    ...
  })();

CHR.js (2015)

Falco Nogatz

CHR.js (2015)

Falco Nogatz

Using PEG parser generator

Rule
  = name:RuleName?
    rule:RuleWithoutName

RuleWithoutName
  = PropagationRule
  / SimplificationRule
  / SimpagationRule

PropagationRule
  = headConstraints:Constraints __
    "==>" __
    guard:Guard?
    bodyConstraints:Body
A token shall not be followed by characters such that concatenating the characters of the token with these characters forms a valid token as specified by the above Syntax.
ISO Prolog Standard, Section 6.4

Our Approach

  • Write DCG for Prolog as closely as possible to the ISO Standard
  • Use term expansions to automatically create syntax tree
  • Use graph transformations to create abstract syntax tree
  • Use graph transformations to create transpilers and compilers

Recap:

Definite
Clause
Grammars

DCG

Simple Syntax

General Syntax for head//0:

head         -->  body.

Translates into:

head(In, Out) :-  body(In, Out).

Usage with phrase/3:

?- phrase(DCGBody,In,Rest).

DCG

Example: as

as --> [].
as --> [a], as.

Translates into:

as(A, A).
as([a|A], B) :-
  as(A, B).

Example call:

?- phrase(as, Ls, []).
Ls = [] ;
Ls = [a] ;
Ls = [a, a] ;
...

DCG

Semicontext, right-hand context, pushback list

first                 -->  body.
second, [push]        -->  body.

Translates into:

first(In, Out)         :-  body(In, Out).
second(In, [push|Out]) :-  body(In, Out).

Usage with phrase/3:

?- phrase(second,In,[push|Rest]).

DCG

Example: Use semicontext to count as

as, [0] --> [].
as, [N] --> [a], as, [M], { N is M+1 }.

Usage with phrase/3:

?- phrase(as,[a,a,a],[Count|Rest]).
Count = 0,
Rest = [a, a, a] ;
Count = 1,
Rest = [a, a] ;
Count = 2,
Rest = [a] ;
Count = 3,
Rest = [] .

Current Status

ISO Prolog as a DCG

Currently ~ 900 LOC

ISO Prolog as a DCG

Currently ~ 900 LOC

/* 6.4.3 Variables */
variable_token -->                  % 6.4.3
    anonymous_variable              % 6.4.3
  | named_variable.                 % 6.4.3
anonymous_variable -->              % 6.4.3
    variable_indicator_char.        % 6.4.3
named_variable -->                  % 6.4.3
    variable_indicator_char         % 6.4.3
  , alphanumeric_char               % 6.5.2
  , *alphanumeric_char.             % 6.5.2
named_variable -->                  % 6.4.3
    capital_letter_char             % 6.5.2
  , *alphanumeric_char.             % 6.5.2
variable_indicator_char -->         % 6.4.3
    underscore_char.                % 6.5.2

ISO Prolog as a DCG

Quantifiers ? and *

%% op `?` to denote optional occurence
:- op(800, fy, ?).
?_ -->
  [].
?X -->
  X.

%% op `*` to denote any number of occurences
:- op(800, fy, *).
*_ -->
  [].
*X --> 
  X,
  *X.

ISO Prolog as a DCG

Example Usage with phrase/3

?- phrase(variable_token, ['A'], []).
true.

?- phrase(variable_token, ['a'], []).
false.

?- phrase(variable_token, ['_','a'], Rest).
Rest = [a] ;
Rest = [] .

ISO Prolog as a DCG

Conditions

%% "A token shall not be followed by characters such that
%%   concatenating the characters of the token with these
%%   characters forms a valid token as specified by the above
%%   Syntax." (6.4)
token(In, Out) :-
  token_(In, Out),
  (
    % empty rest list
    Out = []
  ;
    Out = [_One_More_Element|Rests],
    % consuming one element does not succeed
    \+ token_(In, Rests)
  ).

Create Syntax Trees
Using Term Expansions

Currently ~ 120 LOC

Create Syntax Trees Using Term Expansions

Currently ~ 120 LOC

term_expansion((H --> [T]), Expanded) :-
  Res =.. [H, T],   % Res = H(T)
  dcg_translate_rule((H, [Res] --> [T]), Expanded).

Example:

underscore_char --> ['_'].

is expanded to

underscore_char, [underscore_char('_')] --> ['_'].

Create Syntax Trees Using Term Expansions

Example Usage with phrase/3

?- phrase(variable_token, ['A'], [T]).
T = variable_token(named_variable([capital_letter_char('A')]))
variable_token -->                  % 6.4.3
    anonymous_variable              % 6.4.3
  | named_variable.                 % 6.4.3
anonymous_variable -->              % 6.4.3
    variable_indicator_char.        % 6.4.3
named_variable -->                  % 6.4.3
    variable_indicator_char         % 6.4.3
  , alphanumeric_char               % 6.5.2
  , *alphanumeric_char.             % 6.5.2
named_variable -->                  % 6.4.3
    capital_letter_char             % 6.5.2
  , *alphanumeric_char.             % 6.5.2
variable_indicator_char -->         % 6.4.3
    underscore_char.                % 6.5.2

Create Syntax Trees Using Term Expansions

Example Usage with phrase/3 (cont.)

?- phrase(variable_token, ['A'], [T]).
T = variable_token(named_variable([capital_letter_char('A')]))

?- phrase(variable_token, ['a'], [T]).
false.

?- phrase(variable_token, ['_','a'], [T|Rest]).
T = variable_token(anonymous_variable(
      variable_indicator_char(underscore_char('_')))),
Rest = [a] ;
T = variable_token(named_variable([
      variable_indicator_char(underscore_char('_')), 
      alphanumeric_char(alpha_char(letter_char(small_letter_char(a))))
    ])),
Rest = [] .

Operator Definitions

As CLP(FD)

Operator Definitions

As CLP(FD)

fy prefix operator:
term(P_o, Operator_Table) -->
    op(Op)
  , term(P_i)
  , {
      member(op(Op, Spec, Prec), Operator_Table),
      Spec = fy,
      P_i  #=< Prec,  % note the # for library(clpfd)
      Prec #=< P_o
    }.

Test Framework

library(tap)

Test Anything Protocol

SWI-Prolog pack by Michael Hendricks

swipl -q -g main -t halt -s test/test.pl
swipl -q -g 'main,halt(0)' -t 'halt(1)' -s test/test.pl
TAP version 13
1..24
ok 1 - [integer_token +] 123
ok 2 - [integer_token +] 00123
ok 3 - [integer_token -] 12 34
ok 4 - [integer_token -] 12\n34
ok 5 - [integer_token -] a
ok 6 - [integer_token -] 1.2
ok 7 - [float_number_token +] 1.2
ok 8 - [float_number_token -] 1. 2
ok 9 - [float_number_token -] 1.2.3
ok 10 - [float_number_token +] 1.2e3
ok 11 - [float_number_token +] 1.2E3
...

library(tap)

library(tap) specifies tests as Prolog rules, get expanded using term expansion

%% test/parser/number.pl

integer_token('123'):
  integer_token(
    integer_constant(
      [ decimal_digit_char('1'),
        decimal_digit_char('2'),
        decimal_digit_char('3') ])).

% integers are allowed to start with zeroes
integer_token('00123'): true.

% no space in between
integer_token('12 34'): fail.

Conclusion and
Future Work

Current Contributions

  • Implementation of ISO Standard as DCGs
  • Term Expansions to get (verbose) syntax tree
  • CLP for operator definitions
  • Example application in a modern development approach: test-driven using TAP

Next Steps

  • Graph transformation to build AST
  • Run style tests for various Prolog source codes
  • Integration in IDE, CI, etc.
  • Source-to-source transformations over AST
    (as transpiler Prolog → Prolog)
  • Source-to-source transformations over AST
    (as compiler Prolog → Any)

Falco Nogatz

University of Würzburg
Chair for Computer Science I

falco.nogatz@uni-wuerzburg.de
http://uni-wuerzburg.de/?nogatz