# Scope Analysis and Translation: Translating From Egg to JavaScript

# Goals

Write a translator from the Egg (opens new window) language to the JavaScript language.

Reuse your parser to create the parse trees. Add traversal functions of the ASTs to generate the JS code.

# On the Preservation of Semantics

None of these quotes will serve as an excuse for your Egg to JS translator not to be faithful to the original Egg interpreter

Negative Quotes

The original is unfaithful to the translation.

  • Jorge Luis Borges, book Other Inquisitions

Borges poses the idea that, in some ways, the translation may be truer to the fundamental spirit of the original than the original itself. Equivalent excuse: "The JS semantics of my translator is better than the one in the original Egg"

You can also try to say that, according to Robert Frost, it is because there is too much poetry in the original Egg:

Poetry is what gets lost in translation

  • Robert Frost

Instead you have to follow Gunter Grass:

Positive Quotes

Translation is that which transforms everything so that nothing changes

  • Gunter Grass

# Etimology

The word 'translation' comes, etymologically, from the Latin for 'bearing across'. Having been borne across the world, we are translated men. It is normally supposed that something always gets lost in translation; I cling, obstinately to the notion that something can also be gained

  • Salman Rushdie, Imaginary Homelands: Essays and Criticism 1981-1991

Here there are a few ethimological entries:

  • traductĭo, -ōnis (latin): hacer pasar de un lugar a otro, to move from one place to another
  • translatio (classical latin): traslación, translation
  • interpretatio: interpretación, interpretation
  • übersetzen (german): pasar a la otra orilla, cross over to the other shore, ex. Wir haben über den Fluss übergesetzt

# The Compiler

Add a -j --js option to your compiler executable, or add a new egg2js.js executable to your bin directory to do the translation

# Translating Arithmetic Expressions

When the program is given input with Egg expressions of type apply like these:

$ cat examples/times.egg
print(
 +(3,
    *(4,
      5
    )
  ),
  -(9,3)
)
1
2
3
4
5
6
7
8
9

should output an examples/times.js file with output similar to this:

➜  egg2js-solution git:(master) git remote -v
origin  git@github.com:ULL-ESIT-PL-2122/egg2js-solution.git (fetch)
origin  git@github.com:ULL-ESIT-PL-2122/egg2js-solution.git (push)
➜  egg2js-solution git:(master) git -P lg -n 1
6e62885 - (HEAD -> master, origin/master) test/testfiles/generatingJS (hace 2 minutos Casiano Rodriguez-Leon)
1
2
3
4
5
➜  egg2js-solution git:(master) bin/egg.js test/testfiles/generatingJS/times.egg -J
const path = require('path');
const runtimeSupport = require(path.join('/Users/casianorodriguezleon/campus-virtual/2122/pl2122/practicas-alumnos/egg2js/egg2js-solution/lib/eggInterpreter', "..", "generateJS", "runtimeSupport"));
runtimeSupport.print((3 + (4 * 5)), (9 - 3))
1
2
3
4

Observe how the JS expression is fully parenthesized

➜  egg2js-solution git:(master) bin/egg.js test/testfiles/generatingJS/times.egg -j
➜  egg2js-solution git:(master) node test/testfiles/generatingJS/times.egg.js 
23 6
1
2
3

Here is another example but with an arithmetic string expressions instead:

➜  egg2js-solution git:(master) ✗ cat ex/string.egg 
print(+(
    "hello",
    +(
        " ",
        "world"
    )
))
1
2
3
4
5
6
7
8
➜  egg2js-solution git:(master) ✗ bin/egg.js -J ex/string.egg
const path = require('path');
const runtimeSupport = require(path.join('/Users/casianorodriguezleon/campus-virtual/2122/pl2122/practicas-alumnos/egg2js/egg2js-solution/lib/eggInterpreter', "..", "generateJS", "runtimeSupport"));
runtimeSupport.print(('hello' + (' ' + 'world')))
1
2
3
4
➜  egg2js-solution git:(master) ✗ bin/egg.js -j ex/string.egg
➜  egg2js-solution git:(master) ✗ node ex/string.egg.js 
hello world
1
2
3

# Translating Value nodes

The generateJS for the Value class is almost trivial:

class Value extends Ast {
  constructor(value) { ... }
  evaluate() { ... }

  generateJS() {
    if (typeof this.value === 'number') {
      return this.value;
    }
    return `'${this.value}'`;
  }
}
1
2
3
4
5
6
7
8
9
10
11

# Translating Word nodes

Observe how we concatenate the prefix string '$' to all the source variables in order to avoid clashes with any variables we will need for our translation algorithm.

For instance Egg.print! What if the egg source code has a variable with name Egg? This way we are on safe ground.

class Word extends Ast {
  constructor(name) { ... }
  evaluate(env) { ... }

  generateJS(scope) {
      let eggName = '$'+this.name;
      setAsUsed(scope, eggName);
      return eggName;
  }
}
1
2
3
4
5
6
7
8
9
10

The utility function setAsUsed is imported from '../utils/scope.js'. It saves in the current symbol table that the variable is being used in the current scope.

Later, when the scope is closed we will check that all used variables were declared.

# Adding generateJS Methods to the AST nodes

An approach that I have followed when doing this practice is to add generateJS methods to each of the different types of AST nodes that are responsible for generating the JS code corresponding to that type of node. These methods always receive as a parameter the symbol table for the current scope.

Inside the file ast.js we wrote:

const { setAsUsed } = require('../utils/scope.js');
const {specialForms} = require('../eggInterpreter/specialForms.js');
const {generateJSForms} = require('../generateJS/generateJSForms.js');

class Ast { ... } // Abstract class 
1
2
3
4
5

# Translating Apply nodes

class Apply extends Ast {
  constructor(tree, args = []) { ... }
  evaluate(env) { ... }

  generateJS(scope) {
    if (this.operator.type === 'word') {
      if (generateJSForms[this.operator.name]) {
        return generateJSForms[this.operator.name](this.args, scope);
      } 
      else {
        let opTranslation = this.operator.generateJS(scope);
        if (opTranslation && scope[opTranslation].declared) {
          let argsTranslated = this.args.map(arg => arg.generateJS(scope));
          return `${opTranslation}(${argsTranslated})`; 
        } 
        else if (opTranslation && !scope[opTranslation].declared){ 
          if (this.operator.name in specialForms) {
            let errorMsg = `Translation of "${this.operator.name}" not implemented yet.\n`;
            console.error(errorMsg);
            process.exit(1);
          }
          console.warn(`Warning: Non declared symbol "${this.operator.name}" used as function.\n`);
          let argsTranslated = this.args.map(arg => arg.generateJS(scope));
          return `${opTranslation}(${argsTranslated})`; 
        }
        else {
          let errMsg = `Fatal error.\n`+
              `AST=${JSON.stringify(this)}.\nscope=${JSON.stringify(scope)}.\n`;
          console.error(errMsg);
          process.exit(0);
        }
      }
    } else if (this.operator.type == 'apply') {
      let argsTranslated = this.args.map(arg => arg.generateJS(scope));
      return `${this.operator.generateJS(scope)}(${argsTranslated})`;
    }
  }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

# Not Implemented Constructs

Here is an example that activates lines 18-20 since while is in specialForms but its translator has not being implemented yet:

➜  egg2js-solution git:(master) ✗ cat ex/not-implemented.egg          
do(
    def(i, 1),
    while(<(i, 4),
      do(
        print(i),
        =(i, +(i,1))  
      )
    )
)
1
2
3
4
5
6
7
8
9
10

When executes produces:

➜  egg2js-solution git:(master) ✗ bin/egg.js -J ex/not-implemented.egg
Translation of "while" not implemented yet.
1
2

# Translating applys onto applys

Lines 33-35 of the method take care of the case when this.operator.type is apply.

Note that in Egg the operator of an apply can itself be an apply as in this example with the expression f(2)(4):

➜  egg2js-solution git:(master) ✗ cat examples/generatingJS/funfun.egg
do(
  def(f, fun(x, fun(y, +(x,y)))),
  print(f(2)(4)) # 6
)
1
2
3
4
5

It is therefore necessary to correctly translate the operator:

➜  egg2js-solution git:(master) ✗ bin/egg.js examples/generatingJS/funfun.egg -J
1
➜  egg2js-solution git:(master) bin/egg.js -J examples/generatingJS/funfun.egg
const path = require('path');
const runtimeSupport = require(path.join('/Users/casianorodriguezleon/campus-virtual/2122/pl2122/practicas-alumnos/egg2js/egg2js-solution/lib/eggInterpreter', "..", "generateJS", "runtimeSupport"));
var $f;
(() => {
  $f = function($x) {
    return function($y) {
      return ($x + $y)
    }
  };
  return runtimeSupport.print($f(2)(4))
})()
1
2
3
4
5
6
7
8
9
10
11
12

# Strategy Pattern Again

To make it easier to generate JS code you may find it useful to follow the Strategy Pattern and have a module that exports a generateJS map/hash whose keys are the same as in specialForms and topEnv and the values are the corresponding JS code generation functions. In this way we avoid as far as possible violating the OPEN/CLOSED principle:


let generateJSForms = Object.create(null);

['+', '-', '*', '/', '==', '<', '>', '&&', '||' ].forEach(op => {
  generateJSForms[op] = function(args, scope) {
    ...
  }
});

generateJSForms["print"] = function(args, scope) {
  ...
};

generateJSForms["do"] = function(args, scope) {
  ...
};

generateJSForms["fun"] = generateJSForms["->"] = function(args, scope) {
  ...
};

generateJSForms["="] = generateJSForms["set"] = function(args, scope) {
  ... 
};

generateJSForms[':='] = 
generateJSForms['def'] = 
generateJSForms['define'] = function(args, scope) {
  ...
};

module.exports = { generateJSForms };
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

# Compiling the input program: compileToJS

Some scope analysis is needed. Again, we can use a hash list to keep track of the declarations and uses of the symbols in the input program. Such data structure is called a Symbol Table. Our symbol tables are objects with the following structure:

{
  key is the variable: { declared: true if declared, used: true if used }
}
1
2
3

Here is the entry function compileToJS that initiates the translation:

function compileToJS(fileName) {
  let topScope = Object.create(null);
  Object.keys(topEnv).forEach((key) => {
    topScope[key] = { declared: true};
  });

  let scope = Object.create(topScope);

  const template = (declarations, jsCode) => declarations.length?
                      `var ${declarations.join(',')}; ${jsCode}`: 
                      jsCode;

  try {
    let program = fs.readFileSync(fileName, 'utf8');
    let tree = parse(program);
    let jscode = json2Ast[tree.type](tree).generateJS(scope);
    checkDeclarationsIn(scope, "In global program");
    return template(Object.keys(scope), jscode);
  }
  catch (err) {
    console.log(err.message);
    process.exit(0);
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

# Checking Declarations: checkDeclarationsIn

Since we are using JS inheritance chain the search in the symbol table is automatically made by JS. Awfully simple!


function checkDeclarationsIn(localScope, message) {
  Object.keys(localScope).forEach((key) => {
    if (!localScope[key].declared) {
      //console.warn
      throw new TypeError(`Variable "${key.slice(1)}" is not declared. ${message? message : ""}`);
    }
  })
}
1
2
3
4
5
6
7
8
9

The following image illustrates the search on a C compiler in the scope analysis phase for a non declared variable i = 2:

Here is an example that illustrates the typical info to store in a symbol table and how the AST variable use nodes have references to its variable declarations stored in the symbol table:

# References

# Runtime Library Support

You may find it useful to write a runtime-support.js library with functions that support running the translated JS programs. Something like that:

➜  egg2js-solution git:(master) ✗ cat lib/generateJS/runtimeSupport.js 
runtimeSupport = {
  /**
   * print function emulating egg print behavior
   * @param  {...any} args 
   */
  print: (...args) => {
    console.log(...args);
    return args;
  }
  /* ... */
};

module.exports = runtimeSupport;
1
2
3
4
5
6
7
8
9
10
11
12
13
14

To reach the runtime support library in an independent manner, you can use the fact that wherever it is installed the path relative to the caller is going to be the same:

const jsBeautify = require('js-beautify');
const jsBeautifyConfig = {
  indent_size: 2,
  space_in_empty_paren: true,
  end_with_newline: true,
  preserve_newlines: false
}

const compileToJsAndBeautify = (eggFile) => {  
  let compiledJS = `
  const path = require('path');
  const runtimeSupport = require(
    path.join(
      '${__dirname}', 
      "..", 
      "generateJS", 
      "runtimeSupport"
    )
  );
  
  ${compileToJS(eggFile)}
  `;
  
  let compiledJSBW = jsBeautify(compiledJS, jsBeautifyConfig);
  return compiledJSBW;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

# Translating a do

The do in Egg returns the last expression evaluated, whereas a JS block is not an expression. So I can't translate do(...) directly into a JS { ... } block.

# Translation Scheme: Using Anonymous Functions

Observe how is this translation that we have made of a do:

➜  egg2js-solution git:(master) ✗ cat examples/generatingJS/do.egg
print(
  do(
    def(a,1),
    =(a,9),
    def(b, +(a,1))
  )
)                                                                                           
1
2
3
4
5
6
7
8

can be translated as:

➜  egg2js-solution git:(master) bin/egg.js examples/generatingJS/do.egg -J
const path = require('path');
const runtimeSupport = require(path.join('/Users/casianorodriguezleon/campus-virtual/2122/pl2122/practicas-alumnos/egg2js/egg2js-solution/lib/eggInterpreter', "..", "generateJS", "runtimeSupport"));
var $a, $b;
runtimeSupport.print((() => {
  $a = 1;
  $a = 9
  return $b = ($a + 1);
})())
1
2
3
4
5
6
7
8
9

See how the scope is created using an anonymous function (() => { ... })() that is executed on the fly so that it returns the last expression evaluated.

We have not made use of a direct translation of a do by a compound statement

{ ... }
1

and we have taken this pains to respect the semantics of Egg.

In general, make sure that any JS program resulting from the translation of an Egg program produces the same results as when the Egg program is parsed.

Here is the hash generateJSForms["do"] entry for this do translation strategy:

generateJSForms["do"] = function(args, scope) { 
  let argsTranslated = args.map(arg => arg.generateJS(scope));
  const lastOne = argsTranslated.pop();

  const template = (expressions, lastOne) => {
    return `(()=>{
      ${expressions.join("\n")}
      return ${lastOne}
    })()`;
  }

  return template(argsTranslated, lastOne);
};
1
2
3
4
5
6
7
8
9
10
11
12
13

# Another translation Scheme for do

But, is the case with do, we can have more than one translation scheme for the same source construct. For instance, for the tanslation of the do we can follow this alternative approach:

generateJSForms["do"] = function(args, scope) {
  debugger;
  
  let argsTranslated = args.map(arg => arg.generateJS(scope));

  const template = (expressions) => {
    return `(${expressions.join(", ")})`;
  }

  let temp = template(argsTranslated);

  return temp;
};
1
2
3
4
5
6
7
8
9
10
11
12
13

That for the same input example do.egg:

➜  egg2js-solution git:(master) cat ex/do.egg         
print(
  do(
    def(a,1),
    =(a,9),
    def(b, +(a,1))
  )
)
1
2
3
4
5
6
7
8

Produces this translation:

➜  egg2js-solution git:(master) bin/egg.js -J ex/do.egg
const path = require('path');
const runtimeSupport = require(path.join('/Users/casianorodriguezleon/campus-virtual/2122/pl2122/practicas-alumnos/egg2js/egg2js-solution/lib/eggInterpreter', "..", "generateJS", "runtimeSupport"));
var $a, $b;
runtimeSupport.print(($a = 1, $a = 9, $b = ($a + 1)))
1
2
3
4
5

Take notice of the importance of the nested parenthesis (($a =1, ...)). If we were using a single parenthesis instead, the meaning will be quite different!

That produces the same result than the interpreted egg program:

➜  egg2js-solution git:(master) bin/egg.js -xj ex/do.egg
10
1
2

# Translating a def

Here you have a translation strategy for the def:

generateJSForms[':='] = 
generateJSForms['def'] = 
generateJSForms['define'] = function(args, scope) {
  if (args.length != 2)  throw new Error('define only accepts two arguments');
  let [variable, expr] = args.map(arg => arg.generateJS(scope));
  setAsDeclared(scope, variable);
  return `${variable} = ${expr};`;
};
1
2
3
4
5
6
7
8

And here is an example of use:

➜  egg2js-solution git:(master) ✗ cat ex/declared-twice.egg
do(
    def(a,4),
    def(a,5),
    def(b, 9),
    print(+(a,b))
)
1
2
3
4
5
6
7

That when compiled produces:

➜  egg2js-solution git:(master) ✗ bin/egg.js -J ex/declared-twice.egg
const path = require('path');
const runtimeSupport = require(path.join('/Users/casianorodriguezleon/campus-virtual/2122/pl2122/practicas-alumnos/egg2js/egg2js-solution/lib/eggInterpreter', "..", "generateJS", "runtimeSupport"));
var $a, $b;
(() => {
  $a = 4;
  $a = 5;
  $b = 9;
  return runtimeSupport.print(($a + $b))
})()
1
2
3
4
5
6
7
8
9
10

If we run the resulting js we get:

➜  egg2js-solution git:(master) ✗ bin/egg.js -Jj ex/declared-twice.egg
➜  egg2js-solution git:(master) ✗ node ex/declared-twice.egg.js 
14
1
2
3

# A more complex example: Managing Scopes

When variables and functions are declared and new scopes are created like in this example (assume that in addition to the functions the do has its own scope):

➜  egg2js-solution git:(master) ✗ cat examples/generatingJS/hello-scope.egg
print("computed value = ", 
  do(
    def(x,4),
    def(inc, fun(w, do(
        def(y, 999),
        +(w,1)
      ) # do
    ) # fun
    ),# def
    def(z,-1),
    set(x, inc(x))
  )
)
1
2
3
4
5
6
7
8
9
10
11
12
13
14

The translation should produce the equivalent JavaScript code:

➜  egg2js-solution git:(master) ✗ bin/egg.js examples/generatingJS/hello-scope.egg -J
1
➜  egg2js-solution git:(develop) ✗ bin/egg.js examples/generatingJS/hello-scope.egg -J
const path = require('path');
const runtimeSupport = require(path.join('/Users/casianorodriguezleon/campus-virtual/2122/pl2122/practicas-alumnos/egg2js/egg2js-solution/lib/eggInterpreter', "..", "generateJS", "runtimeSupport"));
var $x, $inc, $z;
runtimeSupport.print('computed value = ', (() => {
  $x = 4;
  $inc = function($w) {
    var $y;
    return (() => {
      $y = 999;
      return ($w + 1)
    })()
  };
  $z = -1;
  return $x = $inc($x)
})())
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

Notice how we prefix source variables with "$" so that statements like def(x,4) become:

  var $x, $inc, $z;
  $x = 4;
1
2

this is done so that translated variables don't collide with auxiliary variables that we might need to introduce to support the translation.

# Visual Appearance of the Generated Code

You can use some module like this:

to improve the visual appearance of the exit code

# Exercise: Translation Scheme for While

Write a translation scheme for the Egg while loop. Assume the Egg while loop always returns false.

Try your proposal with this example:

➜  egg2js-solution git:(master) cat examples/generatingJS/while.egg
do(
    def(a,1),
    while(<(a,4),
      do(
        print(a),
        =(a, +(a,1))
      )
    )
)
1
2
3
4
5
6
7
8
9
10

Execution:

➜  egg2js-solution git:(master) bin/egg.js -xj ex/while.egg       
1
2
3
1
2
3
4

# Translating Property nodes

As exercise do the generateJS translation function for the Property class.

class Property extends Ast { ... }

module.exports = {Value, Word, Apply, Property };
1
2
3

# References

Last Updated: a year ago