\chapter{Background}

\section{Evolution of the JavaScript programming language}

Technical Committee 39 (abbreviated as TC39) is a group within ECMA international, whose main goal is to develop the language standard for ECMAScript (JavaScript) and other related standards. These related standards include: ECMA-402 the internalization API of ECMA-262 ECMA-404 the standard for JSON and ECMA-414 the ECMAScript specification suite standard. The members of the committee are representatives from  companies, academic institutions, and various other organizations from all across the world interested in developing the ECMAScript language. The delegates are experts in JavaScript engines, tooling surrounding JavaScript, and other parts of the JavaScript ecosystem.  


\subsection{ECMA-262 Proposals}

We explain now what a proposal is, and how proposals are developed in TC39 for the ECMA-262 language standard. 


A \emph{proposal} is a suggested change to the ECMA-262 language standard. These additions to the standard have to solve some form of problem with the current version of ECMAScript. Such problems can come in many forms, and can apply to any part of the language. Examples include: a feature that is not present in the language, inconsistent parts of the language, simplification of common patterns, and so on. The proposal development process is defined in the TC39 Process Document. 


The \textbf{TC39 Process Document}~\cite{TC39Process} describes each stage a proposal has to go through to be accepted into the ECMA-262 language standard. 

The purpose of \emph{stage 0} of the process is to allow for exploration and ideation around which parts of the current version of ECMAScript can be improved, and then define a problem space for the committee to focus on improving. 

\emph{stage 1} is the stage the committee will start development of a suggested proposal. The are several requirements to enter this stage: a champion, a delegate of the committee responsible for a proposal, has to be identified. A rough outline of the problem must be provided, and a general shape of a solution must be given. There must have been discussion around key algorithms, abstractions and semantics of the proposal. Exploration of potential implementation challenges and cross-cutting concerns must have been done. The final requirement is for all parts of the proposal to be captured in a public repository. Once all these requirements are met, a proposal is accepted into stage 1. During this stage, the committee will work on the design of a solution, and resolve any cross-cutting concerns discovered previously. 

At \emph{stage 2} a preferred solution has been identified. Requirements for a proposal to enter this stage are as follows: all high level API's and syntax must be described in the proposal document, illustrative examples created, and an initial specification text must be drafted. During this stage the following areas of the proposal are explored: refining the identified solution, deciding on minor details, and create experimental implementations. 

At \emph{stage 2.7} the proposal is principally approved, and has to be tested and validated. To enter this stage, the major sections of the proposal must be complete. The specification text should be finished, and all reviewers of the specification have approved. Once a proposal has entered this stage, testing and validation will be performed. This is done through the prototype implementations created in stage 2.

Once a proposal has been sufficiently tested and verified, it is moved to \emph{stage 3}. During this stage, the proposal is implemented in at least two major JavaScript engines. The proposal should be tested for web compatibility issues, and integration issues in the major JavaScript engines. 

At \emph{Stage 4} the proposal is completed and included in the next revision of the ECMA-262. 

\section{Abstract Syntax Trees}


An \emph{abstract syntax tree} (AST) is a tree representation of source code. Every node of such a tree represents a construct from the source code. ASTs remove syntactic details while maintaining the \emph{structure} of the program. Each node is set to represent elements of the programming language, such as statements, expressions, declarations to name a few. Every node type represents a grammatical construct in the language the AST was built from.  

ASTs are important for manipulating source code; they are used by almost any tool that has to represent source code in some way to perform operations with it~\cite{AST3}. Using ASTs is favored over raw text due to their structured nature, especially when considering tools like compilers, interpreters, or code transformation tools. ASTs are produced by language parsers. For JavaScript, one of the popular labraries used for parsing is \emph{Babel}~\cite{Babel}. 

Babel is a JavaScript toolchain, its main usage is converting ECMASCript 2015 and newer into older versions of JavaScript. This conversion is done to increase the compatibility of JavaScript in older environments such as older browsers. 

Babel has a suite of libraries used to work with JavaScript source code, each library relies on Babel's AST definition~\cite{BabelAST}. The AST specification Babel uses tries to stay as close to the ECMAScript standard as possible~\cite{BabelSpecCompliant}, which has made it a recommended parser to use for proposal transpiler implementations~\cite{TC39RecommendBabel}. A simple example of how source code parsed into an AST with Babel looks like can be seen in Figure~\ref{ex:srcToAST}.

\begin{figure}[H]
\noindent\begin{minipage}{.30\textwidth}
\begin{lstlisting}[language={JavaScript}]
let name = f(100);
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.65\textwidth}
\begin{center}
\begin{tikzpicture}[
    squarednode/.style={rectangle, draw=red!60, fill=black!5, very thick, minimum size=2mm}, node distance=10mm and 5mm
]
\node[squarednode] (VarDecl)        {VariableDeclaration};
\node[squarednode] (VarDeclarator)  [below=of VarDecl] {VariableDeclarator};
\node[squarednode] (id)             [below left= 10mm and -10mm of VarDeclarator] {Identifier: name};
\node[squarednode] (callExpr)       [below right= 10mm and -10mm of VarDeclarator] {CallExpression};
\node[squarednode] (cid)            [below left= 10mm and -10mm of callExpr] {Identifier: f};
\node[squarednode] (arg)            [below right= 10mm and -15mm of callExpr] {NumericLiteral: 100};

\draw[] (VarDecl.south) -- (VarDeclarator.north);
\draw[] (VarDeclarator.south) -- (id.north);
\draw[] (VarDeclarator.south) -- (callExpr.north);
\draw[] (callExpr.south) -- (cid.north);
\draw[] (callExpr.south) -- (arg.north);
\end{tikzpicture}
\end{center}
\end{minipage}\hfil
\caption{\label{ex:srcToAST} Example of source code parsed to Babel AST}
\end{figure}


To achieve compilation of newer versions into older versions, Babel uses a plugin system that allows a myriad of features to be enabled or disabled. This makes the parser versatile to fit different ways of working with JavaScript source code. Because of this, Babel allows parsing of JavaScript experimental features. These features are usually proposals that are under development by TC39, and the development of these plugins are a part of the proposal deliberation process. This allows for experimentation as early as stage 1 of the proposal development process. Some examples of proposals that were first  supported by Babel's plugin system are "Do Expression"~\cite{Proposal:DoProposal} and "Pipeline"~\cite{Pipeline}. These proposals are both currently in stage 1 and 2, respectively.


\section{Source Code Querying}

Source code querying is the action of searching source code to extract some information or find specific sections of code. Source code querying comes in many forms, the simplest of which is text search. Since source code is primarily text, one can apply text search techniques to perform a query, or a more complex approach using regular expressions (eq tools like \texttt{grep}). Both these methods do not allow for queries based on the structure of the code, and rely solely on its syntax. AST-based queries allow queries to be written based on both syntax and structure, and are generally more powerful than regular text based queries. Another technique for code querying is based on semantics of code.
 
The primary use cases for source code querying is code understanding, analysis, code navigation, enforcement of styles, along with others. All these are important tools developers use when writing programs, and they all rely on some form of source code queries. One such tool is Integrated Development Environments (IDEs), as these tools are created to write source code, and therefore rely on querying the source code for many of their features. One such example of code querying being used in an IDE is JetBrains structural search and replace~\cite{StructuralSearchAndReplaceJetbrains}, where queries are defined based on code structure to find and replace sections of our program. 

\section{Domain-Specific languages}

Domain specific languages are software languages specialized to a specific narrow domain~\cite{Kleppe}. The difference between DSLs and General Purpose Languages (GPLs) is defined like a spectrum, in which DSL is on one end and GPLs is on the other~\cite{DSLSANDHOW}. DSLs allow domain experts to get involved in the development process, as it is expected that a domain expert would have the capabilities to read and write DSL code. A DSL allows for very concise and expressive code to be written that is specifically designed for the domain. Using a DSL might result in faster development because of this expressiveness within the domain, this specificity to a domain might also increase correctness. However, there are also some disadvantages to DSL's, the restrictiveness of a DSL might become a hinderance if it is not well designed to represent the domain. DSL's also might have a learning curve, making the knowledge required to use them a hinderance. Developing the DSL might also be a hinderance, as a DSL requires both knowledge of the domain and knowledge of software language engineering. 


\section{Language Workbenches}

A language workbench is a tool created to facilitate the development of a computer language, such as a DSL. 


Language workbenches support generating tooling for languages, as most modern computer languages are backed by some form of tooling. One such tool is a language parser that is generated from the language definition within the language workbench. Another such tool commonly generated by a language workbench is a language server for integrated development environments, these language servers provide functionality such as syntax highlighting, code navigation, error highlighting, along with others. 

A language is defined in a language workbench using a grammar definition. This grammar is a formal specification of the language that describes how each language construct is composed and the structure of the language. This allows the language workbench to determine what is a valid sentence of the language. This grammar is used to create the the AST of the language, which is the basis for all the tools generated by the language workbench. Many such language workbenches exist, such as Langium~\cite{Langium}, Xtext~\cite{Xtext}, Jetbrains MPS, and Racket.