\chapter{Background}

\section{Evolution of the JavaScript programming language}

Technical Committee 39 (abbreviated as TC39) is a group within ECMA international, whose main goal is to develop the language standard for ECMAScript (JavaScript) and other related standards. These related standards include: ECMA-402 the internalization API of ECMA-262 ECMA-404 the standard for JSON and ECMA-414 the ECMAScript specification suite standard. The members of the committee are representatives from  companies, academic institutions, and various other organizations from all across the world interested in developing the ECMAScript language. The delegates are experts in JavaScript engines, tooling surrounding JavaScript, and other parts of the JavaScript ecosystem.  


\subsection{ECMA-262 Proposals}

We explain now what a proposal is, and how proposals are developed in TC39 for the ECMA-262 language standard. 


A \emph{proposal} is a suggested change to the ECMA-262 language standard. These additions to the standard have to solve some form of problem with the current version of ECMAScript. Such problems can come in many forms, and can apply to any part of the language. Examples include: a feature that is not present in the language, inconsistent parts of the language, simplification of common patterns, and so on. The proposal development process is defined in the TC39 Process Document. 


\textbf{The TC39 Process Document~\cite{TC39Process}} describes each stage a proposal has to go through to be accepted into the ECMA-262 language standard. 

The purpose of \emph{stage 0} of the process is to allow for exploration and ideation around which parts of the current version of ECMAScript can be improved, and then define a problem space for the committee to focus on improving. 

\emph{stage 1} is the stage the committee will start development of a suggested proposal. The are several requirements to enter this stage: a champion, a delegate of the committee responsible for a proposal, has to be identified. A rough outline of the problem must be privded, and a general shape of a solution must be given. There must have been discussion around key algorithms, abstractions and semantics of the proposal. Exploration of potential implementation challenges and cross-cutting concerns must have been done. The final requirement is for all parts of the proposal to be captured in a public repository. Once all these requirements are met, a proposal is accepted into stage 1. During this stage, the committee will work on the design of a solution, and resolve any cross-cutting concerns discovered previously. 

At \emph{stage 2} a preferred solution has been identified. Requirements for a proposal to enter this stage are as follows: all high level API's and syntax must be described in the proposal document, illustrative examples created, and an initial specification text must be drafted. During this stage the following areas of the proposal are explored: refining the identified solution, deciding on minor details, and create experimental implementations. 

At \emph{stage 2.7} the proposal is principally approved, and has to be tested and validated. To enter this stage, the major sections of the proposal must be complete. The specification text should be finished, and all reviewers of the specification have approved. Once a proposal has entered this stage, testing and validation will be performed. This is done through the prototype implementations created in stage 2.

Once a proposal has been sufficiently tested and verified, it is moved to \emph{stage 3}. During this stage, the proposal is implemented in at least two major JavaScript engines. The proposal should be tested for web compatibility issues, and integration issues in the major JavaScript engines. 

At \emph{Stage 4} the proposal is completed and included in the next revision of the ECMA-262. 

\section{AST and Babel}

\subsection*{Abstract Syntax Tree}

An abstract syntax tree is a tree representation of source code. Every node of the tree represents a construct from the source code. ASTs remove syntactic details that are present in the source code, and while maintaining the structure of the program with its tree. Each node is set to represent elements of the programming language, some common ones are statements, expressions, declarations and other language concepts. Every node type represents a grammatical construct in the language the AST was built from.  

ASTs are important for working with source code, they are used by almost any tool that has to represent source code in some way to perform operations with it~\cite{AST3}. This is because the structure is simpler to work with then raw text, especially when considering tools like compilers, interpreters, or code transformation tools. 

ASTs are built by language parsers. A language parser takes the raw source code of a language, and parses the code into an AST while maintaining its structure but discarding irrelevant information. A simple example of how JavaScript is parsed into an AST can be seen in Figure~\ref{ex:srcToAST}. 

\subsection*{Babel}

Babel is a JavaScript toolchain, its main usage is converting ECMASCript 2015 and newer into older versions of JavaScript. The conversion to older versions is done to increase compatibility of JavaScript in older environments such as older browsers. 

Babel has a suite of libraries used to work with JavaScript source code, each library relies on Babels AST definition~\cite{BabelAST}. The AST specification Babel uses tries to stay as true to the ECMAScript standard as possible~\cite{BabelSpecCompliant}, which has made it a recommended parser to use for proposal transpiler implementations~\cite{TC39RecommendBabel}. A simple example of how source code parsed into an AST with Babel looks like can be seen in Figure~\ref{ex:srcToAST}.

\begin{figure}[H]
\noindent\begin{minipage}{.30\textwidth}
\begin{lstlisting}[language={JavaScript}]
let name = f(100);
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.65\textwidth}
\begin{center}
\begin{tikzpicture}[
    squarednode/.style={rectangle, draw=red!60, fill=black!5, very thick, minimum size=2mm}, node distance=10mm and 5mm
]
\node[squarednode] (VarDecl)        {VariableDeclaration};
\node[squarednode] (VarDeclarator)  [below=of VarDecl] {VariableDeclarator};
\node[squarednode] (id)             [below left= 10mm and -10mm of VarDeclarator] {Identifier: name};
\node[squarednode] (callExpr)       [below right= 10mm and -10mm of VarDeclarator] {CallExpression};
\node[squarednode] (cid)            [below left= 10mm and -10mm of callExpr] {Identifier: f};
\node[squarednode] (arg)            [below right= 10mm and -15mm of callExpr] {NumericLiteral: 100};

\draw[] (VarDecl.south) -- (VarDeclarator.north);
\draw[] (VarDeclarator.south) -- (id.north);
\draw[] (VarDeclarator.south) -- (callExpr.north);
\draw[] (callExpr.south) -- (cid.north);
\draw[] (callExpr.south) -- (arg.north);
\end{tikzpicture}
\end{center}
\end{minipage}\hfil
\caption{\label{ex:srcToAST} Example of source code parsed to Babel AST}
\end{figure}


Babel's mission is to transpile newer version of JavaScript into older versions that are more compatible with different environments. It has a rich plugin system to allow a myriad of features to be enabled or disabled. This makes the parser very versatile to fit different ways of working with JavaScript source code. This plugin system is built to enable or disable several language constructs, 

One of Babel's more prominent features is \texttt{@babel/parse}~\cite{BabelParser} with plugins. This library allows parsing of JavaScript experimental features. These features are usually proposals that are under development by TC39, and the development of these plugins are a part of the proposal deliberation process. This allows for experimentation as early as stage one of the proposal development process. Some examples of proposals that were first  supported by Babels plugin system are "Do Expression"~\cite{Proposal:DoProposal} and "Pipeline"~\cite{Pipeline}. These proposals are both currently in the very early stage of development, with "Do Expression" being stage one, and "Pipeline" being stage 2.  


\section{Source Code Querying}

Source code querying is the action of searching source code to extract some information or find specific sections of code. This is primarily done using several varying techniques, and is a core part of many tools developers use. The primary use cases for source code querying is code understanding, analysis, code navigation, enforcement of styles along with others. All these are important tools developers use when writing programs, and they all rely on some form of source code queries.  

Source code querying comes in many forms, the simplest of which is text search. Since source code is primarily text, one can apply text search techniques to perform a query, this can be regular string search like with CTRL+F in the browser, or a more complex approach using regular expressions with tools like grep. Both these methods cannot allow for queries based on the structure of the code, and rely solely on its syntax. AST based queries allow queries to be written based on both syntax and structure, and are generally more powerful than regular text based queries. Another technique for code querying is creating queries based on semantics of code. Recently, querying based on the semantics of code is more feasible by using large language models to perform the queries. 

Source code querying is used in many areas of software development. Some of the more prevalent areas is in Integrated Development Environments (IDEs), as these tools are created to write source code, and therefore rely on querying of the source code written for many of their features. Some of these features include code navigation, static code analysis, or complex code searching. One such example of code querying being used in an IDE is Jetbrains structural search and replace~\cite{StructuralSearchAndReplaceJetbrains}, where we define queries based on code structure to find and replace sections of our program. 

\section{Domain Specific languages}

Domain specific languages are computer languages specialized to a specific domain. If we compare a DSL to a general purpose language like Python, C++ or JavaScript, these GPL are not designed with a specific task in mind, but have a more general feature set to allow them to be used in a wide array of applications.  What a domain is for a DSL is not so simple to define, as there is no general way to define exactly the point in which a DSL becomes a GPL and vice versa. This difference is defined more like a spectrum, in which DSL is on one end and GPL is on the other~\cite{DSLSANDHOW}.  

DSL's has some clear advantages when being applied to a specific domain compared to GPL's. A DSL allows for very concise and expressive code to be written that is specifically designed for the application, in which a GPL might require specific implementations to suit the domain. Using a DSL might result in faster development because of this expressiveness within the domain, this specificity to a domain might also increase correctness. However, there are also clear disadvantages to DSL's, the restrictiveness of a DSL might become a hinderance if it is not well designed to the domain. DSL's also might have a learning curve, making the knowledge required to use them a hinderance. Developing the DSL might also be a hinderance, as a DSL requires both knowledge of the domain and knowledge of language design. 


\section{Language Workbenches}

A language workbench is a tool created to facilitate the development of a computer language, such as a DSL. Language workbenches also create tooling for languages defined within them, and help with the language development process in general. 

Language workbenches support generating tooling for languages, as most modern computer languages are backed by some form of tooling. This tooling comes in the form of language parsing, language servers for integrated development environments, along with other tooling for using the language created within the language workbench. 

A language is defined in a language workbench using a grammar definition. This grammar is a formal specification of the language that describes how each language construct is composed and the structure of the language. This allows the language workbench to determine what is a valid sentence of the language. This grammar is used to create the the AST of the language, which is the basis for all the tools generated by the language workbench. Many such language workbenches exist, such as Langium~\cite{Langium}, Xtext~\cite{Xtext}, Jetbrains MPS, and Racket.