Finished feedback so far on chapter 3

This commit is contained in:
Rolf Martin Glomsrud 2024-05-28 20:47:25 +02:00
parent 1235d0f917
commit d66a638d0b
8 changed files with 89 additions and 31 deletions

Binary file not shown.

View file

@ -320,6 +320,8 @@ let x = do {
\end{lstlisting}
\end{minipage}\hfil
The current version of JavaScript enables the use of arrow functions with no arguments to achieve similar behavior to "Do Expression". The main difference in this case, is the final statement/expression will implicitly return it's Completion Record~\cite[6.2.4]{ecma262}
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Current status quo
@ -341,13 +343,14 @@ let x = do {
\end{lstlisting}
\end{minipage}\hfil
This example is very similar, as it uses an unnamed function~\cite[15.2]{ecma262} which is invoked immediately to produce similar behavior to the "Do Expression" proposal.
\subsection{Await to Promise}
We discuss now an imaginary proposal that was used as a running example during the development of this thesis. This proposal is of just a pure JavaScript transformation example. The transformation this proposal is meant to display, is transforming a code using \texttt{await}~\cite[27.7.5.3]{ecma262}, into code which uses a promise~\cite[27.2]{ecma262}.
To perform this transformation, we define an equivalent way of expressing an \texttt{await} expression as a promise. The equivalent way of expressing \texttt{await} with a promise, is removing \texttt{await} from the expression, this expression now will return a promise, which has a function \texttt{then()}, this function is executed when the promise resolves. We pass an arrow function as argument to \texttt{then()}, and append each following statement in the current scope~\cite[8.2]{ecma262} inside the block of that arrow function. This will result in equivalent behavior to using \texttt{await}.
To perform this transformation, we define an equivalent way of expressing an \texttt{await} expression as a promise. This means removing \texttt{await}, this expression now will return a promise, which has a function \texttt{then()}, this function is executed when the promise resolves. We pass an arrow function as argument to \texttt{then}, and append each following statement in the current scope~\cite[8.2]{ecma262} inside the block of that arrow function. This will result in equivalent behavior to using \texttt{await}.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
@ -366,7 +369,7 @@ async function a(){
async function a(){
let b = 9000;
return asyncFunction()
.then((something) => {
.then(async (something) => {
let c = something + 100;
return c;
})
@ -374,6 +377,8 @@ async function a(){
\end{lstlisting}
\end{minipage}\hfil
Transforming using this imaginary proposal, will result in a returning the expression present at the first \texttt{await} expression, with a deferred function \texttt{then}, that will execute once the expression is completed. This function \texttt{then} takes a callback containing a lambda function with a single argument. This argument shares a name with the initial \texttt{VariableDeclaration}. This is needed because we have to transfer all statements that occur after the original \texttt{await} expression into the body of the callback function. This callback function also has to be async, in case any of the statements placed into it contains \texttt{await}. This will result in equivalent behavior to the original code.
\section{Searching user code for applicable snippets}
In order to identify snippets of code in the user's code where a proposal is applicable, we need some way to define patterns of code to use as a query. To do this, we have designed and implemented a domain-specific language that allows matching parts of code that is applicable to some proposal, and transforming those parts to use the features of that proposal.
@ -381,7 +386,9 @@ In order to identify snippets of code in the user's code where a proposal is app
\subsection{Structure of \DSL}
\label{sec:DSLStructure}
\paragraph*{Proposal Definition}
In this section, we describe the structure of \DSL. We describe every section of the language, why each section is needed and what it is used for.
\paragraph*{Proposal definition.}
\DSL is designed to mimic the examples already provided in proposal descriptions~\cite{TC39Process}. These examples can be seen in each of the proposals described in Section \ref{sec:proposals}. The idea is to allow a similar kind of notation to the examples in order to define the transformations.
@ -391,7 +398,7 @@ The first part of \DSL is defining the proposal, this is done by creating a name
proposal Pipeline_Proposal {}
\end{lstlisting}
\paragraph*{Case}
\paragraph*{Case definition.}
Each proposal will have one or more definitions of a template for code to identify in the users codebase, and its corresponding transformation definition. These are grouped together in order to have a simple way of identifying the corresponding cases of matching and transformations. This section of the proposal is defined by the keyword \textit{case} and a block that contains its related fields. A proposal definition in \DSL should contain at least one \texttt{case} definition. This allows for matching many different code snippets and showcasing more of the proposal than a single concept the proposal has to offer.
\begin{lstlisting}
@ -409,6 +416,7 @@ applicable to {
"let a = 0;"
}
\end{lstlisting}
This \texttt{applicable to} template, will create matches on any \texttt{VariableDeclaration} that is initialized to the value 0, and is stored in an \texttt{Identifier} with name \texttt{a}.
\paragraph*{Defining the transformation}
@ -421,6 +429,7 @@ transform to{
}"
}
\end{lstlisting}
This transformation definition, will change any code matched to its corresponding matching definition into exactly what is defined. This means for any matches produced this code will be inserted in its place.
\paragraph*{Full definition of \DSL}
@ -446,20 +455,44 @@ proposal PROPOSAL_NAME {
}
}
\end{lstlisting}
\subsection{\DSL}
This full example of \DSL has two \texttt{case} sections. Each \texttt{case} is applied one at a time to the user's code. The first case will try to find any \texttt{VariableDeclaration} statements, where the identifier is \texttt{b}, and the right side expression is a \texttt{Literal} with value 100. The second \texttt{case} will change any empty \texttt{console.log} expression, into a \texttt{console.dir} expression.
\subsection{How a match and transformation is performed}
\label{sec:DSL_DEF}
Showcasing a proposal using a user's code requires some way of identifying applicable code sections to that proposal. To do this, we have designed a DSL called \DSL, JavaScript Template Query Language.
To perform matching and transformation of the user's code, we first have to have some way of identifying applicable user code. These applicable code sections then have to be transformed and inserted it back into the full user code definition.
\subsection*{Identifying applicable code}
In order to identify sections of code a proposal is applicable to, we use \emph{templates}, which are snippets of JavaScript. These templates are used to identify and match applicable sections of a users code. A matching section for a template is one that produces an exactly equal AST structure, where each node of the AST sections has the same information contained within it. This means that templates are matched exactly against the users code, this does not really provide some way of querying the code and performing context based transformations, so for that we use \textit{wildcards} within the template.
To identify sections of code a proposal is applicable to, we use \emph{templates}, which are snippets of JavaScript. These templates are used to identify and match applicable sections of a users code. A matching section for a template is one that produces an exactly equal AST structure, where each node of the AST sections has the same information contained within it. This means that templates are matched exactly against the users code, this does not really provide some way of querying the code and performing context based transformations, so for that we use \textit{wildcards} within the template.
Wildcards are interspliced into the template inside a block denoted by \texttt{$<$$<$ $>$$>$}. Each wildcard starts with an identifier, which is a way of referring to that wildcard in the definition of the transformation template later. This allows for transferring the context of parts matched to a wildcard into the transformed output, like identifiers, parts of statements, or even entire statements, can be transferred from the original user code into the transformation template. A wildcard also contains a type expression. A type expression is a way of defining exactly the types of AST nodes a wildcard will produce a match against. These type expressions use Boolean logic together with the AST node-types from BabelJS~\cite{Babel} to create a very versatile of defining exactly what nodes a wildcard can match against.
Wildcards are interspliced into the template inside a block denoted by \texttt{<< >>}. Each wildcard starts with an identifier, which is a way of referring to that wildcard in the definition of the transformation template later. This allows for transferring the context of parts matched to a wildcard into the transformed output, like identifiers, parts of statements, or even entire statements, can be transferred from the original user code into the transformation template. A wildcard also contains a type expression. A type expression is a way of defining exactly the types of AST nodes a wildcard will produce a match against. These type expressions use Boolean logic together with the AST node-types from BabelJS~\cite{Babel} to create a very versatile of defining exactly what nodes a wildcard can match against.
\subsubsection*{Wildcard type expressions}
Wildcard expressions are used to match AST node types based on Boolean logic. This means an expression can be as simple as \texttt{VariableDeclaration}: this will match only against a node of type \texttt{VariableDeclaration}. Every type used in these expressions are compared against the AST node types from Babel~\cite{Babel}, meaning every AST node type is supported. We also include the types \texttt{Statement} for matching against a statement, and \texttt{Expression} for matching any expression. The expressions also support binary and unary operators, an example \texttt{Statement \&\& !ReturnStatement} will match any statement which is not of type \texttt{ReturnStatement}. The expressions support the following operators, \texttt{\&\&} is logical AND, this means both parts of the expression have to evaluate to true, \texttt{||} means logical OR, so either side of expression can be true for the entire expression to be true, \texttt{!} is the only unary expression, and is logical NOT, so \texttt{!Statement} is any node that is NOT a Statement. The wildcards support matching multiple sibling nodes, this is done by using \texttt{(expr)+}, this is only valid at the top level of the expression. This is useful for matching against a series of one or more Statements, while not wanting to match an entire \texttt{BlockStatement}, this is written as \texttt{(Statement \&\& !ReturnStatement)+}.
Wildcard expressions are used to match AST node types based on Boolean logic. This Boolean logic is based on comparison of Babel AST node types~\cite{BabelAST}. We do this because we need an accurate and expressive way of defining specifically what kinds of AST nodes a wildcard can be matched against. This means an type expression can be as simple as \texttt{VariableDeclaration}: this will match only against a node of type \texttt{VariableDeclaration}. We also special types for \texttt{Statement} for matching against a statement, and \texttt{Expression} for matching any expression.
This example will allow any \texttt{CallExpression} to match against this wildcard named \texttt{expr}.
\begin{lstlisting}
<< expr: CallExpression >>
\end{lstlisting}
To make this more expressive, the type expressions support binary and unary operators.We support the following operators, \texttt{\&\&} is logical conjunction, \texttt{||} means logical disjunction,\texttt{!} is logical negation. This makes it possible to build complex type expressions, making it very expressive exactly what nodes are allowed to match against a specific wildcard.
In the first example on line 1, we want to limit the wildcard to not match against any nodes with type \texttt{VariableDeclaration}, while still allowing any other \texttt{Statement}. The example on line 2 want to avoid loop specific statements. We express this by allowing any \texttt{Statement}, but we negate the expression containing the types of loop specific statements.
\begin{lstlisting}
<< notVariableDeclaration: Statement && !VariableDeclaration >>
<< noLoopSpecificStatements: Statement && !(BreakStatement || ContinueStatement) >>
\end{lstlisting}
The wildcards support matching subsequent sibling nodes of the code against a single wildcard. We achieve this behavior done by using a Keene plus at the top level of the expression. A Keene plus means one or more, so we allow for one or more matches in order when using this token. This is useful for matching against a series of one or more specific nodes, the matching algorithm will continue to match until the type expression no longer evaluates to true.
In the example below, we allow the wildcard to match multiple nodes with the Keene plus \texttt{+}. This example will continue to match against itself as long as the nodes are a \texttt{Statement} and at the same time is not a \texttt{ReturnStatement}.
\begin{lstlisting}
<< statementsNoReturn : (Statement && !ReturnStatement)+ >>
\end{lstlisting}
\begin{lstlisting}
@ -471,9 +504,9 @@ A wildcard section is defined on the right hand side of an assignment statement.
\subsection{Transforming}
When matching sections of the users code has been found, we need some way of defining how to transform those sections to showcase a proposal. This is done in an \texttt{transform to} block, this template describes the general structure of the newly transformed code.
When matching sections of the users code has been found, we need some way of defining how to transform those sections to showcase a proposal. This is done using the \texttt{transform to} template. This template describes the general structure of the newly transformed code, with context from the users code by using wildcards.
A transformation template is used to define how the matches will be transformed after applicable code has been found. The transformation is a general template of the code once the match is replaced in the original AST. However, without transferring over the context from the match, this would be a template search and replace. Thus, in order to transfer the context from the match, wildcards are defined in this template as well. These wildcards use the same block notation found in the \texttt{applicable to} template, however they do not need to contain the types, as those are not needed in the transformation. The only required field of the wildcard is the identifier defined in \texttt{applicable to}. This is done in order to know which wildcard match we are taking the context from, and where to place it in the transformation template.
A transformation template defines how the matches will be transformed after applicable code has been found. The transformation is a general template of the code once the match is replaced in the original AST. However, without transferring over the context from the match, this would be a template search and replace. Thus, in order to transfer the context from the match, wildcards are defined in this template as well. These wildcards use the same block notation found in the \texttt{applicable to} template, however they do not need to contain the types, as those are not needed in the transformation. The only required field of the wildcard is the identifier defined in \texttt{applicable to}. This is done in order to know which wildcard match we are taking the context from, and where to place it in the transformation template.
@ -493,29 +526,45 @@ transform to {
\subsection{Using \DSL}
\DSL is designed to be used in tandem with proposal development, this means the users of \DSL will most likely be contributors to TC39~\cite{TC39} or member of TC39.
\DSL is designed to be used at a proposal development stage, this means the users of \DSL will most likely be TC39~\cite{TC39} delegates, or otherwise relevant stakeholders.
\DSL is designed to closely mimic the style of the examples required in the TC39 process~\cite{TC39Process}. We chose to design it this way to specifically make this tool fit the use-case of the committee. The idea behind this project is to gather early user feedback on syntactic proposals, this would mean the main users of this kind of tool is the committee themselves.
\DSL is just written using text, most modern Domain-specific languages have some form of tooling in order to make the process of using the DSL simpler and more intuitive. \DSL has an extension built for Visual Studio Code, see Figure \ref{fig:ExtensionExample}, this extension supports many common features of language servers, it supports auto completion, it will produce errors if fields are defined wrong or missing parameters. The extension performs validation of the wildcards, such as checking for wildcards with missing type parameters,wrong expression definitions, or usage of undeclared wildcards, a demonstration of this can be seen in \ref{fig:ExtensionError}.
\DSL is just written using text, most Domain-specific languages have some form of tooling to make the process of using the DSL simpler and more intuitive. \DSL has an extension built for Visual Studio Code, see Figure \ref{fig:ExtensionExample}, this extension supports many common features of language servers, it supports auto completion, it will produce errors if fields are defined wrong or are missing parameters.
\begin{figure}[H]
\includegraphics[width=\textwidth]{figures/ExtensionExample.png}
\begin{center}
\includegraphics[width=\textwidth/2]{figures/ExtensionExample.png}
\caption{\label{fig:ExtensionExample} Writing \DSL in Visual Studio Code with extension}
\end{center}
\end{figure}
The language server included with this extension performs validation of the wildcards. This allows verification of wildcard declarations in applicable to, see Figure \ref{fig:NoTypes}. If a wildcard is declared with no types, an error will be reported.
\begin{figure}[H]
\includegraphics[width=\textwidth]{figures/ErrorExampleExtension.png}
\caption{\label{fig:ExtensionError} Errors of wildcards}
\begin{center}
\includegraphics[width=\textwidth/2]{figures/EmptyType.png}
\caption{\label{fig:NoTypes} Error displayed when declaring a wildcard with no types.}
\end{center}
\end{figure}
The extension automatically uses wildcard declarations in \texttt{applicable to} to verify all wildcards referenced in \texttt{transform to} are declared. If an undeclared wildcard is used, an error will be reported and the name of the undeclared wildcard will be displayed, see Figure \ref{fig:UndeclaredWildcard}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=\textwidth/2]{figures/EmptyType.png}
\caption{\label{fig:UndeclaredWildcard} Error displayed with usage of undeclared wildcard.}
\end{center}
\end{figure}
\section{Using the \DSL with syntactic proposals}
This section contains the definitions of the proposals used to evaluate the tool created in this thesis. These definitions do not have to cover every single case where the proposal might be applicable, as they just have to be general enough to create some amount of examples that will give a representative number of matches when the transformations are applied to some relatively long user code.
This section contains the definitions of the proposals used to evaluate the tool created in this thesis. These definitions do not have to cover every single case where the proposal might be applicable, as they just have to be general enough to create some amount of examples that will give a representative number of matches when the transformations are applied to some relatively long user code. This is because this this tool will be used to gather feedback from user's on proposals during development. Because of this use case, it does not matter that we catch every single applicable code snippet, just that we find enough to perform a "showcase" of the proposal to the user. The most important thing is that the transformation is correct, as incorrect transformations will lead to bad feedback on the proposal.
\subsection{"Pipeline" Proposal}
The "Pipeline" proposal is the first we define of the proposals presented in Section \ref{sec:proposals}. This is due to the proposal being applicable to function calls, which is used all across JavaScript. This proposal is trying to solve readability when performing deeply nested function calls.
The "Pipeline" proposal is one of the proposals presented in Section \ref{sec:proposals}. This proposal is applicable to call expressions, which are used all across JavaScript. This proposal is trying to solve readability when performing deeply nested function calls.
\begin{lstlisting}[language={JavaScript}, caption={Example of "Pipeline" proposal definition in \DSL}, label={def:pipeline}]
@ -542,12 +591,12 @@ proposal Pipeline {
}
\end{lstlisting}
In the Listing \ref{def:pipeline}, the first pair definition \texttt{SingleArgument} will apply to any \textit{CallExpression} with a single argument. We do not expressively write a \texttt{CallExpression} inside a wildcard, as we have defined the structure of a \texttt{CallExpression}. The first wildcard \texttt{someFunctionIdent}, has the types of \texttt{Identifier}, to match against single identifiers, and \texttt{MemberExpression}, to match against functions who are members of objects, i.e. \texttt{console.log}. In the transformation template, we define the structure of a function call using the pipe operator, but the wildcards change order, so the argument passed as argument \texttt{someFunctionParam} is placed on the left side of the pipe operator, and the \texttt{CallExpression} is on the right, with the topic token as the argument. This case will produce a match against all function calls with a single argument, and transform them to use the pipe operator. The main difference of the second case \texttt{TwoArgument}, is it matches against functions with exactly two arguments, and uses the first argument as the left side of the pipe operator, while the second argument remains in the function call.
In the Listing \ref{def:pipeline}, the first pair definition \texttt{SingleArgument} will apply to any \texttt{CallExpression} with a single argument. We do not expressively write a \texttt{CallExpression} inside a wildcard, as we have defined the structure of a \texttt{CallExpression}. The first wildcard \texttt{someFunctionIdent}, has the types of \texttt{Identifier}, to match against single identifiers, and \texttt{MemberExpression}, to match against functions who are members of objects, i.e. \texttt{console.log}. In the transformation template, we define the structure of a function call using the pipe operator, but the wildcards change order, so the argument passed as argument \texttt{someFunctionParam} is placed on the left side of the pipe operator, and the \texttt{CallExpression} is on the right, with the topic token as the argument. This case will produce a match against all function calls with a single argument, and transform them to use the pipe operator. The main difference of the second case \texttt{TwoArgument}, is it matches against functions with exactly two arguments, and uses the first argument as the left side of the pipe operator, while the second argument remains in the function call.
\subsection{"Do Expressions" Proposal}
The "Do Expressions" proposal~\cite{Proposal:DoProposal} can be specified in our tool. Due to the nature of the proposal, it is not as applicable as the "Pipeline" proposal, as it does not re-define a style that is used quite as frequently as call expressions. This means the amount of transformed code snippets this specification in \DSL will be able to perform is expected to be lower. This is due to the Do Proposal introducing an entirely new way to write expression-oriented code in JavaScript. If the user running this tool has not used the current way of writing in an expression-oriented style in JavaScript, \DSL is limited in the amount of transformations it can perform. Nevertheless, if the user has been using an expression-oriented style, \DSL will transform parts of the code.
The "Do Expressions" proposal~\cite{Proposal:DoProposal} can be specified in our DSL. Due to the nature of the proposal, it is not as applicable as the "Pipeline" proposal, as it does not re-define a style that is used quite as frequently as call expressions. This means the amount of transformed code snippets this specification in \DSL will be able to perform is expected to be lower. This is due to the "Do Expression" proposal introducing an entirely new way to write expression-oriented code in JavaScript. If the user running this tool has not used the current way of writing in an expression-oriented style in JavaScript, \DSL is limited in the amount of transformations it can perform. Nevertheless, if the user has been using an expression-oriented style, \DSL will transform parts of the code.
\begin{lstlisting}[language={JavaScript}, caption={Definition of Do Proposal in \DSL}, label={def:doExpression}]
proposal DoExpression {

View file

@ -75,7 +75,7 @@ Contained within the \texttt{Model} rule, is one or more proposals. Each proposa
The \texttt{Case} rule is created to contain a single transformation. Each case starts with the keyword \texttt{case}, followed by a name for the current case, then a block for that case's fields. Cases are designed in this way to separate different transformation definitions within a proposal. Each case contains a single definition used to match against user code, and a definition used to transform a match.
The rule, \texttt{AplicableTo}, is designed to hold a single template used for matching. It starts with the keywords \texttt{applicable} and \texttt{to}, followed by a block designed to hold the matching template definition. The template is defined as the terminal \texttt{STRING}, and is parsed as a raw string for characters by Langium\cite{Langium}.
The rule, \texttt{AplicableTo}, is designed to hold a single template used for matching. It starts with the keywords \texttt{applicable} and \texttt{to}, followed by a block designed to hold the matching template definition. The template is defined as the terminal \texttt{STRING}, and is parsed as a raw string for characters by Langium~\cite{Langium}.
The rule, \texttt{TransformTo}, is created to contain a single template used for transforming a match. It starts with the keywords \texttt{transform} and \texttt{to}, followed by a block that holds the transformation definition. This transformation definition is declared with the terminal \texttt{STRING}, and is parser at a string of characters, same as the template in \texttt{applicable to}.
@ -111,7 +111,7 @@ terminal ID: /[_a-zA-Z][\w_]*/;
terminal STRING: /"[^"]*"|'[^']*'/;
\end{lstlisting}
In the case of \DSL, we are not implementing a programming language meant to be executed. We are using Langium in order to generate an AST that will be used as a markup language, similar to YAML, JSON or TOML\cite{TOML}. The main reason for using Langium in such an unconventional way is Langium provides support for Visual Studio Code integration, and it solves the issue of parsing the definition of each proposal manually. However with only the grammar we cannot actually verify the wildcards placed in \texttt{apl\_to\_code} and \texttt{transform\_to\_code} are correctly written. This is done by using a feature of Langium called \texttt{Validator}.
In the case of \DSL, we are not implementing a programming language meant to be executed. We are using Langium in order to generate an AST that will be used as a markup language, similar to YAML, JSON or TOML~\cite{TOML}. The main reason for using Langium in such an unconventional way is Langium provides support for Visual Studio Code integration, and it solves the issue of parsing the definition of each proposal manually. However with only the grammar we cannot actually verify the wildcards placed in \texttt{apl\_to\_code} and \texttt{transform\_to\_code} are correctly written. This is done by using a feature of Langium called \texttt{Validator}.
\subsection*{Langium Validator}
@ -146,7 +146,7 @@ export class JstqlValidator {
\subsection*{Using Langium as a parser}
Langium\cite{Langium} is designed to automatically generate extensive tool support for the language specified using its grammar. However, in our case we have to parse the \DSL definition using Langium, and then extract the Abstract syntax tree generated in order to use the information it contains.
Langium~\cite{Langium} is designed to automatically generate extensive tool support for the language specified using its grammar. However, in our case we have to parse the \DSL definition using Langium, and then extract the Abstract syntax tree generated in order to use the information it contains.
To use the parser generated by Langium, we created a custom function \texttt{parseDSLtoAST}, which takes a string as an input (the raw \DSL code), and outputs the pure AST using the format described in the grammar, see Listing \ref{sec:DSL_DEF}. This function is exposed as a custom API for our tool to interface with. This also means our tool is dependent on the implementation of the Langium parser to function with \DSL. The implementation of \DSLSH is entirely independent.
@ -162,7 +162,7 @@ Langium has support for creating a generator to output an artifact, which is som
\subsection*{Extracting wildcards from \DSL}
In order to allow the use of Babel\cite{Babel}, the wildcards present in the \texttt{applicable to} blocks and \texttt{transform to} blocks have to be parsed and replaced with some valid JavaScript. This is done by using a pre-parser that extracts the information from the wildcards and inserts an \texttt{Identifier} in their place.
In order to allow the use of Babel~\cite{Babel}, the wildcards present in the \texttt{applicable to} blocks and \texttt{transform to} blocks have to be parsed and replaced with some valid JavaScript. This is done by using a pre-parser that extracts the information from the wildcards and inserts an \texttt{Identifier} in their place.
To extract the wildcards from the template, we look at each character in the template. If a start token of a wildcard is discovered, which is denoted by \texttt{<<}, everything after that until the closing token, which is denoted by \texttt{>>}, is then treated as an internal DSL variable and will be stored by the tool. A variable \texttt{flag} is used (line 5,10 \ref{lst:extractWildcard}), when the value of flag is false, we know we are currently not inside a wildcard block, this allows us to just pass the character through to the variable \texttt{cleanedJS} (line 196 \ref{lst:extractWildcard}). When \texttt{flag} is true, we know we are currently inside a wildcard block and we collect every character of the wildcard block into \texttt{temp}. Once we hit the end of the wildcard block, when we have consumed the entirety of the wildcard, the contents of the \texttt{temp} variable is passed to a tokenizer, then the tokens are parsed by a recursive descent parser (line 10-21 \ref{lst:extractWildcard}).
@ -262,7 +262,7 @@ The reason this is preferred is it allows us to avoid having to extract the wild
Allowing the tool to perform transformations of code requires the generation of an Abstract Syntax Tree from the users code, \texttt{applicable to} and \texttt{transform to}. This means parsing JavaScript into an AST, in order to do this we use a tool~\cite[Babel]{Babel}.
The most important reason for choosing to use Babel for the purpose of generating the AST's used for transformation is due to the JavaScript community surrounding Babel. As this tool is dealing with proposals before they are part of JavaScript, a parser that supports early proposals for JavaScript is required. Babel works closely with TC39 to support experimental syntax\cite{BabelProposalSupport} through its plugin system, which allows the parsing of code not yet part of the language.
The most important reason for choosing to use Babel for the purpose of generating the AST's used for transformation is due to the JavaScript community surrounding Babel. As this tool is dealing with proposals before they are part of JavaScript, a parser that supports early proposals for JavaScript is required. Babel works closely with TC39 to support experimental syntax~\cite{BabelProposalSupport} through its plugin system, which allows the parsing of code not yet part of the language.
\subsection*{Custom Tree Structure}
@ -288,7 +288,7 @@ export class TreeNode<T> {
Placing the AST generated by Babel into this structure means utilizing the library~\cite{BabelTraverse}{Babel Traverse}. Babel Traverse uses the~\cite{VisitorPattern}{visitor pattern} to allow for traversal of the AST. While this method does not suit traversing multiple trees at the same time, it allows for very simple traversal of the tree in order to place it into our simple tree structure.
\cite{BabelTraverse}{Babel Traverse} uses the~\cite{VisitorPattern}{visitor pattern} to visit each node of the AST in a \textit{depth first} manner, the idea of this pattern is one implements a \textit{visitor} for each of the nodes in the AST and when a specific node is visited, that visitor is then used. In the case of transferring the AST into our simple tree structure we simply have to use the same visitor for all nodes, and place that node into the tree.
\texttt{@babel/traverse}~\cite{BabelTraverse} uses the~\cite{VisitorPattern}{visitor pattern} to visit each node of the AST in a \textit{depth first} manner, the idea of this pattern is one implements a \textit{visitor} for each of the nodes in the AST and when a specific node is visited, that visitor is then used. In the case of transferring the AST into our simple tree structure we simply have to use the same visitor for all nodes, and place that node into the tree.
Visiting a node using the \texttt{enter()} function means we went from the parent to that child node, and it should be added as a child node of the parent. The node is automatically added to its parent list of children nodes from the constructor of \texttt{TreeNode}. Whenever leaving a node the function \texttt{exit()} is called, this means we are moving back up into the tree, and we have to update what node was the \textit{last} in order to generate the correct tree structure.
@ -354,12 +354,12 @@ traverse(ast, {
Each line in \ref{lst:outline} is a step in the full algorithm for transforming user code based on a proposal specification in our tool. These steps work as follows:
\begin{description}
\item [Line 1:] Extract the wildcards from the template definitions and replace them with identifiers.
\item [Line 3:] Parse all source code into a Babel AST using \texttt{@babel/parser}\cite{BabelParser}
\item [Line 3:] Parse all source code into a Babel AST using \texttt{@babel/parser}~\cite{BabelParser}
\item [Line 5:] Convert the Babel AST into our own tree structure for simpler traversal of multiple trees at the same time
\item [Lines 7-12:] Based on the \texttt{applicable to} template, decide what matching function to use, and find all matching sections of the user code.
\item [Lines 14-17:] Move all matched wildcard nodes into an instance of the \texttt{transform to} template.
\item [Lines 20-26:] Insert all transformations from the previous step into the original user AST.
\item [Line 29:] Generate source code from the user AST using \texttt{@babel/generate}\cite{BabelGenerate}.
\item [Line 29:] Generate source code from the user AST using \texttt{@babel/generate}~\cite{BabelGenerate}.
\end{description}
\section{Matching}
@ -402,7 +402,7 @@ The larger and more complex the \texttt{applicable to} template is, the fewer ma
Determining if we are currently matching with a template that is only a single expression/statement, we have to verify that the program body of the template has the length of one, if it does we can use the single length traversal algorithm.
There is a special case for if the template is a single expression, as the first node of the AST generated by \texttt{@babel/generate}\cite{BabelGenerate} will be of type \texttt{ExpressionStatement}, the reason for this is Babel will treat free floating expressions as a statement. This will miss many applicable parts of the users code, because expressions within other statements are not wrapped in an \texttt{ExpressionStatement}. This will give a template that is incompatible with a lot of otherwise applicable expressions. This means the statement has to be removed, and the search has to be done with the expression as the top node of the template. If the node in the body of the template is a statement, no removal has to be done, as a statement can be used directly.
There is a special case for if the template is a single expression, as the first node of the AST generated by \texttt{@babel/generate}~\cite{BabelGenerate} will be of type \texttt{ExpressionStatement}, the reason for this is Babel will treat free floating expressions as a statement. This will miss many applicable parts of the users code, because expressions within other statements are not wrapped in an \texttt{ExpressionStatement}. This will give a template that is incompatible with a lot of otherwise applicable expressions. This means the statement has to be removed, and the search has to be done with the expression as the top node of the template. If the node in the body of the template is a statement, no removal has to be done, as a statement can be used directly.
\paragraph{Discovering Matches Recursively}
@ -488,7 +488,7 @@ return [currentPair, Match];
\subsection{Matching multiple Statements}
Using multiple statements in the template of \texttt{applicable to} means the tree of \texttt{applicable to} as multiple root nodes, to perform a match with this kind of template, we use a sliding window\cite{SlidingWindow} with size equal to the amount statements in the template. This window is applied at every \textit{BlockStatement} and \texttt{Program} of the code AST, as that is the only placement statements can reside in JavaScript~\cite[14]{ecma262}.
Using multiple statements in the template of \texttt{applicable to} means the tree of \texttt{applicable to} as multiple root nodes, to perform a match with this kind of template, we use a sliding window~\cite{SlidingWindow} with size equal to the amount statements in the template. This window is applied at every \textit{BlockStatement} and \texttt{Program} of the code AST, as that is the only placement statements can reside in JavaScript~\cite[14]{ecma262}.
The initial step of this algorithm is to search through the AST for ast nodes that contain a list of \textit{Statements}. Searching the tree is done by Depth-First search, at every level of the AST, we check the type of the node. Once a node of type \texttt{BlockStatement} or \texttt{Program} is discovered, we start the trying to match the statements.
@ -545,7 +545,7 @@ The transformations are performed by inserting the matched wildcards from the \t
First we have to extract every node that was matched against the wildcards in the match. To do this we recursively search through the match until we encounter an \texttt{Identifier} that shares a name with a wildcard.
To insert all nodes matched against wildcards, we use \texttt{@babel/traverse}\cite{BabelTraverse}, and traverse the AST of the \texttt{transform to} template. We use custom visitors for \textit{Identifier} and \textit{ExpressionStatement} with an \texttt{Identifier} as expression. Each visitor checks if the identifier is a registered wildcard, if it is, we perform a replacement of the \texttt{Identifier} with the node/s the wildcard was matched with.
To insert all nodes matched against wildcards, we use \texttt{@babel/traverse}~\cite{BabelTraverse}, and traverse the AST of the \texttt{transform to} template. We use custom visitors for \textit{Identifier} and \textit{ExpressionStatement} with an \texttt{Identifier} as expression. Each visitor checks if the identifier is a registered wildcard, if it is, we perform a replacement of the \texttt{Identifier} with the node/s the wildcard was matched with.
\subsubsection*{Inserting the template into the AST}

BIN
figures/EmptyType.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 308 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 242 KiB

View file

@ -375,4 +375,13 @@
urldate = {2024-05-27},
note = {[Online; accessed 27. May 2024]},
url = {https://github.com/babel/proposals}
}
@misc{BabelAST,
title = {{babel/packages/babel-parser/ast/spec.md at main {$\cdot$} babel/babel}},
journal = {GitHub},
year = {2024},
month = may,
urldate = {2024-05-28},
note = {[Online; accessed 28. May 2024]},
url = {https://github.com/babel/babel/blob/main/packages/babel-parser/ast/spec.md}
}