master/chapter/ch3.tex
2024-05-31 15:27:32 +02:00

700 lines
46 KiB
TeX

\chapter{Collecting User Feedback for Syntactic Proposals}
The goal for this project is to utilize users familiarity with their own code to gain early and worthwhile user feedback on new
syntactic proposals for ECMAScript.
\section{The core idea}
When a user of ECMAScript wants to suggest a change to the language, the idea of the change has to be described in a Proposal. A proposal is a general way of describing a change and its requirements, this is done by a language specification, motivation for the idea, and general discussion around the proposed change. A proposal ideally also needs backing from the community of users that use ECMAScript, this means the proposal has to be presented to users some way. This is currently done by many channels, such as polyfills, code examples, and as beta features of the main JavaScript engines, however, this paper wishes to showcase proposals to users by using a different avenue.
Users of ECMAScript have a familiarity with code they themselves have written. This means they have knowledge of how their own code works and why they might have written it a certain way. This project aims to utilize this pre-existing knowledge to showcase new proposals for ECMAScript. This way will allow users to focus on what the proposal actually entails, instead of focusing on the examples written by the proposal authors.
Further in this chapter, we will be discussing the current version and future version of ECMAScript. What we are referring to in this case is with set of problems a proposal is trying to solve, if that proposal is allowed into ECMAScript as part of the language, there will be a future way of solving said problems. The current way is the current status quo when the proposal is not part of ECMAScript, and the future version is when the proposal is part of ECMAScript and we are utilizing the new features of said proposal.
The program will allow the users to preview proposals way before they are part of the language. This way the committee can get useful feedback from users of the language earlier in the proposal process. Using the users familiarity will ideally allow for a more efficient process developing ECMAScript.
\subsection{Applying a proposal}
The way this project will use the pre-existing knowledge a user has of their own code is to use that code as base for showcasing a proposals features. Using the users own code as base requires the following steps to automatically implement the examples that showcase the proposal inside the context of the users own code.
The ide is to identify where the features and additions of a proposal could have been used. This means identifying parts of the users program that use pre-existing ECMAScript features that the proposal is interacting with and trying to solve. This will then identify all the different places in the users program the proposal can be applied. This step is called \textit{matching} in the following chapters
Once we have matched all the parts of the program the proposal could be applied to, the users code has to be transformed to use the proposal, this means changing the code to use a possible future version of JavaScript. This step also includes keeping the context and functionality of the users program the same, so variables and other context related concepts have to be transferred over to the transformed code.
The output of the previous step is then a set of code pairs, where one a part of the users original code, and the second is the transformed code. The transformed code is then ideally a perfect replacement for the original user code if the proposal is part of ECMAScript. These pairs are used as examples to present to the user, presented together so the user can see their original code together with the transformed code. This allows for a direct comparison and an easier time for the user to understand the proposal.
The steps outlined in this section require some way of defining matching and transforming of code. This has to be done very precisely and accurately to avoid examples that are wrong. Imprecise definition of the proposal might lead to transformed code not being a direct replacement for the code it was based upon. For this we suggest two different methods, a definition written in a custom DSL \DSL and a definition written in a self-hosted way only using ECMAScript as a language as definition language. Read more about this in SECTION HERE.
\section{Applicable proposals}
\label{sec:proposals}
A proposal for ECMAScript is a suggested change for the language, in the case of ECMAScript this comes in the form of an addition to the language, as ECMAScript does not allow for breaking changes. There are many different kinds of proposals, this project focuses exclusively on Syntactic Proposals.
\subsection{Syntactic Proposals}
A syntactic proposal, is a proposal that contains only changes to the syntax of a language. This means, the proposal contains either no, or very limited change to functionality, and no changes to semantics. This limits the scope of proposals this project is applicable to, but it also focuses solely on some of the most challenging proposals where the users of the language might have the strongest opinions.
\subsection{Simple example of a syntactic proposal}
Consider an imaginary proposal \exProp. This proposal describes adding an optional keyword for declaring numerical variables if the expression of the declaration is a numerical literal.
This proposal will look something like this:
\begin{lstlisting}[language={JavaScript}, caption={Example of imaginary proposal \exProp}, label={ex:proposal}]
// Original code
let x = 100;
let b = "Some String";
let c = 200;
// Code after application of proposal
int x = 100;
let b = "Some String";
let c = 200;
\end{lstlisting}
See that in \ref{ex:proposal} the change is optional, and is not applied to the declaration of \textit{c}, but it is applied to the declaration of \textit{x}. Since the change is optional to use, and essentially is just \textit{syntax sugar}, this proposal does not make any changes to functionality or semantics, and can therefore be categorized as a syntactic proposal.
\iffalse
\subsection{\cite{Proposal:DiscardBindings}{Discard Bindings}}
The proposal \discardBindings is classified as a Syntactic Proposal, as it contains no change to the semantics of ECMAScript. This proposal is created to allow for discarding objects when using the feature of unpacking objects/arrays on the left side of an assignment. The whole idea of this proposal is to avoid declaring unused temporary variables.
Unpacking when doing an assignment refers to assigning internal fields of an object/array directly in the assignment rather than using a temporary variable. See \ref{ex:unpackingObject} for an example of unpacking an object and \ref{ex:unpackingArr}.
\begin{lstlisting}[language={JavaScript}, caption={Example of unpacking Object}, label={ex:unpackingObject}]
// previous
let temp = { a:1, b:2, c:3, d:4 };
let a = temp.a;
let b = temp.b;
// unpacking
let {a,b ...rest} = { a:1, b:2, c:3, d:4 };
rest; // { c:3, d:4 }
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example of unpacking Array}, label={ex:unpackingArr}]
// previous
let tempArr = [ 0, 2, 3, 4 ];
let a = tempArr[0]; // 0
let b = tempArr[1] // 2
//unpacking
let [a, b, _1, _2] = [ 0, 2, 3, 4 ]; // a = 0, b = 2, _1 = 3, _2 = 4
\end{lstlisting}
In ECMAScripts current form, it is required to assign every part of an unpacked object/array to some identifier. The current status quo is to use \_ as a sign it is meant to be discarded. This proposal suggests a specific keyword \textit{void} to be used as a signifier whatever is at that location should be discarded.
This feature is present in other languages, such as Rust wildcards, Python wildcards and C\# using statement and discards. In most of these other languages, the concept of discard is a single \_. In ECMAScript the \_ token is a valid identifier, therefore, this proposal suggests the use of the keyword \textit{void}. This keyword is already is reserved as part of function definitions where a function is meant to have no return value.
This proposal allows for the \textit{void} keyword to be used in a variety of contexts. Some simpler than others but all following the same pattern of allowing discarding of bindings to an identifier. It is allowed anywhere the \textit{BindingPattern}, \textit{LexicalBinding} or \textit{DestructuringAssignmentTarget} features are used in ECMAScript. This means it can be applied to unpacking of objects/arrays, in callback parameters and class methods.
\begin{lstlisting}[language={JavaScript}, caption={Example discard binding with variable discard}]
using void = new UniqueLock(mutex);
// Not allowed on top level of var/let/const declarations
const void = bar(); // Illegal
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example Object binding and assignment pattern}]
let {b:void, ...rest} = {a:1, b:2, c:3, d:4}
rest; // {a:1, c:3, d:4};
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={
Example Array binding and assignment pattern. It is not clear to the reader that in line 8 we are consuming 2 or 3 elements of the iterator. In the example on line 13 we see that is it more explicit how many elements of the iterator is consumed
}]
function* gen() {
for (let i = 0; i < Number.MAX_SAFE_INTEGER; i++) {
console.log(i);
yield i;
}
}
const iter = gen();
const [a, , ] = iter;
// prints:
// 0
// 1
const [a, void] = iter; // author intends to consume two elements
// vs.
const [a, void, void] = iter; // author intends to consume three elements
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example discard binding with function parameters. This avoids needlessly naming parameters of a callback function that will remain unused.}]
// project an array values into an array of indices
const indices = array.map((void, i) => i);
// passing a callback to `Map.prototype.forEach` that only cares about
// keys
map.forEach((void, key) => { });
// watching a specific known file for events
fs.watchFile(fileName, (void, kind) => { });
// ignoring unused parameters in an overridden method
class Logger {
log(timestamp, message) {
console.log(`${timestamp}: ${message}`);
}
}
class CustomLogger extends Logger {
log(void, message) {
// this logger doesn't use the timestamp...
}
}
// Can also be utilized for more trivial examples where _ becomes
// cumbersome due to multiple discarded parameters.
doWork((_, a, _1, _2, b) => {});
// vs.
doWork((void, a, void, void, b) => {
});
\end{lstlisting}
The grammar of this proposal is precisely specified in the specification found in the \href{https://github.com/tc39/proposal-discard-binding?tab=readme-ov-file#object-binding-and-assignment-patterns}{proposal definition} on github.
\begin{lstlisting}[language={JavaScript}, caption={Grammar of Discard Binding}]
var [void] = x; // via: BindingPattern :: `void`
var {x:void}; // via: BindingPattern :: `void`
let [void] = x; // via: BindingPattern :: `void`
let {x:void}; // via: BindingPattern :: `void`
const [void] = x; // via: BindingPattern :: `void`
const {x:void} = x; // via: BindingPattern :: `void`
function f(void) {} // via: BindingPattern :: `void`
function f([void]) {} // via: BindingPattern :: `void`
function f({x:void}) {} // via: BindingPattern :: `void`
((void) => {}); // via: BindingPattern :: `void`
(([void]) => {}); // via: BindingPattern :: `void`
(({x:void}) => {}); // via: BindingPattern :: `void`
using void = x; // via: LexicalBinding : `void` Initializer
await using void = x; // via: LexicalBinding : `void` Initializer
[void] = x; // via: DestructuringAssignmentTarget : `void`
({x:void} = x); // via: DestructuringAssignmentTarget : `void`
\end{lstlisting}
\fi
\subsection{"Pipeline" Proposal}
The "Pipeline" proposal~\cite{Pipeline} is a syntactic proposal which focuses on solving problems related to nesting of function calls and other expressions that take an expression as an argument.
This proposal aims to solve two problems with performing consecutive operations on a value. In \emph{ECMAScript} there are two main styles of achieving this functionality currently: nesting calls and chaining calls, each of them come with a differing set of challenges when used.
Nesting calls is mainly an issue related to function calls with one or more arguments. When doing many calls in sequence the result will be a \textit{deeply nested} call expression.
Using nested calls has some specific challenges related to readability. The order of calls is from right to left, which is the opposite of the natural reading direction a lot of the users of ECMAScript are used to day to day. This means it is difficult to switch the reading direction when working out which call happens in which order. When using functions with multiple arguments in the middle of the nested call, it is not intuitive to see what call its arguments belong to. These issues are the main challenges this proposal is trying to solve. There are currently ways to improve readability with nested calls, as they can be simplified by using temporary variables. While this does introduce its own set of issues, it provides some way of mitigating the readability problem. Another positive side of nested calls is they do not require a specific design to be used, and a library developer does not have to design their library around this specific call style.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Deeply nested call with single arguments
f1(f2(f3(f4(v))));
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Deeply nested call with multi argument functions
f1(v5, f2(f3(v3, f4(v1, v2)), v4), v6);
\end{lstlisting}
\end{minipage}\hfil
Chaining solves some of these issues: indeed, as it allows for a more natural reading direction left to right when identifying the sequence of call, arguments are naturally grouped together with their respective function call, and it provides a way of untangling deep nesting. However, executing consecutive operations using chaining has its own set of challenges. To use chaining, the API of the code being called has to be designed to allow for chaining. This is not always the case however, making use of chaining when it has not been specifically designed for can be very difficult. There are also concepts in JavaScript not supported when using chaining, such as arithmetic operations, literals, \texttt{await} expressions, \texttt{yield} expressions and so on. This is because all of these concept would "break the chain", and one would have to use temporary variables.
\begin{lstlisting}[language={JavaScript}]
// Chaining calls
function1().function2().function3();
// Chaining calls with multiple arguments
function1(value1).function2().function3(value2).function4();
\end{lstlisting}
The "Pipeline" proposal aims to combine the benefits of these two styles without the challenges each method faces.
~
The main benefit of the proposal is to allow for a similar style to chaining when chaining has not been specifically designed to be applicable. The essential idea is to use syntactic sugar to change the writing order of the calls without influencing the API of the functions. Doing so will allow each call to come in the direction of left to right, while still maintaining the modularity of deeply nested function calls.
The proposal introduces a \emph{pipe operator}, which takes the result of an expression on the left, and pipes it into an expression on the right. The location of where the result is piped to is where the topic token is located. All the specifics of the exact token used as a topic token and exactly what operator will be used as the pipe operator might be subject to change, and is currently under discussion~\cite{PipelineBikeshedding}.
The code snippets below showcase the machinery of the proposal.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Status quo
var loc = Object.keys(grunt.config( "uglify.all" ))[0];
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// With pipes
var loc = grunt.config('uglify.all') |> Object.keys(%)[0];
\end{lstlisting}
\end{minipage}\hfil
More intuitive ordering of function calls, to know exactly the order of execution.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Status quo
const json = await npmFetch.json(
npa(pkgs[0]).escapedName, opts);
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// With pipes
const json = pkgs[0] |> npa(%).escapedName |> await npmFetch.json(%, opts);
\end{lstlisting}
\end{minipage}\hfil
Seeing which argument is passed to which function call is is simpler when using pipes.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Status quo
return filter(obj, negate(cb(predicate)), context);
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// With pipes
return cb(predicate) |> _.negate(%) |> _.filter(obj, %, context);
\end{lstlisting}
\end{minipage}\hfil
Can be used with any number of function arguments, as long as a single topic token is used.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Status quo
return xf['@@transducer/result'](obj[methodName](bind(xf['@@transducer/step'], xf), acc));
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// With pipes
return xf
|> bind(%['@@transducer/step'], %)
|> obj[methodName](%, acc)
|> xf['@@transducer/result'](%);
\end{lstlisting}
\end{minipage}\hfil
Complex call expressions are unraveled with pipes.
The pipe operator is present in many other languages such as F\#~\cite{FPipeOperator} and Julia~\cite{JuliaPipe}. The main difference between the Julia and F\# pipe operator compared to this proposal, is the result of the left side expression has to be piped into a function with a single argument, the proposal suggests a topic reference to be used in stead of requiring a function. ~
\subsection{"Do Expression"}
The "Do Expression"~\cite{Proposal:DoProposal} proposal, is a proposal meant to bring a style of \textit{expression oriented programming}~\cite{ExpressionOriented} to ECMAScript. Expression oriented programming is a concept taken from functional programming which allows for combining expressions in a very free manner, resulting in a highly malleable programming experience.
The motivation of the "Do Expression" proposal is to allow for local scoping of a code block that is treated as an expression. Thus, complex code requiring multiple statements will be confined inside its own scope~\cite[8.2]{ecma262} and the resulting value is returned from the block implicitly as an expression, similarly to how a unnamed functions or arrow functions are currently used. To achieve this behavior in the current stable version of ECMAScript, one needs to use immediately invoked unnamed functions~\cite[15.2]{ecma262} and invoke them immediately, or use an arrow function~\cite[15.3]{ecma262}.
The codeblock of a \texttt{do} expression has one major difference from these equivalent functions, as it allows for implicit return of the final statement of the block, and is the resulting value of the entire \texttt{do} expression. The local scoping of this feature allows for a cleaner environment in the parent scope of the \texttt{do} expression. What is meant by this is for temporary variables and other assignments used once can be enclosed inside a limited scope within the \texttt{do} block. Allowing for a cleaner environment inside the parent scope where the \texttt{do} block is defined.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Current status quo
let x = () => {
let tmp = f();
return tmp + tmp + 1;
};
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// With do expression
let x = do {
let tmp = f();
tmp + tmp + 1;
};
\end{lstlisting}
\end{minipage}\hfil
The current version of JavaScript enables the use of arrow functions with no arguments to achieve similar behavior to "Do Expression". The main difference in this case, is the final statement/expression will implicitly return it's Completion Record~\cite[6.2.4]{ecma262}
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Current status quo
let x = function(){
let tmp = f();
let a = g() + tmp;
return a - 1;
}();
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// With do expression
let x = do {
let tmp = f();
let a = g() + tmp;
a - 1;
};
\end{lstlisting}
\end{minipage}\hfil
This example is very similar, as it uses an unnamed function~\cite[15.2]{ecma262} which is invoked immediately to produce similar behavior to the "Do Expression" proposal.
\subsection{Await to Promise}
We discuss now an imaginary proposal that was used as a running example during the development of this thesis. This proposal is of just a pure JavaScript transformation example. The transformation this proposal is meant to display, is transforming a code using \texttt{await}~\cite[27.7.5.3]{ecma262}, into code which uses a promise~\cite[27.2]{ecma262}.
To perform this transformation, we define an equivalent way of expressing an \texttt{await} expression as a promise. This means removing \texttt{await}, this expression now will return a promise, which has a function \texttt{then()}, this function is executed when the promise resolves. We pass an arrow function as argument to \texttt{then}, and append each following statement in the current scope~\cite[8.2]{ecma262} inside the block of that arrow function. This will result in equivalent behavior to using \texttt{await}.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Code containing await
async function a(){
let b = 9000;
let something = await asyncFunction();
let c = something + 100;
return c + 1;
}
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Re-written using promises
async function a(){
let b = 9000;
return asyncFunction()
.then(async (something) => {
let c = something + 100;
return c;
})
}
\end{lstlisting}
\end{minipage}\hfil
Transforming using this imaginary proposal, will result in a returning the expression present at the first \texttt{await} expression, with a deferred function \texttt{then}, that will execute once the expression is completed. This function \texttt{then} takes a callback containing a lambda function with a single argument. This argument shares a name with the initial \texttt{VariableDeclaration}. This is needed because we have to transfer all statements that occur after the original \texttt{await} expression into the body of the callback function. This callback function also has to be async, in case any of the statements placed into it contains \texttt{await}. This will result in equivalent behavior to the original code.
\section{Searching user code for applicable snippets}
To identify snippets of code in the user's code where a proposal is applicable, we need some way to define patterns of code to use as a query. To do this, we have designed and implemented a domain-specific language that allows matching parts of code that is applicable to some proposal, and transforming those parts to use the features of that proposal.
\subsection{Structure of \DSL}
\label{sec:DSLStructure}
In this section, we describe the structure of \DSL. We describe every section of the language, why each section is needed and what it is used for.
\paragraph*{Proposal definition.}
\DSL is designed to mimic the examples already provided in proposal descriptions~\cite{TC39Process}. These examples can be seen in each of the proposals described in Section \ref{sec:proposals}. The idea is to allow a similar kind of notation to the examples in order to define the transformations.
The first part of \DSL is defining the proposal, this is done by creating a named block containing all definitions of templates used for matching alongside their respective transformation. This section is used to contain everything relating to a specific proposal and is meant for easy proposal identification by tooling.
\begin{lstlisting}
proposal Pipeline_Proposal {}
\end{lstlisting}
\paragraph*{Case definition.}
Each proposal will have one or more definitions of a template for code to identify in the users codebase, and its corresponding transformation definition. These are grouped together to have a simple way of identifying the corresponding cases of matching and transformations. This section of the proposal is defined by the keyword \textit{case} and a block that contains its related fields. A proposal definition in \DSL should contain at least one \texttt{case} definition. This allows for matching many different code snippets and showcasing more of the proposal than a single concept the proposal has to offer.
\begin{lstlisting}
case case_name {
}
\end{lstlisting}
\paragraph*{Template used for matching}
To define the template used to match, we have another section defined by the keyword \textit{applicable to}. This section will contain the template defined using JavaScript with specific DSL keywords defined inside the template. This template is used to identify applicable parts of the user's code to a proposal.
\begin{lstlisting}
applicable to {
"let a = 0;"
}
\end{lstlisting}
This \texttt{applicable to} template, will create matches on any \texttt{VariableDeclaration} that is initialized to the value 0, and is stored in an \texttt{Identifier} with name \texttt{a}.
\paragraph*{Defining the transformation}
To define the transformation that is applied to a specific matched code snippet, the keyword \textit{transform to} is used. This section is similar to the template section, however it uses the specific DSL identifiers defined in applicable to, to transfer the context of the matched user code, this allows us to keep parts of the users code important to the original context it was written in.
\begin{lstlisting}
transform to{
"() => {
let b = 100;
}"
}
\end{lstlisting}
This transformation definition, will change any code matched to its corresponding matching definition into exactly what is defined. This means for any matches produced this code will be inserted in its place.
\paragraph*{Full definition of \DSL}
Taking all these parts of \DSL structure, defining a proposal in \DSL will look as follows.
\begin{lstlisting}[caption={\DSL definition of a proposal}]
proposal PROPOSAL_NAME {
case CASE_NAME_1 {
applicable to {
"let b = 100;"
}
transform to {
"() => {};"
}
}
case CASE_NAME_2 {
applicable to {
"console.log();"
}
transform to {
"console.dir();"
}
}
}
\end{lstlisting}
This full example of \DSL has two \texttt{case} sections. Each \texttt{case} is applied one at a time to the user's code. The first case will try to find any \texttt{VariableDeclaration} statements, where the identifier is \texttt{b}, and the right side expression is a \texttt{Literal} with value 100. The second \texttt{case} will change any empty \texttt{console.log} expression, into a \texttt{console.dir} expression.
\subsection{How a match and transformation is performed}
\label{sec:DSL_DEF}
To perform matching and transformation of the user's code, we first have to have some way of identifying applicable user code. These applicable code sections then have to be transformed and inserted it back into the full user code definition.
\subsection*{Identifying applicable code}
To identify sections of code a proposal is applicable to, we use \emph{templates}, which are snippets of JavaScript. These templates are used to identify and match applicable sections of a users code. A matching section for a template is one that produces an exactly equal AST structure, where each node of the AST sections has the same information contained within it. This means that templates are matched exactly against the users code, this does not really provide some way of querying the code and performing context based transformations, so for that we use \textit{wildcards} within the template.
Wildcards are interspliced into the template inside a block denoted by \texttt{<< >>}. Each wildcard starts with an identifier, which is a way of referring to that wildcard in the definition of the transformation template later. This allows for transferring the context of parts matched to a wildcard into the transformed output, like identifiers, parts of statements, or even entire statements, can be transferred from the original user code into the transformation template. A wildcard also contains a type expression. A type expression is a way of defining exactly the types of AST nodes a wildcard will produce a match against. These type expressions use Boolean logic together with the AST node-types from BabelJS~\cite{Babel} to create a very versatile of defining exactly what nodes a wildcard can match against.
\subsubsection*{Wildcard type expressions}
Wildcard expressions are used to match AST node types based on Boolean logic. This Boolean logic is based on comparison of Babel AST node types~\cite{BabelAST}. We do this because we need an accurate and expressive way of defining specifically what kinds of AST nodes a wildcard can be matched against. This means an type expression can be as simple as \texttt{VariableDeclaration}: this will match only against a node of type \texttt{VariableDeclaration}. We also special types for \texttt{Statement} for matching against a statement, and \texttt{Expression} for matching any expression.
This example will allow any \texttt{CallExpression} to match against this wildcard named \texttt{expr}.
\begin{lstlisting}
<< expr: CallExpression >>
\end{lstlisting}
To make this more expressive, the type expressions support binary and unary operators.We support the following operators, \texttt{\&\&} is logical conjunction, \texttt{||} means logical disjunction,\texttt{!} is logical negation. This makes it possible to build complex type expressions, making it very expressive exactly what nodes are allowed to match against a specific wildcard.
In the first example on line 1, we want to limit the wildcard to not match against any nodes with type \texttt{VariableDeclaration}, while still allowing any other \texttt{Statement}. The example on line 2 want to avoid loop specific statements. We express this by allowing any \texttt{Statement}, but we negate the expression containing the types of loop specific statements.
\begin{lstlisting}
<< notVariableDeclaration: Statement && !VariableDeclaration >>
<< noLoopSpecificStatements: Statement && !(BreakStatement || ContinueStatement) >>
\end{lstlisting}
The wildcards support matching subsequent sibling nodes of the code against a single wildcard. We achieve this behavior done by using a Keene plus at the top level of the expression. A Keene plus means one or more, so we allow for one or more matches in order when using this token. This is useful for matching against a series of one or more specific nodes, the matching algorithm will continue to match until the type expression no longer evaluates to true.
In the example below, we allow the wildcard to match multiple nodes with the Keene plus \texttt{+}. This example will continue to match against itself as long as the nodes are a \texttt{Statement} and at the same time is not a \texttt{ReturnStatement}.
\begin{lstlisting}
<< statementsNoReturn : (Statement && !ReturnStatement)+ >>
\end{lstlisting}
\begin{lstlisting}
let variableName = << expr1: ((CallExpression || Identifier) && !ReturnStatement)+ >>;
\end{lstlisting}
A wildcard section is defined on the right hand side of an assignment statement. This wildcard will match against any AST node classified as a CallExpression or an Identifier.
\subsection{Transforming}
When matching sections of the users code has been found, we need some way of defining how to transform those sections to showcase a proposal. This is done using the \texttt{transform to} template. This template describes the general structure of the newly transformed code, with context from the users code by using wildcards.
A transformation template defines how the matches will be transformed after applicable code has been found. The transformation is a general template of the code once the match is replaced in the original AST. However, without transferring over the context from the match, this would be a template search and replace. Thus, to transfer the context from the match, wildcards are defined in this template as well. These wildcards use the same block notation found in the \texttt{applicable to} template, however they do not need to contain the types, as those are not needed in the transformation. The only required field of the wildcard is the identifier defined in \texttt{applicable to}. This is done to know which wildcard match we are taking the context from, and where to place it in the transformation template.
Transforming a variable declaration from using \texttt{let} to use \texttt{const}.
\begin{lstlisting}[language={JavaScript}]
// Example applicable to template
applicable to {
let <<variableName: Identifier>> = <<expr1: Expression>>;
}
// Example of transform to template
transform to {
const <<variableName>> = <<expr1>>;
}
\end{lstlisting}
\subsection{Using \DSL}
\DSL is designed to be used at a proposal development stage, this means the users of \DSL will most likely be TC39~\cite{TC39} delegates, or otherwise relevant stakeholders.
\DSL is designed to closely mimic the style of the examples required in the TC39 process~\cite{TC39Process}. We chose to design it this way to specifically make this tool fit the use-case of the committee. The idea behind this project is to gather early user feedback on syntactic proposals, this would mean the main users of this kind of tool is the committee themselves.
\DSL is just written using text, most Domain-specific languages have some form of tooling to make the process of using the DSL simpler and more intuitive. \DSL has an extension built for Visual Studio Code, see Figure \ref{fig:ExtensionExample}, this extension supports many common features of language servers, it supports auto completion, it will produce errors if fields are defined wrong or are missing parameters.
\begin{figure}[H]
\begin{center}
\includegraphics[width=\textwidth/2]{figures/ExtensionExample.png}
\caption{\label{fig:ExtensionExample} Writing \DSL in Visual Studio Code with extension}
\end{center}
\end{figure}
The language server included with this extension performs validation of the wildcards. This allows verification of wildcard declarations in applicable to, see Figure \ref{fig:NoTypes}. If a wildcard is declared with no types, an error will be reported.
\begin{figure}[H]
\begin{center}
\includegraphics[width=\textwidth/2]{figures/EmptyType.png}
\caption{\label{fig:NoTypes} Error displayed when declaring a wildcard with no types.}
\end{center}
\end{figure}
The extension automatically uses wildcard declarations in \texttt{applicable to} to verify all wildcards referenced in \texttt{transform to} are declared. If an undeclared wildcard is used, an error will be reported and the name of the undeclared wildcard will be displayed, see Figure \ref{fig:UndeclaredWildcard}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=\textwidth/2]{figures/UndeclaredRef.png}
\caption{\label{fig:UndeclaredWildcard} Error displayed with usage of undeclared wildcard.}
\end{center}
\end{figure}
\section{Using the \DSL with syntactic proposals}
This section contains the definitions of the proposals used to evaluate the tool created in this thesis. These definitions do not have to cover every single case where the proposal might be applicable, as they just have to be general enough to create some amount of examples that will give a representative number of matches when the transformations are applied to some relatively long user code. This is because this this tool will be used to gather feedback from user's on proposals during development. Because of this use case, it does not matter that we catch every single applicable code snippet, just that we find enough to perform a "showcase" of the proposal to the user. The most important thing is that the transformation is correct, as incorrect transformations will lead to bad feedback on the proposal.
\subsection{"Pipeline" Proposal}
The "Pipeline" proposal is one of the proposals presented in Section \ref{sec:proposals}. This proposal is applicable to call expressions, which are used all across JavaScript. This proposal is trying to solve readability when performing deeply nested function calls.
\begin{lstlisting}[language={JavaScript}, caption={Example of "Pipeline" proposal definition in \DSL}, label={def:pipeline}][H]
proposal Pipeline {
case SingleArgument {
applicable to {
"<<someFunctionIdent:Identifier || MemberExpression>>(<<someFunctionParam: Expression>>);"
}
transform to {
"<<someFunctionParam>> |> <<someFunctionIdent>>(%);"
}
}
case TwoArgument{
applicable to {
"<<someFunctionIdent: Identifier || MemberExpression>>(<<someFunctionParam: Expression>>, <<moreFunctionParam: Expression>>)"
}
transform to {
"<<someFunctionParam>> |> <<someFunctionIdent>>(%, <<moreFunctionParam>>)"
}
}
}
\end{lstlisting}
In the Listing \ref{def:pipeline}, the first pair definition \texttt{SingleArgument} will apply to any \texttt{CallExpression} with a single argument. We do not expressively write a \texttt{CallExpression} inside a wildcard, as we have defined the structure of a \texttt{CallExpression}. The first wildcard \texttt{someFunctionIdent}, has the types of \texttt{Identifier}, to match against single identifiers, and \texttt{MemberExpression}, to match against functions who are members of objects, i.e. \texttt{console.log}. In the transformation template, we define the structure of a function call using the pipe operator, but the wildcards change order, so the argument passed as argument \texttt{someFunctionParam} is placed on the left side of the pipe operator, and the \texttt{CallExpression} is on the right, with the topic token as the argument. This case will produce a match against all function calls with a single argument, and transform them to use the pipe operator. The main difference of the second case \texttt{TwoArgument}, is it matches against functions with exactly two arguments, and uses the first argument as the left side of the pipe operator, while the second argument remains in the function call.
\subsection{"Do Expressions" Proposal}
The "Do Expressions" proposal~\cite{Proposal:DoProposal} can be specified in our DSL. Due to the nature of the proposal, it is not as applicable as the "Pipeline" proposal, as it does not re-define a style that is used quite as frequently as call expressions. This means the amount of transformed code snippets this specification in \DSL will be able to perform is expected to be lower. This is due to the "Do Expression" proposal introducing an entirely new way to write expression-oriented code in JavaScript. If the user running this tool has not used the current way of writing in an expression-oriented style in JavaScript, \DSL is limited in the amount of transformations it can perform. Nevertheless, if the user has been using an expression-oriented style, \DSL will transform parts of the code.
\begin{lstlisting}[language={JavaScript}, caption={Definition of Do Proposal in \DSL}, label={def:doExpression}]
proposal DoExpression {
case arrowFunction {
applicable to {
"() => {
<<statements: (Statement && !ReturnStatement)+>>
return <<returnVal : Expression>>;
}
"
}
transform to {
"(do {
<<statements>>
<<returnVal>>
})"
}
}
case immediatelyInvokedAnonymousFunction {
applicable to {
"(function(){
<<statements: (Statement && !ReturnStatement)+>>
return <<returnVal : Expression>>;
})();"
}
transform to {
"(do {
<<statements>>
<<returnVal>>
})"
}
}
}
\end{lstlisting}
In Listing \ref{def:doExpression}, the specification of "Do Expression" proposal in \DSL can be seen. It has two cases: the first case \texttt{arrowFunction}, applies to a code snippet using an arrow function~\cite[15.3]{ecma262} with a return value. The wildcards of this template are \texttt{statements}, which is a wildcard that matches against one or more statements that are not of type \texttt{ReturnStatement}, the reason we limit the one or more match is we cannot match the final statement of the block to this wildcard, as that has to be matched against the return statement in the template. The second wildcard \texttt{returnVal} matches against any expressions. The reason for extracting the expression from the \texttt{return} statement, is to use it in the implicit return of the \texttt{do} block. In the transformation template, we replace the arrow function with with a \texttt{do} expression, this do expression has to be defined inside parenthesis, as a free floating do expression is not allowed due to ambiguous parsing against a \texttt{do {} while()} statement. We and insert the statements matched against \texttt{statements} wildcard into the block of the \texttt{do} expression, and the final statement of the block is the expression matched against the \texttt{returnVal} wildcard. This will produce an equivalent transformation of an arrow function into a \texttt{do} expression. The second case \texttt{immediatelyInvokedAnonymousFunction} follows the same principle as the first case, but is applied to immediately invoked anonymous functions, and produces the exact same output after the transformation as the first case. This is because immediately invoked anonymous functions are equivalent to arrow functions.
\subsection{"Await to Promise" imaginary proposal}
The imaginary proposal "Await to Promise" is created to transform code snippets from using \texttt{await}, to use a promise with equivalent functionality.
This proposal was created to evaluate the tool, as it is quite difficult to define applicable code in this current template form. This definition is designed to create matches in code using await, and highlight how await could be written using a promise in stead. This actually highlights some of the issues with the current design of \DSL that will be described in Future Work.
\begin{lstlisting}[language={JavaScript}, caption={Definition of Await to Promise evaluation proposal in \DSL}, label={def:awaitToPromise}]
proposal awaitToPomise{
case single{
applicable to {
"let <<ident:Identifier>> = await <<awaitedExpr: Expression>>;
<<statements: (Statement && !ReturnStatement && !ContinueStatement &&!BreakStatement)+>>
return <<returnExpr: Expression>>
"
}
transform to{
"return <<awaitedExpr>>.then(async <<ident>> => {
<<statements>>
return <<returnExpr>>
});"
}
}
}
\end{lstlisting}
The specification of "Await to Promise" in \DSL is created to match asynchronous code inside a function. It is limited to match asynchronous functions containing a single await statement, and that await statement has to be stored in a \texttt{VariableDeclaration}. The second wildcard \texttt{statements}, is designed to match all statements following the \texttt{await} statement up to the return statement. This is done to move the statements into the callback function of \texttt{then()} in the transformation. We include \texttt{\!ReturnStatement} because we do not want to consume the return as it would then be removed from the functions scope and into the callback function of \texttt{then()}. We also have to avoid matching where there exists loop specific statements such as \texttt{ContinueStatement} or \texttt{BreakStatement}.
The transformation definition has to use an async function in \texttt{.then()}, as there might be more await expressions contained within \texttt{statements}.
\section{\DSLSH}
In this thesis, we also created an alternative way of defining proposals and their respective transformations, this is done using JavaScript as it's own meta language for the definitions. The reason for creating a way of defining proposals using JavaScript is, it allows us to limit the amount of dependencies of the tool, since we no longer rely on \DSL, and it allows for more exploration in the future work of this project.
\DSLSH is less of an actual language, and more of a program API at the moment, it allows for defining proposals purely in JavaScript objects, which is meant to allow a more modular way of using this idea. In \DSLSH you define a \textit{prelude}, which is just a list of variable declarations that contain the type expression as a string for that given wildcard. This means we do not need to perform wildcard extraction when wanting to parse the templates used for matching and transformation.
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}
// Definition in JSTQL
proposal a{
case {
applicable to {
<<a:Expression>>
}
transform to {
() => <<a>>
}
}
}
\end{lstlisting}
\end{minipage}\hfil
\noindent\begin{minipage}{.45\textwidth}
\begin{lstlisting}[language={JavaScript}]
// Equivalent definition in JSTQL-SH
{
prelude: 'let a = "Expression"'`,
applicableTo: "a;",
transformTo: "() => a;"
}
\end{lstlisting}
\end{minipage}\hfil