master/chapter/ch3.tex
2024-05-11 21:34:34 +02:00

540 lines
29 KiB
TeX

\chapter{Collecting User Feedback for Syntactic Proposals}
The goal for this project is to utilize users familiarity with their own code to gain early and worthwhile user feedback on new
syntactic proposals for EcmaScript.
\section{The core idea}
\textbf{THIS IS TOO ABRUPT OF AN INTRODUCTION, MORE GENERAL ALMOST REPEAT OF BACKGGROUND GOES HERE}
\textbf{CURRENT VERSION vs FUTURE VERSION instead of old way}
\textbf{DO NOT DISCUSS TOOL HERE}
Users of EcmaScript have a familiarity with code they themselves have written. This means they have knowledge of how their own code works and why they might have written it a certain way. This project aims to utilize this pre-existing knowledge to showcase new proposals for EcmaScript. Showcasing proposals this way will allow users to focus on what the proposal actually entails, instead of focusing on the examples written by the proposal author.
Further in this chapter, we will be discussing the \textit{old} and \textit{new} way of programming in EcmaScript. What we are referring to in this case is with set of problems a proposal is trying to solve, if that proposal is allowed into EcmaScript as part of the language, there will be a \textit{new} way of solving said problems. The \textit{old} way is the current status quo when the proposal is not part of EcmaScript, and the \textit{new} way is when the proposal is part of EcmaScript and we are utilizing the new features of said proposal.
The program will allow the users to preview proposals way before they are part of the language. This way the committee will get feedback from users of the language earlier in the proposal process, this will ideally allow for a more efficient process of adding proposals to EcmaScript.
\subsection{Applying a proposal}
The way this project will use the pre-existing knowledge a user has of their own code is to use that code as base for showcasing a proposals features. Using the users own code as base requires the following steps in order to automatically implement the examples that showcase the proposal inside the context of the users own code.
The tool has to identify where the features and additions of a proposal could have been used. This means identifying parts of the users program that use pre-existing EcmaScript features that the proposal is interacting with and trying to solve. This will then identify all the different places in the users program the proposal can be applied. This step is called \textit{matching} in the following chapters
Once the tool has matched all parts of the program that the proposal could be applied, the users code has to be transformed to use the feature/s the proposal is trying to implement. This step also includes keeping the context and functionality of the users program the same, so variables and other context related concepts have to be transferred over to the transformed code.
The output of the previous step is then a set of code pairs, where one a part of the users original code, and the second is the transformed code. The transformed code is then ideally a perfect replacement for the original user code if the proposal is part of EcmaScript. These pairs are used as examples to present to the user, presented together so the user can see their original code together with the transformed code. This allows for a direct comparison and an easier time for the user to understand the proposal.
The steps outlined in this section require some way of defining matching and transforming of code. This has to be done very precisely and accurately in order to avoid bugs. Imprecise definition of the proposal might lead to transformed code not being a direct replacement for the code it was based upon. For this we suggest two different methods, a definition written in a custom DSL \DSL and a definition written in a self-hosted way only using EcmaScript as a language as definition language. Read more about this in SECTION HERE.
\section{Applicable proposals}
\label{sec:proposals}
A proposal for EcmaScript is a suggested change for the language, in the case of EcmaScript this comes in the form of an addition to the language, as EcmaScript does not allow for breaking changes. There are many different kinds of proposals, this project focuses exclusively on Syntactic Proposals.
\subsection{Syntactic Proposals}
A syntactic proposal, is a proposal that contains only changes to the syntax of a language. This means, the proposal contains either no, or very limited change to functionality, and no changes to semantics. This limits the scope of proposals this project is applicable to, but it also focuses solely on some of the most challenging proposals where the users of the language might have the strongest opinions.
\subsection{Simple example of a syntactic proposal}
Consider a imaginary proposal \exProp. This proposal describes adding an optional keyword for declaring numerical variables if the expression of the declaration is a numerical literal.
This proposal will look something like this:
\begin{lstlisting}[language={JavaScript}, caption={Example of imaginary proposal \exProp}, label={ex:proposal}]
// Original code
let x = 100;
let b = "Some String";
let c = 200;
// Code after application of proposal
int x = 100;
let b = "Some String";
let c = 200;
\end{lstlisting}
See that in \ref{ex:proposal} the change is optional, and is not applied to the declaration of \textit{c}, but it is applied to the declaration of \textit{x}. Since the change is optional to use, and essentially is just \textit{syntax sugar}, this proposal does not make any changes to functionality or semantics, and can therefore be categorized as a syntactic proposal.
\subsection{\cite[Discard Bindings]{Proposal:DiscardBindings}}
The proposal \discardBindings is classified as a Syntactic Proposal, as it contains no change to the semantics of EcmaScript. This proposal is created to allow for discarding objects when using the feature of unpacking objects/arrays on the left side of an assignment. The whole idea of this proposal is to avoid declaring unused temporary variables.
Unpacking when doing an assignment refers to assigning internal fields of an object/array directly in the assignment rather than using a temporary variable. See \ref{ex:unpackingObject} for an example of unpacking an object and \ref{ex:unpackingArr}.
\begin{lstlisting}[language={JavaScript}, caption={Example of unpacking Object}, label={ex:unpackingObject}]
// previous
let temp = { a:1, b:2, c:3, d:4 };
let a = temp.a;
let b = temp.b;
// unpacking
let {a,b ...rest} = { a:1, b:2, c:3, d:4 };
rest; // { c:3, d:4 }
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example of unpacking Array}, label={ex:unpackingArr}]
// previous
let tempArr = [ 0, 2, 3, 4 ];
let a = tempArr[0]; // 0
let b = tempArr[1] // 2
//unpacking
let [a, b, _1, _2] = [ 0, 2, 3, 4 ]; // a = 0, b = 2, _1 = 3, _2 = 4
\end{lstlisting}
In EcmaScripts current form, it is required to assign every part of an unpacked object/array to some identifier. The current status quo is to use \_ as a sign it is meant to be discarded. This proposal suggests a specific keyword \textit{void} to be used as a signifier whatever is at that location should be discarded.
This feature is present in other languages, such as Rust wildcards, Python wildcards and C\# using statement and discards. In most of these other languages, the concept of discard is a single \_. In EcmaScript the \_ token is a valid identifier, therefore this proposal suggests the use of the keyword \textit{void}. This keyword is already is reserved as part of function definitions where a function is meant to have no return value.
This proposal allows for the \textit{void} keyword to be used in a variety of contexts. Some simpler than others but all following the same pattern of allowing discarding of bindings to an identifier. It is allowed anywhere the \textit{BindingPattern}, \textit{LexicalBinding} or \textit{DestructuringAssignmentTarget} features are used in EcmaScript. This means it can be applied to unpacking of objects/arrays, in callback parameters and class methods.
\begin{lstlisting}[language={JavaScript}, caption={Example discard binding with variable discard}]
using void = new UniqueLock(mutex);
// Not allowed on top level of var/let/const declarations
const void = bar(); // Illegal
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example Object binding and assignment pattern}]
let {b:void, ...rest} = {a:1, b:2, c:3, d:4}
rest; // {a:1, c:3, d:4};
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={
Example Array binding and assignment pattern. It is not clear to the reader that in line 8 we are consuming 2 or 3 elements of the iterator. In the example on line 13 we see that is it more explicit how many elements of the iterator is consumed
}]
function* gen() {
for (let i = 0; i < Number.MAX_SAFE_INTEGER; i++) {
console.log(i);
yield i;
}
}
const iter = gen();
const [a, , ] = iter;
// prints:
// 0
// 1
const [a, void] = iter; // author intends to consume two elements
// vs.
const [a, void, void] = iter; // author intends to consume three elements
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example discard binding with function parameters. This avoids needlessly naming parameters of a callback function that will remain unused.}]
// project an array values into an array of indices
const indices = array.map((void, i) => i);
// passing a callback to `Map.prototype.forEach` that only cares about
// keys
map.forEach((void, key) => { });
// watching a specific known file for events
fs.watchFile(fileName, (void, kind) => { });
// ignoring unused parameters in an overridden method
class Logger {
log(timestamp, message) {
console.log(`${timestamp}: ${message}`);
}
}
class CustomLogger extends Logger {
log(void, message) {
// this logger doesn't use the timestamp...
}
}
// Can also be utilized for more trivial examples where _ becomes
// cumbersome due to multiple discarded parameters.
doWork((_, a, _1, _2, b) => {});
// vs.
doWork((void, a, void, void, b) => {
});
\end{lstlisting}
The grammar of this proposal is precisely specified in the specification found in the \href{https://github.com/tc39/proposal-discard-binding?tab=readme-ov-file#object-binding-and-assignment-patterns}{proposal definition} on github.
\begin{lstlisting}[language={JavaScript}, caption={Grammar of Discard Binding}]
var [void] = x; // via: BindingPattern :: `void`
var {x:void}; // via: BindingPattern :: `void`
let [void] = x; // via: BindingPattern :: `void`
let {x:void}; // via: BindingPattern :: `void`
const [void] = x; // via: BindingPattern :: `void`
const {x:void} = x; // via: BindingPattern :: `void`
function f(void) {} // via: BindingPattern :: `void`
function f([void]) {} // via: BindingPattern :: `void`
function f({x:void}) {} // via: BindingPattern :: `void`
((void) => {}); // via: BindingPattern :: `void`
(([void]) => {}); // via: BindingPattern :: `void`
(({x:void}) => {}); // via: BindingPattern :: `void`
using void = x; // via: LexicalBinding : `void` Initializer
await using void = x; // via: LexicalBinding : `void` Initializer
[void] = x; // via: DestructuringAssignmentTarget : `void`
({x:void} = x); // via: DestructuringAssignmentTarget : `void`
\end{lstlisting}
\subsection{Pipeline Proposal}
The pipeline proposal is a Syntactic proposal with no change to functionality of EcmaScript, it focuses solely on solving problems related to nesting of function calls and other expressions that allow for a topic reference.
The pipeline proposal aims to solve two problems with performing consecutive operations on a value. In EcmaScript there are two main styles of achieving this functionality currently. Nesting calls and chaining calls, these two come with a differing set of challenges when used.
Nesting calls is mainly an issue related to function calls with one or more arguments. When doing many calls in sequence the result will be a \textit{deeply nested} call expression. See in \ref{ex:deeplyNestedCall}.
Challenges with nested calls
\begin{itemize}
\item The order of calls go from right to left, which is opposite of the natural reading direction users of EcmaScript are used to
\item When introduction functions with multiple arguments in the middle of the nested call, it is not intuitive to see what call it belongs to.
\end{itemize}
Benefits of nested calls
\begin{itemize}
\item Does not require special design thought to be used
\end{itemize}
\begin{lstlisting}[language={JavaScript}, caption={Example of deeply nested call}, label={ex:deeplyNestedCall}]
// Deeply nested call with single arguments
function1(function2(function3(function4(value))));
// Deeply nested call with multi argument functions
function1(function2(function3(value2, function4)), value1);
\end{lstlisting}
Nesting solves some of the issues relating to nesting, as it allows for a more natural reading direction left to right when identifying the sequence of call. However, solving consecutive operations using chaining has its own set of challenges when used
\subsection{Description of Pipeline proposal}
Challenges with chaining calls
\begin{itemize}
\item APIs has to be specifically designed with chaining in mind
\item Might not even be possible due to external libraries
\item Does not support other concepts such as arithmetic operations, array/object literals, await, yield, etc...
\end{itemize}
Benefits of chaining calls
\begin{itemize}
\item More natural direction of call order
\item Arguments of functions are grouped with function name
\item Untangles deep nesting
\end{itemize}
\begin{lstlisting}[language={JavaScript}, caption={Example of chaining calls}, label={ex:chainingCall}]
// Chaining calls
function1().function2().function3();
// Chaining calls with multiple arguments
function1(value1).function2().function3(value2).function4();
\end{lstlisting}
The pipeline proposal aims to combine the benefits of these two styles without all the challenges each method faces.
The main benefit of pipeline is to allow for a similar style to chaining when chaining has not been specifically designed to be applicable. The idea uses syntactic sugar to change the order of writing the calls without influencing the API of the functions.
\begin{lstlisting}[language={JavaScript}, caption={Example from jquery}, label= {ex:pipeline}]
// Status quo
var minLoc = Object.keys( grunt.config( "uglify.all.files" ) )[ 0 ];
// With pipes
var minLoc = grunt.config('uglify.all.files') |> Object.keys(%)[0];
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example from unpublish}, label= {ex:pipeline}]
// Status quo
const json = await npmFetch.json(npa(pkgs[0]).escapedName, opts);
// With pipes
const json = pkgs[0] |> npa(%).escapedName |> await npmFetch.json(%, opts);
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example from underscore.js}, label= {ex:pipeline}]
// Status quo
return filter(obj, negate(cb(predicate)), context);
// With pipes
return cb(predicate) |> _.negate(%) |> _.filter(obj, %, context);
\end{lstlisting}
\begin{lstlisting}[language={JavaScript}, caption={Example from ramda.js}, label= {ex:pipeline}]
// Status quo
return xf['@@transducer/result'](obj[methodName](bind(xf['@@transducer/step'], xf), acc));
// With pipes
return xf
|> bind(%['@@transducer/step'], %)
|> obj[methodName](%, acc)
|> xf['@@transducer/result'](%);
\end{lstlisting}
\subsection{Do proposal}
The \cite[Do Proposal]{Proposal:DoProposal} is a proposal meant to bring \textit{expression oriented} programming to EcmaScript. Expression oriented programming is a concept taken from functional programming which allows for combining expressions in a very free manor allowing for a highly malleable programming experience.
The motivation of the do expression proposal is to create a feature that allows for local scoping of a code block that is treated as an expression. This allows for complex code requiring multiple statements to be confined inside its own scope and the resulting value is returned from the block as an expression. Similar to how a unnamed function is used currently. The current status quo of how to achieve this behavior is to use unnamed functions and invoke them immediately, or use an arrow function, these two are equivalent to a do expression.
The codeblock of a do expression has one major difference from these equivalent functions, as it allows for implicit return of the final statement in the block. This only works if the statement does not contain a final line end (;).
The local scoping of this feature allows for a cleaner environment in the parent scope of the do expression. What is meant by this is for temporary variables and other assignments used once can be enclosed inside a limited scope within the do block. Allowing for a cleaner environment inside the parent scope where the do block is defined.
\begin{lstlisting}[language={JavaScript}, caption={Example of do expression}, label: {ex:doExpression}]
// Current status quo
let x = () => {
let tmp = f();
return tmp + tmp + 1;
};
// Using a immediately invoked function
let x = function(){
let tmp = f();
return tmp + tmp + 1;
}();
// Using do expression
let x = do {
let tmp = f();
tmp + tmp + 1
}
\end{lstlisting}
This proposal has some limitations on its usage. Due to the implicit return of the final statement you cannot end a do expression with an \texttt{if} without and \texttt{else}, or a \texttt{loop}.
\subsection{Await to Promise}
This section covers an imaginary proposal that was used to evaluate the program developed in this thesis. This imaginary proposal is less of a proposal and more of just a pure JavaScript transformation example. What this proposal wants to achieve is re-writing from using \texttt{await} so use promises.
In order to do this an equivalent way of writing code containing \texttt{await} in the syntax of \texttt{promises} had to be identified. In this case, the equivalent way of expressing this is consuming the rest of the scope \texttt{await} was written in and place it inside a \texttt{then(() => {})} function.
\begin{lstlisting}[language={JavaScript}, caption={Example of await to promises}, label={ex:awaitToPromise}]
// Code containing await
async function a(){
let something = await asyncFunction();
let c = something + 100;
return c + 1;
}
// Re-written using promises
function a(){
return asyncFunction().then((something) => {
let c = something + 100;
return c;
})
}
In the example \ref*{ex:awaitToPromise} we change \texttt{a} from async to synchronous, but we still return a promise which ensures everything using the function \texttt{a} to still get the expected value.
\end{lstlisting}
\section{Searching user code for applicable snippets}
In order to identify snippets of code in the users codebase where a proposal is applicable we need some way to define patterns of code where we can apply the proposal. To do this, a DSL titled \DSL is used.
\subsection{\DSL}
\label{sec:DSL_DEF}
In order to allow for the utilization of the users code. We have to identify snippets of the users code that some proposal is applicable to. In order to do this, we have designed a DSL called \DSL JavaScript Template Query Language. This DSL will contain the entire definition used to identify and transform user code in order to showcase a proposal.
\subsection{Matching}
In order to identify snippets of code a proposal is applicable to, we use templates of JavaScript. These templates allow for \textit{wildcard} sections where it can match against specific AST nodes. These \textit{wildcard} sections are also used to transfer the context of the code matched into the transformation.
A template containing none of these \textit{wildcards} is matched exactly. This essentially means the match will be a direct code search for snippets where the AST of the users code match the template exactly.
The \textit{wildcards} are written inside a block denoted by << WILDCARD >>. Each wildcard has to have a DSL identifier, a way of referring to that wildcard in the definition of the transformation, and a wildcard type
Each wildcard has to have some form of type. These types can be node-types inherited from Babels AST definition. This means if you want a wildcard to match any \textit{CallExpression} then that wildcard should be of type CallExpression. In order to allow for multiple node-types to match against a single wildcard, \DSL allows for sum types for wildcards, allowing multiple AST node-types to be allowed to a single wildcard definition.
The wildcard type can also be a custom type with special functionality. Some examples of this is \texttt{anyRest}, which allows for the matcher to match it against multiple expressions/statements defined within an AST node as a list. As an example this type could match against any number of statements within a codeblock.
This type definition is also used to define specific behavior the program using this DSL should perform. One example of this can be found in \ref{def:pipeline}, where the DSL function \textit{anyRest} is used to allow for any amount of child nodes found together with the wildcard. This means it is feasible to match against any number of function parameters for example.
\begin{lstlisting}[caption={Example of a wildcard}, label={ex:wildcard}]
let variableName = << expr1: CallExpression | Identifier >>;
\end{lstlisting}
In \ref{ex:wildcard} a wildcard section is defined on the right hand side of an assignment statement. This wildcard will match against any AST node classified as a CallExpression or an Identifier.
\subsection{\DSL custom matching types}
\texttt{anyNExprs} is a custom DSL matching type. This type allows the matcher to match a specific section of the JavaScript template against any number of elements stored within a list on the AST node Object it is currently trying to match. Using this allows for transferring any number of expression from the match into the transformed code. This custom type is used in \ref{def:pipeline}.
\texttt{anyNStatements} is a custom DSL matching type. This type allows the matcher to match against any number of Statements within a section of JavaScript. This custom type is used in \ref{def:doExpression}
\subsection{Transforming}
Observe that once the a matching template has been defined, a definition of transformation has to be created. This transformation has to transfer over the code matched to a wildcard. This means a way to refer to the wildcard is needed. We do this in a very similar manner as defining the wildcard, since we have an internal DSL identifier previously defined in the definition of the matching, all that is needed is to refer to that identifier. This is done with a similar block definition << >> containing the identifier.
\begin{lstlisting}[caption={
See \ref{ex:wildcard} contains identifier expr1, and we refer to the same in this example, the only transformation happening here is rewriting let to const.
}]
const variableName = <<expr1>>;
\end{lstlisting}
\subsection{Structure of \DSL}
\label{sec:DSLStructure}
\DSL is designed to mimic the examples already provided by a proposal champion in the proposals README. These examples can be seen in each of the proposals described in \ref{sec:proposals}.
\subsubsection*{Define proposal}
The first part of \DSL is defining the proposal, this is done by creating a named block containing all definitions of templates used for matching alongside their respective transformation. This section is used to contain everything relating to a specific proposal and is meant for easy proposal identification by tooling.
\begin{lstlisting}[caption={Example of section containing the pipeline proposal}]
proposal Pipeline_Proposal{
}
\end{lstlisting}
\subsubsection*{Defining a pair of template and transformation}
Each proposal will have 1 or more definitions of a template for code to identify in the users codebase, and its corresponding transformation definition. These are grouped together in order to have a simple way of identifying the corresponding pairs. This section of the proposal is defined by the keyword \textit{pair} and a block to contain its related fields. A proposal will contain 1 or more of this section. This allows for matching many different code snippets and showcasing more of the proposal than a single concept the proposal has to offer.
\begin{lstlisting}[caption={Example of pair section}]
pair PAIR_NAME {
}
\end{lstlisting}
\subsubsection*{Template used for matching}
In order to define the template used to match, we have another section defined by the keyword \textit{applicable to}. This section will contain the template defined using JavaScript with specific DSL keywords defined inside the template.
\begin{lstlisting}[caption={Example of applicable to section}]
applicable to {
}
\end{lstlisting}
\subsubsection*{Defining the transformation}
In order to define the transformation that is applied to a specific matched code snippet, the keyword \textit{transform to} is used. This section is similar to the template section, however it uses the specific DSL keywords to transfer the context of the matched user code, this allows us to keep parts of the users code important to the original context it was written in.
\begin{lstlisting}[caption={Example of transform to section}]
transform to{
}
\end{lstlisting}
\subsubsection*{All sections together}
Taking all these parts of \DSL structure, defining a proposal in \DSL will look as follows.
\begin{lstlisting}[caption={\DSL definition of a proposal}]
proposal PROPOSAL_NAME {
pair PAIR_NAME {
applicable to {
}
transform to {
}
}
pair PAIR_NAME {
applicable to .....
}
pair ....
}
\end{lstlisting}
\section{Using the \DSL with an actual syntactic proposal}
In this section some examples of how a \DSL definition of each of the proposals discussed in \ref{sec:proposals} might look. These definitions do not have to cover every single case where the proposal might be applicable, as they just have to be general enough to create some amount of examples on any reasonably long code definition a user might use this tool with.
\subsection{Pipeline Proposal}
The Pipeline Proposal is the easiest to define of the proposals presented in \ref*{sec:proposals}. This is due to the proposal being applicable to a very wide array of expressions, and the main problem this proposal is trying to solve is deep nesting of function calls.
\begin{lstlisting}[language={JavaScript}, caption={Example of Pipeline Proposal definition in \DSL}, label={def:pipeline}]
proposal Pipeline{
pair SingleArgument {
applicable to {
<<someFunctionIdent>>(<<someFunctionParam: Expression | Identifier>>);
}
transform to {
<<someFunctionParam>> |> <<someFunctionIdent>>(%);
}
}
case MultiArgument {
applicable to {
<<someFunctionIdent>>(
<<firstFunctionParam : Expression | Identifier>>,
<<restOfFunctionParams: anyRest>>
);
}
transform to {
<<firstFunctionParam>> |> <<someFunctionIdent>>(%, <<restOfFunctionParams>>);
}
}
}
\end{lstlisting}
This first pair definition \texttt{SingleArgument} of the Pipeline proposal will apply to any \textit{CallExpression} with a single argument. And it will be applied to each of the deeply nested callExpressions. The second pair definition \texttt{MultiArgument} will apply to any \textit{CallExpression} with 2 or more arguments. This is because we use the custom \DSL type \texttt{anyRest} that allows to match against any number of elements in an array stored on an AST node.
\subsection{Do Proposal}
The \cite[Do Proposal]{Proposal:DoProposal} can also be defined with this tool. This definition will never catch all the applicable sections of the users code, and is very limited in where it might discover this proposal is applicable. This is due to the Do Proposal introducing an entirely new way to write JavaScript (Expression-oriented programming). If the user running this tool has not used the current status-quo way of doing expression-oriented programming in JavaScript, \DSL will probably not find any applicable snippets in the users code. However, in a reasonably large codebase, some examples will probably be discovered.
\begin{lstlisting}[language={JavaScript}, caption={Definition of Do Proposal in \DSL}, label={def:doExpression}]
proposal DoExpression{
pair arrowFunction{
applicable to {
() => {
<<blockStatements: anyStatementList>>
return << returnExpr: Expr >>
}
}
transform to {
do {
<< blockStatements >>
<< returnExpr >>
}
}
}
pair immediatelyInvokedUnnamedFunction {
applicable to {
function(){
<<blockStatements: anyNStatements>>
return << returnExpr: Expr >>
}();
}
transform to {
do {
<< blockStatements >>
<< returnExpr >>
}
}
}
}
\end{lstlisting}