Final version push
This commit is contained in:
parent
5b7db8ca86
commit
91847e5b5d
15 changed files with 1057 additions and 479 deletions
BIN
build/report.pdf
BIN
build/report.pdf
Binary file not shown.
229
chapter/ch2.tex
229
chapter/ch2.tex
|
@ -1,42 +1,135 @@
|
|||
\chapter{Background}
|
||||
|
||||
Below we give an overview of
|
||||
the evolution process of the ECMAScript programming language,
|
||||
abstract syntax trees,
|
||||
source code querying,
|
||||
domain-specific languages,
|
||||
and language workbenches.
|
||||
These are instrumental to the implementation
|
||||
of the tool described in this thesis.
|
||||
|
||||
\section{Evolution of the JavaScript programming language}
|
||||
\emph{Technical Committee 39} (TC39) is a technical committee within
|
||||
Ecma International,
|
||||
whose main goal is to develop the language standard
|
||||
for the ECMAScript programming language
|
||||
(informally known as JavaScript);
|
||||
this standard is known as ECMA-262~\cite{ecma262}.
|
||||
Apart from this standard,
|
||||
the committee is also responsible for maintaining
|
||||
related standards:
|
||||
on internalization API (ECMA-402),
|
||||
the standard for JSON (ECMA-404),
|
||||
and
|
||||
ECMAScript specification suite (ECMA-414).
|
||||
The members of the committee are representatives of companies,
|
||||
academic institutions,
|
||||
and other organizations
|
||||
interested in developing and maintaining the ECMAScript language.
|
||||
The delegates include
|
||||
experts in JavaScript engines,
|
||||
tooling surrounding JavaScript,
|
||||
and other areas of the JavaScript ecosystem.
|
||||
|
||||
Technical Committee 39 (abbreviated as TC39) is a group within ECMA international, whose main goal is to develop the language standard for ECMAScript (JavaScript) and other related standards. These related standards include: ECMA-402 the internalization API of ECMA-262 ECMA-404 the standard for JSON and ECMA-414 the ECMAScript specification suite standard. The members of the committee are representatives from companies, academic institutions, and various other organizations from all across the world interested in developing the ECMAScript language. The delegates are experts in JavaScript engines, tooling surrounding JavaScript, and other parts of the JavaScript ecosystem.
|
||||
\paragraph{ECMA-262 Proposals}
|
||||
We explain now what a proposal is,
|
||||
and how proposals are developed in TC39 for the ECMA-262 language standard.
|
||||
|
||||
A \emph{proposal} is a suggested change to the ECMA-262 language standard.
|
||||
These additions to the standard have to solve some form of problem with the current version of ECMAScript.
|
||||
Such problems can come in many forms,
|
||||
and can apply to any part of the language.
|
||||
Examples include: a feature that is not present in the language,
|
||||
inconsistent parts of the language,
|
||||
simplification of common patterns, and so on.
|
||||
The proposal development process is defined in the \emph{TC39 Process Document}~\cite{TC39Process},
|
||||
which describes each stage a proposal has to go through
|
||||
in order to be accepted into the ECMA-262 language standard.
|
||||
|
||||
\subsection{ECMA-262 Proposals}
|
||||
The purpose of \emph{stage 0} of the process is to allow
|
||||
for exploration and ideation around which parts of the current version
|
||||
of ECMAScript can be improved,
|
||||
and then to define a problem space for the committee to focus on improving.
|
||||
|
||||
We explain now what a proposal is, and how proposals are developed in TC39 for the ECMA-262 language standard.
|
||||
At \emph{stage 1}, the committee will start development
|
||||
of a proposal.
|
||||
In order for a proposal to enter this stage, several requirements have to be fulfilled.
|
||||
First, a champion---a delegate of the committee who will be responsible for the advancement of the proposal---has to be identified.
|
||||
In addition, a rough outline of the problem must be provided,
|
||||
and a general shape of a solution must be given.
|
||||
There must have been a discussion around key algorithms,
|
||||
abstractions and semantics of the proposal.
|
||||
Exploration of potential implementation challenges and cross-cutting concerns must have been done.
|
||||
The final requirement is for all parts of the proposal
|
||||
to be captured in a public repository.
|
||||
Once all these requirements are met,
|
||||
a proposal is accepted into stage 1.
|
||||
During this stage, the committee will work on the design of a solution,
|
||||
and resolve any cross-cutting concerns discovered previously.
|
||||
|
||||
At \emph{stage 2}, a preferred solution has been identified.
|
||||
Requirements for a proposal to enter this stage are as follows:
|
||||
all high level APIs and syntax must be described in the proposal document,
|
||||
illustrative examples have to be worked out,
|
||||
and an initial specification text must be drafted.
|
||||
During this stage,
|
||||
the following areas of the proposal are explored:
|
||||
refining the identified solution,
|
||||
deciding on minor details,
|
||||
and create experimental implementations.
|
||||
|
||||
A \emph{proposal} is a suggested change to the ECMA-262 language standard. These additions to the standard have to solve some form of problem with the current version of ECMAScript. Such problems can come in many forms, and can apply to any part of the language. Examples include: a feature that is not present in the language, inconsistent parts of the language, simplification of common patterns, and so on. The proposal development process is defined in the TC39 Process Document.
|
||||
At \emph{stage 2.7},
|
||||
the proposal is principally approved,
|
||||
and has to be tested and validated.
|
||||
To enter this stage,
|
||||
the major sections of the proposal must be complete.
|
||||
The specification text should be finished,
|
||||
and all reviewers of the specification have approved.
|
||||
Once a proposal has entered this stage,
|
||||
testing and validation will be performed.
|
||||
This is done through the prototype implementations at stage 2.
|
||||
|
||||
Once a proposal has been sufficiently tested and verified,
|
||||
it is moved to \emph{stage 3}.
|
||||
During this stage,
|
||||
the proposal should be implemented in at least two major JavaScript engines.
|
||||
The proposal should be tested for web compatibility issues,
|
||||
and integration issues in the major JavaScript engines.
|
||||
|
||||
The \textbf{TC39 Process Document}~\cite{TC39Process} describes each stage a proposal has to go through to be accepted into the ECMA-262 language standard.
|
||||
|
||||
The purpose of \emph{stage 0} of the process is to allow for exploration and ideation around which parts of the current version of ECMAScript can be improved, and then define a problem space for the committee to focus on improving.
|
||||
|
||||
\emph{stage 1} is the stage the committee will start development of a suggested proposal. The are several requirements to enter this stage: a champion, a delegate of the committee responsible for a proposal, has to be identified. A rough outline of the problem must be provided, and a general shape of a solution must be given. There must have been discussion around key algorithms, abstractions and semantics of the proposal. Exploration of potential implementation challenges and cross-cutting concerns must have been done. The final requirement is for all parts of the proposal to be captured in a public repository. Once all these requirements are met, a proposal is accepted into stage 1. During this stage, the committee will work on the design of a solution, and resolve any cross-cutting concerns discovered previously.
|
||||
|
||||
At \emph{stage 2} a preferred solution has been identified. Requirements for a proposal to enter this stage are as follows: all high level API's and syntax must be described in the proposal document, illustrative examples created, and an initial specification text must be drafted. During this stage the following areas of the proposal are explored: refining the identified solution, deciding on minor details, and create experimental implementations.
|
||||
|
||||
At \emph{stage 2.7} the proposal is principally approved, and has to be tested and validated. To enter this stage, the major sections of the proposal must be complete. The specification text should be finished, and all reviewers of the specification have approved. Once a proposal has entered this stage, testing and validation will be performed. This is done through the prototype implementations created in stage 2.
|
||||
|
||||
Once a proposal has been sufficiently tested and verified, it is moved to \emph{stage 3}. During this stage, the proposal is implemented in at least two major JavaScript engines. The proposal should be tested for web compatibility issues, and integration issues in the major JavaScript engines.
|
||||
|
||||
At \emph{Stage 4} the proposal is completed and included in the next revision of the ECMA-262.
|
||||
At \emph{stage 4} the proposal is completed and will be included in the next revision of the ECMA-262.
|
||||
|
||||
\section{Abstract Syntax Trees}
|
||||
\label{sec:backgroundAST}
|
||||
An \emph{abstract syntax tree} (AST)
|
||||
is a tree representation of source code.
|
||||
Every node of such a tree represents a construct from the source code.
|
||||
ASTs remove syntactic details while maintaining the
|
||||
\emph{structure} of the program.
|
||||
Each node is set to represent constructs of the programming language,
|
||||
such as statements, expressions, declarations, and so on.
|
||||
Thus, every node type represents a grammatical construct in the language the AST was built from.
|
||||
|
||||
|
||||
An \emph{abstract syntax tree} (AST) is a tree representation of source code. Every node of such a tree represents a construct from the source code. ASTs remove syntactic details while maintaining the \emph{structure} of the program. Each node is set to represent elements of the programming language, such as statements, expressions, declarations to name a few. Every node type represents a grammatical construct in the language the AST was built from.
|
||||
|
||||
ASTs are important for manipulating source code; they are used by almost any tool that has to represent source code in some way to perform operations with it~\cite{AST3}. Using ASTs is favored over raw text due to their structured nature, especially when considering tools like compilers, interpreters, or code transformation tools. ASTs are produced by language parsers. For JavaScript, one of the popular labraries used for parsing is \emph{Babel}~\cite{Babel}.
|
||||
|
||||
Babel is a JavaScript toolchain, its main usage is converting ECMASCript 2015 and newer into older versions of JavaScript. This conversion is done to increase the compatibility of JavaScript in older environments such as older browsers.
|
||||
|
||||
Babel has a suite of libraries used to work with JavaScript source code, each library relies on Babel's AST definition~\cite{BabelAST}. The AST specification Babel uses tries to stay as close to the ECMAScript standard as possible~\cite{BabelSpecCompliant}, which has made it a recommended parser to use for proposal transpiler implementations~\cite{TC39RecommendBabel}. A simple example of how source code parsed into an AST with Babel looks like can be seen in Figure~\ref{ex:srcToAST}.
|
||||
ASTs are important for manipulating source code;
|
||||
they are used by various tools
|
||||
that need to represent source code in some way to perform operations with it~\cite{AST3}.
|
||||
Using ASTs is favored over raw text due to their structured nature;
|
||||
this especially manifests when considering tools like compilers,
|
||||
interpreters, or code transformation tools.
|
||||
ASTs are produced by language \emph{parsers}.
|
||||
For JavaScript, one of the popular libraries used for parsing is \emph{Babel}~\cite{Babel}.
|
||||
Babel is a JavaScript toolchain,
|
||||
and its main usage is converting source code written in the version ECMASCript 2015 or a newer one into older versions of JavaScript.
|
||||
This conversion is done to increase the compatibility of JavaScript
|
||||
in older execution environments.
|
||||
Babel has a suite of libraries used to work with JavaScript source code.
|
||||
Each library relies on Babel's AST definition~\cite{BabelAST}.
|
||||
The AST specification Babel uses tries to stay
|
||||
as close as possible to the ECMAScript standard~\cite{BabelSpecCompliant}.
|
||||
This fact has made Babel a recommended parser to use
|
||||
for proposal transpiler implementations~\cite{TC39RecommendBabel}.
|
||||
A simple example of how source code parsed into an AST with Babel
|
||||
can be seen in Figure~\ref{ex:srcToAST}.
|
||||
|
||||
\begin{figure}[H]
|
||||
\noindent\begin{minipage}{.30\textwidth}
|
||||
|
@ -64,29 +157,95 @@ let name = f(100);
|
|||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{minipage}\hfil
|
||||
\caption{\label{ex:srcToAST} Example of source code parsed to Babel AST}
|
||||
\caption{\label{ex:srcToAST} Example of source code parsed to Babel AST.}
|
||||
\end{figure}
|
||||
|
||||
To achieve compilation of newer versions into older versions,
|
||||
Babel uses a \emph{plugin} system that allows
|
||||
a myriad of features to be enabled or disabled.
|
||||
This makes the parser versatile to fit different ways
|
||||
of working with JavaScript source code.
|
||||
Because of this,
|
||||
Babel allows parsing of JavaScript experimental features.
|
||||
These features are usually proposals that are under development by TC39,
|
||||
and the development of these plugins are a part of the proposal deliberation process.
|
||||
This allows for experimentation as early as \emph{stage 1} of the proposal development process.
|
||||
Some examples of proposals that were first supported by Babel's plugin system
|
||||
are ``Do Expression''~\cite{Proposal:DoProposal}
|
||||
and ``Pipeline''~\cite{Pipeline}.
|
||||
These proposals are both currently at \emph{stage 1} and \emph{stage 2}, respectively.
|
||||
|
||||
To achieve compilation of newer versions into older versions, Babel uses a plugin system that allows a myriad of features to be enabled or disabled. This makes the parser versatile to fit different ways of working with JavaScript source code. Because of this, Babel allows parsing of JavaScript experimental features. These features are usually proposals that are under development by TC39, and the development of these plugins are a part of the proposal deliberation process. This allows for experimentation as early as stage 1 of the proposal development process. Some examples of proposals that were first supported by Babel's plugin system are "Do Expression"~\cite{Proposal:DoProposal} and "Pipeline"~\cite{Pipeline}. These proposals are both currently in stage 1 and 2, respectively.
|
||||
In this project,
|
||||
we will use Babel to parse JavaScript into abstract syntax trees.
|
||||
This choice was made because of Babel's support of very early stage proposals.
|
||||
|
||||
|
||||
\section{Source Code Querying}
|
||||
|
||||
Source code querying is the action of searching source code to extract some information or find specific sections of code. Source code querying comes in many forms, the simplest of which is text search. Since source code is primarily text, one can apply text search techniques to perform a query, or a more complex approach using regular expressions (eq tools like \texttt{grep}). Both these methods do not allow for queries based on the structure of the code, and rely solely on its syntax. AST-based queries allow queries to be written based on both syntax and structure, and are generally more powerful than regular text based queries. Another technique for code querying is based on semantics of code.
|
||||
Source code querying is the action of searching source code
|
||||
to extract some information or find specific sections of code.
|
||||
Source code querying comes in many forms,
|
||||
the simplest of which is text search.
|
||||
Since source code is primarily text,
|
||||
one can apply text search techniques to perform a query,
|
||||
or a more complex approach using regular expressions
|
||||
(e.g., tools like \texttt{grep}).
|
||||
Both these methods do not allow for queries based on the structure of the code,
|
||||
and rely solely on its syntax.
|
||||
AST-based queries allow queries to be written
|
||||
based on both syntax and structure,
|
||||
and are generally more powerful than regular text based queries.
|
||||
Another technique for code querying is based on semantics of code.
|
||||
|
||||
The primary use cases for source code querying is code understanding, analysis, code navigation, enforcement of styles, along with others. All these are important tools developers use when writing programs, and they all rely on some form of source code queries. One such tool is Integrated Development Environments (IDEs), as these tools are created to write source code, and therefore rely on querying the source code for many of their features. One such example of code querying being used in an IDE is JetBrains structural search and replace~\cite{StructuralSearchAndReplaceJetbrains}, where queries are defined based on code structure to find and replace sections of our program.
|
||||
The primary use cases for source code querying are
|
||||
code understanding, analysis, code navigation, enforcement of styles,
|
||||
along with others.
|
||||
All these are important tools developers use when writing programs,
|
||||
and they all rely on some form of source code queries.
|
||||
One such tool is Integrated Development Environments (IDEs),
|
||||
as these tools are created to write source code,
|
||||
and, therefore,
|
||||
rely on querying the source code for many of their features.
|
||||
One such example of code querying being used in an IDE is JetBrains IntelliJ \emph{structural search and replace}~\cite{StructuralSearchAndReplaceJetbrains}, where queries are defined based on code structure
|
||||
to find and replace sections of our program.
|
||||
|
||||
\section{Domain-Specific languages}
|
||||
|
||||
Domain specific languages are software languages specialized to a specific narrow domain~\cite{Kleppe}. The difference between DSLs and General Purpose Languages (GPLs) is defined like a spectrum, in which DSL is on one end and GPLs is on the other~\cite{DSLSANDHOW}. DSLs allow domain experts to get involved in the development process, as it is expected that a domain expert would have the capabilities to read and write DSL code. A DSL allows for very concise and expressive code to be written that is specifically designed for the domain. Using a DSL might result in faster development because of this expressiveness within the domain, this specificity to a domain might also increase correctness. However, there are also some disadvantages to DSL's, the restrictiveness of a DSL might become a hinderance if it is not well designed to represent the domain. DSL's also might have a learning curve, making the knowledge required to use them a hinderance. Developing the DSL might also be a hinderance, as a DSL requires both knowledge of the domain and knowledge of software language engineering.
|
||||
Domain-specific languages (DSLs) are software languages
|
||||
specialized to a specific narrow domain~\cite{Kleppe}.
|
||||
DSLs allow domain experts to get involved in the software development process,
|
||||
as it is expected that a domain expert would have the capabilities to read and write DSL code.
|
||||
A domain-specific language allows for very concise and expressive code
|
||||
to be written that is specifically designed for the domain.
|
||||
Using a DSL might result in faster development because of this expressiveness within the domain;
|
||||
this specificity to a domain might also increase correctness.
|
||||
However, there are also some disadvantages to DSLs:
|
||||
the restrictiveness of a DSL might become a hindrance
|
||||
if it is not well designed to represent the domain.
|
||||
Domain-specific languages also might have a learning curve,
|
||||
this makes these language less accessible for the target users.
|
||||
Developing a domain-specific language might
|
||||
is a non-trivial process~\cite{MarkusDSL},
|
||||
as implementing a DSL requires both knowledge of the domain and knowledge of software language engineering.
|
||||
|
||||
|
||||
\section{Language Workbenches}
|
||||
A \emph{language workbench}~\cite{LanguageWorkbenchMartinFowler}
|
||||
is an integrated development environment
|
||||
created to facilitate the development of a software language,
|
||||
such as a domain-specific language.
|
||||
The goal of a language workbench is to give increased productivity during development,
|
||||
and to enhance the design and evolution of software languages~\cite{LanguageWorkbenchMartinFowler}.
|
||||
|
||||
A language workbench is a tool created to facilitate the development of a computer language, such as a DSL.
|
||||
Commonly language workbenches generate tooling for a software language.
|
||||
One such tool is a language parser that is generated from the language definition within the language workbench.
|
||||
Another such tool commonly generated by a language workbench is an integrated development environment,
|
||||
such IDEs provide functionality such as syntax highlighting,
|
||||
code navigation, error highlighting, along with others.
|
||||
|
||||
|
||||
Language workbenches support generating tooling for languages, as most modern computer languages are backed by some form of tooling. One such tool is a language parser that is generated from the language definition within the language workbench. Another such tool commonly generated by a language workbench is a language server for integrated development environments, these language servers provide functionality such as syntax highlighting, code navigation, error highlighting, along with others.
|
||||
|
||||
A language is defined in a language workbench using a grammar definition. This grammar is a formal specification of the language that describes how each language construct is composed and the structure of the language. This allows the language workbench to determine what is a valid sentence of the language. This grammar is used to create the the AST of the language, which is the basis for all the tools generated by the language workbench. Many such language workbenches exist, such as Langium~\cite{Langium}, Xtext~\cite{Xtext}, Jetbrains MPS, and Racket.
|
||||
%When working with a language workbench,
|
||||
%one manipulates an abstract representation of the language one is
|
||||
%creating~\cite{LanguageWorkbenchMartinFowler}. This abstract representation is
|
||||
%projected into an editable form, this editable form is how we define a software
|
||||
%language within a language workbench.
|
||||
|
||||
|
|
|
@ -1,28 +1,38 @@
|
|||
\chapter{Collecting User Feedback for Syntactic Proposals}
|
||||
\chapter{A domain-specific language for matching and transforming source code}
|
||||
\label{cha:3}
|
||||
The tool that we implement in this thesis should allow
|
||||
previewing how an ECMAScript proposal could affect a user's codebase.
|
||||
We only focus on proposals that introduce new syntactic forms that merely abstract certain use patterns that could be otherwise written in JavaScript---but a verbose or less idiomatic manner. We call these \emph{syntactic proposals}.
|
||||
The idea is to identify code fragments in a user's codebase
|
||||
to which a proposal can be \emph{applied}.
|
||||
An application of a proposal can be thought of as a process of identifying the user's code that ``matches''
|
||||
a proposal's problem space,
|
||||
and then replacing that code with a \emph{semantically equivalent} code
|
||||
that uses the functionality introduced in the proposal.
|
||||
|
||||
The goal for this project is to utilize users familiarity with their own code to gain early and worthwhile feedback on a certain kind of ECMAScript proposals.
|
||||
Thus,
|
||||
it will be possible to identify all the places
|
||||
in the user's codebase which can be affected by a proposal---and to show to the user how the modified version of the \emph{user's own} code will look like.
|
||||
This way a user will be able to give a very specific feedback to the TC39 committee. Importantly, the fact that a user is familiar with their codebase
|
||||
could potentially allow for a more useful feedback and thus a more efficient process of developing and evolving the ECMAScript programming language.
|
||||
|
||||
\section{The core idea}
|
||||
Implementing the idea outlined here requires some way of \emph{matching}
|
||||
and
|
||||
\emph{transforming} code.
|
||||
A proposal should thus have a precise specification, where
|
||||
the matching and the transformation can be defined.
|
||||
For this purpose,
|
||||
we have designed and implemented a domain-specific language,
|
||||
which will be introduced in this chapter.
|
||||
|
||||
When a user of ECMAScript wants to suggest a change to the language, the idea of the change has to be described in a Proposal. A proposal is a general way of describing a change and its requirements, this is done by a language specification, motivation for the idea, and general discussion around the proposed change. A proposal ideally also needs backing from the community of users that use ECMAScript, this means the proposal has to be presented to users some way. This is currently done by many channels, such as polyfills, code examples, and as beta features of the main JavaScript engines, however, this paper wishes to showcase proposals to users by using a different avenue.
|
||||
|
||||
Users of ECMAScript have a familiarity with code they themselves have written. This means they have knowledge of how their own code works and why they might have written it a certain way. This project aims to utilize this pre-existing knowledge to showcase new proposals for ECMAScript. This way will allow users to focus on what the proposal actually entails, instead of focusing on the examples written by the proposal authors.
|
||||
%The way this project will use the pre-existing knowledge a user has of their own code is to use that code as base for showcasing a proposals features. Using the users own code as base requires the following steps to automatically implement the examples that showcase the proposal inside the context of the users own code.
|
||||
|
||||
Further in this chapter, we will be discussing the current version and future version of ECMAScript. What we are referring to in this case is with set of problems a proposal is trying to solve, if that proposal is allowed into ECMAScript as part of the language, there will be a future way of solving said problems. The current way is the current status quo when the proposal is not part of ECMAScript, and the future version is when the proposal is part of ECMAScript and we are utilizing the new features of said proposal.
|
||||
|
||||
The program will allow the users to preview proposals way before they are part of the language. This way the committee can get useful feedback from users of the language earlier in the proposal process. Using the users familiarity will ideally allow for a more efficient process developing ECMAScript.
|
||||
%Once we have matched all the parts of the program the proposal could be applied to, the users code has to be transformed to use the proposal, this means changing the code to use a possible future version of JavaScript. This step also includes keeping the context and functionality of the users program the same, so variables and other context related concepts have to be transferred over to the transformed code.
|
||||
|
||||
\subsection{Applying a proposal}
|
||||
%The output of the previous step is then a set of code pairs, where one a part of the users original code, and the second is the transformed code. The transformed code is then ideally a perfect replacement for the original user code if the proposal is part of ECMAScript. These pairs are used as examples to present to the user, presented together so the user can see their original code together with the transformed code. This allows for a direct comparison and an easier time for the user to understand the proposal.
|
||||
|
||||
The way this project will use the pre-existing knowledge a user has of their own code is to use that code as base for showcasing a proposals features. Using the users own code as base requires the following steps to automatically implement the examples that showcase the proposal inside the context of the users own code.
|
||||
|
||||
The idea is to identify where the features and additions of a proposal could have been used. This means identifying parts of the users program that use pre-existing ECMAScript features that the proposal is interacting with and trying to solve. This will then identify all the different places in the users program the proposal can be applied. This step is called \textit{matching} in the following chapters.
|
||||
|
||||
Once we have matched all the parts of the program the proposal could be applied to, the users code has to be transformed to use the proposal, this means changing the code to use a possible future version of JavaScript. This step also includes keeping the context and functionality of the users program the same, so variables and other context related concepts have to be transferred over to the transformed code.
|
||||
|
||||
The output of the previous step is then a set of code pairs, where one a part of the users original code, and the second is the transformed code. The transformed code is then ideally a perfect replacement for the original user code if the proposal is part of ECMAScript. These pairs are used as examples to present to the user, presented together so the user can see their original code together with the transformed code. This allows for a direct comparison and an easier time for the user to understand the proposal.
|
||||
|
||||
The steps outlined in this section require some way of defining matching and transforming of code. This has to be done very precisely and accurately to avoid examples that are wrong. Imprecise definition of the proposal might lead to transformed code not being a direct replacement for the code it was based upon. For this we suggest two different methods, a definition written in a custom DSL \DSL{} and a definition written in a self-hosted way only using ECMAScript as a language as definition language.
|
||||
|
||||
\section{Applicable proposals}
|
||||
\label{sec:proposals}
|
||||
|
@ -552,6 +562,7 @@ In this section, we present specifications of the proposals described in Section
|
|||
|
||||
This is because this tool is designed to be used by TC39 to gather feedback from user's on proposals during development. This use case means the specifications should be defined in a way that showcases the proposal. This also means it is important that the transformation is correct, as incorrect transformations might lead to bad feedback on a proposal.
|
||||
|
||||
\newpage
|
||||
\subsection{``Pipeline'' Proposal}
|
||||
|
||||
This proposal is applicable to call expressions, and is aimed at improving code readability when performing deeply nested function calls.
|
||||
|
@ -649,21 +660,28 @@ proposal awaitToPromise{
|
|||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The specification of ``Await To Promise'' in \DSL{} is specified to match asynchronous code inside a function. It is limited to match asynchronous functions containing a single await statement, and that await statement has to be stored in a \texttt{VariableDeclaration}. The second wildcard \texttt{statements} is designed to match all statements following the \texttt{await} statement up to the return statement. This is done to move the statements into the callback function of \texttt{then()} in the transformation. We include \texttt{\!ReturnStatement} because we do not want to consume the return as it would then be removed from the functions scope and into the callback function of \texttt{then()}. We also have to avoid matching where there exists loop specific statements such as \texttt{ContinueStatement} or \texttt{BreakStatement}.
|
||||
The specification of ``Await To Promise'' in \DSL{} is specified to match asynchronous code inside a function. It is limited to match asynchronous functions containing a single await statement, and that await statement has to be stored in a \texttt{VariableDeclaration}. The second wildcard \texttt{statements} is designed to match all statements following the \texttt{await} statement up to the return statement. This is done to move the statements into the callback function of \texttt{then()} in the transformation. We include \texttt{\!ReturnStatement} because we do not want to consume the return as it would then be removed from the functions scope and into the callback function of \texttt{then()}. We also have to avoid matching where there exists loop specific statements such as \texttt{ContinueStatement} or \texttt{BreakStatement}.
|
||||
|
||||
The transformation definition has to use an \texttt{async} arrow function as argument for \texttt{then}, as there might be more await expressions contained within \texttt{statements}.
|
||||
|
||||
\section{\DSL{}-SH}
|
||||
In this thesis,
|
||||
we have also explored an alternative way of specifying
|
||||
syntactic proposals---\DSL{}-SH\footnote{``SH'' stands for ``self-hosted'', inspired by \url{http://udn.realityripple.com/docs/Mozilla/Projects/SpiderMonkey/Internals/self-hosting}.}---which effectively uses JavaScript as a meta-language.
|
||||
|
||||
In this thesis, we also created an alternative way of defining proposals and their respective transformations, this is done using JavaScript as it's own meta language for the definitions, we call this \DSL{}-SH. The reason for this is we wanted an alternative to \DSL{}, as if this tool is to be used by TC39, we want to present several possible ways to specify a proposal. This has an added benefit of allowing our tool to be independent from the dependencies of \DSL{}-SH.
|
||||
|
||||
\DSL{}-SH is less of an actual language, and more of a program API, it allows for defining proposals purely in JavaScript Object, which is meant to allow a more modular way of using this idea. In \DSL{}-SH you define a \emph{prelude}, which is a series of variable declarations in JavaScript. These variable declarations are used to define the type expressions of wildcards, where the identifier of the variable declaration is the identifier of the wildcard, and the expression of the declaration is a string containing the wildcard type expression.
|
||||
The templates are defined in a similar manner to \DSL{}, however one does not write wildcard blocks, and instead use the identifier of the wildcard directly. This makes it very clear where wildcards are allowed and might produce a more natural way of defining templates.
|
||||
|
||||
In the example below is the first \texttt{case} of the ``Pipeline'' proposal defined using \DSL{}-SH JavaScript object. We define the wildcards in the object key \texttt{prelude}, where it is defined as valid JavaScript. The variable declarations define the wildcards, and we use the variables defined to reference wildcards in the templates.
|
||||
In this approach,
|
||||
proposal specifications are written as JavaScript objects.
|
||||
Each specification defines the following keys on the object:
|
||||
\begin{itemize}
|
||||
\item \texttt{prelude}, which is a sequence of
|
||||
JavaScript variable declarations,
|
||||
which are used to define wildcards.
|
||||
\item \texttt{applicableTo}, which is the template to perform matching. Unlike the \DSL{} template specification, a corresponding \DSL{}-SH specification is free of wildcards---for that purpose, the variables introduced in the \texttt{prelude} are used.
|
||||
\item \texttt{transformTo}, which is the template that defines the transformation. Similarly to the previous field, the value of this field is free of any wildcards; instead, a user can refer to the variables defined in \texttt{prelude} that represent wildcards.
|
||||
\end{itemize}
|
||||
|
||||
This example below represents the first case of the ``Pipeline'' proposal specification (see Listing~\ref{def:pipeline} for comparison).
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
// Equivalent definition of pipeline first case in JSTQL-SH
|
||||
{
|
||||
prelude: `
|
||||
let someFunctionIdent = "Identifier || MemberExpression";
|
||||
|
@ -673,3 +691,19 @@ In the example below is the first \texttt{case} of the ``Pipeline'' proposal def
|
|||
transformTo: "someFunctionParam |> someFunctionIdent(%);"
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\DSL{}-SH provides an Application Programming Interface
|
||||
that exposes a function that
|
||||
takes a JavaScript object that represents a proposal specification
|
||||
and a string that represents the user code to which the proposal will be applied.
|
||||
This function then returns the transformed source code.
|
||||
|
||||
The main benefit of this alternative approach is
|
||||
that an extraction of wildcards from templates is not required.
|
||||
This means that a template is becomes a valid JavaScript code.
|
||||
Using only JavaScript to define proposal specifications could lower
|
||||
the barrier for use of our tool by the TC39 delegates and members of the JavaScript community,
|
||||
as being able to experiment with a new proposal now becomes a matter
|
||||
of using a library function.
|
||||
|
||||
|
||||
|
|
414
chapter/ch4.tex
414
chapter/ch4.tex
|
@ -1,14 +1,38 @@
|
|||
\chapter{Implementation}
|
||||
|
||||
In this chapter, the implementation of the tool utilizing the \DSL{} and \DSL{}SH will be presented. It will describe the overall architecture of the tool, the flow of data throughout, and how the different stages of transforming user code are completed.
|
||||
In this chapter,
|
||||
the implementation of the tool utilizing the \DSL{} and \DSL{}-SH will be presented.\footnote{The source code for this implementation can be found \url{https://github.com/polsevev/JSTQL-JS-Transform}}
|
||||
It will describe the overall architecture of the tool,
|
||||
the flow of data throughout,
|
||||
and how the different stages of transforming user code are completed.
|
||||
|
||||
\section{Architecture of the solution}
|
||||
|
||||
The architecture of the solution described in this thesis is illustrated in \figFull[fig:architecture]
|
||||
As was presented in Section~\ref{cha:3},
|
||||
there are two ways to specify a proposal:
|
||||
either using a custom domain-specific language JSTQL,
|
||||
or by using the corresponding JavaScript API.
|
||||
Figure~\ref{fig:architecture} demonstrates the architecture of the implementation of these two approaches.
|
||||
In the figure, ellipse nodes represent data passed into the tool,
|
||||
and rectangular nodes represent specific components of the tool.
|
||||
|
||||
In this tool, there exists two multiple ways to define a proposal, and each provide the same functionality, they only differ in syntax and writing-method. One can either write the definition in \DSL{}, or one can use the program API with \DSL{}SH, which is more friendly for programs to interact with.
|
||||
In the JSTQL approach (the ``left-side'' path in the figure),
|
||||
the initial step is to parse a proposal specification and then to extract the wildcard \emph{declarations} and \emph{references} from code templates.
|
||||
A corresponding step in the API-based approach (the ``right-side'' path)
|
||||
is to build the prelude, where the wildcard definitions are ``extracted'' from JavaScript code.
|
||||
|
||||
In the architecture diagram of Figure~\ref{fig:architecture}, ellipse nodes show data passed into the program sections, and rectangular nodes is a specific section of the program. The architecture is split into seven levels, where each level is a step of the program. The initial step is the proposal definition, the definition can have two different forms, either it is \DSL{} code, or it can be a JavaScript object using the self hosted in \DSL{}SH. If we use \DSL{}, the second step is parsing it using Langium~\cite{Langium}, this parses the raw source code into an AST. If \DSL{}SH is used, we have to build the prelude, so we have to extract the wildcard definitions from JavaScript source code. At this point the two paths meet at the second step, which is wildcard extraction. At this step, if \DSL{} was used, the wildcards are extracted from the template. If \DSL{}SH was used extraction is not needed. In both cases we parse the wildcard type expressions into an AST. The third step is parsing the raw source code with Babel~\cite{Babel}. It is also at this point we parse the users source code into an AST. The fourth step is translating the Babel AST into our own custom tree structure for simpler traversal. Once all data is prepared, the fifth step is matching the user's AST against the \texttt{applicable to} template AST. Once all matches have been found, we transplant the wildcard matches into the \texttt{transform to} template, and insert it back into the users code in step six. We have at this point transformed the users code, the final step seven is generating it back into source code.
|
||||
For both of the approaches, the second step (Section~\ref{sec:WildcardExtraction}) is to parse wildcard type expressions used in the templates' specifications.
|
||||
After that,
|
||||
at step 3 (Section~\ref{sec:BabelParse}), Babel is used to parse and build abstract syntax trees for the \texttt{applicable to} templates and the \texttt{transform to} templates in a proposal specification, and the user's code to which the proposal will be applied.
|
||||
At step 4 (Section~\ref{sec:customTree}),
|
||||
we process the abstract syntax trees produced by Babel and produce a custom tree data structure for simpler traversal.
|
||||
At step 5 (Section~\ref{sec:Matching}),
|
||||
we match the user's AST against the templates in the \texttt{applicable to} blocks.
|
||||
Once all matches have been found, we incorporate the wildcard matches into the
|
||||
\texttt{transform to} template at step 6 (Section~\ref{sec:transform}), and insert it back into the users code.
|
||||
At this point, the AST of the user's code has been transformed,
|
||||
and the final step 7 (Section~\ref{sec:generate})
|
||||
then pretty-prints the transformed AST into a JavaScript source code.
|
||||
|
||||
\iffalse
|
||||
\begin{description}
|
||||
|
@ -31,12 +55,14 @@ In the architecture diagram of Figure~\ref{fig:architecture}, ellipse nodes show
|
|||
roundnode/.style={ellipse, draw=red!60, fill=red!5, very thick, minimum size=7mm},
|
||||
squarednode/.style={rectangle, draw=red!60, fill=red!5, very thick, minimum size=5mm}
|
||||
]
|
||||
\node[squarednode] (preParser) {2. Wildcard Extraction};
|
||||
\node[squarednode] (preParser) {2. Type Expression Parser};
|
||||
\node[squarednode] (preludebuilder) [above right=of preParser] {1. Prelude Builder};
|
||||
\node[roundnode] (selfhostedjsoninput) [above=of preludebuilder] {Self-Hosted Object};
|
||||
\node[squarednode] (langium) [above left=of preParser] {1. Langium Parser};
|
||||
\node[squarednode] (extraction) [above left=of preParser] {1.2. Extract wildcards};
|
||||
\node[squarednode] (langium) [above =of extraction] {1.1. Parse JSTQL code};
|
||||
|
||||
\node[roundnode] (jstqlcode) [above=of langium] {JSTQL Code};
|
||||
\node[squarednode] (babel) [below=of preParser] {3. Babel};
|
||||
\node[squarednode] (babel) [below=of preParser] {3. Babel parsing};
|
||||
\node[roundnode] (usercode) [left=of babel] {User source code};
|
||||
\node[squarednode] (treebuilder) [below=of babel] {4. Custom Tree builder};
|
||||
\node[squarednode] (matcher) [below=of treebuilder] {5. Matcher};
|
||||
|
@ -45,7 +71,8 @@ In the architecture diagram of Figure~\ref{fig:architecture}, ellipse nodes show
|
|||
|
||||
|
||||
\draw[->] (jstqlcode.south) -- (langium.north);
|
||||
\draw[->] (langium.south) |- (preParser.west);
|
||||
\draw[->] (langium.south) -- (extraction.north);
|
||||
\draw[->] (extraction.south) |- (preParser.west);
|
||||
\draw[->] (preParser.south) |- (babel.north);
|
||||
\draw[->] (babel.south) -- (treebuilder.north);
|
||||
\draw[->] (treebuilder.south) -- (matcher.north);
|
||||
|
@ -64,111 +91,152 @@ In the architecture diagram of Figure~\ref{fig:architecture}, ellipse nodes show
|
|||
|
||||
\section{Parsing \DSL{} using Langium}
|
||||
|
||||
In this section, we describe the implementation of the parser for \DSL{}. We outline the tool Langium, used as a parser-generator to create the AST used by the tool later to perform the transformations.
|
||||
In this section,
|
||||
we describe the implementation of the parser for \DSL{}.
|
||||
We start with outlining the language workbench which we used to generate a parser for \DSL{}.
|
||||
|
||||
\subsection{Langium}
|
||||
|
||||
Langium~\cite{Langium} is a language workbench~\cite{LanguageWorkbench} primarily used to create parsers and Integrated Development Environments for domain specific languages. These kinds of parsers produce Abstract Syntax Trees that are later used to create interpreters or other tooling. In this project, we use Langium to generate an AST definition in the form of TypeScript objects. These objects and their structure are used as definitions for the tool to do matching and transformation of user code.
|
||||
\emph{Langium}~\cite{Langium} is a language workbench~\cite{LanguageWorkbench}
|
||||
that can be used to generate parsers for software languages, in addition to producing a tailored Integrated Development Environment for the language.
|
||||
|
||||
To generate this parser, Langium requires a definition of a grammar. A grammar is a specification that describes syntax a valid programs. The \DSL{} grammar describes the structure of \DSL{}, such as \texttt{proposals}, \texttt{cases}, \texttt{applicable to} blocks, and \texttt{transform to} blocks. A grammar in Langium starts by describing the \texttt{Model}. The model is the top entry of the grammar; this is where the description of all valid top level statements.
|
||||
A parser generated by Langium produces abstract syntax trees which are TypeScript objects.
|
||||
These objects and their structure are used as definitions for the tool
|
||||
to do matching and transformation of user code.
|
||||
|
||||
Contained within the \texttt{Model} rule, is one or more proposals. Each proposal is defined with the rule \texttt{Proposals}, and starts with the keyword \texttt{proposal}, followed by a name, and a code block. This rule is designed to contain every definition of a transformation related to a specific proposal. To hold every transformation definition, a proposal definition contains one or more cases.
|
||||
|
||||
The \texttt{Case} rule is created to contain a single transformation. Each case specification starts with the keyword \texttt{case}, followed by a name for the current case, then a block for that case's fields. Cases are designed in this way to separate different transformation definitions within a proposal. Each case contains a single definition used to match against user code, and a definition used to transform a match.
|
||||
|
||||
The rule \texttt{AplicableTo}, is designed to hold a single template used for matching. It starts with the keywords \texttt{applicable} and \texttt{to}, followed by a block designed to hold the matching template definition. The template is defined as the terminal \texttt{STRING}, and is parsed as a raw string for characters by Langium~\cite{Langium}.
|
||||
|
||||
The rule \texttt{TransformTo}, is created to contain a single template used for transforming a match. It starts with the keywords \texttt{transform} and \texttt{to}, followed by a block that holds the transformation definition. This transformation definition is declared with the terminal \texttt{STRING}, and is parser at a string of characters, same as the template in \texttt{applicable to}.
|
||||
|
||||
To define exactly what characters/tokens are legal in a specific definition, Langium uses terminals defined using regular expressions, these allow for a very specific character-set to be legal in specific keys of the AST generated by the parser generated by Langium. In the definition of \texttt{Proposal} and \texttt{Pair} the terminal \texttt{ID} is used; this terminal is limited to allow for only words and can only begin with a character of the alphabet or an underscore. In \texttt{Section} the terminal \texttt{STRING} is used, this terminal is meant to allow any valid JavaScript code and the custom DSL language described in \ref{sec:DSL_DEF}. Both these terminals defined allows Langium to determine exactly what characters are legal in each location.
|
||||
|
||||
\begin{lstlisting}[caption={Definition of \DSL{} in Langium.}, label={def:JSTQLLangium}]
|
||||
To generate a parser,
|
||||
Langium requires a definition of a grammar.
|
||||
A grammar is a specification that describes syntax a valid programs in a language.
|
||||
The grammar for \DSL{} describes the structure of \DSL{} specifications.
|
||||
The starting symbol of the grammar represents valid specifications:
|
||||
\begin{lstlisting}
|
||||
grammar Jstql
|
||||
|
||||
entry Model:
|
||||
(proposals+=Proposal)*;
|
||||
\end{lstlisting}
|
||||
|
||||
In its turn, a proposal's specification includes its name and a specification of at least one \emph{transformation case}.
|
||||
\begin{lstlisting}
|
||||
Proposal:
|
||||
'proposal' name=ID "{"
|
||||
(case+=Case)+
|
||||
"}";
|
||||
\end{lstlisting}
|
||||
|
||||
A transformation case specification is comprised of a code template to match a JavaScript code to which the case is applicable,
|
||||
and a code template that specifies how a match should be transformed.
|
||||
\begin{lstlisting}
|
||||
Case:
|
||||
"case" name=ID "{"
|
||||
aplTo=ApplicableTo
|
||||
traTo=TransformTo
|
||||
"}";
|
||||
\end{lstlisting}
|
||||
Case specifications are designed in this way in order to separate different transformation definitions within a single proposal.
|
||||
|
||||
An \texttt{applicable to} block specifies a JavaScript code template
|
||||
with wildcard declarations. This code template is represented in the grammar
|
||||
using the terminal symbol \texttt{STRING},
|
||||
and will be thus parsed as a raw string of characters.
|
||||
\begin{lstlisting}
|
||||
ApplicableTo:
|
||||
"applicable" "to" "{"
|
||||
apl_to_code=STRING
|
||||
"}";
|
||||
\end{lstlisting}
|
||||
The decision to use the \texttt{STRING} terminal,
|
||||
rather than a designated nonterminal symbol that would represent valid JavaScript programs with wildcards,
|
||||
is motivated by two reasons:
|
||||
(i) we separate parsing of the JSTQL specification structure (which is done by Langium) and parsing of JavaScript code (for which we use Babel\footnote{See Sections~\ref{sec:backgroundAST} and \ref{sec:BabelParse}.});
|
||||
and (ii) we use a custom processor of wildcards to enable reuse of such a processor for both JSTQL and JSTQL-SH\footnote{See Section~\ref{sec:wildcardExtractionAndParsing}.}.
|
||||
|
||||
A \texttt{transform to} block is specified in a similar manner:
|
||||
\begin{lstlisting}
|
||||
TransformTo:
|
||||
"transform" "to" "{"
|
||||
transform_to_code=STRING
|
||||
"}";
|
||||
hidden terminal WS: /\s+/;
|
||||
terminal ID: /[_a-zA-Z][\w_]*/;
|
||||
terminal STRING: /"[^"]*"|'[^']*'/;
|
||||
\end{lstlisting}
|
||||
|
||||
With \DSL{}, we are not implementing a programming language meant to be executed. We are using Langium to generate an AST that will be used as a markup language, similar to YAML, JSON or TOML~\cite{TOML}. The main reason for using Langium in such an unconventional way is Langium provides support for Visual Studio Code integration, and it solves the issue of parsing the definition of each proposal manually. However, with this grammar we cannot actually verify the wildcards placed in \texttt{apl\_to\_code} and \texttt{transform\_to\_code} are correctly written. To do this, we have implemented several validation rules.
|
||||
|
||||
Notwithstanding the fact that the code templates in \texttt{applicable to} and \texttt{transform to} blocks are treated as strings by Langium---and thus by the Visual Studio Extension for JSTQL generated by Langium---we perform validation of the wildcard declarations and references, as explained below.
|
||||
|
||||
\subsection*{Langium Validator}
|
||||
|
||||
A Langium validator allows for further checks DSL code, a validator allows for the implementation of specific checks on specific parts of the grammar.
|
||||
A Langium validator allows for further check to be applied to DSL code,
|
||||
a validator allows for the implementation of specific checks on specific parts of the code.
|
||||
|
||||
\DSL{} does not allow empty typed wildcard definitions in \texttt{applicable to} blocks, this means we cannot define a wildcard that allows any AST type to match against it. This is not defined within the grammar, as inside the grammar the code is defined as a \texttt{STRING} terminal. This means further checks have to be implemented using code. To do this we have a specific \texttt{Validator} implemented on the \texttt{Case} definition of the grammar. This means every time anything contained within a \texttt{Case} is updated, the language server created with Langium will perform the validation step and report any errors.
|
||||
\DSL{} does not allow empty wildcard type expression definitions in \texttt{applicable to} blocks.
|
||||
This is not defined within the grammar, and needs to be enforced with a validator. Concretely, we have implemented a specific \texttt{Validator} for the \texttt{Case} rule of the grammar. This means every time anything contained within a \texttt{Case} is updated, Langium will perform the validation step and report any errors. The validator implemented for our tool checks for the following errors: empty wildcard type expressions, undeclared wildcards in \texttt{transform to} block, and wildcards used multiple times in \texttt{transform to block}.
|
||||
|
||||
The validator uses \texttt{Case} as its entry point, as it allows for a checking of wildcards in both \texttt{applicable to} and \texttt{transform to}, allowing for a check for whether a wildcard identifier used in \texttt{transform to} exists in the definition of \texttt{applicable to}.
|
||||
In the listing below is the validator, it performs checks on the \texttt{applicable to} block and \texttt{transform to} block of the \texttt{case}. If any errors are found it reports them with the function \texttt{accept}.
|
||||
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
export class JstqlValidator {
|
||||
validateWildcardAplTo(pair: Pair, accept: ValidationAcceptor): void {
|
||||
validateWildcards(case_: Case, accept: ValidationAcceptor): void {
|
||||
try {
|
||||
let validationResultAplTo = validateWildcardAplTo(
|
||||
collectWildcard(case_.aplTo.apl_to_code.split(""))
|
||||
);
|
||||
if (validationResultAplTo.errors.length != 0) {
|
||||
accept("error", validationResultAplTo.errors.join("\n"), {
|
||||
node: pair.aplTo,
|
||||
node: case_.aplTo,
|
||||
property: "apl_to_code",
|
||||
});
|
||||
}
|
||||
|
||||
let validationResultTraTo = validateWildcardTraTo(
|
||||
collectWildcard(case_.traTo.transform_to_code.split("")),
|
||||
validationResultAplTo.env
|
||||
);
|
||||
|
||||
if (validationResultTraTo.length != 0) {
|
||||
accept("error", validationResultTraTo.join("\n"), {
|
||||
node: pair.traTo,
|
||||
node: case_.traTo,
|
||||
property: "transform_to_code",
|
||||
});
|
||||
}
|
||||
} catch (e) {}
|
||||
}
|
||||
}
|
||||
|
||||
\end{lstlisting}
|
||||
|
||||
\subsection*{Using Langium as a parser}
|
||||
\subsection*{Interfacing with Langium}
|
||||
|
||||
Langium is designed to automatically generate extensive tool support for the language specified using its grammar. However, in our case we have to parse the \DSL{} definition using Langium, and then extract the Abstract syntax tree generated in order to use the information it contains.
|
||||
To use the parser generated by Langium, we have give our tool a way to interface with Langium. To do this, we create a custom function that calls the generated parser on some \DSL{} code and transforms the AST into a JavaScript object compatible with our tool.
|
||||
|
||||
To use the parser generated by Langium, we created a custom function \texttt{parseDSLtoAST}, which takes a string as an input (the raw \DSL{} code), and outputs the pure AST using the format described in the grammar, see Listing \ref{sec:DSL_DEF}. This function is exposed as a custom API for our tool to interface with. This also means our tool is dependent on the implementation of the Langium parser to function with \DSL{}. The implementation of \DSL{}SH is entirely independent.
|
||||
|
||||
When interfacing with the Langium parser to get the Langium generated AST, the exposed API function is imported into the tool, when this API is executed, the output is on the form of the Langium \texttt{Model}, which follows the same form as the grammar. This is then transformed into an internal object structure used by the tool, this structure is called \texttt{TransformRecipe}, and is then passed in to perform the actual transformation.
|
||||
|
||||
\section{Wildcard extraction and parsing}
|
||||
\label{sec:wildcardExtractionAndParsing}
|
||||
To refer to internal DSL variables defined in \texttt{applicable to} and \texttt{transform to} blocks of the transformation,
|
||||
we need to extract this information from the template definitions.
|
||||
|
||||
To refer to internal DSL variables defined in \texttt{applicable to} and \texttt{transform to} blocks of the transformation, we need to extract this information from the template definitions and pass that on to the matcher.
|
||||
\subsection*{Why not use Langium for wildcard extraction?}
|
||||
\label{sec:langParse}
|
||||
Langium supports creating a generator to output an artifact,
|
||||
which is some transformation applied to the AST built by the Langium parser.
|
||||
This would facilitate extraction of the wildcards, however this would make \DSL{}-SH dependent on Langium. This is not preferred as that would mean both ways of defining a proposal are reliant on Langium.
|
||||
The reason for using our own extractor is to allow for an independent way to define transformations using our tool.
|
||||
|
||||
\subsection*{Why not use Langium for wildcard parsing?}
|
||||
|
||||
Langium has support for creating a generator to output an artifact, which is some transformation applied to the AST built by the Langium parser. This suits the needs of \DSL{} quite well and could be used to extract the wildcards and parse the type expressions. This is the way the developers of Langium want this kind of functionality to be implemented, however, the implementation would still be mostly the same, as the parsing of the wildcards still has to be done ``manually'' with a custom parser. Therefore, we decided for this project to keep the parsing of the wildcards separate. If we were to use Langium generators to parse the wildcards, it would make \DSL{}SH dependent on Langium. This is not preferred as that would mean both ways of defining a proposal are reliant of Langium. The reason for using our own extractor is to allow for an independent way to define transformations using our tool.
|
||||
|
||||
\subsection*{Extracting wildcards from \DSL{}}
|
||||
\label{sec:WildcardExtraction}
|
||||
To parse the templates in \texttt{applicable to} blocks and \texttt{transform to} blocks, we have to make the templates valid JavaScript. This is done by using a wildcard extractor that extracts the information from the wildcards and inserts an \texttt{Identifier} in their place.
|
||||
|
||||
To allow the use of Babel~\cite{Babel}, the wildcards present in the \texttt{applicable to} blocks and \texttt{transform to} blocks have to be parsed and replaced with some valid JavaScript. This is done by using a pre-parser that extracts the information from the wildcards and inserts an \texttt{Identifier} in their place.
|
||||
|
||||
To extract the wildcards from the template, we look at each character in the template. If a start token of a wildcard is discovered, which is denoted by \texttt{<<}, everything after that until the closing token, which is denoted by \texttt{>>}, is then treated as an internal DSL variable and will be stored by the tool. A variable \texttt{flag} is used (line 5,10 \ref{lst:extractWildcard}), when the value of flag is false, we know we are currently not inside a wildcard block, this allows us to pass the character through to the variable \texttt{cleanedJS} (line 196 \ref{lst:extractWildcard}). When \texttt{flag} is true, we know we are currently inside a wildcard block and we collect every character of the wildcard block into \texttt{temp}. Once we hit the end of the wildcard block, when we have consumed the entirety of the wildcard, the contents of the \texttt{temp} variable is passed to a tokenizer, then the tokens are parsed by a recursive descent parser (line 10-21 \ref{lst:extractWildcard}).
|
||||
To extract the wildcards from the template,
|
||||
we look at each character in the template.
|
||||
If a wildcard opening token is encountered, everything after that until the closing token
|
||||
is treated as a wildcard definition or reference and will parsed using the wildcard parser.
|
||||
|
||||
Once the wildcard is parsed, and we know it is safely a valid wildcard, we insert the identifier into the JavaScript template where the wildcard would reside. This allows for easier identifications of wildcards when performing matching/transformation as we can identify whether or not an Identifier in the code is the same as the identifier for a wildcard. This however, does introduce the problem of collisions between the wildcard identifiers inserted and identifiers present in the users code. To avoid this, the tool adds \texttt{\_\$\$\_} at the beginning and end of every identifier inserted in place of a wildcard. This allows for easier identification of if an Identifier is a wildcard, and avoids collisions where a variable in the user code has the same name as a wildcard inserted into the template. This can be seen on line 17-19 of Listing~\ref{lst:extractWildcard}.
|
||||
Once the wildcard is parsed,
|
||||
and we know it is valid,
|
||||
we insert the identifier into the JavaScript template
|
||||
where the wildcard would reside.
|
||||
This introduces a problem of \emph{collisions} between
|
||||
the wildcard identifiers inserted and identifiers present in the users code.
|
||||
In order to avoid this, we prepend and append every identifier inserted in place of a wildcard with the sequence of characters \texttt{\_\$\$\_}.
|
||||
|
||||
\newpage
|
||||
In the Listing~\ref{sec:WildcardExtraction} the function used to extract the wildcards declarations can be seen. This function iterates through each character of \texttt{applicable to} template (line 2). When an opening token for a wildcard is encountered (line 3), we collect each character into a separate variable until the closing token is encountered. This separate variable is passed to the wildcard parser to create the type expression AST of the wildcard (lines 14-16). We insert the collision avoidance characters into the wildcard (line 18), and insert the identifier into \texttt{cleanedJS} (line 19).
|
||||
\begin{lstlisting}[language={JavaScript}, caption={Extracting wildcard from template.}, label={lst:extractWildcard}]
|
||||
export function parseInternal(code: string): InternalParseResult {
|
||||
for (let i = 0; i < code.length; i++) {
|
||||
|
@ -211,32 +279,36 @@ export function parseInternal(code: string): InternalParseResult {
|
|||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\paragraph*{Parsing wildcard}
|
||||
\paragraph*{Parsing wildcards}
|
||||
|
||||
Once a wildcard has been extracted from definitions inside \DSL{}, they have to be parsed into a simple AST to be used when matching against the wildcard. This is accomplished by using a simple tokenizer and a recursive descent parser~\cite{RecursiveDescent}.
|
||||
Once a wildcard has been extracted from definitions inside \DSL{},
|
||||
it has to be parsed into a simple AST to be used when matching against the wildcard. This is accomplished by using a tokenizer and a recursive descent parser~\cite{RecursiveDescent}.
|
||||
|
||||
Our tokenizer takes the raw stream of input characters extracted from the wildcard block within the template, and determines which part is what token. Due to the very simple nature of the type expressions, no ambiguity is present with the tokens, so determining what token is meant to come at what time is quite trivial. We use a switch case on the current token, if the token is of length one we accept it and move on to the next character. If the next character is an unexpected one it will produce an error. The tokenizer also groups tokens with a \textit{token type}, this allows for an simpler parsing of the tokens later.
|
||||
Our tokenizer takes the contents of wildcard block template and splits it into tokens.
|
||||
Given the straighforward grammar type expressions, there is no ambiguity is present with the tokens,
|
||||
thus making is easy to identify which character corresponds to which token.
|
||||
The tokenizer adds a \textit{token type} to the tokens, this is later used by the parser to determine which nonterminal to use.
|
||||
|
||||
A recursive descent parser mimics the grammar of the language the parser is implemented for, where we define functions for handling each of the non-terminals and ways to determine what non terminal each of the token-types result in. The type expression language is a very simple Boolean expression language, making parsing quite simple.
|
||||
A recursive descent parser mimics the grammar of the language the parser is implemented for: we define functions for handling each of the nonterminals.
|
||||
|
||||
\begin{lstlisting}[caption={Grammar of type expressions}, label={ex:grammarTypeExpr}]
|
||||
Wildcard:
|
||||
Identifier ":" MultipleMatch
|
||||
|
||||
MultipleMatch:
|
||||
GroupExpr "*"
|
||||
GroupExpr "+"
|
||||
| TypeExpr
|
||||
|
||||
TypeExpr:
|
||||
TypeExpr:
|
||||
BinaryExpr
|
||||
| UnaryExpr
|
||||
| PrimitiveExpr
|
||||
| PrimitiveExpr
|
||||
|
||||
BinaryExpr:
|
||||
TypeExpr { Operator TypeExpr }*
|
||||
|
||||
UnaryExpr:
|
||||
{UnaryOperator}? TypeExpr
|
||||
UnaryOperator TypeExpr
|
||||
|
||||
PrimitiveExpr:
|
||||
GroupExpr | Identifier
|
||||
|
@ -245,35 +317,55 @@ GroupExpr:
|
|||
"(" TypeExpr ")"
|
||||
\end{lstlisting}
|
||||
|
||||
The grammar of the type expressions used by the wildcards can be seen in \figFull[ex:grammarTypeExpr], the grammar is written in something similar to Extended Backus-Naur form, where we define the terminals and non-terminals in a way that makes the entire grammar parseable by the recursive descent parser.
|
||||
|
||||
Our recursive descent parser produces an AST, which is later used to determine when a wildcard can be matched against a specific AST node, the full definition of this AST can be seen in Appendix \ref{ex:typeExpressionTypes}. We use this AST by traversing it using a~\cite{VisitorPattern}{visitor pattern} and comparing each \texttt{Identifier} against the specific AST node we are currently comparing with, and evaluating all subsequent expressions and producing a boolean value, if this value is true, the node is matched against the wildcard, if not then we do not have a match.
|
||||
The grammar of the type expressions used by the wildcards can be seen in \figFull[ex:grammarTypeExpr].
|
||||
|
||||
|
||||
\paragraph*{Building prelude in \DSL{}-SH}
|
||||
|
||||
\paragraph*{Extracting wildcards from \DSL{}SH}
|
||||
The self-hosted version \DSL{}-SH also requires some form of parsing to prepare the internal DSL environment.
|
||||
|
||||
The self-hosted version \DSL{}SH also requires some form of pre-parsing to prepare the internal DSL environment. This is relatively minor and only parsing directly with no insertion compared to \DSL{}.
|
||||
To use JavaScript as the meta language,
|
||||
we define a \texttt{prelude} on the object
|
||||
used to define the transformation case.
|
||||
This prelude is required to consist of several
|
||||
\texttt{Variable declaration} statements,
|
||||
where the variable names are used as the internal DSL variables
|
||||
and right-hand side expressions are strings that contain
|
||||
the type expression used to determine a match for that specific wildcard.
|
||||
|
||||
To use JavaScript as the meta language, we define a \texttt{prelude} on the object used to define the transformation case. This prelude is required to consist of several \texttt{Variable declaration} statements, where the variable names are used as the internal DSL variables and right side expressions are strings that contain the type expression used to determine a match for that specific wildcard.
|
||||
We use Babel to generate the AST of the \texttt{prelude} definition;
|
||||
this allows us to get a JavaScript object structure.
|
||||
Since the structure is strictly defined,
|
||||
we can expect every statement of the program
|
||||
to be a variable declaration; otherwise we throw an error for invalid prelude.
|
||||
Then the string value of each of the variable declarations is
|
||||
passed to the same parser used for \DSL{} wildcards.
|
||||
|
||||
We use Babel to generate the AST of the \texttt{prelude} definition, this allows us to get a JavaScript object structure. Since the structure is very strictly defined, we can expect every \texttt{stmt} of \texttt{stmts} to be a variable declaration, otherwise throw an error for invalid prelude. Then the string value of each of the variable declarations is passed to the same parser used for \DSL{} wildcards.
|
||||
|
||||
The reason this is preferred is it allows us to avoid having to extract the wildcards and inserting an Identifier.
|
||||
The reason this is preferred is it allows us to avoid having to extract the wildcards and inserting an \texttt{Identifier} into the template.
|
||||
|
||||
\section{Using Babel to parse}
|
||||
\label{sec:BabelParse}
|
||||
|
||||
Allowing the tool to perform transformations of code requires the generation of an Abstract Syntax Tree from the users code, \texttt{applicable to} and \texttt{transform to}. This means parsing JavaScript into an AST, to do this we use Babel~\cite{Babel}.
|
||||
Allowing the tool to perform transformations of code
|
||||
requires the generation of an abstract syntax trees from the user's code,
|
||||
as well as the \texttt{applicable to} and \texttt{transform to} blocks.
|
||||
This means parsing JavaScript into an AST;
|
||||
to do this we use Babel~\cite{Babel}.
|
||||
|
||||
The most important reason for choosing to use Babel for the purpose of generating the AST's used for transformation is due to the JavaScript community surrounding Babel. As this tool is dealing with proposals before they are part of JavaScript, a parser that supports early proposals for JavaScript is required. Babel works closely with TC39 to support experimental syntax~\cite{BabelProposalSupport} through its plugin system, which allows the parsing of code not yet part of the language.
|
||||
The reason for choosing to use Babel is the fact that it supports very early-stage JavaScript language proposals. Babel's maintainers collaborate closely with the TC39 Committee in order to provide extensive support of experimental syntax~\cite{BabelProposalSupport} through its plugin system. This allows the parsing of JavaScript code that uses language features which are not yet part of the language standard.
|
||||
|
||||
|
||||
\subsection*{Custom Tree Structure}
|
||||
\label{sec:customTree}
|
||||
The AST structure used by Babel does not suit traversing multiple trees at the same time, which is a requirement for matching.
|
||||
Therefore, based on Babel's AST, we produce our own custom tree structure that allows for simple traversal of multiple trees at once.
|
||||
|
||||
Performing matching and transformation on each of the sections inside a \texttt{case} definition, they have to be parsed into and AST to allow the tool to match and transform accordingly, for this we use Babel~\cite{Babel}. However, Babels AST structure does not suit traversing multiple trees at the same time, this is a requirement for matching and transforming. Therefore we take the AST and transform it into a simple custom tree structure to allow for simple traversal of the tree.
|
||||
|
||||
As can be seen in \figFull[def:TreeStructure] we use a recursive definition of a \texttt{TreeNode} where a nodes parent either exists or is null (it is top of tree), and a node can have any number of children elements. This definition allows for simple traversal both up and down the tree. Which means traversing two trees at the same time can be done in the matcher and transformer section of the tool.
|
||||
As can be seen in Figure~\ref{def:TreeStructure},
|
||||
we use a recursive definition of a \texttt{TreeNode},
|
||||
where a node's parent either exists or is \texttt{null} (the root),
|
||||
and a node can have any number of child elements.
|
||||
This definition allows for simple traversal both up and down the tree.
|
||||
This means traversing two trees at the same time can be done when searching for matches in the user's code.
|
||||
|
||||
|
||||
\begin{lstlisting}[language={JavaScript}, label={def:TreeStructure}, caption={Simple definition of a Tree structure in TypeScript}]
|
||||
|
@ -290,29 +382,37 @@ export class TreeNode<T> {
|
|||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Placing the AST generated by Babel into this structure means utilizing the library~\cite{BabelTraverse}{Babel Traverse}. Babel Traverse uses the visitor pattern~\cite{VisitorPattern} to perform traversal of the AST. While this method does not suit traversing multiple trees at the same time, it allows for very simple traversal of the tree to place it into our simple tree structure.
|
||||
To place the AST into our tree structure,
|
||||
we use \texttt{@babel/traverse}~\cite{BabelTraverse}
|
||||
to visit each node of the AST in a \textit{depth first} manner.
|
||||
We implement a \textit{visitor} for each of the nodes in the AST and when a specific node is encountered, the corresponding visitor of that node is used to visit it.
|
||||
When transferring the AST into our simple tree structure,
|
||||
we use a generic visitor that applies to every kind of AST node,
|
||||
and place that node into the tree.
|
||||
|
||||
To place the AST into our tree structure, we use \texttt{@babel/traverse}~\cite{BabelTraverse} to visit each node of the AST in a \textit{depth first} manner, the idea is we implement a \textit{visitor} for each of the nodes in the AST and when a specific node is encountered, the corresponding visitor of that node is used to visit it. When transferring the AST into our simple tree structure we simply have to use the same visitor for every kind of AST node, and place that node into the tree.
|
||||
Visiting a node using the \texttt{enter()}
|
||||
function means we traversed from a parent node to its child node. When we then initialize the \texttt{TreeNode} of the current child, we add the parent previously visited as its parent node.
|
||||
Whenever leaving a node the function \texttt{exit()} is called,
|
||||
this means we are moving back up the tree,
|
||||
and we have to update what node was the \textit{last} visited to keep track of the correct parent.
|
||||
|
||||
Visiting a node using the \texttt{enter()} function means we went from the parent to that child node, and it should be added as a child node of the parent. The node is automatically added to its parent list of children nodes from the constructor of \texttt{TreeNode}. Whenever leaving a node the function \texttt{exit()} is called, this means we are moving back up into the tree, and we have to update what node was the \textit{last} to generate the correct tree structure.
|
||||
In the example below is the algorithm that transforms the Babel AST into our custom tree structure.
|
||||
We start by defining a variable \texttt{last} (line 1), this variable will keep track of the previous node we visited. When the visitor enters a new node, the function \texttt{enter} is called (line 3). This function creates a new node in our custom tree structure (lines 4-6), and sets its parent to the previous node visited (line 5). Once our new node has been created, we update \texttt{last} to point to our new node (line 8).
|
||||
|
||||
Every time we walk back up the tree, the function \texttt{exit} (line 10) is called. Whenever this happens, we have to update \texttt{last} such that it will always contain the parent of a node when we visit it (line 11).
|
||||
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
let last = 0;
|
||||
traverse(ast, {
|
||||
enter(path: any) {
|
||||
let node: TreeNode<t.Node> = new TreeNode<t.Node>(
|
||||
last,
|
||||
path.node as t.Node
|
||||
);
|
||||
|
||||
if (last == null) {
|
||||
first = node;
|
||||
}
|
||||
last = node;
|
||||
},
|
||||
exit(path: any) {
|
||||
if (last && last?.element?.type != "Program") {
|
||||
last = last.parent;
|
||||
}
|
||||
},
|
||||
});
|
||||
if (first != null) {
|
||||
|
@ -321,71 +421,100 @@ traverse(ast, {
|
|||
|
||||
\end{lstlisting}
|
||||
|
||||
One important nuance of the way we place the nodes into the tree, is we still have the same underlying data structure from Babel. Because of this, the nodes can still be used with Babels APIs, and we can still access every field of each node. Transforming it into a tree only creates an easy way to traverse up and down the tree by references. We perform no copying.
|
||||
One important nuance of the way we place the nodes into the tree
|
||||
is that we still have the same underlying data structure as Babel.
|
||||
Because of this,
|
||||
the nodes can still be used with Babel APIs,
|
||||
and we can still access every field of each node.
|
||||
Transforming it into a tree only creates an easy way to traverse up and down the tree by references.
|
||||
We perform no changes of the underlying data structure.
|
||||
|
||||
\section{Outline of transforming user code}
|
||||
|
||||
Below is an outline of every major step performed, and how data is passed through the program.
|
||||
Below is an outline of every major step performed,
|
||||
and how data is passed through the program.
|
||||
|
||||
\begin{algorithm}[H]
|
||||
\caption{Outline of steps of algorithm}\label{lst:outline}
|
||||
\caption{An outline of the steps to perform the transformation.
|
||||
Here:
|
||||
$A$ denotes the \texttt{applicable to} template with wildcards extracted,
|
||||
$B$ denotes the \texttt{transform to} template with wildcards extracted,
|
||||
$W$ denotes extracted wildcards,
|
||||
$C$ denotes the abstract syntax tree of the \textttt{applicable to} template,
|
||||
$D$ denotes the abstract syntax tree of the \texttt{transform to} template,
|
||||
$E$ denotes the abstract syntax tree of the user's code,
|
||||
$F$ denotes the \texttt{applicable to} template in our custom tree structure,
|
||||
$G$ denotes the \texttt{transform to} template in our custom tree structure,
|
||||
$H$ denotes the user code in our custom tree structure,
|
||||
$J$ denotes an array of all the found matches,
|
||||
$K$ denotes an array that contains all \texttt{transform to} templates with context from user code inserted,
|
||||
$L$ denotes the abstract syntax tree of the transformed user code,
|
||||
and $SourceCode$ is the transformed user code pretty-printed as JavaScript.}
|
||||
|
||||
\label{lst:outline}
|
||||
\begin{algorithmic}[1]
|
||||
\State $CA, CT, W \gets extractWildcards()$
|
||||
\State $A, B, W \gets extractWildcards()$
|
||||
|
||||
\State $A,T \gets babel.parse(CA, CT)$ \Comment{Parse templates}
|
||||
\State $C \gets babel.parse()$ \Comment{Parse user code}
|
||||
\State $C, D, E \gets babel.parse(A, B, UserCode)$
|
||||
|
||||
\State $AT, TT, CT \gets Tree(A, T, C)$ \Comment{Build the tree structure from Babel AST}
|
||||
\State $F, G, H \gets Tree(C, D, E)$
|
||||
|
||||
\If{$AT.length > 1$} \Comment{Decide which matcher to use}
|
||||
\State $M \gets multiMatcher(CT, AT, W)$
|
||||
\If{$F.length > 1$}
|
||||
\State $J \gets multiMatcher(F, E, W)$
|
||||
\Else
|
||||
\State $M \gets singleMatcher(CT, AT, W)$
|
||||
\State $J \gets singleMatcher(F, E, W)$
|
||||
\EndIf
|
||||
|
||||
\State $TMap \gets $ Map()
|
||||
|
||||
\For{\textbf{each} m \textbf{in} M} \Comment{Build transformation templates}
|
||||
\State TMap.insert $\gets$ buildTransform($m$, $TT$, $W$);
|
||||
\State $K \gets []$ \Comment{Array of transformed code}
|
||||
\For{\textbf{each} $m$ \textbf{in} $J$}
|
||||
\State $K.insert \gets buildTransform(m, G, W)$
|
||||
\EndFor
|
||||
|
||||
\For{$traverse(C)$}
|
||||
\If{$TMap.has(c)$}
|
||||
\State $C$.replaceMany($TMap.get(c)$);
|
||||
\EndIf
|
||||
\EndFor
|
||||
\State \Return babel.generate($C$);
|
||||
|
||||
\State $L \gets insertTransformations(K)$
|
||||
\State $SourceCode$$ \gets babel.generate(L)$
|
||||
\end{algorithmic}
|
||||
\end{algorithm}
|
||||
|
||||
|
||||
|
||||
Each part of Algorithm \ref{lst:outline} is a step in the full algorithm for transforming user code based on a proposal specification in our tool. The initial step (line 1) is extraction of wildcards from the template definition. This step also parses the wildcard type expressions into an AST. The second step (lines 2,3) is to parse all templates into an AST with \texttt{@babel/parser}~\cite{BabelParser}. Once we have parsed all code into ASTs, we decide which matching algorithm to use (line 5) based on the \texttt{applicable to} template. These algorithms will find all matching sections of the user AST to the template. We then build the transformation templates(lines 11-13), and insert the sections from the use code that was matched with a wildcard. These transformations are stored in a \texttt{Map}(line 10). Once all transformations are prepared, we traverse the user AST (line 14), and insert the transformations if the current node traversed is in the \texttt{Map} (line 16). The final step, is to generate JavaScript from the transformed AST (line 19).
|
||||
Each part of Algorithm \ref{lst:outline} is a step to transform user code based on a proposal specification in our tool.
|
||||
|
||||
In the initial of the algorithm (line 1), the wildcards are extracted from the templates \texttt{applicable to} and \texttt{transform to}, and replaced by identifiers. The extracted wildcards are then parsed into ASTs using a parser built into the tool.
|
||||
|
||||
We parse the \texttt{applicable to} template, \texttt{transform to} template and the user's code into ASTs with Babel (line 2). These ASTs are immediately translated into our custom tree structure (line 3). This ensures simple traversal of multiple trees.
|
||||
|
||||
To decide which matching algorithm we apply, the length of the \texttt{applicable to} template is checked (line 5), if it is more than one statement long we use \texttt{multiMatcher} (line 5), if it is a single statement we use \texttt{singleMatcher} (line 7). These algorithms will find all matching parts of the user AST to the \texttt{applicable to} template.
|
||||
|
||||
We use these matches to prepare the \texttt{transform to} templates (lines 9-12). The AST nodes from the user code that was matched with a wildcard is inserted into the wildcard references present in the \texttt{transform to} template (line 11). All the transformed \texttt{transform to} templates are stored in a list(line 9, 11).
|
||||
|
||||
Once all transformations are prepared, we traverse the user AST (line 13), and insert the transformations where their corresponding match originated. The final step, is to generate JavaScript from the transformed AST (line 14).
|
||||
|
||||
|
||||
\section{Matching}
|
||||
|
||||
This section discusses how we find matches in the users code, this is the step described in lines 5-10 of Listing \ref{lst:outline}. Firstly, we will discuss how individual nodes are compared, then how the two traversal algorithms are implemented, and how matches are discovered using these algorithms.
|
||||
\label{sec:Matching}
|
||||
This section discusses how we find matches in the users code;
|
||||
this is the step described in lines 4-8 of Listing~\ref{lst:outline}.
|
||||
We will discuss how individual nodes are compared, how the two traversal algorithms are implemented, and how matches are discovered using these algorithms.
|
||||
|
||||
|
||||
|
||||
\subsection{Determining if AST nodes match}
|
||||
|
||||
The initial problem we have to overcome is a way of comparing AST nodes from the template to AST nodes from the user code. This step also has to take into account comparing against wildcards and pass that information back to the AST matching algorithms.
|
||||
To determine if two nodes are a match, we need some method to compare AST nodes of the \texttt{applicable to} template to AST nodes of the user code. This step also has to take into account comparisons with wildcards and pass that information back to the AST matching algorithms.
|
||||
|
||||
When comparing two AST nodes in this tool, we use the function \texttt{checkCodeNode}, which will give the following values based on what kind of match these two nodes produce.
|
||||
\begin{description}
|
||||
\item[NoMatch] The nodes do not match.
|
||||
\item[Matched] The nodes are a match, and the node of \texttt{applicable to} is not a wildcard.
|
||||
\item[MatchedWithWildcard] The node of the user AST produced a match against a wildcard.
|
||||
\item[MatchedWithPlussedWildcard] The node of the user AST produced a match against a wildcard that can match one or more nodes against itself.
|
||||
\item[MatchedWithWildcard] The nodes are a match, and the node of \texttt{applicble to} is a wildcard.
|
||||
\item[MatchedWithPlussedWildcard] The nodes are a match, and the node of \texttt{applicable to} is a wildcard with the Kleene plus.
|
||||
\end{description}
|
||||
|
||||
When we are comparing two AST nodes, we have to perform an equality check. Due to this being a structural matching search, we can get away with just performing some preliminary checks, such as that names of identifiers, otherwise it is sufficient to just perform an equality check of the types of the nodes we are currently trying to match. If the types are the same, they can be validly matched against each other. This is sufficient because we are currently trying to determine if a single node can be a match, and not the entire template structure is a match. Therefore false positives that are not equivalent are highly unlikely due to the fact the entire structure has to be a false positive match.
|
||||
To compare two AST nodes, we start by comparing their types, if the types are not the same, the result is \texttt{NoMatch}. If the types are the same, further checks are required.
|
||||
|
||||
There is a special case when comparing two nodes, namely when encountering a wildcard. To know if we have encountered a wildcard, the current AST node of \texttt{applicable to} will be either an \texttt{Identifier} or a \texttt{ExpressionStatement} where the expression is an \texttt{Identifier}. The reason it might be an \texttt{ExpressionStatement} is due to the wildcard extraction step, where we replace the wildcard with an identifier of the same name. Due to this replacement, we might place an identifier as a statement, the identifier will then be wrapped inside an \texttt{ExpressionStatement} AST node. If the node of \texttt{applicable to} is of either of these types, we have to check if the name of the identifier is the same as a wildcard. If it is, we have to compare the type of the user AST node against the type expression of the wildcard.
|
||||
Firstly we need to determine if the current AST node of \texttt{applicable to} is a wildcard. To do this, we check if its type is either an \texttt{Identifier} or an \texttt{ExpressionStatement} with an \texttt{Identifier} as its expression. During the wildcard extraction step, we replace the wildcard with an identifier. As a result, an identifier might be placed as a statement. When this occurs, the identifier will be wrapped inside an \texttt{ExpressionStatement}. If we encounter either of these two types, we must then check if the name of the identifier matches the name of a wildcard. If it does, we evaluate the type of the user AST node against the wildcards type expression.
|
||||
|
||||
In the example below we determine if the node of \texttt{applicable to} might be a wildcard
|
||||
\begin{lstlisting}
|
||||
if((aplToNode.type === "ExpressionStatement" &&
|
||||
aplToNode.expression.type === "Identifier") ||
|
||||
|
@ -395,25 +524,26 @@ if((aplToNode.type === "ExpressionStatement" &&
|
|||
}
|
||||
\end{lstlisting}
|
||||
|
||||
When comparing an AST node type against a wildcard type expression, we pass the node type into a function \texttt{WildcardEvaluator}. This evaluator will traverse through the AST of the wildcard type expression. Every leaf of the tree is equality checked against the type, and the resulting boolean value is returned. Then we evaluate the expression, passing the values through the visitors until we have evaluated the entire expression, and have a result. If the result of the evaluator is \texttt{false}, we return \texttt{NoMatch}. If the result of the evaluation is \texttt{true}, we know we can match the user's AST node against the wildcard. If the wildcard type expression contains a Kleene plus, the comparison returns \texttt{MatchedWithPlussedWildcard}, if not, we return \texttt{MatchedWithWildcard}.
|
||||
If we have determined the node of \texttt{applicable to} is not a wildcard, we then compare the two nodes to see if they match. For certain nodes, like \texttt{Identifier}, this involves explicitly checking specific fields, such as comparing the name field. For most nodes, however, we compare their types. Based on this comparison, the result will be either \texttt{Match} or \texttt{NoMatch}.
|
||||
|
||||
When comparing an AST node type against a wildcard type expression, we evaluate the wildcard type expression relative to the type of the node being compared. This evaluation employs the visitor pattern to traverse the AST, where each leaf node is checked against the type of the node being compared, yielding a Boolean result. All expressions are subsequently evaluated, with the values passed through the visitors until the entire expression is evaluated, producing a final result. If the evaluation result is \texttt{false}, we return \texttt{NoMatch}. If the evaluation result is \texttt{true}, we have to check if the wildcard uses a Kleene plus, if it does we return \texttt{MatchedWithPlussedWildcard}, if it does not we retrun \texttt{MatchedWithWildcard}.
|
||||
|
||||
\subsection{Matching a single Expression/Statement template}
|
||||
\label{sec:singleMatcher}
|
||||
|
||||
In this section, we will discuss how matching is performed when the \texttt{applicable to} template is a single expression/statement. A very complex matching template with many statements might result in a lower chance of finding matches in the users code. Therefore using simple, single root node matching templates provide the highest possibility of discovering a match within the users code. This section will cover line 11 of Listing \ref{lst:outline}.
|
||||
In this section, we discuss how matching is performed when the \texttt{applicable to} template is a single expression/statement. This section will cover line 7 of Listing \ref{lst:outline}.
|
||||
|
||||
Determining if we are currently matching with a template that is only a single expression/statement, we have to verify that the program body of the template has the length of one, if it does we can use the single length traversal algorithm.
|
||||
Determining if we are currently matching with a template that is only a single expression/statement, we must verify that the program body of the template has the length of one, if this is the case we use the single length traversal algorithm.
|
||||
|
||||
There is a special case for if the template is a single expression, as the first node of the AST generated by \texttt{@babel/generate}~\cite{BabelGenerate} will be of type \texttt{ExpressionStatement}, the reason for this is Babel will treat free floating expressions as a statement. This will miss many applicable parts of the users code, because expressions within other statements are not wrapped in an \texttt{ExpressionStatement}. This will give a template that is incompatible with a lot of otherwise applicable expressions. This means the statement has to be removed, and the search has to be done with the expression as the top node of the template. If the node in the body of the template is a statement, no removal has to be done, as a statement can be used directly.
|
||||
There is a special case when the template is a single expression. When this is the case, the first node of the AST generated by \texttt{@babel/generate}~\cite{BabelGenerate} will be of type \texttt{ExpressionStatement}. This will miss many applicable parts of the users code, because expressions within other statements are not wrapped in an \texttt{ExpressionStatement}. This makes the template incompatible with otherwise applicable expressions. This means the statement has to be removed, and the search has to be done with the expression as the top node of the template.
|
||||
|
||||
|
||||
\paragraph{Discovering Matches Recursively}
|
||||
|
||||
The matcher used against single expression/statement templates is based Depth-First Search to traverse the trees. The algorithm can be split into two steps. The initial step is to check if we are currently at the root of the \texttt{applicable to} AST, the second is to try to match the current nodes, and start a search on each of their child nodes.
|
||||
The matching algorithm used with single expression/statement templates is based on depth-first search to traverse the trees. The algorithm can be split into two steps. The first step is to start a new search on each child of the current node explored, and the second is the check the current node for a match.
|
||||
|
||||
|
||||
It is important we try to match against the template at all levels of the code AST, this is done by starting a new search one every child node of the code AST if the current node of the template AST is the root node. This ensures we have tried to perform a match at any level of the tree. This also ensures we have no partial matches, as we store it only if it returns a match when being called with the root node of \texttt{applicable to}.
|
||||
|
||||
The first step is to ensure we search for a match at all levels of the code AST. This is done by starting a new search on every child node of the code AST if the current node of the \texttt{applicable to} AST is the root node. This ensures we have explored a match at every level of the tree. As an added benefit of this approach is it ensures we have no partial matches, as we store a match only if it was called with the root node of the \texttt{applicable to} AST. This behaviour can be seen in the listing below.
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
if(aplTo.element === this.aplToRoot){
|
||||
// Start a search from root of aplTo on all child nodes
|
||||
|
@ -428,21 +558,20 @@ if(aplTo.element === this.aplToRoot){
|
|||
}
|
||||
\end{lstlisting}
|
||||
|
||||
We can now determine if we are currently exploring a match. This means the current code AST node is checked against the current node of \texttt{applicable to} AST. Based on what kind of result the comparison between these two nodes give, we have perform different steps.
|
||||
The second step is to compare the nodes of the current search. This means the current code AST node is compared against the current \texttt{applicable to} AST node. Based on this comparison, different steps must be performed, these steps can be seen below.
|
||||
|
||||
\begin{description}
|
||||
\item[NoMatch:] If a comparison between the nodes return a \texttt{NoMatch} result, we perform an early return of undefined, as no match was discovered. We can safely discard this search, because we have started a search at all levels of the code AST.
|
||||
\item[NoMatch:] If a comparison between the nodes return a \texttt{NoMatch} result, we perform an early return of undefined, as no match was discovered.
|
||||
\item[Matched:] The current code node matches against the current node of the template, and we have to perform a search on each of the child nodes.
|
||||
\item[MatchedWithWildcard:] When a comparison results in a wildcard match, we pair the current code node and the template wildcard, and do an early return. We can do this because if a wildcard matches, the nodes of the children does not matter and will be placed into the transformation.
|
||||
\item[MatchedWithPlussedWildcard:] this is a special case for a wildcard match. When a match against a wildcard that has the Kleene plus tied to it we also perform an early return. This result means some special traversal has to be done to the current nodes siblings, this is described below.
|
||||
\item[MatchedWithWildcard:] When a comparison results in a wildcard match, we immediately pair the current code node with the template wildcard and return early. This is possible because, once a wildcard matches, the child nodes are irrelevant and will be included in the transformation regardless.
|
||||
\item[MatchedWithPlussedWildcard:] This is a special case for a wildcard match. When a match occurs against a wildcard with a Kleene plus we do the same as \texttt{MatchedWithWildcard}, but give a different comparison result as this necessitates a special traversal of the current nodes siblings.
|
||||
\end{description}
|
||||
|
||||
A comparison result of \texttt{Matched} means the two nodes match, but the \texttt{applicable to} node is not a wildcard. If this is the case, we perform a search on each child nodes of \texttt{applicable to} AST and the user AST. This can be seen in Listing~\ref{lst:pseudocodeChildSearch}.
|
||||
|
||||
A comparison result of \texttt{Matched} means the two nodes match, but the \texttt{applicable to} node is not a wildcard. With this case, we perform a search on each child nodes of \texttt{applicable to} AST and the user AST. This is performed in order, meaning the n-th child node of \texttt{applicable to} is checked against the n-th child node of the user AST.
|
||||
When checking the child nodes, we have to check for a special case when the comparison of the child nodes result in \texttt{MatchedWithPlussedWildcard}. If this result is encountered, we have to continue matching the same \texttt{applicable to} node against each subsequent sibling node of the code node. This is because, a wildcard with a Kleene plus can match against multiple sibling nodes.
|
||||
|
||||
When checking the child nodes, we have to check for a special case when the comparison of the child nodes result in \texttt{MatchedWithPlussedWildcard}. If this result is encountered, we have to continue matching the same \texttt{applicable to} node against each subsequent sibling node of the code node. This is because, a wildcard with a Keene plus can match against multiple sibling nodes. This behavior can bee seen in line 17-31 of Listing \ref{lst:pseudocodeChildSearch}.
|
||||
|
||||
If all child nodes did not give the result of NoMatch, we have successfully matched every node of the \texttt{applicable to} AST. This does not yet mean we have a match, as there might be remaining nodes in the child node of the code AST. To check for this, we check whether or not \texttt{codeI} is equal to the length of \texttt{code.children}. If it is unequal, we have not matched all child nodes of the code AST and have to return \texttt{NoMatch}. This can be seen on lines 37-39 of Listing \ref{lst:pseudocodeChildSearch}.
|
||||
In the Listing~\ref{lst:pseudocodeChildSearch} below, we search the children of a comparison that returned the result \texttt{Match}. For this, we use a two pointer technique with \texttt{codeI} and \texttt{aplToI} (lines 1,2). This search continues until one of the pointers reaches the end of the list of children for its respective node (line 4). If any of the child nodes to not return a match the entire match is discarded (lines 8-10). We prepare the paired tree by appending the current child search to the parent pair (line 14,15). We handle the special case with a Kleene plus (line 18), by continuing the search with the same \texttt{aplToI} pointer while incrementing \texttt{codeI} (lines 19-22). As long as the result is \texttt{MatchedWithPlussedWildcard} we add the node matched with the wildcard to the pair of matches, meaning the pair will contain multiple nodes from the user AST matched with the same wildcard (line 28). If the result is not \texttt{MatchedWithPlussedWildcard}, we decrement \texttt{codeI}, stop the comparisons against the wildcard, and continue searching all the child nodes as normal(lines 23-26). When one of the child lists is completely searched, we check if it is a full match of all the child nodes of the current code AST parent by verifying that we reached the end of the code AST children (lines 39-41). Once all these searches have been completed, and we confirm a \texttt{Match}, we return the paired tree structure along with match result.
|
||||
|
||||
\begin{lstlisting}[language={JavaScript}, caption={Pseudocode of child node matching}, label={lst:pseudocodeChildSearch}]
|
||||
let codeI = 0;
|
||||
|
@ -456,22 +585,24 @@ while (aplToI < aplTo.children.length && codeI < code.children.length){
|
|||
return [undefined, NoMatch];
|
||||
}
|
||||
|
||||
|
||||
// Add the match to the current Paired Tree structure
|
||||
pairedChild.parent = currentPair;
|
||||
currentPair.children.push(pairedChild);
|
||||
|
||||
// Special case for Keene plus wildcard match
|
||||
// Special case for Kleene plus wildcard match
|
||||
if(childResult === MatchedWithPlussedWildcard){
|
||||
codeI += 1;
|
||||
while(codeI < code.children.length){
|
||||
let [nextChild, plusChildResult] = singleMatcher(code.children[codeI], aplTo.children[aplToI]);
|
||||
|
||||
if(plusChildResult !== MatchedWithPlussedWildcard){
|
||||
i -= 1;
|
||||
codeI -= 1;
|
||||
break;
|
||||
}
|
||||
|
||||
pairedChild.element.codeNode.push(...nextChild.element.codeNode);
|
||||
pairedChild.element.codeNode.push(
|
||||
...nextChild.element.codeNode);
|
||||
|
||||
codeI += 1;
|
||||
}
|
||||
|
@ -488,11 +619,11 @@ if(codeI !== code.children.length){
|
|||
return [currentPair, Match];
|
||||
\end{lstlisting}
|
||||
|
||||
\subsection{Matching multiple Statements}
|
||||
\subsection{Matching multiple statements}
|
||||
|
||||
Using multiple statements in the template of \texttt{applicable to} means the tree of \texttt{applicable to} as multiple root nodes, to perform a match with this kind of template, we use a sliding window~\cite{SlidingWindow} with size equal to the amount statements in the template. This window is applied at every \textit{BlockStatement} and \texttt{Program} of the code AST, as that is the only placement statements can reside in JavaScript~\cite[14]{ecma262}.
|
||||
|
||||
The initial step of this algorithm is to search through the AST for ast nodes that contain a list of \textit{Statements}. Searching the tree is done by Depth-First search, at every level of the AST, we check the type of the node. Once a node of type \texttt{BlockStatement} or \texttt{Program} is discovered, we start the trying to match the statements.
|
||||
The initial step of this algorithm is to search through the AST for nodes that contain a list of \textit{Statements}. Searching the tree is done by depth-first search, at every level of the AST, we check the type of the node. Once a node of type \texttt{BlockStatement} or \texttt{Program} is encountered, we start the trying to match the statements.
|
||||
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
multiStatementMatcher(code, aplTo) {
|
||||
|
@ -512,7 +643,7 @@ multiStatementMatcher(code, aplTo) {
|
|||
|
||||
\texttt{matchMultiHead} uses a sliding window~\cite{SlidingWindow}. The sliding window will try to match every statement of the code AST against its corresponding statement in the \texttt{applicable to} AST. For every statement, we perform a DFS recursion algorithm is applied, similar to algorithm used in Section \ref{sec:singleMatcher}, however this search is not applied to all levels, and if it matches it has to match fully and immediately. If a match is not found, the current iteration of the sliding window is discarded and we move on to the next iteration by moving the window one further.
|
||||
|
||||
One important case here is we might not know the width of the sliding window, this is due to wildcards using the Keene plus, as they can match one or more nodes against the wildcard. These wildcards might match against \texttt{(Statement)+}. Therefore, we use a similar technique to the one described in Section \ref{sec:singleMatcher}, where we have two pointers and match each statement depending on each pointer.
|
||||
One important case here is we might not know the width of the sliding window, this is due to wildcards using the Kleene plus, as they can match one or more nodes against the wildcard. These wildcards might match against \texttt{(Statement)+}. Therefore, we use a similar technique to the one described in Section \ref{sec:singleMatcher}, where we have two pointers and perform a search based on the value of these pointers.
|
||||
|
||||
\subsection*{Output of the matcher}
|
||||
|
||||
|
@ -524,7 +655,7 @@ interface PairedNode{
|
|||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Since a match might be multiple statements, we use an interface \texttt{Match}, that contains separate tree structures of \texttt{PairedNodes}. This allows storage of a match with multiple root nodes.
|
||||
Since a match might be multiple statements, we use an interface \texttt{Match}, that contains separate tree structures of \texttt{PairedNodes}. This allows storage of a match with multiple root nodes. This is used by \texttt{matchMultiHead}.
|
||||
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
export interface Match {
|
||||
|
@ -535,11 +666,12 @@ export interface Match {
|
|||
|
||||
|
||||
\section{Transforming}
|
||||
\label{sec:transform}
|
||||
To perform the transformation and replacement on each of the matches, we take the resulting list of matches, the template from the \texttt{transform to}, and the Babel AST~\cite{BabelAST} version of original code. All the transformations are then applied to the code and we use \texttt{@babel/generate}~\cite{BabelGenerate} to generate JavaScript code from the transformed AST.
|
||||
|
||||
To perform the transformation and replacement on each of the matches, we take the resulting list of matches, the template from the \texttt{transform to} section of the current case of the proposal, and the Babel AST~\cite{BabelAST} version of original code. All the transformations are then applied to the code and we use \texttt{@babel/generate}~\cite{BabelGenerate} to generate JavaScript code from the transformed AST.
|
||||
|
||||
An important discovery is to ensure we transform the leafs of the AST first, this is because if the transformation was applied from top to bottom, it might remove transformations done using a previous match. This means if we transform from top to bottom on the tree, we might end up with \texttt{a(b) |> c(\%)} in stead of \texttt{b |> a(\%) |> c(\%)} in the case of the ``Pipeline'' proposal. This is quite easily solved in our case, as the matcher looks for matches from the top of the tree to the bottom of the tree, the matches it discovers are always in that order. Therefore when transforming, all that has to be done is reverse the list of matches, to get the ones closest to the leaves of the tree first.
|
||||
An important note is we have to ensure we transform the leaves of the AST first, this is because if the transformation was applied from top to bottom, it might remove transformations done using a previous match. This means if we transform from top to bottom on the tree, we might end up with \texttt{a(b) |> c(\%)} in stead of \texttt{b |> a(\%) |> c(\%)} in the case of the ``Pipeline'' proposal. This is quite easily solved in our case, as the matcher looks for matches from the top of the tree to the bottom of the tree, the matches it discovers are always in that order. Therefore when transforming, all that has to be done is reverse the list of matches, to get the ones closest to the leaves of the tree first.
|
||||
|
||||
\newpage
|
||||
\subsubsection{Building the transformation}
|
||||
|
||||
Before we can start to insert the \texttt{transform to} section into the user's code AST. We have to insert all nodes matched against a wildcard in \texttt{applicable to} into their reference locations.
|
||||
|
@ -577,9 +709,9 @@ function extractWildcardPairs(match: Match): Map<string, t.Node[]> {
|
|||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Once the full map of all wildcards has been built, we have to insert the wildcards into the Babel AST of the \texttt{transform to} template. To do this, we have to traverse the template and insert the matched nodes of the user's code. We use \texttt{@babel/traverse}~\cite{BabelTraverse} to traverse the AST, as this provides us with a powerful API for modifying the AST. \texttt{@babel/traverse} allows us to define visitors, that are executed when traversing specific types of AST nodes. For this, we define a visitor for \texttt{Identifier}, and a visitor for \texttt{ExpressionStatement}. These visitors will do exactly the same, however for the \texttt{ExpressionStatement}, we have to check if the expression is an identifier.
|
||||
Once the full map of all wildcards has been built, we have to insert the node matched with the wildcard into the \texttt{transform to} template. To do this, we traverse the template with \texttt{@babel/traverse}~\cite{BabelTraverse}, as this provides us with a powerful API for modifying the AST. \texttt{@babel/traverse} allows us to define visitors, that are executed when traversing specific types of AST nodes. In this traversal, we define a visitor for \texttt{Identifier}, and a visitor for \texttt{ExpressionStatement}.
|
||||
|
||||
When we visit a node that might be a wildcard, we check if that nodes name is in the map of wildcards built in Listing \ref{lst:extractWildcardFromMatch}. If the name of the identifier is a key in the wildcard, we get the value for that key, and perform a node replacement. Where we replace the identifier with the node from the user's code that was matched against that wildcard. See Listing \ref{lst:traToTransform}
|
||||
When we visit a node with these visitors, we check if that nodes name is in the map of wildcards. If the name of the identifier is a key in the wildcard map, we replace the identifier with the value in the map, which is AST nodes from the user's code that matched with that wildcard. See Listing \ref{lst:traToTransform}
|
||||
|
||||
\begin{lstlisting}[language={JavaScript}, caption={Traversing \texttt{transform to} AST and inserting user context}, label={lst:traToTransform}]
|
||||
traverse(transformTo, {
|
||||
|
@ -609,7 +741,7 @@ Due to some wildcards allowing matching of multiple sibling nodes, we have to us
|
|||
|
||||
\subsubsection*{Inserting the template into the AST}
|
||||
|
||||
We have now created the \texttt{transform to} template with the user's context. This has to be inserted into the full AST definition of the users code. To do this we have to locate exactly where in the user AST this match originated. We can perform an equality check on the top noe of the user node stored in the match. To do this efficiently, we perform this check by using this top node as the key to a \texttt{Map}, so if a node in the user AST exists in that map, we know it was matched.
|
||||
We have now created the \texttt{transform to} template with the user's context. This has to be inserted into the full AST definition of the users code. To do this we have to locate exactly where in the user AST this match originated. To perform this efficiently, we use this top node as the key to a \texttt{Map}, so if a node in the user AST exists in that map, we know it was matched and should be replaced.
|
||||
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
transformedTransformTo.set(
|
||||
|
@ -622,7 +754,7 @@ transformedTransformTo.set(
|
|||
\end{lstlisting}
|
||||
|
||||
|
||||
To traverse the user AST, we use \texttt{@babel/traverse}~\cite{BabelTraverse}. In this case we cannot use a specific visitor, and therefore we use a generic visitor that applies to every node of the AST. If the current node we are visiting is a key to the map of transformations, we know we have to insert the transformed code. This is done similarly to before where we use \texttt{replaceWithMultiple}.
|
||||
We now traverse the AST generated from the users code with \texttt{@babel/traverse}. In this case we cannot use a specific visitor, and therefore we use a generic visitor that applies to every node type of the AST. If the current node we are visiting is a key to the map of transformations created earlier, we know we have to insert the transformed code. This is done similarly to before where we use \texttt{replaceWithMultiple}.
|
||||
|
||||
Some matches have multiple root nodes. This is likely when matching was done with multiple statements as top nodes. This means we have to remove n-1 following sibling nodes. Removal of these sibling nodes can be seen on lines 12-15 of Listing \ref{lst:insertingIntoUserCode}.
|
||||
|
||||
|
@ -661,11 +793,11 @@ traverse(codeAST, {
|
|||
});
|
||||
\end{lstlisting}
|
||||
|
||||
There is a special case when a wildcard with a Keene plus, allowing the match of multiple siblings, means we might have more siblings to remove. In this case, it is not so simple to know exactly how many we have to remove. Therefore, we have to iterate over all statements of the match, and check if that statement is still a sibling of the current one being replace. This behavior can be seen on lines 20-29 of Listing \ref{lst:insertingIntoUserCode}.
|
||||
There is a special case when a wildcard with a Kleene plus, allowing the match of multiple siblings, means we might have more siblings to remove. In this case, it is not so simple to know exactly how many we have to remove. Therefore, we have to iterate over all statements of the match, and check if that statement is still a sibling of the current one being replace. This behavior can be seen on lines 20-29 of Listing \ref{lst:insertingIntoUserCode}.
|
||||
|
||||
After one full traversal of the user AST. All matches found have been replaced with their respective transformation. All that remains is generating JavaScript from the transformed AST.
|
||||
|
||||
\subsubsection*{Generating source code from transformed AST}
|
||||
|
||||
To generate JavaScript from the transformed AST created by this tool, we use a JavaScript library titled~\cite{BabelGenerate}{babel/generator}. This library is specifically designed for use with Babel to generate JavaScript from a Babel AST. The transformed AST definition of the users code is transformed, while being careful to apply all Babel plugins the current proposal might require.
|
||||
\subsection{Generating source code from transformed AST}
|
||||
\label{sec:generate}
|
||||
To generate JavaScript from the transformed AST created by this tool, we use a JavaScript library titled \texttt{@babel/generator}~\cite{BabelGenerate}. This library is specifically designed for use with Babel to generate JavaScript from a Babel AST. The transformed AST definition of the users code is transformed, while being careful to apply all Babel plugins the current proposal might require.
|
||||
|
||||
|
|
|
@ -1,24 +1,19 @@
|
|||
\chapter{Evaluation}
|
||||
|
||||
In this chapter we will present the results of running each of the proposals discussed in this thesis on large scale JavaScript projects.
|
||||
In this chapter we will present the results of running each of the proposals discussed in this thesis on large-scale JavaScript projects.
|
||||
|
||||
\section{Real Life source code}
|
||||
To evaluate this tool on existing JavaScript codebases, we have collected JavaScript projects from Github containing many or large JavaScript files.
|
||||
|
||||
To evaluate this tool on real life JavaScript code, we have collected some JavaScript projects from Github containing many or large JavaScript files. Every JavaScript file within the project is transformed with our tool, and we will evaluate it based upon the amount of matches discovered, as well as manual checking that the transformation resulted in correct output code.
|
||||
Each case study was evaluated by running this tool on every \texttt{.js}-file in the repository,
|
||||
and then collecting the number of matches found in total,
|
||||
and how many files were successfully searched.
|
||||
Evaluating if the transformation was correct is done by manually sampling output files,
|
||||
and verifying that it passes through Babel Generate~\cite{BabelGenerate}
|
||||
without errors.
|
||||
|
||||
Each case study was evaluated by running this tool on every .js file in the repository, then collecting the number of matches found in total and how many files were successfully searched. Evaluating if the transformation was correct is done by manually sampling output files, and verifying that it passes through Babel Generate~\cite{BabelGenerate} without error.
|
||||
We describe below our results and observations on using our tool on the codebases of various large-scale projects that use JavaScript.
|
||||
|
||||
``Pipeline''~\cite{Pipeline} is applicable to most files, as the concept it touches (function calls) is widely used when writing JavaScript code. Our tool found matches in almost all files that Babel~\cite{BabelParser} managed to parse, and the transformed files are correct.
|
||||
|
||||
The ``Do Expression'' proposal~\cite{Proposal:DoProposal} is not as applicable as the "Pipeline" proposal, as it does not re-define a concept that is used quite as frequently as call expressions. This means the amount of transformed code this specification in \DSL{} will be able to perform is expected and proven to be lower. This is due to the ``Do Expression'' proposal introducing an entirely new way to write expression-oriented code in JavaScript. If the code has not used the current way of writing in an expression-oriented style in JavaScript, \DSL{} is limited in the amount of transformations it can perform. Nevertheless, matches are discovered and transformed correctly.
|
||||
|
||||
|
||||
The imaginary ``Await to promise'' proposal also has an expected number of matches, but this evaluation proposal is not meant to be real life representative, and was only created to evaluate the expressiveness of our tool.
|
||||
|
||||
|
||||
|
||||
|
||||
\textbf{Next.js}~\cite{NEXT.JS} is one of the largest projects on the web. It is used with React~\cite{React} to enable feature such as server-sire rendering and static site generation.
|
||||
\textbf{Next.js}~\cite{NEXT.JS} is one of the largest projects on the web. It is used with React~\cite{React} to enable feature such as server-side rendering and static site generation.
|
||||
|
||||
\begin{table}[H]
|
||||
\begin{center}
|
||||
|
@ -30,7 +25,7 @@ The imaginary ``Await to promise'' proposal also has an expected number of match
|
|||
\hline
|
||||
``Do Expression'' & 229 & 37 & 3340 \\
|
||||
\hline
|
||||
Await to Promise & 143 & 75 & 3340 \\
|
||||
Await to Promise & 8 & 7 & 3340 \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
@ -38,11 +33,8 @@ The imaginary ``Await to promise'' proposal also has an expected number of match
|
|||
\label{fig:evalNextJS}
|
||||
\end{table}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
\textbf{Three.js}~\cite{ThreeJS} is a library for 3D rendering in JavaScript. It is written purely in JavaScript and uses GPU for 3D calculations. It being a popular JavaScript library, and being written in mostly pure JavaScript, makes it a good case study for our tool.
|
||||
\textbf{Three.js}~\cite{ThreeJS} is a library for 3D rendering in JavaScript.
|
||||
It is written purely in JavaScript and uses GPU for 3D calculations.
|
||||
|
||||
\begin{table}[H]
|
||||
\begin{center}
|
||||
|
@ -62,7 +54,7 @@ The imaginary ``Await to promise'' proposal also has an expected number of match
|
|||
\label{fig:evalThreeJS}
|
||||
\end{table}
|
||||
|
||||
\textbf{React}~\cite{React} is a graphical user interface library for JavaScript, it facilitates the creation of user interfaces for both web and native platforms. React is based upon splitting a user interface into components for simple development. It is currently one of the most popular libraries for creating web apps.
|
||||
\textbf{React}~\cite{React} is a graphical user interface library for JavaScript, which facilitates the creation of user interfaces for both web and native platforms. React is based upon splitting a user interface into components for simple development. It is currently one of the most popular libraries for creating web apps.
|
||||
|
||||
\begin{table}[H]
|
||||
\begin{center}
|
||||
|
@ -83,7 +75,7 @@ The imaginary ``Await to promise'' proposal also has an expected number of match
|
|||
\end{table}
|
||||
|
||||
|
||||
\textbf{Bootstrap}~\cite{Bootstrap} is a front-end framework used for creating responsive and mobile-first websites, it comes with a variety of built-in components, as well as a built in styling. This styling is also customizable using CSS. This library is a good evaluation point for this thesis as it is written in pure JavaScript.
|
||||
\textbf{Bootstrap}~\cite{Bootstrap} is a front-end framework used for creating responsive and mobile-first websites, and it comes with a variety of built-in components. This library is a good evaluation point for this thesis as it is written in ``vanilla'' JavaScript.
|
||||
|
||||
\begin{table}[H]
|
||||
\begin{center}
|
||||
|
@ -104,8 +96,7 @@ The imaginary ``Await to promise'' proposal also has an expected number of match
|
|||
\end{table}
|
||||
|
||||
|
||||
|
||||
\textbf{Atom}~\cite{Atom} is a text editor made in JavaScript using the Electron framework. It was created to give a very minimal and modular text editor. It was bought by Microsoft, and later discontinued in favor for Visual Studio Code.
|
||||
\textbf{Atom}~\cite{Atom} is a text editor developed in JavaScript.
|
||||
|
||||
\begin{table}[H]
|
||||
\begin{center}
|
||||
|
@ -125,4 +116,34 @@ The imaginary ``Await to promise'' proposal also has an expected number of match
|
|||
\label{fig:evalAtom}
|
||||
\end{table}
|
||||
|
||||
See Appendix~\ref{appendix:b} for some examples of the transformations discussed in this section.
|
||||
The ``Pipeline'' proposal is applicable to most files: the reason for this is that call expressions are widely used when writing JavaScript code. Our tool found matches in most files that Babel~\cite{BabelParser} managed to parse, and with manual evaluation transformations were performed correctly.
|
||||
|
||||
The ``Do Expression'' proposal is not as ``applicable'' as the ``Pipeline'' proposal: this means that the amount of transformed code this specification in \DSL{} will be able to perform is expected and proven to be lower.
|
||||
This is because the proposal introduces an entirely new way of writing
|
||||
expression-oriented code in JavaScript.
|
||||
If the code has not used the current way of writing
|
||||
expression-oriented in JavaScript,
|
||||
\DSL{} is limited in the amount of transformations it can perform.
|
||||
Nevertheless, our tool is able to identify matches where it is applicable, and by manual verification transformations are correct.
|
||||
|
||||
The imaginary ``Await to promise'' proposal also has an ``expected'' number of matches; however, we do not evaluate this proposal since it is not an official TC39 proposal.
|
||||
|
||||
|
||||
Our tool demonstrates its capability to perform searches on large codebases,
|
||||
to identify applicable code for proposals,
|
||||
and to transform the code.
|
||||
As can be seen from the tables above,
|
||||
some of the proposals found zero matches
|
||||
when evaluated on some of these codebases.
|
||||
This is due to the fact that the developers of these projects
|
||||
have not used the language construct the proposal is targeting.
|
||||
Because of this,
|
||||
no transformations can be performed.
|
||||
This is especially apparent with the ``Do Expression`` proposal,
|
||||
but also with the ``Await to Promise'' imaginary proposal.
|
||||
This means that our tool's ability to perform transformations
|
||||
depends on how widespread the adoption of the language construct targeted in a proposal is.
|
||||
We can hypothesize that the amount of matches reflects the ``impact''
|
||||
that design decisions made by the TC39 committee might have on established JavaScript projects and codebases.
|
||||
|
||||
We give examples of some of the transformations performed on these codebases in Appendix~\ref{appendix:b}.
|
||||
|
|
|
@ -1,19 +1,34 @@
|
|||
\chapter{Conclusions \& Future Work}
|
||||
\chapter{Conclusion and Future Work}
|
||||
|
||||
\section{Conclusions}
|
||||
In this thesis,
|
||||
we have explored an approach to define transformations of JavaScript code
|
||||
based on formal specifications of syntactic proposals.
|
||||
The goal of such transformations is to gather (early) feedback for (contentious) syntactic ECMAScript language proposals discussed by the TC39 committee.
|
||||
Our tool opens a possibility for the users to ``preview''
|
||||
proposals on their own codebases:
|
||||
it can be conjectured that users' familiarity with the code
|
||||
shall improve the quality of feedback.
|
||||
|
||||
In this thesis, we have developed a way to define transformations of JavaScript based on a proposal definition. The idea this thesis started to explore is to create a way to gather early feedback on syntactic proposals for ECMAScript. The tool created allows for matching and transformation of user code based on a proposal definition. This is the initial step of gathering user feedback by using a user's familiarity with their own code. While more work still remains, this thesis has proved the idea of transforming user code to use new proposals to be feasible.
|
||||
The work presented in this thesis is an initial step in developing a language workbench-like tool for supporting design of widely adopted programming languages. While this thesis adequately implements the machinery of the core of such a tool, future work is required.
|
||||
A major next step is to \textbf{integrate a feedback gathering mechanism in an IDE}. This shall give users a way to apply proposals to fragments of their code \emph{and} to be able to give feedback on every such application. This could be implemented, for example, using a rating scale (e.g., Likert scale) to quantify
|
||||
user's preferences. The user would also be able to submit their code (in an obfuscated form) directly to the TC39 committee.
|
||||
|
||||
\section{Future Work}
|
||||
We have also identified several directions on how the
|
||||
\textbf{expressiveness of \DSL{} can be improved.}
|
||||
For example, \emph{parameterized specifications}
|
||||
can be introduced to enable reuse of (parts of) proposal specifications.
|
||||
Another example is to \emph{support a richer syntax for wildcards}---this would allow for more power matching and transformations of the AST structures.
|
||||
Currently, our tool relies heavily on abstract syntax trees produced by Babel.
|
||||
While this can be considered as an advantage for the TC39's use case,
|
||||
introducing \textbf{support for arbitrary JavaScript parsers} can be beneficial.
|
||||
|
||||
\textbf{Provide access and gather feedback} This project is build upon creating a tool for users of EcmaScript to see new proposals within their own codebase. The idea behind this is to use the users familiarity to showcase new syntactic proposals, and get valuable feedback to the committee developing the ECMA-262 standard. This means making the definitions of a proposal in \DSL{} and this tool available to end-users to execute using their own code. This can come in multiple forms, we suggest some ideas, such as a playground on the web, an extension for Visual Studio Code, or to be used in github pull requests.
|
||||
%Introducing some way of defining new syntax for a proposal in the proposal definition, and allowing for parsing JavaScript containing that specific new syntax would limit the reliance on Babel, and allow for defining proposals earlier in the development process. This can possibly be done by implementing a custom parser inside this tool that allows defining custom syntax for specific new proposals.
|
||||
|
||||
\textbf{Supporting other languages} The idea of showcasing changes to a programming language by transforming user code is not only limited to EcmaScript, and could be applied to many other programming languages using a similar development method to EcmaScript. The developers of a language could write definitions of new changes for their respective language, and use a similar tool to the one discussed in this thesis to showcase possible new changes.
|
||||
%\emph{Fully self-hosting \DSL{}SH} The current version of \DSL{}SH relies on this tools parser to generate the AST for the type expressions used for matching by wildcards. This might make this tool more difficult to adopt for the committee. Therefore adding functionality for writing these type expressions purely in JavaScript and allowing for the use of JavaScript as its own meta language is an interesting avenue to explore.
|
||||
|
||||
|
||||
Ultimately, \textbf{supporting other programming languages} in our tool could help in performing corpus analysis when designing new features for both ECMAScript and those other languages. In addition, this could enable exploring \emph{co-evolution} of programming languages.
|
||||
|
||||
\textbf{Parameterized specifications} The current form of \DSL{} supports writing each template as its own respective case, but multiple templates might be very similar and could be written using generics that are shared between case definitions. Introducing this might give a simpler way of writing more complex definitions of a proposal transformation by re-using generic type parameters for the wildcards used in the transformations.
|
||||
|
||||
\textbf{Fully self-hosting \DSL{}SH} The current version of \DSL{}SH relies on this tools parser to generate the AST for the type expressions used for matching by wildcards. This might make this tool more difficult to adopt for the committee. Therefore adding functionality for writing these type expressions purely in JavaScript and allowing for the use of JavaScript as its own meta language is an interesting avenue to explore.
|
||||
|
||||
\textbf{Support for custom proposal syntax} Currently this tool relies heavily on that a proposal is supported by Babel~\cite{Babel}. This makes the tool quite limited in what proposals could be defined and transformed due to relying on Babel for parsing the templates and generating the output code. Introducing some way of defining new syntax for a proposal in the proposal definition, and allowing for parsing JavaScript containing that specific new syntax would limit the reliance on Babel, and allow for defining proposals earlier in the development process. This can possibly be done by implementing a custom parser inside this tool that allows defining custom syntax for specific new proposals.
|
||||
|
||||
\textbf{Support of a wider syntax for wildcards} The current syntax for wildcards allow limiting on node types only. An interesting avenue of exploration is to specify wildcards with code snippets in them. This would allow for an even deeper template structure to define matches, and would also give the transformation a more specified expected structure to insert.
|
||||
|
|
|
@ -1,15 +1,94 @@
|
|||
\chapter{Introduction}
|
||||
|
||||
The development of the programming language ECMAScript, which is defined by the ECMA-262 language standard, is worked on by Technical Committee 39. The committee has the responsibility to explore proposals suggested for addition to ECMASript. During this process, proposals go through many iterations in the exploration process of solutions to the problem identified in the proposal. During this process, the community of JavaScript developers can give feedback on proposed solutions. This feedback has to be of some quality, and it is therefore adamant that the user's giving feedback understand the proposal well. This thesis suggests a way of presenting proposals to users by exploiting their familiarity with their own code.
|
||||
The development and evolution of the programming language ECMAScript---which
|
||||
is defined by the ECMA-262 language standard---is
|
||||
done by the Technical Committee 39 of Ecma International.
|
||||
The committee has the responsibility to investigate proposals
|
||||
suggested for addition into the ECMASript language.
|
||||
During this process,
|
||||
proposals go through numerous iterations
|
||||
of improving the solution space of the problem identified in a proposal.
|
||||
The community of JavaScript developers
|
||||
can give feedback on proposals;
|
||||
this feedback has to be of a certain quality---it is,
|
||||
therefore,
|
||||
crucial that the users are confident in their understanding of the proposal,
|
||||
the suggested solution,
|
||||
and its potential corner cases.
|
||||
To aid users in this understanding,
|
||||
the description of a proposal is expected
|
||||
to illustrate the solution by presenting several
|
||||
examples---in form of ECMAScript code snippets---that
|
||||
highlight various scenarios for the use of
|
||||
the functionality suggested in a proposal.
|
||||
In this thesis,
|
||||
we suggest a way of demonstrating these scenarios in a user's own codebase.
|
||||
We conjecture this will lower the barrier of understanding a proposal,
|
||||
and will allow the user to focus solely on the concepts a proposal introduces.
|
||||
|
||||
The idea behind using the user's own code, is to lower the barrier to understand a proposal. A user most likely understands the code they themselves have written, therefore there is no need to understand the initial premise of the example. Therefore, this will allow the user to fucus solely on the concepts the proposal introduces. This idea applies to syntactic proposals, which are proposals that contains either no, or very limited change to functionality, and no change to semantics.
|
||||
This thesis discusses a way of defining transformations of code specifically for \emph{syntactic} proposals---these are proposals that do not introduce any new semantics to the language, but merely improve the ergonomics of how the code is written.
|
||||
The idea is to identify code fragments in a user's codebase
|
||||
to which a proposal can be applied,
|
||||
and then replace that code with an equivalent code
|
||||
that uses the functionality introduced in a proposal.
|
||||
|
||||
This thesis discusses a way of defining transformations of code specifically for syntactic proposals. These transformations are used to showcase a syntactic proposal to users by transforming code the user's are already familiar with. The theory is this will help user's understand proposals faster and easier, as they are already familiar with the code the proposal is being presented in. The two proposals discussed in this thesis to evaluate the tool are "Do Expression"~\cite{Proposal:DoProposal} and "Pipeline"~\cite{Pipeline}.
|
||||
We developed a domain-specific language called ``JSTQL'' (for ``JavaScript Transformation Query Language'')
|
||||
for specifying queries and transformations on JavaScript code.
|
||||
This DSL utilizes JavaScript templates to query user code and
|
||||
to define how the matched code should be transformed.
|
||||
To parse both the templates and user code,
|
||||
we employ the Babel~\cite{Babel} library.
|
||||
The templates defined in \DSL{} may include variables---referred to as \emph{wildcards}---which are special blocks written inside the template.
|
||||
These blocks facilitate matching against arbitrary code,
|
||||
and transforming that code according to the specified transformation;
|
||||
this allows the transformed code to maintain its \emph{context}.
|
||||
To specify what kind of code a wildcard will match against,
|
||||
we use \emph{type expressions},
|
||||
which are Boolean propositions on the node types as defined in the Babel abstract syntax tree specification.
|
||||
|
||||
The language defined in this thesis is titles \DSL{}, and allows for structured template queries to be written, these templates are used to search for code that is applicable to a proposal. A code snippet being applicable to a proposal, means it could have been written using that proposal. These queries come with a definition of a transformation, which will use the matched code snippet in the user's code, and transform it to use the features of the proposal. This newly transformed code can then be presented to the user along with the original unchanged code. This allows a user to see an example of the proposal being applied, and they can then give feedback on the proposal based on that.
|
||||
|
||||
Every template uses a notion of wildcards, which are sections of the template that is not defined using JavaScript, but a custom block containing an identifier, and a AST node type expression.
|
||||
The evaluation of the transformation tool implemented in this thesis involved specifying the proposals ``Do Expression''~\cite{Proposal:DoProposal}
|
||||
and ``Pipeline''~\cite{Pipeline} in our DSL.
|
||||
These specifications were applied to existing large codebases
|
||||
in order to assess the functionality of the transformations.
|
||||
The results obtained from this process confirmed the functionality of the tool, and provided insights into how significant of an ``impact''
|
||||
the design decisions in each proposal might have on existing codebases.
|
||||
|
||||
The transformation tool presented in this thesis
|
||||
is meant to be the initial step in creating a language workbench-like
|
||||
tool for designing widely adopted programming languages.
|
||||
We created the core machinery of transforming code based on a proposal specification,
|
||||
while implementing ways to present this to users and gather feedback on proposals is left up to future work.
|
||||
|
||||
|
||||
%A proposal describes a the technicalities of a suggested change to the language and its requirements, this is done by a language specification, motivation for the idea, and general discussion around the proposed change. A proposal ideally also needs backing from the community of users that use ECMAScript, this means the proposal has to be presented to users some way. This is currently done by many channels, such as polyfills, code examples, and as beta features of the main JavaScript engines, however, this paper wishes to showcase proposals to users by using a different avenue.
|
||||
|
||||
%Users of ECMAScript have a familiarity with code they themselves have written. This means they have knowledge of how their own code works and why they might have written it a certain way. This project aims to utilize this pre-existing knowledge to showcase new proposals for ECMAScript. This way will allow users to focus on what the proposal actually entails, instead of focusing on the examples written by the proposal authors.
|
||||
|
||||
|
||||
|
||||
|
||||
%searching the user's code for such snippets and transform them to use the proposals proposed solution.
|
||||
|
||||
|
||||
%\textcolor{red}{write that a proposal description contains several examples, but those are artificial/%taken out of the context, but we want the users to actually play with the proposal}
|
||||
%In this thesis, we suggest an \textcolor{red}{approach of how presenting proposals to users by %exploiting their familiarity with their own code.}
|
||||
|
||||
%We assume that a user understands the code they themselves have written,
|
||||
%and therefore understanding the initial premise of an example is simpler.
|
||||
|
||||
%\textcolor{red}{We conjecture this will help user's understand proposals faster and easier, as they are already familiar with the code the proposal is being presented in.}
|
||||
|
||||
|
||||
|
||||
%WRITE ABOUT VSCODE EXTENSION TO WRITE JSTQL CODE, WITH VALIDATIONS, ETC.
|
||||
|
||||
%SOMEWHERE WRITE THAT YOU USE BABEL X
|
||||
|
||||
%WRITE THAT YOU IMPLEMENTED A MATCHING-AND-TRANSFORMATION ALGORITHM
|
||||
|
||||
%WE EVALUATED OUR TOOL ON TWO PROPOSALS AND WE GOT THESE AND THESE RESULTS
|
||||
|
||||
%WRITE THAT THIS IS A FIRST STEP OF ... (SEE CONCL.)
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -1,16 +1,24 @@
|
|||
\chapter{Related Work}
|
||||
|
||||
In this chapter, we present work related to other query languages for source code, aspect-oriented programming, some code querying methods, and other JavaScript parsers. This all relates to the work described in this thesis.
|
||||
In this chapter,
|
||||
we discuss various techniques and languages for code querying,
|
||||
present approaches to tree manipulation and transformation,
|
||||
and describe several JavaScript parsers.
|
||||
We also discuss aspect-oriented programming and model-driven language engineering.
|
||||
|
||||
\section{Other source code query languages}
|
||||
\section{Source code query languages}
|
||||
|
||||
To allow for simple analysis and refactoring of code, there exists many query languages designed to query source code. These languages use several methods to allow for querying code based on specific paradigms such as logical queries, declarative queries, or SQL-like queries. All provide similar functionality of being able to query code. In this section we will look some of these languages for querying source code, and how they relate to \DSL{} developed in this thesis.
|
||||
To allow for simple analysis and refactoring of code, there exist many query languages designed to query source code.
|
||||
These languages use various techniques to allow for querying code
|
||||
based on specific paradigms
|
||||
(such as: logical queries, declarative queries, SQL-like queries, etc.).
|
||||
|
||||
\subsection{CodeQL}
|
||||
\emph{CodeQL}~\cite{CodeQL} is an object-oriented query language, previously known as \emph{.QL}.
|
||||
CodeQL is used to semantically analyze code to discover vulnerabilities~\cite{CodeQLStuff}.
|
||||
The language is inspired ~\cite{CodeQLStuff} by SQL~\cite{SQL}, Datalog~\cite{Datalog}, Eindhoven Quantifier Notation~\cite{EindhovenQuantifierNotation}, and classes are predicates~\cite{Predicates}.
|
||||
|
||||
CodeQL~\cite{CodeQL} is an object-oriented query language, it was previously known as .QL. . CodeQL is used to analyze code semantically to discover vulnerabilities~\cite{CodeQLStuff}. CodeQL has taking inspiration from several areas of computer science to create their query language~\cite{CodeQLStuff}, such a inspiration from SQL, Datalog, Eindhoven QUantifier Notation, and Classes are Predicates.
|
||||
|
||||
An example of how queries are written in CodeQL can be defined below~\cite{CodeQL}. This query will find all methods that declare a method \texttt{equals} and not a method \texttt{hashCode}. This query is performed using an erroneous class \texttt{c}, and we define what properties that class should have.
|
||||
An example~\cite{CodeQLStuff} of how queries are written in CodeQL is as follows.
|
||||
\begin{lstlisting}
|
||||
from Class c
|
||||
where c.declaresMethod("equals") and
|
||||
|
@ -18,13 +26,27 @@ where c.declaresMethod("equals") and
|
|||
c.fromSource()
|
||||
select c.getPackage(), c
|
||||
\end{lstlisting}
|
||||
This query will find all class that have method \texttt{equals},
|
||||
but do not have method \texttt{hashCode}.
|
||||
|
||||
The syntax of writing queries in CodeQL is not similar to \DSL{}, as it is SQL-like, and not declarative patterns, which makes the writing experience of the two languages very different. Writing CodeQL queries are similar to querying a database, while queries written in \DSL{} are similar to defining an example of the structure you wish to search for.
|
||||
As can be seen from this example, the SQL-like syntax of writing queries in CodeQL is substantially different from \DSL{}, which aims at a more declarative syntax. This makes the writing experience of the two languages very different:
|
||||
writing CodeQL queries are similar to querying a database, while queries written in \DSL{} are similar to defining an example of the structure one wishes to search for.
|
||||
|
||||
\subsection{PMD XPath}
|
||||
|
||||
PMD XPath is a language for Java source code querying, it supports querying of all Java constructs~\cite{ProgrammingLanguageEcolutionViaSourceCodeQueryLanguages}. The reason it has this wide support is due to it constructing the entire codebase's AST in XML format, and then performing the query on the XML. These queries are performed using XPath rules, that define the matching on the XML. This makes the query language very versatile for static code analysis, and is used in the PMD static code analysis tool.
|
||||
PMD XPath is a language for Java source code querying,
|
||||
This language supports querying of all Java constructs~\cite{ProgrammingLanguageEcolutionViaSourceCodeQueryLanguages}.
|
||||
The reason it has this wide support is due to it constructing the entire codebase's AST in XML format, and then performing the query on the corresponding XML.
|
||||
These queries are performed using XPath expressions that define matching on XML trees.
|
||||
This makes the query language versatile for static code analysis,
|
||||
and it is used in the \emph{PMD} static code analysis tool~\cite{PMDAnalyzer}.
|
||||
|
||||
An example~\cite{PMDXPathRule} PMD XPath queries are as follows.
|
||||
\begin{lstlisting}
|
||||
//VariableId[@Name = "bill"]
|
||||
//VariableId[@Name = "bill" and ../../Type[@TypeImage = "short"]]
|
||||
\end{lstlisting}
|
||||
This query can be applied, for example, to the following Java code~\cite{PMDXPath}:
|
||||
\begin{lstlisting}
|
||||
public class KeepingItSerious{
|
||||
Delegator bill; // FieldDeclaration
|
||||
|
@ -34,30 +56,80 @@ public class KeepingItSerious{
|
|||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
If we execute the queries on this code, the first query will match against the field declaration \texttt{Delegator bill} and \texttt{short bill}, while the second query will only return \texttt{short bill}.
|
||||
The reason the second limits the search
|
||||
is that we define the type of the declaration.
|
||||
|
||||
There are two queries with PMD XPath defined in the example below~\cite{PMDXPath}. If we execute these on the code above, the first will match against the field declaration \texttt{Delegator bill} and \texttt{short bill}, while the second will only return \texttt{short bill}. The reason the second limits the search, is we define the type of the declaration.
|
||||
\begin{lstlisting}
|
||||
//VariableId[@Name = "bill"]
|
||||
//VariableId[@Name = "bill" and ../../Type[@TypeImage = "short"]]
|
||||
\end{lstlisting}
|
||||
|
||||
Comparing this tool to \DSL{}, we can see it is good at querying code based on structure, which our tool also excels at. The main difference is the manor of which each tool does this, \DSL{} uses JavaScript templates to perform the query, making writing queries simple for users as they are based in JavaScript. PMD XPath uses XPath to perform define structural queries that is quite verbose, and requires extended knowledge of the AST that is currently being queried.
|
||||
\DSL{} uses JavaScript code \emph{templates} to specify queries;
|
||||
this supposedly makes writing such queries simpler for users as they write JavaScript. In its turn, PMD XPath uses XPath expressions to perform define structural queries that is quite verbose, and requires extended knowledge of the AST that is currently being queried.
|
||||
|
||||
\subsection{XSL Transformations}
|
||||
|
||||
XSLT~\cite{XSLT} is a language created to perform transformations of XML documents, transforming an XML document into a different format, this can be into other XML documents, HTML or even plain text.
|
||||
XSLT~\cite{XSLT} is a language for performing transformations of XML documents,
|
||||
either to other XML documents, or to different formats altogether (such as HTML or plain text).
|
||||
|
||||
XSLT is part of Extensible Stylesheets Language family of programs. The XSL language is expressed in the form of a stylesheet~\cite[1.1]{XSLT}, whose syntax is defined in XML. This language uses a template based approach to define matches on specific patterns in the source to find sections to transform. These transformations are defined by a transformation declaration that describes how the output of the match should look.
|
||||
XSLT is part of Extensible Stylesheets Language family of programs.
|
||||
The XSL language is expressed in the form of a stylesheet~\cite[Sect.~1.1]{XSLT},
|
||||
whose syntax is defined in XML.
|
||||
This language uses a template based approach to define matches on specific patterns in the source to find sections to transform.
|
||||
These transformations are defined by a transformation declaration
|
||||
that describes how the output of the match should look.
|
||||
|
||||
This language defines matching and transformation very similarly to \DSL{}, and uses the same technique, where the transformation declaration describes how the output should look, and not exactly how the transformation is performed.
|
||||
The example XML document represents a program, where each node \texttt{variable} has an attribute \texttt{name}.
|
||||
\begin{lstlisting}
|
||||
<program>
|
||||
<variable name="a"/>
|
||||
<variable name="b"/>
|
||||
<variable name="c"/>
|
||||
</program>
|
||||
\end{lstlisting}
|
||||
|
||||
To transform the example above, we define a transformation in XSLT seen below. This transformation contains two match templates; the first template matches nodes \texttt{program}, this template copies the node in the transformation with \texttt{xsl:copy} and applies the second transformation to all child nodes. The second transformation matches element \texttt{person}, it defines a transformation that changes node from \texttt{variable} to \texttt{const}.
|
||||
|
||||
\begin{lstlisting}
|
||||
<xsl:stylesheet version="1.0"
|
||||
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
|
||||
<xsl:output method="xml" indent="yes"/>
|
||||
|
||||
<xsl:template match="/program">
|
||||
<xsl:copy>
|
||||
<xsl:apply-templates select="variable"/>
|
||||
</xsl:copy>
|
||||
</xsl:template>
|
||||
|
||||
<xsl:template match="variable">
|
||||
<const name="{@name}"/>
|
||||
</xsl:template>
|
||||
</xsl:stylesheet>
|
||||
|
||||
\end{lstlisting}
|
||||
|
||||
The result of running the XSLT transformation above on the XML we defined is shown below.
|
||||
\begin{lstlisting}
|
||||
<program>
|
||||
<const name="a"/>
|
||||
<const name="b"/>
|
||||
<const name="c"/>
|
||||
</program>
|
||||
\end{lstlisting}
|
||||
|
||||
Though XSLT defines matching in a manner similar to \DSL{},
|
||||
its approach to define transformations is different: \DSL{} allows the user to specify a code fragment interspliced with wildcards,
|
||||
while XSLT requires specifying a transformation (written in a functional style).
|
||||
Moreover, \DSL{}'s implementation is tailored for the use by the TC39 committee, while XSLT's expressive power allows specifying arbitrary complex transformations of tree-like data structures.
|
||||
|
||||
\subsection{Jackpot}
|
||||
|
||||
Jackpot~\cite{Jackpot} is a query language created for the Apache Netbeans platform~\cite{ApacheNetBeans}, it has since been mostly renamed to Java Declarative Hints Language, we will continue to refer to it as Jackpot in this section. The language uses declarative patterns to define source code queries, these queries are used in conjunction with multiple rewrite definitions. This is used in the Apache Netbeans suite of tools to allow for declarative refactoring of code.
|
||||
\emph{Jackpot}~\cite{Jackpot}
|
||||
(also known as \emph{Java Declarative Hints Language})
|
||||
is a query language
|
||||
that uses declarative patterns to define source code queries:
|
||||
these queries are used in conjunction with multiple rewrite definitions.
|
||||
The language is used in the Apache Netbeans~\cite{ApacheNetBeans}
|
||||
suite of tools to allow for declarative refactoring of code.
|
||||
|
||||
This is quite similar to the form of \DSL{}, as both language define som query by using similar structure, in Jackpot you define a \textit{pattern}, then every match of that pattern can be re-written to a \textit{fix-pattern}, each fix-pattern can have a condition attached to it. This is quite similar to the \textit{applicable to} and \textit{transform to} sections of \DSL{}. Jackpot also supports something similar to the wildcards in \DSL{}, as you can define variables in the \textit{pattern} definition and transfer them over to the \textit{fix-pattern} definition. This is closely related to the definition of wildcards in \DSL{}, though without type restrictions and notation for matching more than one AST node.
|
||||
|
||||
The example of a query and transformation below, will query the code for variable declarations with initial value of 1, and then change them into a declaration with initial value of 0.
|
||||
The example of a query and transformation below queries the code for variable declarations with initial value of 1,
|
||||
and then changes them into a declaration with initial value of 0.
|
||||
\begin{lstlisting}
|
||||
"change declarations of 1 to declarations of 0":
|
||||
int $1 = 1;
|
||||
|
@ -65,52 +137,142 @@ The example of a query and transformation below, will query the code for variabl
|
|||
\end{lstlisting}
|
||||
|
||||
|
||||
\section{JetBrains structural search}
|
||||
|
||||
JetBrains integrated development environments have a feature that allows for structural search and replace~\cite{StructuralSearchAndReplaceJetbrains}. This feature is intended for large code bases where a developer wants to perform a search and replace based on syntax and semantics, not just a regular text based search and replace. A search is applied to specific files of the codebase or the entire codebase. It does not recursively check the entire static structure of the code, but this can be specified in the user interface of structural search and replace.
|
||||
|
||||
When doing structural search in Jetbrains IntelliJ IDEA, templates are used to describe the query used in the search. These templates use variables described with \texttt{\$variable\$}, these allow for transferring context to the structural replace.
|
||||
|
||||
This tool is an interactive experience, where each match is showcased in the find tool, and the developer can decide which matches to apply the replace template to. This allows for error avoidance and a stricter search that is verified by humans. If the developer wants, they do not have to verify each match and just replace everything.
|
||||
|
||||
When comparing this tool to \DSL{} and its corresponding program, there are some similarities. They are both template based, which means a search uses a template to define query, both templates contain variables/wildcards to match against a free section, and the replacing structure is also a template based upon those same variables. A way of matching the variables/wildcards of structural search and replace also exists, one can define the amount of X node to match against, similar to the \texttt{+} operator used in \DSL{}. A core difference between \DSL{} and structural search and replace is the variable type system. When performing a match and transformation in \DSL{} the types are used extensively to limit the match against the wildcards, while this limitation is not possible in structural search and replace.
|
||||
Jackpot is quite similar to \DSL{},
|
||||
as both languages define queries
|
||||
by using similar structure.
|
||||
In Jackpot,
|
||||
one defines a \textit{pattern},
|
||||
and then every match of that pattern can be re-written
|
||||
to a \textit{fix-pattern}.
|
||||
Each fix-pattern can have a condition attached to it.
|
||||
This is quite similar to the \textit{applicable to} and \textit{transform to} sections of \DSL{}.
|
||||
Jackpot also supports a feature
|
||||
which is similar to the wildcards in \DSL{}---one can define variables
|
||||
in the \textit{pattern} definition and transfer them over to the
|
||||
\textit{fix-pattern} definition.
|
||||
In constant to \DSL{}, wildcard type restrictions and notation for matching more than one AST node are not supported in Jackpot.
|
||||
|
||||
|
||||
\section{Other JavaScript parsers}
|
||||
|
||||
This section will explore other JavaScript parsers that could have been used in this project. We will give a brief introduction of each of them, and discuss why they were not chosen.
|
||||
|
||||
|
||||
\section{IntelliJ structural search}
|
||||
JetBrains IntelliJ-based Integrated Development Environments
|
||||
have a feature that allows for structural search and replace~\cite{StructuralSearchAndReplaceJetbrains}.
|
||||
This feature is intended for large code bases
|
||||
where a developer wishes to perform a search and replace
|
||||
based on syntax and semantics,
|
||||
and not a (regular) text based search and replace.
|
||||
|
||||
When doing structural search in IntelliJ-based IDEs,
|
||||
templates are used to describe the query used in the search.
|
||||
These templates use variables described with \texttt{\$variable\$};
|
||||
these allow for transferring context to the structural replace.
|
||||
|
||||
In the figure below we perform a structured search for a method declaration with three parameters of type \texttt{int}, and replace it with a method declaration where all parameters are of type \texttt{double} and the return type is \texttt{double}.
|
||||
\begin{figure}[H]
|
||||
\begin{center}
|
||||
\includegraphics[width={.85\textwidth}]{figures/image.psd.png}
|
||||
\end{center}
|
||||
\caption{Example of Intellij structural search and replace}
|
||||
\end{figure}
|
||||
|
||||
This tool is interactive,
|
||||
and every match is showcased in the \emph{Find} tool.
|
||||
In this tool,
|
||||
a developer can decide which matches to apply the replace template to.
|
||||
This allows for error avoidance and a stricter search
|
||||
that is verified by humans.
|
||||
If the developer wishes so,
|
||||
they do not have to verify each match and can replace all matches at once.
|
||||
|
||||
IntelliJ structured search and replace and \DSL{} have similarities:
|
||||
they both are template-based.
|
||||
In both approaches, templates can contain variables and wildcards
|
||||
to allow for matching against arbitrary code.
|
||||
Both tools also support matching multiple code parts against a single variable or a wildcard.
|
||||
A core difference between the two tools is the variable type system:
|
||||
when performing a match and transformation in \DSL{},
|
||||
the types are used extensively to limit the match against the wildcards,
|
||||
while this limitation is not possible in IntelliJ.
|
||||
|
||||
|
||||
|
||||
\section{JavaScript parsers}
|
||||
|
||||
This section will explore other JavaScript parsers that could have been used in this project.
|
||||
We will give a brief introduction of each of them,
|
||||
and discuss why they were not chosen.
|
||||
|
||||
\subsection*{Speedy Web Compiler}
|
||||
Speedy Web Compiler~\cite{SpeedyWebCompiler} (SWC) is a library created for parsing and compiling JavaScript and other dialects (such as JSX and TypeScript).
|
||||
It is written in Rust and is known for its improved performance.
|
||||
SWC is used by large organizations creating applications and tooling for the web platform.
|
||||
|
||||
Speedy Web Compiler~\cite{SpeedyWebCompiler} is a library created for parsing JavaScript and other dialects like JSX, TypeScript faster. It is written in Rust and advertises faster speeds than Babel and is used by large organizations creating applications and tooling for the web platform.
|
||||
Speedy Web Compiler supports various features, such as:
|
||||
\emph{compilation} (used for TypeScript and other languages that are compiled down to JavaScript),
|
||||
\emph{bundling} (which takes multiple JavaScript/TypeScript files and bundles them into a single output file, while handling naming collisions),
|
||||
\emph{minification} (that makes the bundle size of a project smaller, transforming for use with WebAssembly),
|
||||
as well as \emph{custom plugins} (to change the specification of the languages parsed by SWC).
|
||||
|
||||
Similar to Babel~\cite{Babel}, Speedy Web Compiler is an extensible parser that allows for changing the specification of the parsed program. Its extensions are written in Rust. While it does not have as mature of a plugin system as Babel, its focus on speed makes it widely used for large scale web projects.
|
||||
|
||||
Speedy Web Compiler supports features out of the box such as compilation, used for TypeScript and other languages that are compiled down to JavaScript. Bundling, which takes multiple JavaScript/TypeScript files and bundles them into a single output file, while handling naming collisions. Minification, to make the bundle size of a project smaller, transforming for use with WebAssembly, and custom plugins to change the specification of the languages parsed by SWC.
|
||||
|
||||
Compared to Babel used in this paper, SWC focuses on speed, as its main selling point is a faster way of developing web projects. This parser was considered to be used for this project, however it had some shortcomings which made us decide it was not a good fit. SWC is written in Rust, which considering this project is targeted at TC39 we wanted it to be in JavaScript or one of JavaScript's dialects. SWC does have such an extensive library of early stage proposal plugins as Babel, this by itself is the deal breaker for use in this project, as we rely un support of proposals as early as stage one.
|
||||
SWC was considered to be used in this project, however due to SWC only supporting proposals when they reach stage 3, it was not possible to use this parser.
|
||||
|
||||
\subsection*{Acorn}
|
||||
|
||||
Acorn~\cite{AcornJS} is parser written in JavaScript to parse JavaScript and it's related languages. Acorn focuses on plugin support to support extending and redefinition of how it's internal parser works. Acorn focuses on being a small and fast JavaScript parser, has it's own tree traversal library Acorn Walk. Babel is originally a fork of Acorn, while Babel has since had a full rewrite, Babel is still heavily based on Acorn and Acorn-jsx~\cite{BabelAcornBased}.
|
||||
|
||||
Acorn suffers from a similar problem to SWC when it was considered for use in this project. It does not have the same wide community as Babel, and does not have the same recommendation from TC39 as Babel does~\cite{TC39RecommendBabel}. Even though it supports plugins and the plugin system is powerful, there does not exist the same amount of pre-made plugins for early stage proposals as Babel has.
|
||||
|
||||
Acorn~\cite{AcornJS} is parser written in JavaScript to parse JavaScript
|
||||
and related languages.
|
||||
Acorn focuses on plugin support to support extending and redefinition
|
||||
of how its internal parser works.
|
||||
Acorn focuses on being a small and performant JavaScript parser,
|
||||
and has a custom tree traversal library Acorn Walk.
|
||||
Babel is originally a fork of Acorn,
|
||||
and while Babel has since had a full rewrite,
|
||||
Babel is still heavily based on Acorn~\cite{BabelAcornBased}.
|
||||
|
||||
Acorn was considered as a parser in this project,
|
||||
however it does not have the same wide community as Babel,
|
||||
and does not have the same recommendation from TC39 as Babel does~\cite{TC39RecommendBabel}.
|
||||
Even though it supports plugins and the plugin system is powerful,
|
||||
there does not exist the same amount of pre-made plugins
|
||||
for early stage proposals as Babel has.
|
||||
|
||||
\section{Model-to-Model Transformations}
|
||||
Model-to-Model transformations are an integral part of model-driven engineering (MDE), which is a methodology that focuses on the creation and modification of abstract models rather than focusing on executable code~\cite{MDE}. This methodology provides a higher-level approach to developing large software systems.
|
||||
|
||||
Model-to-Model transformations are a part of model-driven engineering (MDE). These transformations involve transforming one model into another. a model is an abstraction of representation and behavior within a system.
|
||||
The process of performing a model-to-model transformation is to convert one model into another,
|
||||
while preserving or adapting its underlying semantics and structure~\cite{ModelToModelTransformations}.
|
||||
This is usually done by traversing its structure,
|
||||
and extracting data and transforming its format
|
||||
to fit the model it should be transformed into.
|
||||
This allows a model described within one domain
|
||||
to be transformed into another automatically.
|
||||
|
||||
The process of performing a model-to-model transformation is to convert one model into another, while preserving or adapting its underlying semantics and structure~\cite{ModelToModelTransformations}. This is usually done by traversing its structure, and extracting data and transforming its format to fit the model it should be transformed into. This allows a model described within one domain to be transformed into another automatically.
|
||||
|
||||
This is quite similar to what our tool described in this thesis is doing. We are transforming the AST of the user code from one AST definition not using a proposal, so one form of representation, into another using a proposal not part of the original AST.
|
||||
|
||||
\section{Aspect-Oriented Programming}
|
||||
|
||||
AOP, is a programming paradigm that enables modularity by allowing for a high degree of separation of concerns, specifically focusing on cross-cutting concerns. cross-cutting concerns are aspects of a software program or system that have an effect at multiple levels, cutting across the main functional requirements. Such aspects are often related to security, logging, or error handling, but could be any concern that are shared across an application.
|
||||
Aspect-Oriented Programming~\cite{AOP} (AOP)
|
||||
s a programming paradigm that enables modularity by allowing
|
||||
for a high degree of separation of concerns,
|
||||
specifically focusing on cross-cutting concerns.
|
||||
Cross-cutting concerns are aspects of a software program or a system
|
||||
that have an effect at multiple levels,
|
||||
cutting across the main functional requirements.
|
||||
Such aspects are often related to security,
|
||||
logging, or error handling,
|
||||
but could be any concern that are shared across an application.
|
||||
|
||||
In AOP, one creates an \textit{aspect}, which is a module that contains some cross-cutting concern the developer wants to achieve, this can be logging, error handling or other concerns not related to the original classes it should applied to. An aspect contains advices,which is the specific code executed when certain conditions of the program are met, an example of these are \textit{before advice}, which is executed before a method executes, \textit{after advice}, which is executed after a method regardless of the methods outcome, and \textit{around advice}, which surrounds a method execution. Contained within the aspect is also a \textit{pointcut}, which is the set of criteria determining when the aspect is meant to be executed. This can be at specific methods, or when specific constructors are called etc.
|
||||
|
||||
Aspect oriented programming is similar to this project in that to define where \textit{pointcuts} are placed, we have to define some structure and the AOP library has to search the code execution for events triggering the pointcut and run the advice defined within the aspect of that given pointcut. Essentially performing a rewrite of the code during execution to add functionality to multiple places in the executing code.
|
||||
In AOP,
|
||||
one creates an \textit{aspect},
|
||||
which is a module that contains some
|
||||
cross-cutting concern the developer wants to achieve.
|
||||
An aspect contains \emph{advices},
|
||||
which are the specific code fragments executed when certain conditions of the program are met
|
||||
(for example, a \textit{before advice} is executed before a method executes, an \textit{after advice} is executed after a method regardless of the methods outcome, an \textit{around advice} surrounds a method execution).
|
||||
Contained within the aspect is also a \textit{pointcut},
|
||||
which is the set of criteria determining
|
||||
when the aspect is meant to be executed
|
||||
(these can be at specific methods
|
||||
or when specific constructors are called, and so on).
|
||||
|
||||
One can see a similarity between \DSL{} and aspect-oriented programming:
|
||||
to define where \textit{pointcuts} are placed, we have to define some structure and the AOP weaver has to search the code execution for events triggering the pointcut and run the advice defined within the aspect of that given pointcut.
|
||||
|
|
0
figures/canvas.png
Executable file → Normal file
0
figures/canvas.png
Executable file → Normal file
Before Width: | Height: | Size: 142 KiB After Width: | Height: | Size: 142 KiB |
BIN
figures/image.psd.png
Normal file
BIN
figures/image.psd.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 236 KiB |
|
@ -2,18 +2,27 @@
|
|||
|
||||
\begin{abstract}
|
||||
|
||||
\noindent Lorem ipsum dolor sit amet, his veri singulis necessitatibus ad. Nec insolens periculis ex. Te pro purto eros error, nec alia graeci placerat cu. Hinc volutpat similique no qui, ad labitur mentitum democritum sea. Sale inimicus te eum.
|
||||
|
||||
No eros nemore impedit his, per at salutandi eloquentiam, ea semper euismod meliore sea. Mutat scaevola cotidieque cu mel. Eum an convenire tractatos, ei duo nulla molestie, quis hendrerit et vix. In aliquam intellegam philosophia sea. At quo bonorum adipisci. Eros labitur deleniti ius in, sonet congue ius at, pro suas meis habeo no.
|
||||
|
||||
\noindent Technical Committee 39 (TC39) of Ecma International
|
||||
is the body responsible for the evolution of the ECMAScript programming language, better known as JavaScript. Suggested changes to the language are presented in a form of proposals. To allow JavaScript users to form opinions about a proposal during the extensive design stage, proposal descriptions mention examples that showcase various corner cases. In this thesis, we implement a tool to search a user's codebase and demonstrate how the codebase would look like if the functionality defined by a proposal were a part of the language.
|
||||
We evaluate our tool on two contentious ECMAScript proposals (``Do Expression''
|
||||
and ``Pipeline'') and demonstrate that specifying proposals and transforming user code is feasible.
|
||||
The work presented in this theses is an initial step in creating a language workbench-like tool to aid in the development and design of widely adopted programming languages.
|
||||
\end{abstract}
|
||||
|
||||
\renewcommand{\abstractname}{Acknowledgements}
|
||||
\begin{abstract}
|
||||
Est suavitate gubergren referrentur an, ex mea dolor eloquentiam, novum ludus suscipit in nec. Ea mea essent prompta constituam, has ut novum prodesset vulputate. Ad noster electram pri, nec sint accusamus dissentias at. Est ad laoreet fierent invidunt, ut per assueverit conclusionemque. An electram efficiendi mea.
|
||||
First of all, I would like to give my deepest appreciation to my supervisor Assoc. Prof. Mikhail Barash for his incredible guidance throughout this thesis. He has given me so much in this journey, his kind words and thought-provoking discussion has been invaluable. Having him as my advisor is what made writing this thesis a positive experience, and I will always be indebted to him for that.
|
||||
|
||||
I would also like to thank Yulia Startsev (TC39), who has been my informal advisor throughout this thesis. She has suggested the idea of which this thesis is based on, and provided deep technical knowledge when I was stuck on the implementation. The brain-storming sessions during meetings are what has made this thesis a reality.
|
||||
|
||||
I also want to express my thanks to Daniel Svalestad Liland. Without his continued jokes, motivation, and competitive spirit I would not be where I am today.
|
||||
|
||||
My acknowledgements would not be complete without mentioning all my fellow master students. While they have been distracting at times, the support and community they provided has been incredibly important to me.
|
||||
|
||||
Last but not least, I want to thank my family for always supporting me throughout my studies, this thesis would not have been possible without them.
|
||||
|
||||
\vspace{1cm}
|
||||
\hspace*{\fill}\texttt{Your name}\\
|
||||
\hspace*{\fill}\texttt{Rolf Martin Glomsrud}\\
|
||||
\hspace*{\fill}\today
|
||||
\end{abstract}
|
||||
\setcounter{page}{1}
|
||||
|
|
|
@ -1,29 +1,7 @@
|
|||
\chapter{Examples of transformations performed in Evaluation}
|
||||
\label{appendix:b}
|
||||
\begin{figure}[H]
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
async function getCurrentRules() {
|
||||
return fetch(`https://api.github.com/repos/vercel/next.js/branches/canary/protection`, {
|
||||
headers: {
|
||||
Accept: 'application/vnd.github+json',
|
||||
Authorization: `Bearer ${authToken}`,
|
||||
'X-GitHub-Api-Version': '2022-11-28'
|
||||
}
|
||||
}).then(async res => {
|
||||
if (!res.ok) {
|
||||
throw new Error(`Failed to check for rule ${res.status} ${await res.text()}`);
|
||||
}
|
||||
const data = await res.json();
|
||||
return {
|
||||
// Massive JS object
|
||||
};
|
||||
});
|
||||
}
|
||||
\end{lstlisting}
|
||||
\caption*{"Await to Promise" transformation, from \texttt{next.js/test/integration/typescript-hmr/index.test.js}}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
for (const file of typeFiles) {
|
||||
const content = await fs.readFile(join(styledJsxPath, file), 'utf8')
|
||||
|
@ -39,7 +17,7 @@ await (typesDir |> join(%, file) |> fs.writeFile(%, content));
|
|||
\caption*{``Pipeline'' transformation, taken from \texttt{next.js/packages/next/taskfile.js}}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
tracks.push( parseKeyframeTrack( jsonTracks[ i ] ).scale( frameTime ) );
|
||||
\end{lstlisting}
|
||||
|
@ -51,7 +29,7 @@ frameTime
|
|||
\caption*{Transformation taken from \texttt{three.js/src/animation/AnimationClip.js}}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
const logger = createLogger({
|
||||
storagePath: join(__dirname, '.progress-estimator'),
|
||||
|
@ -65,7 +43,7 @@ const logger = {
|
|||
\caption*{``Pipeline'' transformation, taken from \texttt{react/scripts/devtools/utils.js}}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
if (isElement(content)) {
|
||||
this._putElementInTemplate(getElement(content), templateElement)
|
||||
|
@ -83,7 +61,7 @@ if (content |> isElement(%)) {
|
|||
|
||||
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
if (repo && repo.onDidDestroy) {
|
||||
repo.onDidDestroy(() =>
|
||||
|
@ -99,7 +77,7 @@ if (repo && repo.onDidDestroy) {
|
|||
\caption*{``Pipeline'' transformation, taken from \texttt{atom/src/project.js}}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
Lensflare.Geometry = do {
|
||||
const geometry = new BufferGeometry();
|
||||
|
@ -141,95 +119,8 @@ Lensflare.Geometry = do {
|
|||
\caption*{``Do expression'' transformation, taken from \texttt{three.js/examples/jsm/objects/Lensflare.js}}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
addHelper: (function () {
|
||||
var geometry = new THREE.SphereGeometry(2, 4, 2);
|
||||
var material = new THREE.MeshBasicMaterial({
|
||||
color: 0xff0000,
|
||||
visible: false,
|
||||
});
|
||||
|
||||
return function (object, helper) {
|
||||
if (helper === undefined) {
|
||||
if (object.isCamera) {
|
||||
helper = new THREE.CameraHelper(object);
|
||||
} else if (object.isPointLight) {
|
||||
helper = new THREE.PointLightHelper(object, 1);
|
||||
} else if (object.isDirectionalLight) {
|
||||
helper = new THREE.DirectionalLightHelper(object, 1);
|
||||
} else if (object.isSpotLight) {
|
||||
helper = new THREE.SpotLightHelper(object);
|
||||
} else if (object.isHemisphereLight) {
|
||||
helper = new THREE.HemisphereLightHelper(object, 1);
|
||||
} else if (object.isSkinnedMesh) {
|
||||
helper = new THREE.SkeletonHelper(object.skeleton.bones[0]);
|
||||
} else if (
|
||||
object.isBone === true &&
|
||||
object.parent &&
|
||||
object.parent.isBone !== true
|
||||
) {
|
||||
helper = new THREE.SkeletonHelper(object);
|
||||
} else {
|
||||
// no helper for this object type
|
||||
return;
|
||||
}
|
||||
|
||||
const picker = new THREE.Mesh(geometry, material);
|
||||
picker.name = "picker";
|
||||
picker.userData.object = object;
|
||||
helper.add(picker);
|
||||
}
|
||||
|
||||
this.sceneHelpers.add(helper);
|
||||
this.helpers[object.id] = helper;
|
||||
|
||||
this.signals.helperAdded.dispatch(helper);
|
||||
};
|
||||
})(),
|
||||
\end{lstlisting}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
addHelper: do {
|
||||
var geometry = new THREE.SphereGeometry(2, 4, 2);
|
||||
var material = new THREE.MeshBasicMaterial({
|
||||
color: 0xff0000,
|
||||
visible: false
|
||||
});
|
||||
function (object, helper) {
|
||||
if (helper === undefined) {
|
||||
if (object.isCamera) {
|
||||
helper = new THREE.CameraHelper(object);
|
||||
} else if (object.isPointLight) {
|
||||
helper = new THREE.PointLightHelper(object, 1);
|
||||
} else if (object.isDirectionalLight) {
|
||||
helper = new THREE.DirectionalLightHelper(object, 1);
|
||||
} else if (object.isSpotLight) {
|
||||
helper = new THREE.SpotLightHelper(object);
|
||||
} else if (object.isHemisphereLight) {
|
||||
helper = new THREE.HemisphereLightHelper(object, 1);
|
||||
} else if (object.isSkinnedMesh) {
|
||||
helper = new THREE.SkeletonHelper(object.skeleton.bones[0]);
|
||||
} else if (object.isBone === true && object.parent && object.parent.isBone !== true) {
|
||||
helper = new THREE.SkeletonHelper(object);
|
||||
} else {
|
||||
// no helper for this object type
|
||||
return;
|
||||
}
|
||||
const picker = new THREE.Mesh(geometry, material);
|
||||
picker.name = 'picker';
|
||||
picker.userData.object = object;
|
||||
helper.add(picker);
|
||||
}
|
||||
this.sceneHelpers.add(helper);
|
||||
this.helpers[object.id] = helper;
|
||||
this.signals.helperAdded.dispatch(helper);
|
||||
}
|
||||
},
|
||||
\end{lstlisting}
|
||||
\caption*{``Do expression'' transformation, taken from \texttt{three.js/editor/js/libs/codemirror/mode/javascript.js}}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
const panLeft = (function () {
|
||||
const v = new Vector3();
|
||||
|
@ -286,7 +177,7 @@ const panUp = do {
|
|||
\end{figure}
|
||||
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}
|
||||
\begin{lstlisting}[language={JavaScript}]
|
||||
async loadAsync(url, onProgress) {
|
||||
const scope = this;
|
||||
|
|
|
@ -16,7 +16,7 @@
|
|||
|
||||
\HRule \\[0.5cm]
|
||||
\begin{Huge}
|
||||
\bfseries{Making a template query language for EcmaScript}\\[0.7cm] % Title of your document
|
||||
\bfseries{Implementing\\ Structural Search and Replace\\ for Showcasing ECMAScript\\ Language Proposals}\\[0.7cm] % Title of your document
|
||||
\end{Huge}
|
||||
\HRule \\[0.5cm]
|
||||
|
||||
|
@ -35,7 +35,7 @@
|
|||
% Logo for other faculties here: http://kapd.h.uib.no/profilmanual/99LastNed/99a_lastned.html
|
||||
%----------------------------------------------------------------------------------------
|
||||
|
||||
\centerline{\includegraphics[scale=1.9]{figures/canvasWithFaculty}}
|
||||
\centerline{\includegraphics[scale=1.8]{figures/canvasWithFaculty}}
|
||||
%\centerline{\includegraphics[scale=0.15]{figures/canvas}} %change for your faculty
|
||||
|
||||
%----------------------------------------------------------------------------------------
|
||||
|
|
|
@ -56,6 +56,8 @@
|
|||
language=Golang
|
||||
}
|
||||
|
||||
\usepackage{hyperref}
|
||||
|
||||
\newlength\glsdescwidth
|
||||
\setlength{\glsdescwidth}{1\hsize}
|
||||
|
||||
|
|
194
generators/refs.bib
Executable file → Normal file
194
generators/refs.bib
Executable file → Normal file
|
@ -1,16 +1,8 @@
|
|||
|
||||
|
||||
@misc{Proposal:DiscardBindings,
|
||||
title = {{proposal-discard-binding}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = apr,
|
||||
note = {[Online; accessed 25. Apr. 2024]},
|
||||
url = {https://github.com/tc39/proposal-discard-binding}
|
||||
}
|
||||
|
||||
@misc{Proposal:DoProposal,
|
||||
title = {{proposal-do-expressions}},
|
||||
title = {{Proposal ``Do Expression''}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -28,7 +20,7 @@
|
|||
}
|
||||
|
||||
@misc{Babel,
|
||||
title = {{Babel {$\cdot$} Babel}},
|
||||
title = {{Babel}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
note = {[Online; accessed 10. May 2024]},
|
||||
|
@ -110,26 +102,6 @@
|
|||
|
||||
|
||||
|
||||
@article{AST2,
|
||||
author = {Neamtiu, Iulian and Foster, Jeffrey S. and Hicks, Michael},
|
||||
title = {Understanding source code evolution using abstract syntax tree matching},
|
||||
year = {2005},
|
||||
issue_date = {July 2005},
|
||||
publisher = {Association for Computing Machinery},
|
||||
address = {New York, NY, USA},
|
||||
volume = {30},
|
||||
number = {4},
|
||||
issn = {0163-5948},
|
||||
url = {https://doi.org/10.1145/1082983.1083143},
|
||||
doi = {10.1145/1082983.1083143},
|
||||
abstract = {Mining software repositories at the source code level can provide a greater understanding of how software evolves. We present a tool for quickly comparing the source code of different versions of a C program. The approach is based on partial abstract syntax tree matching, and can track simple changes to global variables, types and functions. These changes can characterize aspects of software evolution useful for answering higher level questions. In particular, we consider how they could be used to inform the design of a dynamic software updating system. We report results based on measurements of various versions of popular open source programs. including BIND, OpenSSH, Apache, Vsftpd and the Linux kernel.},
|
||||
journal = {SIGSOFT Softw. Eng. Notes},
|
||||
month = {may},
|
||||
pages = {1–5},
|
||||
numpages = {5},
|
||||
keywords = {abstract syntax trees, software evolution, source code analysis}
|
||||
}
|
||||
|
||||
|
||||
|
||||
@article{RecursiveDescent,
|
||||
|
@ -153,7 +125,7 @@
|
|||
}
|
||||
|
||||
@misc{SpeedyWebCompiler,
|
||||
title = {{Rust-based platform for the Web {\textendash} SWC}},
|
||||
title = {{Speedy Web Compiler}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-05-21},
|
||||
|
@ -162,7 +134,7 @@
|
|||
}
|
||||
|
||||
@misc{Pipeline,
|
||||
title = {{proposal-pipeline-operator}},
|
||||
title = {{``Pipeline'' Operator}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -172,7 +144,7 @@
|
|||
}
|
||||
|
||||
@misc{AcornJS,
|
||||
title = {{acorn}},
|
||||
title = {{Acorn}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -181,16 +153,6 @@
|
|||
url = {https://github.com/acornjs/acorn}
|
||||
}
|
||||
|
||||
@misc{JQuery,
|
||||
author = {{OpenJS Foundation - openjsf.org}},
|
||||
title = {{jQuery}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-05-21},
|
||||
note = {[Online; accessed 21. May 2024]},
|
||||
url = {https://jquery.com}
|
||||
}
|
||||
|
||||
@inproceedings{ProgrammingLanguageEcolutionViaSourceCodeQueryLanguages,
|
||||
author = {Urma, Raoul-Gabriel and Mycroft, Alan},
|
||||
title = {Programming language evolution via source code query languages},
|
||||
|
@ -220,7 +182,7 @@
|
|||
|
||||
@misc{ApacheNetBeans,
|
||||
author = {NetBeans, Apache},
|
||||
title = {{Welcome to Apache NetBeans}},
|
||||
title = {{Apache NetBeans}},
|
||||
year = {2024},
|
||||
month = feb,
|
||||
urldate = {2024-05-21},
|
||||
|
@ -229,7 +191,7 @@
|
|||
}
|
||||
|
||||
@misc{PMDXPath,
|
||||
title = {{Writing XPath rules {$\vert$} PMD Source Code Analyzer}},
|
||||
title = {{Writing XPath rules PMD Source Code Analyzer}},
|
||||
year = {2024},
|
||||
month = apr,
|
||||
urldate = {2024-05-30},
|
||||
|
@ -247,7 +209,7 @@
|
|||
}
|
||||
|
||||
@misc{Atom,
|
||||
title = {{atom}},
|
||||
title = {{Atom}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-05-23},
|
||||
|
@ -256,7 +218,7 @@
|
|||
}
|
||||
|
||||
@misc{Bootstrap,
|
||||
title = {{bootstrap}},
|
||||
title = {{Bootstrap}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -266,7 +228,7 @@
|
|||
}
|
||||
|
||||
@misc{React,
|
||||
title = {{react}},
|
||||
title = {{React}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -275,7 +237,7 @@
|
|||
url = {https://github.com/facebook/react}
|
||||
}
|
||||
@misc{NEXT.JS,
|
||||
title = {{next.js}},
|
||||
title = {{Next.js}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -285,7 +247,7 @@
|
|||
}
|
||||
|
||||
@misc{ThreeJS,
|
||||
title = {{three.js}},
|
||||
title = {{Three.js}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -294,7 +256,7 @@
|
|||
url = {https://github.com/mrdoob/three.js}
|
||||
}
|
||||
@misc{PipelineBikeshedding,
|
||||
title = {{Bikeshedding the Hack topic token {$\cdot$} Issue {\#}91 {$\cdot$} tc39/proposal-pipeline-operator}},
|
||||
title = {{Bikeshedding the Hack topic token}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -314,7 +276,7 @@
|
|||
}
|
||||
|
||||
@misc{JuliaPipe,
|
||||
title = {{Functions {$\cdot$} The Julia Language}},
|
||||
title = {{Pipe Operator The Julia Language}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-05-24},
|
||||
|
@ -323,7 +285,7 @@
|
|||
}
|
||||
|
||||
@misc{ecma262,
|
||||
title = {{ECMAScript{\ifmmode\circledR\else\textregistered\fi} 2025 Language Specification}},
|
||||
title = {{ECMA-262 Language Specification}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-05-28},
|
||||
|
@ -360,7 +322,7 @@
|
|||
}
|
||||
|
||||
@misc{BabelProposalSupport,
|
||||
title = {{proposals}},
|
||||
title = {{Babel Proposal Support}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -369,7 +331,7 @@
|
|||
url = {https://github.com/babel/proposals}
|
||||
}
|
||||
@misc{BabelAST,
|
||||
title = {{babel/packages/babel-parser/ast/spec.md at main {$\cdot$} babel/babel}},
|
||||
title = {{Babel AST Specification}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -387,7 +349,7 @@
|
|||
}
|
||||
|
||||
@misc{BabelSpecCompliant,
|
||||
title = {{What is Babel? {$\cdot$} Babel}},
|
||||
title = {{Babel Specification Compliant}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-05-29},
|
||||
|
@ -396,7 +358,7 @@
|
|||
}
|
||||
|
||||
@misc{TC39RecommendBabel,
|
||||
title = {{how-we-work/implement.md at main {$\cdot$} tc39/how-we-work}},
|
||||
title = {{TC39 How We Work}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-05-29},
|
||||
|
@ -488,7 +450,7 @@
|
|||
doi = {10.1109/SCAM.2007.31}
|
||||
}
|
||||
@misc{BabelAcornBased,
|
||||
title = {{@babel/parser {$\cdot$} Babel}},
|
||||
title = {{Babel Credits}},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-05-30},
|
||||
|
@ -531,7 +493,7 @@
|
|||
}
|
||||
|
||||
@misc{PipelineHistory,
|
||||
title = {{proposal-pipeline-operator/HISTORY.md at main {$\cdot$} tc39/proposal-pipeline-operator}},
|
||||
title = {{History ``Pipeline'' Operator In EcmaScript}},
|
||||
journal = {GitHub},
|
||||
year = {2024},
|
||||
month = may,
|
||||
|
@ -555,4 +517,116 @@
|
|||
urldate = {2024-06-01},
|
||||
note = {[Online; accessed 1. Jun. 2024]},
|
||||
url = {https://www.gnu.org/software/bash/manual/bash.html#Pipelines}
|
||||
}
|
||||
}
|
||||
|
||||
@misc{PMDXPathRule,
|
||||
title = {{XPath rule for PMD Source Code Analyzer }},
|
||||
year = {2024},
|
||||
month = apr,
|
||||
urldate = {2024-06-01},
|
||||
note = {[Online; accessed 1. Jun. 2024]},
|
||||
url = {https://docs.pmd-code.org/latest/pmd_userdocs_extending_your_first_rule.html}
|
||||
}
|
||||
|
||||
@article{AOP,
|
||||
author = {Lopes, Cristina and Kiczales, Gregor and Lamping, John and Mendhekar, Anurag and Maeda, Chris and Loingtier, Jean-marc and Irwin, John},
|
||||
year = {1999},
|
||||
month = {10},
|
||||
pages = {},
|
||||
title = {Aspect-Oriented Programming},
|
||||
volume = {28},
|
||||
journal = {ACM Computing Surveys},
|
||||
doi = {10.1145/242224.242420}
|
||||
}
|
||||
|
||||
@misc{LanguageWorkbenchMartinFowler,
|
||||
title = {{Language Workbenches: The Killer-App for Domain Specific Languages?}},
|
||||
journal = {Martinfowler},
|
||||
year = {2024},
|
||||
month = may,
|
||||
urldate = {2024-06-01},
|
||||
note = {[Online; accessed 1. Jun. 2024]},
|
||||
url = {https://www.martinfowler.com/articles/languageWorkbench.html#DefiningALanguageWorkbench}
|
||||
}
|
||||
@inproceedings{LanguageWorkbenchMikhail,
|
||||
author = {Barash, Mikhail},
|
||||
title = {Vision: the next 700 language workbenches},
|
||||
year = {2021},
|
||||
isbn = {9781450391115},
|
||||
publisher = {Association for Computing Machinery},
|
||||
address = {New York, NY, USA},
|
||||
url = {https://doi.org/10.1145/3486608.3486907},
|
||||
doi = {10.1145/3486608.3486907},
|
||||
abstract = {Language workbenches (LWBs) are tools to define software languages together with tailored Integrated Development Environments for them. A comprehensive review of language workbenches by Erdweg et al. (Comput. Lang. Syst. Struct. 44, 2015) presented a feature model of functionality of LWBs from the point of view of "languages that can be defined with a LWB, and not the definition mechanism of the LWB itself". This vision paper discusses possible functionality of LWBs with regard to language definition mechanisms. We have identified five groups of such functionality, related to: metadefinitions, metamodifications, metaprocess, LWB itself, and programs written in languages defined in a LWB. We design one of the features ("ability to define dependencies between language concerns") based on our vision.},
|
||||
booktitle = {Proceedings of the 14th ACM SIGPLAN International Conference on Software Language Engineering},
|
||||
pages = {16–21},
|
||||
numpages = {6},
|
||||
keywords = {software languages, metaprogramming, algebraic specifications, Language workbenches},
|
||||
location = {Chicago, IL, USA},
|
||||
series = {SLE 2021}
|
||||
}
|
||||
@book{MarkusDSL,
|
||||
author = {Markus Voelter et al.},
|
||||
title = {DSL Engineering
|
||||
Designing, Implementing and Using
|
||||
Domain-Specific Languages} ,
|
||||
year = 2013
|
||||
}
|
||||
|
||||
@misc{PMDAnalyzer,
|
||||
title = {{ PMD Source Code Analyzer}},
|
||||
year = {2024},
|
||||
month = jun,
|
||||
urldate = {2024-06-02},
|
||||
note = {[Online; accessed 2. Jun. 2024]},
|
||||
url = {https://pmd.github.io/pmd/index.html}
|
||||
}
|
||||
@book{SQL,
|
||||
author = {Alan Beaulieu},
|
||||
title = {Learning SQL},
|
||||
publisher = {O'Reilly Media, Inc.},
|
||||
year = {2005}
|
||||
}
|
||||
@inproceedings{EindhovenQuantifierNotation,
|
||||
author = {Backhouse, Roland and Michaelis, Diethard},
|
||||
year = {2006},
|
||||
month = {07},
|
||||
pages = {69-81},
|
||||
title = {Exercises in Quantifier Manipulation},
|
||||
isbn = {978-3-540-35631-8},
|
||||
doi = {10.1007/11783596_7}
|
||||
}
|
||||
@article{Datalog,
|
||||
title={What you always wanted to know about Datalog(and never dared to ask)},
|
||||
author={Ceri, Stefano and Gottlob, Georg and Tanca, Letizia and others},
|
||||
journal={IEEE transactions on knowledge and data engineering},
|
||||
volume={1},
|
||||
number={1},
|
||||
pages={146--166},
|
||||
year={1989},
|
||||
publisher={Citeseer}
|
||||
}
|
||||
@InProceedings{Predicates,
|
||||
author="Chambers, Craig",
|
||||
editor="Nierstrasz, Oscar M.",
|
||||
title="Predicate Classes",
|
||||
booktitle="ECOOP' 93 --- Object-Oriented Programming",
|
||||
year="1993",
|
||||
publisher="Springer Berlin Heidelberg",
|
||||
address="Berlin, Heidelberg",
|
||||
pages="268--296",
|
||||
abstract="Predicate classes are a new linguistic construct designed to complement normal classes in object-oriented languages. Like a normal class, a predicate class has a set of superclasses, methods, and instance variables. However, unlike a normal class, an object is automatically an instance of a predicate class whenever it satisfies a predicate expression associated with the predicate class. The predicate expression can test the value or state of the object, thus supporting a form of implicit property-based classification that augments the explicit type-based classification provided by normal classes. By associating methods with predicate classes, method lookup can depend not only on the dynamic class of an argument but also on its dynamic value or state. If an object is modified, the property-based classification of an object can change over time, implementing shifts in major behavior modes of the object. A version of predicate classes has been designed and implemented in the context of the Cecil language.",
|
||||
isbn="978-3-540-47910-9"
|
||||
}
|
||||
|
||||
|
||||
@article{MDE,
|
||||
title={Model-driven engineering},
|
||||
author={Schmidt, Douglas C and others},
|
||||
journal={Computer-IEEE Computer Society-},
|
||||
volume={39},
|
||||
number={2},
|
||||
pages={25},
|
||||
year={2006},
|
||||
publisher={Citeseer}
|
||||
}
|
||||
|
|
Loading…
Reference in a new issue