master/chapter/related_work.tex

\chapter{Related Work}

In this chapter, we present work related to other query languages for source code, aspect-oriented programming, some code querying methods, and other JavaScript parsers. This all relates to the work described in this thesis.

\section{Aspect-Oriented Programming}

AoP, is a programming paradigm that gives increased modularity by allowing for a high degree of separation of concerns, specifically focusing on cross-cutting concerns.

Cross-cutting concerns are aspects of a software program or system that have an effect at multiple levels, cutting across the main functional requirements. Such aspects are often related to security, logging, or error handling, but could be any concern that are shared across an application.

In AOP, one creates an \textit{aspect}, which is a module that contains some cross-cutting concern the developer wants to achieve, this can be logging, error handling or other concerns not related to the original classes it should applied to. An aspect contains advices,which is the specific code executed when certain conditions of the program are met, an example of these are \textit{before advice}, which is executed before a method executes, \textit{after advice}, which is executed after a method regardless of the methods outcome, and \textit{around advice}, which surrounds a method execution. Contained within the aspect is also a \textit{pointcut}, which is the set of criteria determining when the aspect is meant to be executed. This can be at specific methods, or when specific constructors are called etc.

Aspect oriented programming is similar to this project in that to define where \textit{pointcuts} are placed, we have to define some structure and the AOP library has to search the code execution for events triggering the pointcut and run the advice defined within the aspect of that given pointcut. Essentially performing a rewrite of the code during execution to add functionality to multiple places in the executing code.

\section{Other source code query languages}

To allow for simple analysis and refactoring of code, there exists many query languages designed to query source code. These languages use several methods to allow for querying code based on specific paradigms such as logical queries, declarative queries, or SQL-like queries. All provide similar functionality of being able to query code. In this section we will look some of these languages for querying source code, and how they relate to \DSL developed in this thesis.

\subsection{CodeQL}

CodeQL~\cite{CodeQL} is an object-oriented query language, it was previously known as .QL. . CodeQL is used to analyze code semantically to discover vulnerabilities~\cite{CodeQLStuff}. CodeQL has taking inspiration from several areas of computer science to create their query language~\cite{CodeQLStuff}, such a inspiration from SQL, Datalog, Eindhoven QUantifier Notation, and Classes are Predicates.

An example of how queries are written in CodeQL can be defined below~\cite{CodeQL}. This query will find all methods that declare a method \texttt{equals} and not a method \texttt{hashCode}. This query is performed using an erroneous class \texttt{c}, and we define what properties that class should have.
\begin{lstlisting}
from Class c
where c.declaresMethod("equals") and
    not(c.declaresMethod("hashCode")) and
    c.fromSource()
select c.getPackage(), c
\end{lstlisting}

The syntax of writing queries in CodeQL is not similar to \DSL, as it is SQL-like, and not declarative patterns, which makes the writing experience of the two languages very different. Writing CodeQL queries are similar to querying a database, while queries written in \DSL are similar to defining an example of the structure you wish to search for.

\subsection{PMD XPath}

PMD XPath is a language for Java source code querying, it supports querying of all Java constructs~\cite{ProgrammingLanguageEcolutionViaSourceCodeQueryLanguages}. The reason it has this wide support is due to it constructing the entire codebase's AST in XML format, and then performing the query on the XML. These queries are performed using XPath rules, that define the matching on the XML. This makes the query language very versatile for static code analysis, and is used in the PMD static code analysis tool.

\begin{lstlisting}
public class KeepingItSerious{
    Delegator bill; // FieldDeclaration

    public void method(){
        short bill; // LocalVariableDeclaration
    }
}
\end{lstlisting}

There are two queries with PMD XPath defined in the example below~\cite{PMDXPath}. If we execute these on the code above, the first will match against the field declaration \texttt{Delegator bill} and \texttt{short bill}, while the second will only return \texttt{short bill}. The reason the second limits the search, is we define the type of the declaration.
\begin{lstlisting}
//VariableId[@Name = "bill"]
//VariableId[@Name = "bill" and ../../Type[@TypeImage = "short"]]
\end{lstlisting}

Comparing this tool to \DSL, we can see it is good at querying code based on structure, which our tool also excels at. The main difference is the manor of which each tool does this, \DSL uses JavaScript templates to perform the query, making writing queries simple for users as they are based in JavaScript. PMD XPath uses XPath to perform define structural queries that is quite verbose, and requires extended knowledge of the AST that is currently being queried.

\subsection{XSL Transformations}

XSLT~\cite{XSLT} is a language created to perform transformations of XML documents, transforming an XML document into a different format, this can be into other XML documents, HTML or even plain text.

XSLT is part of Extensible Stylesheets Language family of programs. The XSL language is expressed in the form of a stylesheet~\cite[1.1]{XSLT}, whose syntax is defined in XML. This language uses a template based approach to define matches on specific patterns in the source to find sections to transform. These transformations are defined by a transformation declaration that describes how the output of the match should look.

This language defines matching and transformation very similarly to \DSL, and uses the same technique, where the transformation declaration describes how the output should look, and not exactly how the transformation is performed.

\subsection{Jackpot}

Jackpot~\cite{Jackpot} is a query language created for the Apache Netbeans platform~\cite{ApacheNetBeans}, it has since been mostly renamed to Java Declarative Hints Language, we will continue to refer to it as Jackpot in this section. The language uses declarative patterns to define source code queries, these queries are used in conjunction with multiple rewrite definitions. This is used in the Apache Netbeans suite of tools to allow for declarative refactoring of code.

This is quite similar to the form of \DSL, as both language define som query by using similar structure, in Jackpot you define a \textit{pattern}, then every match of that pattern can be re-written to a \textit{fix-pattern}, each fix-pattern can have a condition attached to it. This is quite similar to the \textit{applicable to} and \textit{transform to} sections of \DSL. Jackpot also supports something similar to the wildcards in \DSL, as you can define variables in the \textit{pattern} definition and transfer them over to the \textit{fix-pattern} definition. This is closely related to the definition of wildcards in \DSL, though without type restrictions and notation for matching more than one AST node.

The example of a query and transformation below, will query the code for variable declarations with initial value of 1, and then change them into a declaration with initial value of 0.
\begin{lstlisting}
"change declarations of 1 to declarations of 0":
    int $1 = 1;
=>  int $1 = 0
\end{lstlisting}


\section{JetBrains structural search}

JetBrains integrated development environments have a feature that allows for structural search and replace~\cite{StructuralSearchAndReplaceJetbrains}. This feature is intended for large code bases where a developer wants to perform a search and replace based on syntax and semantics, not just a regular text based search and replace. A search is applied to specific files of the codebase or the entire codebase. It does not recursively check the entire static structure of the code, but this can be specified in the user interface of structural search and replace.

When doing structural search in Jetbrains IntelliJ IDEA, templates are used to describe the query used in the search. These templates use variables described with \texttt{\$variable\$}, these allow for transferring context to the structural replace.

This tool is an interactive experience, where each match is showcased in the find tool, and the developer can decide which matches to apply the replace template to. This allows for error avoidance and a stricter search that is verified by humans. If the developer wants, they do not have to verify each match and just replace everything.

When comparing this tool to \DSL and its corresponding program, there are some similarities. They are both template based, which means a search uses a template to define query, both templates contain variables/wildcards in order to match against a free section, and the replacing structure is also a template based upon those same variables. A way of matching the variables/wildcards of structural search and replace also exists, one can define the amount of X node to match against, similar to the \texttt{+} operator used in \DSL. A core difference between \DSL and structural search and replace is the variable type system. When performing a match and transformation in \DSL the types are used extensively to limit the match against the wildcards, while this limitation is not possible in structural search and replace.


\section{Other JavaScript parsers}

This section will explore other JavaScript parsers that could have been used in this project. We will give a brief introduction of each of them, and discuss why they were not chosen.

\subsection*{Speedy Web Compiler}

Speedy Web Compiler~\cite{SpeedyWebCompiler} is a library created for parsing JavaScript and other dialects like JSX, TypeScript faster. It is written in Rust and advertises faster speeds than Babel and is used by large organizations creating applications and tooling for the web platform.

Similar to Babel~\cite{Babel}, Speedy Web Compiler is an extensible parser that allows for changing the specification of the parsed program. Its extensions are written in Rust. While it does not have as mature of a plugin system as Babel, its focus on speed makes it widely used for large scale web projects.

Speedy Web Compiler supports features out of the box such as compilation, used for TypeScript and other languages that are compiled down to JavaScript. Bundling, which takes multiple JavaScript/TypeScript files and bundles them into a single output file, while handling naming collisions. Minification, to make the bundle size of a project smaller, transforming for use with WebAssembly, and custom plugins to change the specification of the languages parsed by SWC.

Compared to Babel used in this paper, SWC focuses on speed, as its main selling point is a faster way of developing web projects. This parser was considered to be used for this project, however it had some shortcomings which made us decide it was not a good fit. SWC is written in Rust, which considering this project is targeted at TC39 we wanted it to be in JavaScript or one of JavaScript's dialects. SWC does have such an extensive library of early stage proposal plugins as Babel, this by itself is the deal breaker for use in this project, as we rely un support of proposals as early as stage one.

\subsection*{Acorn}

Acorn~\cite{AcornJS} is parser written in JavaScript to parse JavaScript and it's related languages. Acorn focuses on plugin support in order to support extending and redefinition of how it's internal parser works. Acorn focuses on being a small and fast JavaScript parser, has it's own tree traversal library Acorn Walk. Babel is originally a fork of Acorn, while Babel has since had a full rewrite, Babel is still heavily based on Acorn and Acorn-jsx~\cite{BabelAcornBased}.

Acorn suffers from a similar problem to SWC when it was considered for use in this project. It does not have the same wide community as Babel, and does not have the same recommendation from TC39 as Babel does~\cite{TC39RecommendBabel}. Even though it supports plugins and the plugin system is powerful, there does not exist the same amount of pre-made plugins for early stage proposals as Babel has.


\section{Model-to-Model transformations}