Halfway done with feedback ch4

This commit is contained in:
Rolf Martin Glomsrud 2024-05-27 15:15:16 +02:00
parent 133b2807f2
commit 281893a474
4 changed files with 99 additions and 68 deletions

View file

@ -0,0 +1 @@
In the chapter on parsing of the wildcards. What do you mean by the exlamation mark of repetition?

Binary file not shown.

View file

@ -10,18 +10,18 @@ In this tool, there exists two multiple ways to define a proposal, and each prov
In the architecture diagram, circular nodes show data passed into the program sections, and square nodes is a specific section of the program. In the architecture diagram, circular nodes show data passed into the program sections, and square nodes is a specific section of the program.
\begin{itemize} \begin{description}
\item \textbf{\DSL Code} is the raw text definition of proposals \item[\DSL Code] is the raw text definition of proposals
\item \textbf{Self-Hosted Object} is the self-hosted version in \DSLSH format \item[Self-Hosted Object] is the self-hosted version in \DSLSH format
\item \textbf{Langium Parser} takes raw \DSL source code, and parses it into a DSL \item[1. Langium Parser] takes raw \DSL source code, and parses it into a DSL
\item \textbf{Pre-parser} extracts the wildcards from the raw template definitions \item[2. Pre-parser] extracts the wildcards from the raw template definitions
\item \textbf{Prelude-builder} translates JavaScript prelude into array of wildcard strings \item[1. Prelude-builder] translates JavaScript prelude into array of wildcard strings
\item \textbf{Babel} parses the templates and the users source code into an AST \item[3. Babel] parses the templates and the users source code into an AST
\item \textbf{Custom Tree Builder} translates the Babel AST structure into our tree structure \item[4. Custom Tree Builder] translates the Babel AST structure into our tree structure
\item \textbf{Matcher} finds matches with \texttt{applicable to} template in user code \item[5. Matcher] finds matches with \texttt{applicable to} template in user code
\item \textbf{Transformer} performs transformation defined in \texttt{transform to} template to each match of the users AST \item[6. Transformer] performs transformation defined in \texttt{transform to} template to each match of the users AST
\item \textbf{Generator} generates source code from the transformed user AST \item[7. Generator] generates source code from the transformed user AST
\end{itemize} \end{description}
\begin{figure}[H] \begin{figure}[H]
@ -30,17 +30,17 @@ In the architecture diagram, circular nodes show data passed into the program se
roundnode/.style={ellipse, draw=red!60, fill=red!5, very thick, minimum size=7mm}, roundnode/.style={ellipse, draw=red!60, fill=red!5, very thick, minimum size=7mm},
squarednode/.style={rectangle, draw=red!60, fill=red!5, very thick, minimum size=5mm} squarednode/.style={rectangle, draw=red!60, fill=red!5, very thick, minimum size=5mm}
] ]
\node[roundnode] (jstqlcode) {JSTQL Code}; \node[squarednode] (preParser) {2. Pre-parser};
\node[roundnode] (selfhostedjsoninput) [right=of jstqlcode] {Self-Hosted Object}; \node[squarednode] (preludebuilder) [above right=of preParser] {1. Prelude Builder};
\node[squarednode] (langium) [below=of jstqlcode] {Langium Parser}; \node[roundnode] (selfhostedjsoninput) [above=of preludebuilder] {Self-Hosted Object};
\node[squarednode] (preludebuilder) [below=of selfhostedjsoninput] {Prelude Builder}; \node[squarednode] (langium) [above left=of preParser] {1. Langium Parser};
\node[squarednode] (preParser) [below left=of preludebuilder] {Pre-parser}; \node[roundnode] (jstqlcode) [above=of langium] {JSTQL Code};
\node[squarednode] (babel) [below=of preParser] {Babel}; \node[squarednode] (babel) [below=of preParser] {3. Babel};
\node[roundnode] (usercode) [left=of babel] {User source code}; \node[roundnode] (usercode) [left=of babel] {User source code};
\node[squarednode] (treebuilder) [below=of babel] {Custom Tree builder}; \node[squarednode] (treebuilder) [below=of babel] {4. Custom Tree builder};
\node[squarednode] (matcher) [below=of treebuilder] {Matcher}; \node[squarednode] (matcher) [below=of treebuilder] {5. Matcher};
\node[squarednode] (transformer) [below=of matcher] {Transformer}; \node[squarednode] (transformer) [below=of matcher] {6. Transformer};
\node[squarednode] (joiner) [below=of transformer] {Generator}; \node[squarednode] (joiner) [below=of transformer] {7. Generator};
\draw[->] (jstqlcode.south) -- (langium.north); \draw[->] (jstqlcode.south) -- (langium.north);
@ -67,19 +67,21 @@ In this section, the implementation of the parser for \DSL will be described. Th
\subsection{Langium} \subsection{Langium}
Langium \cite{Langium} is primarily used to create parsers for Domain Specific Language, these kinds of parsers output an Abstract Syntax Tree that is later used to create interpreters or other tooling. In the case of \DSL we use Langium to generate an AST definition in the form of TypeScript Objects, these objects and their relation are used as definitions for the tool to do matching and transformation of user code. Langium \cite{Langium} is a language workbench primarily used to create parsers and Integrated Development Environments for domain specific languages. These kinds of parsers produce Abstract Syntax Trees that is later used to create interpreters or other tooling. In this project, we use Langium to generate an AST definition in the form of TypeScript Objects. These objects and their structure are used as definitions for the tool to do matching and transformation of user code.
In order to generate this parser, Langium required a definition of a Grammar. A grammar is a set of instructions that describe a valid program. In our case this is a definition of describing a proposal, and its applicable to, transform to, descriptions. A grammar in Langium starts by describing the \texttt{Model}. The model is the top entry of the grammar, this is where the description of all valid top level statements. In order to generate this parser, Langium requires a definition of a grammar. A grammar is a specification that describes a valid program. The \DSL grammar describes the structure of \DSL, such as \texttt{proposals}, \texttt{cases}, \texttt{applicable to}, and \texttt{transform to}. A grammar in Langium starts by describing the \texttt{Model}. The model is the top entry of the grammar, this is where the description of all valid top level statements.
In \DSL the only valid top level statement is the definition of a proposal. This means our language grammar model contains only one list, which is a list of 0 or many \texttt{Proposal} definitions. A Proposal definition is denoted by a block, which is denoted by \texttt{\{...\}} containing some valid definition. In the case of \DSL this block contains 1 or many definitions of \texttt{Case}. Contained within the \texttt{Model} rule, is one or more proposals. Each proposal is defined with the rule \texttt{Proposals}, and starts with the keyword \texttt{proposal}, followed by a name, and a code block. This rule is designed to contain every definition of a transformation related to a specific proposal. To hold every transformation definition, a proposal definition contains one or more cases.
\texttt{Case} is defined very similarly to \texttt{Proposal}, as it contains only a block containing a definition of a \texttt{Section} The \texttt{Case} rule is created to contain a single transformation. Each case starts with the keyword \texttt{case}, followed by a name for the current case, then a block for that case's fields. Cases are designed in this way to separate different transformation definitions within a proposal. Each case contains a single definition used to match against user code, and a definition used to transform a match.
The \texttt{Section} is where a single case of some applicable code and its corresponding transformation is defined. This definition contains specific keywords do describe each of them, \texttt{applicable to} denotes a definition of some template \DSL uses to perform the matching algorithm. \texttt{transform to} contains the definition of code used to perform the transformation. The rule, \texttt{AplicableTo}, is designed to hold a single template used for matching. It starts with the keywords \texttt{applicable} and \texttt{to}, followed by a block designed to hold the matching template definition. The template is defined as the terminal \texttt{STRING}, and is parsed as a raw string for characters by Langium\cite{Langium}.
In order to define exactly what characters/tokens are legal in a specific definition, Langium uses terminals defined using Regular Expressions, these allow for a very specific character-set to be legal in specific keys of the AST generated by the parser generated by Langium. In the definition of \texttt{Proposal} and \texttt{Pair} the terminal \texttt{ID} is used, this terminal is limited to allow for only words and can only begin with a character of the alphabet or an underscore. In \texttt{Section} the terminal \texttt{TEXT} is used, this terminal is meant to allow any valid JavaScript code and the custom DSL language described in \ref{sec:DSL_DEF}. Both these terminals defined allows Langium to determine exactly what characters are legal in each location.
\begin{lstlisting}[caption={Definition of \DSL in Langium}, label={def:JSTQLLangium}] The rule, \texttt{TransformTo}, is created to contain a single template used for transforming a match. It starts with the keywords \texttt{transform} and \texttt{to}, followed by a block that holds the transformation definition. This transformation definition is declared with the terminal \texttt{STRING}, and is parser at a string of characters, same as the template in \texttt{applicable to}.
In order to define exactly what characters/tokens are legal in a specific definition, Langium uses terminals defined using Regular Expressions, these allow for a very specific character-set to be legal in specific keys of the AST generated by the parser generated by Langium. In the definition of \texttt{Proposal} and \texttt{Pair} the terminal \texttt{ID} is used, this terminal is limited to allow for only words and can only begin with a character of the alphabet or an underscore. In \texttt{Section} the terminal \texttt{STRING} is used, this terminal is meant to allow any valid JavaScript code and the custom DSL language described in \ref{sec:DSL_DEF}. Both these terminals defined allows Langium to determine exactly what characters are legal in each location.
\begin{lstlisting}[caption={Definition of \DSL in Langium.}, label={def:JSTQLLangium}]
grammar Jstql grammar Jstql
entry Model: entry Model:
@ -93,14 +95,14 @@ Proposal:
Case: Case:
"case" name=ID "{" "case" name=ID "{"
aplTo=ApplicableTo aplTo=ApplicableTo
traTo=TraTo traTo=TransformTo
"}"; "}";
ApplicableTo: ApplicableTo:
"applicable" "to" "{" "applicable" "to" "{"
apl_to_code=STRING apl_to_code=STRING
"}"; "}";
TraTo: TransformTo:
"transform" "to" "{" "transform" "to" "{"
transform_to_code=STRING transform_to_code=STRING
"}"; "}";
@ -109,16 +111,16 @@ terminal ID: /[_a-zA-Z][\w_]*/;
terminal STRING: /"[^"]*"|'[^']*'/; terminal STRING: /"[^"]*"|'[^']*'/;
\end{lstlisting} \end{lstlisting}
In the case of \DSL, we are not actually implementing a programming language meant to be executed. We are using Langium in order to generate an AST that will be used as a markup language, similar to YAML, JSON or TOML. The main reason for using Langium in such an unconventional way is Langium provides support for Visual Studio Code integration, and it solves the issue of parsing the definition of each proposal manually. However with only the grammar we cannot actually verify the wildcards placed in \texttt{apl\_to\_code} and \texttt{transform\_to\_code} are correctly written. This is done by using a feature of Langium called \texttt{Validator}. In the case of \DSL, we are not implementing a programming language meant to be executed. We are using Langium in order to generate an AST that will be used as a markup language, similar to YAML, JSON or TOML\cite{TOML}. The main reason for using Langium in such an unconventional way is Langium provides support for Visual Studio Code integration, and it solves the issue of parsing the definition of each proposal manually. However with only the grammar we cannot actually verify the wildcards placed in \texttt{apl\_to\_code} and \texttt{transform\_to\_code} are correctly written. This is done by using a feature of Langium called \texttt{Validator}.
\subsection*{Langium Validator} \subsection*{Langium Validator}
A Langium validator allows for further checks on the templates written withing \DSL, a validator allows for the implementation of specific checks on specific parts of the grammar. A Langium validator allows for further checks DSL code, a validator allows for the implementation of specific checks on specific parts of the grammar.
\DSL does not allow empty typed wildcard definitions in \texttt{applicable to}, this means a wildcard cannot be untyped or allow any AST type to match against it. This is not possible to verify with the grammar, as inside the grammar the code is simply defined as a \texttt{STRING} terminal. This means further checks have to be implemented using code. In order to do this we have a specific \texttt{Validator} implemented on the \texttt{Pair} definition of the grammar. This means every time anything contained within a \texttt{Pair} is updated, the language server shipped with Langium will perform the validation step and report any errors. \DSL does not allow empty typed wildcard definitions in \texttt{applicable to} blocks, this means a wildcard cannot be untyped or allow any AST type to match against it. This is not possible to verify within the grammar, as inside the grammar the code is simply defined as a \texttt{STRING} terminal. This means further checks have to be implemented using code. In order to do this we have a specific \texttt{Validator} implemented on the \texttt{Case} definition of the grammar. This means every time anything contained within a \texttt{Case} is updated, the language server created with Langium will perform the validation step and report any errors.
The validator uses \texttt{Pair} as it's entry point, as it allows for a checking of wildcards in both \texttt{applicable to} and \texttt{transform to}, allowing for a check for if a wildcard identifier used in \texttt{transform to} exists in the definition of \texttt{applicable to}. The validator uses \texttt{Case} as its entry point, as it allows for a checking of wildcards in both \texttt{applicable to} and \texttt{transform to}, allowing for a check for whether a wildcard identifier used in \texttt{transform to} exists in the definition of \texttt{applicable to}.
\begin{lstlisting}[language={JavaScript}] \begin{lstlisting}[language={JavaScript}]
export class JstqlValidator { export class JstqlValidator {
@ -144,65 +146,74 @@ export class JstqlValidator {
\subsection*{Using Langium as a parser} \subsection*{Using Langium as a parser}
\cite{Langium}{Langium} is designed to automatically generate a lot of tooling for the language specified using its grammar. However, in our case we have to parse the \DSL definition using Langium, and then extract the Abstract syntax tree generated in order to use the information it contains. Langium\cite{Langium} is designed to automatically generate extensive tool support for the language specified using its grammar. However, in our case we have to parse the \DSL definition using Langium, and then extract the Abstract syntax tree generated in order to use the information it contains.
To use the parser generated by Langium, we created a custom function \texttt{parseDSLtoAST()} within our Langium project, this function takes a string as an input, the raw \DSL code, and outputs the pure AST using the format described in the grammar described in \figFull[def:JSTQLLangium]. This function is exposed as a custom API for our tool to interface with. This also means our tool is dependent on the implementation of the Langium parser to function with \DSL. The implementation of \DSLSH is entirely independent. To use the parser generated by Langium, we created a custom function \texttt{parseDSLtoAST}, which takes a string as an input (the raw \DSL code), and outputs the pure AST using the format described in the grammar, see Listing \ref{sec:DSL_DEF}. This function is exposed as a custom API for our tool to interface with. This also means our tool is dependent on the implementation of the Langium parser to function with \DSL. The implementation of \DSLSH is entirely independent.
When interfacing with the Langium parser to get the Langium generated AST, the exposed API function is imported into the tool, when this API is ran, the output is on the form of the Langium \textit{Model}, which follows the same form as the grammar. This is then transformed into an internal object structure used by the tool, this structure is called \texttt{TransformRecipe}, and is then passed in to perform the actual transformation. When interfacing with the Langium parser to get the Langium generated AST, the exposed API function is imported into the tool, when this API is executed, the output is on the form of the Langium \texttt{Model}, which follows the same form as the grammar. This is then transformed into an internal object structure used by the tool, this structure is called \texttt{TransformRecipe}, and is then passed in to perform the actual transformation.
\section{Pre-parsing} \section{Wildcard extraction and parsing}
In order to refer to internal DSL variables defined in \texttt{applicable to} in the transformation, we need to extract this information from the template definitions and pass that on to the matcher. In order to refer to internal DSL variables defined in \texttt{applicable to} and \texttt{transform to} blocks of the transformation, we need to extract this information from the template definitions and pass that on to the matcher.
\subsection*{Why not use Langium?} \subsection*{Why not use Langium for wildcard parsing?}
Langium has support for creating a generator for generating an artifact, this actually suits the needs of \DSL quite well and could be used to extract the wildcards from each \texttt{pair} and create the \texttt{TransformRecipe}. This would, as a consequence, make \DSLSH not be entirely independent, and the entire tool would rely on Langium. This is not preferred as that would mean both ways of defining a proposal both are reliant of Langium and not separated. The reason for using our own pre-parser is to allow for an independent way to define transformations using our tool. Langium has support for creating a generator to output an artifact, which is some transformation applied to the AST built by the Langium parser. This suits the needs of \DSL quite well and could be used to extract the wildcards from each \texttt{pair} and create the \texttt{TransformRecipe}. This is the official way the developers of Langium want this kind of functionality to be implemented, however, the implementation would still be mostly the same, as the parsing of the wildcards still has to be done "manually" with code. Therefore, it was decided for this project to keep the parsing of the wildcards within the tool itself. If we were to use Langium generators to parse the wildcards, it would make \DSLSH not entirely independent, and the entire tool would rely on Langium. This is not preferred as that would mean both ways of defining a proposal are reliant of Langium. The reason for using our own extractor is to allow for an independent way to define transformations using our tool.
\subsection*{Extracting wildcards from \DSL} \subsection*{Extracting wildcards from \DSL}
In order to allow the use of \cite[Babel]{Babel}, the wildcards present in the blocks of \texttt{applicable to} and \texttt{transform to} have to be parsed and replaced with some valid JavaScript. This is done by using a pre-parser that extracts the information from the wildcards and inserts an \texttt{Identifier} in their place. In order to allow the use of Babel\cite{Babel}, the wildcards present in the \texttt{applicable to} blocks and \texttt{transform to} blocks have to be parsed and replaced with some valid JavaScript. This is done by using a pre-parser that extracts the information from the wildcards and inserts an \texttt{Identifier} in their place.
To pre-parse the text, we look at each and every character in the code section, when a start token of a wildcard is discovered, which is denoted by \texttt{<<}, everything after that until the closing token, which is denoted by \texttt{>>}, is then treated as an internal DSL variable and will be stored by the tool. A variable \texttt{flag} is used, so when the value of flag is false, we know we are currently not inside a wildcard block, this allows us to just pass the character through to the variable \texttt{cleanedJS}. When \texttt{flag} is true, we know we are currently inside a wildcard block and we collect every character of the wildcard block into \texttt{temp}. Once we hit the end of the wildcard block, when we have consumed the entirety of the wildcard, it is then passed to a tokenizer, then to a recursive descent parser. To extract the wildcards from the template, we look at each character in the template. If a start token of a wildcard is discovered, which is denoted by \texttt{<<}, everything after that until the closing token, which is denoted by \texttt{>>}, is then treated as an internal DSL variable and will be stored by the tool. A variable \texttt{flag} is used (line 5,10 \ref{lst:extractWildcard}), when the value of flag is false, we know we are currently not inside a wildcard block, this allows us to just pass the character through to the variable \texttt{cleanedJS} (line 196 \ref{lst:extractWildcard}). When \texttt{flag} is true, we know we are currently inside a wildcard block and we collect every character of the wildcard block into \texttt{temp}. Once we hit the end of the wildcard block, when we have consumed the entirety of the wildcard, the contents of the \texttt{temp} variable is passed to a tokenizer, then the tokens are parsed by a recursive descent parser (line 10-21 \ref{lst:extractWildcard}).
Once the wildcard is parsed, and we know it is safely a valid wildcard, we insert an identifier into the JavaScript template where the wildcard would reside. This allows for easier identifications of wildcards when performing matching/transformation as we can identify whether or not an Identifier in the code is the same as the identifier for a wildcard. This however, does introduce the problem of collisions between the wildcard identifiers inserted and identifiers present in the users code. In order to avoid this, the tool adds \texttt{\_\-\-\_} at the beginning of every identifier inserted in place of a wildcard. This allows for easier identification of if an Identifier is a wildcard, and avoids collisions where a variable in the user code has the same name as a wildcard inserted into the template. Once the wildcard is parsed, and we know it is safely a valid wildcard, we insert an identifier into the JavaScript template where the wildcard would reside. This allows for easier identifications of wildcards when performing matching/transformation as we can identify whether or not an Identifier in the code is the same as the identifier for a wildcard. This however, does introduce the problem of collisions between the wildcard identifiers inserted and identifiers present in the users code. In order to avoid this, the tool adds \texttt{\_\-\-\_} at the beginning of every identifier inserted in place of a wildcard. This allows for easier identification of if an Identifier is a wildcard, and avoids collisions where a variable in the user code has the same name as a wildcard inserted into the template. This can be seen on line 187 of the example below.
\begin{lstlisting}[language={JavaScript}] \begin{lstlisting}[language={JavaScript}, caption={Extracting wildcard from template.}, label={lst:extractWildcard}]
export function parseInternal(code: string): InternalParseResult { export function parseInternal(code: string): InternalParseResult {
for (char of code) { for (let i = 0; i < code.length; i++) {
if (char === "<" && nextChar === "<") { if (code[i] === "<" && code[i + 1] === "<") {
// From now in we are inside of the DSL custom block // From now in we are inside of the DSL custom block
maybeInsideWildcard = true; flag = true;
i += 1;
continue;
} }
if (flag && code[i] === ">" && code[i + 1] === ">") { if (flag && code[i] === ">" && code[i + 1] === ">") {
// We encountered a closing tag // We encountered a closing tag
flag = false; flag = false;
try{ try{
let { identifier, types } = parseWildcard(temp); let wildcard = new WildcardParser(
// Add the new Identifier with collision avoiding characters new WildcardTokenizer(temp).tokenize()
cleanedJS += collisionAvoider(identifier); ).parse();
cleanedJS += collisionAvoider(wildcard.identifier.name);
prelude[identifier] = types; prelude.push(wildcard);
continue; i += 1;
temp = "";
}catch{ continue;
// Maybe encountered bitshift operator or other error }
catch (e){
// We probably encountered a bitshift operator, append temp to cleanedJS
} }
} }
if (flag) {
temp += code[i];
} else {
cleanedJS += code[i];
}
} }
return { prelude, cleanedJS }; return { prelude, cleanedJS };
} }
\end{lstlisting} \end{lstlisting}
\subsection*{Parsing wildcard} \paragraph*{Parsing wildcard}
Once a wildcard has been extracted from the \texttt{pair} definitions inside \DSL, they have to be parsed into a simple Tree to be used when matching against the wildcard. This is accomplished by using a simple tokenizer and a \cite{RecursiveDescent}{Recursive Descent Parser}. Once a wildcard has been extracted from definitions inside \DSL, they have to be parsed into a simple Tree to be used when matching against the wildcard. This is accomplished by using a simple tokenizer and a \cite{RecursiveDescent}{recursive descent parser}.
Our tokenizer simply takes the raw stream of input characters extracted from the wildcard block within the template, and determines which part is what token. Due to the very simple nature of the type expressions, no ambiguity is present with the tokens, so determining what token is meant to come at what time is quite trivial. We simply use a switch case on the current token, if the token is of length 1 we simply accept it and move on to the next character. If the next character is an unexpected one it will produce an error. The tokenizer also groups tokens with a \textit{token type}, this allows for an easier time parsing the tokens later. Our tokenizer takes the raw stream of input characters extracted from the wildcard block within the template, and determines which part is what token. Due to the very simple nature of the type expressions, no ambiguity is present with the tokens, so determining what token is meant to come at what time is quite trivial. We use a switch case on the current token, if the token is of length one we accept it and move on to the next character. If the next character is an unexpected one it will produce an error. The tokenizer also groups tokens with a \textit{token type}, this allows for an simpler parsing of the tokens later.
A recursive descent parser is created to closely mimic the grammar of the language the parser is implemented for, where we define functions for handling each of the non-terminals and ways to determine what non terminal each of the token-types result in. In the case of this parser, the language is a very simple boolean expression language. A recursive descent parser is created to closely mimic the grammar of the language the parser is implemented for, where we define functions for handling each of the non-terminals and ways to determine what non terminal each of the token-types result in. The type expression language is a very simple Boolean expression language, making parsing quite simple.
\begin{lstlisting}[caption={Grammar of type expressions}, label={ex:grammarTypeExpr}] \begin{lstlisting}[caption={Grammar of type expressions}, label={ex:grammarTypeExpr}]
Wildcard: Wildcard:
@ -232,15 +243,15 @@ GroupExpr:
The grammar of the type expressions used by the wildcards can be seen in \figFull[ex:grammarTypeExpr], the grammar is written in something similar to Extended Backus-Naur form, where we define the terminals and non-terminals in a way that makes the entire grammar \textit{solvable} by the Recursive Descent parser. The grammar of the type expressions used by the wildcards can be seen in \figFull[ex:grammarTypeExpr], the grammar is written in something similar to Extended Backus-Naur form, where we define the terminals and non-terminals in a way that makes the entire grammar \textit{solvable} by the Recursive Descent parser.
Our recursive descent parser produces a very simple \cite{AST1,AST2}{AST} which is later used to determine when a wildcard can be matched against a specific AST node, the full definiton of this AST can be seen in \ref*{ex:typeExpressionTypes}. We use this AST by traversing it using a \cite{VisitorPattern}{visitor pattern} and comparing each \textit{Identifier} against the specific AST node we are currently checking, and evaluating all subsequent expressions and producing a boolean value, if this value is true, the node is matched against the wildcard, if not then we do not have a match. Our recursive descent parser produces an \cite{AST1,AST2}{AST} which is later used to determine when a wildcard can be matched against a specific AST node, the full definition of this AST can be seen in Appendix \ref{ex:typeExpressionTypes}. We use this AST by traversing it using a \cite{VisitorPattern}{visitor pattern} and comparing each \texttt{Identifier} against the specific AST node we are currently checking, and evaluating all subsequent expressions and producing a boolean value, if this value is true, the node is matched against the wildcard, if not then we do not have a match.
\subsection*{Pre-parsing \DSLSH} \paragraph*{Extracting wildcards from \DSLSH}
The self-hosted version \DSLSH also requires some form of pre-parsing in order to prepare the internal DSL environment. This is relatively minor and only parsing directly with no insertion compared to \DSL. The self-hosted version \DSLSH also requires some form of pre-parsing in order to prepare the internal DSL environment. This is relatively minor and only parsing directly with no insertion compared to \DSL.
In order to use JavaScript as the meta language to define JavaScript we define a \texttt{Prelude}. This prelude is required to consist of several \texttt{Declaration Statements} where the variable names are used as the internal DSL variables and right side expressions are strings that contain the type expression used to determine a match for that specific wildcard. In order to use JavaScript as the meta language, we define a \texttt{prelude} on the object used to define the transformation case. This prelude is required to consist of several \texttt{Variable declaration} statements, where the variable names are used as the internal DSL variables and right side expressions are strings that contain the type expression used to determine a match for that specific wildcard.
We use Babel to generate the AST of the \texttt{prelude} definition, this allows us to get a JavaScript object structure. Since the structure is very strictly defined, we can expect every \texttt{stmt} of \texttt{stmts} to be a variable declaration, otherwise throw an error for invalid prelude. Then the string value of each of the variable declarations is passed to the same parser used for \DSL wildcards. We use Babel to generate the AST of the \texttt{prelude} definition, this allows us to get a JavaScript object structure. Since the structure is very strictly defined, we can expect every \texttt{stmt} of \texttt{stmts} to be a variable declaration, otherwise throw an error for invalid prelude. Then the string value of each of the variable declarations is passed to the same parser used for \DSL wildcards.
@ -251,7 +262,7 @@ The reason this is preferred is it allows us to avoid having to extract the wild
Allowing the tool to perform transformations of code requires the generation of an Abstract Syntax Tree from the users code, \texttt{applicable to} and \texttt{transform to}. This means parsing JavaScript into an AST, in order to do this we use a tool \cite[Babel]{Babel}. Allowing the tool to perform transformations of code requires the generation of an Abstract Syntax Tree from the users code, \texttt{applicable to} and \texttt{transform to}. This means parsing JavaScript into an AST, in order to do this we use a tool \cite[Babel]{Babel}.
The most important reason for choosing to use Babel for the purpose of generating the AST's used for transformation is due to the JavaScript community surrounding Babel. As this tool is dealing with proposals before they are part of JavaScript, a parser that supports early proposals for JavaScript is required. Babel supports most Stage 2 proposals through its plugin system, which allows the parsing of code not yet part of the language. The most important reason for choosing to use Babel for the purpose of generating the AST's used for transformation is due to the JavaScript community surrounding Babel. As this tool is dealing with proposals before they are part of JavaScript, a parser that supports early proposals for JavaScript is required. Babel works closely with TC39 to support experimental syntax\cite{BabelProposalSupport} through its plugin system, which allows the parsing of code not yet part of the language.
\subsection*{Custom Tree Structure} \subsection*{Custom Tree Structure}

View file

@ -398,4 +398,23 @@
urldate = {2024-05-26}, urldate = {2024-05-26},
note = {[Online; accessed 26. May 2024]}, note = {[Online; accessed 26. May 2024]},
url = {https://tc39.es} url = {https://tc39.es}
}
@misc{TOML,
title = {{TOML: Tom's Obvious Minimal Language}},
year = {2024},
month = may,
urldate = {2024-05-27},
note = {[Online; accessed 27. May 2024]},
url = {https://toml.io/en}
}
@misc{BabelProposalSupport,
title = {{proposals}},
journal = {GitHub},
year = {2024},
month = may,
urldate = {2024-05-27},
note = {[Online; accessed 27. May 2024]},
url = {https://github.com/babel/proposals}
} }