a

2024-05-21 16:25:45 +02:00 · 2024-05-21 16:25:45 +02:00 · 741cdcd658
commit 741cdcd658
parent 1b7370b37f
4 changed files with 50 additions and 23 deletions
--- a/build/report.pdf
+++ b/build/report.pdf
--- a/chapter/ch4.tex
+++ b/chapter/ch4.tex
@ -148,41 +148,34 @@ Langium has support for creating a generator for generating an artifact, this ac

 In order to allow the use of \cite[Babel]{Babel}, the wildcards present in the blocks of \texttt{applicable to} and \texttt{transform to} have to be parsed and replaced with some valid JavaScript. This is done by using a pre-parser that extracts the information from the wildcards and inserts an \texttt{Identifier} in their place. 

- To pre-parse the text, we look at each and every character in the code section, when a start token of a wildcard is discovered, which is denoted by \texttt{<<}, everything after that until the closing token, which is denoted by \texttt{>>}, is then treated as an internal DSL variable and will be stored by the tool. A variable \texttt{flag} is used, so when the value of flag is false, we know we are currently not inside a wildcard block, this allows us to just pass the character through to the variable \texttt{cleanedJS}. When \texttt{flag} is true, we know we are currently inside a wildcard block and we collect every character of the wildcard block into \texttt{temp}. Once we hit the end of the wildcard block, when we have consumed the entirety of the wildcard, it is then passed to a tokenizer, then to a recursive descent parser. 
+ To pre-parse the text, we look at each and every character in the code section, when a start token of a wildcard is discovered, which is denoted by \texttt{<<}, everything after that until the closing token, which is denoted by \texttt{>>}, is then treated as an internal DSL variable and will be stored by the tool. A variable \texttt{flag} is used, so when the value of flag is false, we know we are currently not inside a wildcard block, this allows us to just pass the character through to the variable \texttt{cleanedJS}. When \texttt{flag} is true, we know we are currently inside a wildcard block and we collect every character of the wildcard block into \texttt{temp}. Once we hit the end of the wildcard block, when we have consumed the entirety of the wildcard, it is then passed to a tokenizer, then to a recursive descent parser.
+ 
+Once the wildcard is parsed, and we know it is safely a valid wildcard, we insert an identifier into the JavaScript template where the wildcard would reside. This allows for easier identifications of wildcards when performing matching/transformation as we can identify whether or not an Identifier in the code is the same as the identifier for a wildcard. This however, does introduce the problem of collisions between the wildcard identifiers inserted and identifiers present in the users code. In order to avoid this, the tool adds \texttt{\_\-\-\_} at the beginning of every identifier inserted in place of a wildcard. This allows for easier identification of if an Identifier is a wildcard, and avoids collisions where a variable in the user code has the same name as a wildcard inserted into the template. 

 \begin{lstlisting}[language={JavaScript}]
 export function parseInternal(code: string): InternalParseResult {
-    let cleanedJS = "";
-    let temp = "";
-    let flag = false;
-    let prelude: InternalDSLVariable = {};
-
-    for (let i = 0; i < code.length; i++) {
-        if (code[i] === "<" && code[i + 1] === "<") {
+     for (char of code) {
+        if (char === "<" && nextChar === "<") {
            // From now in we are inside of the DSL custom block
-            flag = true;
-            i += 1;
-            continue;
+            maybeInsideWildcard = true;
        }

        if (flag && code[i] === ">" && code[i + 1] === ">") {
            // We encountered a closing tag
            flag = false;

-            let { identifier, types } = parseInternalString(temp);
-
-            cleanedJS += identifier;
+            try{
+            let { identifier, types } = parseWildcard(temp);
+            // Add the new Identifier with collision avoiding characters
+            cleanedJS += collisionAvoider(identifier);

            prelude[identifier] = types;
-            i += 1;
-            temp = "";
            continue;
-        }

-        if (flag) {
-            temp += code[i];
-        } else {
-            cleanedJS += code[i];
+            }catch{
+                // Maybe encountered bitshift operator or other error
+            }
+
        }
    }
    return { prelude, cleanedJS };
--- a/chapter/related_work.tex
+++ b/chapter/related_work.tex
@ -21,8 +21,14 @@ Browse-By-Query is a language created for Java that analyses Java Bytecode files

 \subsection*{.QL}

-.QL is an object-oriented query language. It can be used to query with a similar style to SQL queries, and is used in the Semmle
-\subsection*{JQuery}
+.QL is an object-oriented query language. It supports querying a wide array of data structures, code being one of them. \cite{ProgrammingLanguageEcolutionViaSourceCodeQueryLanguages}.QL has a commercial implementation \textit{SemmleCode}, which comes with a full editor and various pre-defined code transformations that might be useful for the end developer. 
+
+
+\subsection*{PMD XPath}
+
+PMD is the most versatile query language for Java source code querying out of all the ones explored in this section. \cite{ProgrammingLanguageEcolutionViaSourceCodeQueryLanguages}PMD supports querying of all Java constructs , it has this wide support due to constructing the entire codebase in XML format. This language was build for static code analysis, and therefore is a great way to perform queries on static code, it is mostly used as a tool for code editors to enforce programming styles. 
+
+

 \section*{JetBrains structural search}

--- a/generators/refs.bib
+++ b/generators/refs.bib
@ -187,4 +187,32 @@
  urldate = {2024-05-21},
  note    = {[Online; accessed 21. May 2024]},
  url     = {https://github.com/acornjs/acorn}
+}
+
+@misc{JQuery,
+  author  = {{OpenJS Foundation - openjsf.org}},
+  title   = {{jQuery}},
+  year    = {2024},
+  month   = may,
+  urldate = {2024-05-21},
+  note    = {[Online; accessed 21. May 2024]},
+  url     = {https://jquery.com}
+}
+
+@inproceedings{ProgrammingLanguageEcolutionViaSourceCodeQueryLanguages,
+  author    = {Urma, Raoul-Gabriel and Mycroft, Alan},
+  title     = {Programming language evolution via source code query languages},
+  year      = {2012},
+  isbn      = {9781450316316},
+  publisher = {Association for Computing Machinery},
+  address   = {New York, NY, USA},
+  url       = {https://doi.org/10.1145/2414721.2414728},
+  doi       = {10.1145/2414721.2414728},
+  abstract  = {Programming languages evolve just like programs. Language features are added and removed, for example when programs using them are shown to be error-prone. When language features are modified, deprecated, removed or even deemed unsuitable for the project at hand, it is necessary to analyse programs to identify occurrences to refactor.Source code query languages in principle provide a good way to perform this analysis by exploring codebases. Such languages are often used to identify code to refactor, bugs to fix or simply to understand a system better.This paper evaluates seven Java source code query languages: Java Tools Language, Browse-By-Query, SOUL, JQuery, .QL, Jackpot and PMD as to their power at expressing queries required by several use cases (such as code idioms to be refactored).},
+  booktitle = {Proceedings of the ACM 4th Annual Workshop on Evaluation and Usability of Programming Languages and Tools},
+  pages     = {35–38},
+  numpages  = {4},
+  keywords  = {program analysis, query languages, source code},
+  location  = {Tucson, Arizona, USA},
+  series    = {PLATEAU '12}
 }