“[…] a set of Swift bindings for the libSyntax library. It allows for Swift tools to parse, inspect, generate, and transform Swift source code.”
Well, seems exactly what I needed for CoherentSwift 🎉
It is used by the Swift Compiler in its very first task and is responsible for generating an Abstract Syntax Tree (AST), which will be taken down the further steps. I suggest you get familiar with how this process works reading a little about the compiler architecture.
Among the tools that already use SwiftSyntax we find code formatters such as SwiftRewriter, swift-format and others like Piranha - Uber’s own tool to refactor code related to stale flags. One thing they have in common, these tools modify/edit your code.
I really suggest you take a look at it.
As already seen above, most of the tools using SwiftSyntax modify code, which in my view turns out to be the biggest strength of the lib, due to the fact that these care about snippets of code, one at a time, they go incrementally through pieces of
TokenSyntax and act differently depending on their
tokenKind. This is done through
This couldn’t be more straight forward! Let’s see…
- Visit a
- Given it’s kind, do something with it;
- Return the modified token.
This is a set of high-level APIs for creating code. Anything, literally anything. It isn’t necessarily convenient as you’d have to create syntax one by one. Given the need to create a static property
static let shared = Something(), we’d make at least 5 calls to the APIs:
It’s at best exhausting to let code write code.
You can check the available APIs going through all the 5525(!) lines of SyntaxFactory’s code.
But… Hm… I don’t want to edit or write code 🤔
With CoherentSwift I don’t want to create nor edit code, and here SwiftSyntax is extremely helpful - just not as much as it could be.
Similar to Rewriter and Factory, we have SyntaxVisitor. It allow us to go through
TokenSyntax (a.k.a walk all nodes of the tree), and by overridding it’s
visit(_:) methods we can parse/analyse given
TokenSyntax. The visit method’s return a
SyntaxVisitorContinueKind, an enumeration:
See this basically as a
Bool, for every visit, should it move forward and also pay a visit to its children, or should it stop right now?
In AST we have syntaxes for everything, and we have at least double the number of methods for them, but before going through a few specific examples, let’s cover the generic
visit(_:) method here.
This couldn’t be more generic, it goes through every node. If you do want to parse them all using something in common, great, and if you want to deal with different
TokenKind in different ways, a switch case is your friend.
However, we do have
visit(_:) methods for specific
If we want to parse entire classes:
This gives us easy access to the class syntax, with immediate properties such as
members and more.
But remember, all this is abstract, they will have more and more abstract syntaxes within them, where at one point we can also find
Overriding multiple visits
I find it useful to override multiple
visit(_:) methods, as I want to process them in completely different ways, i.e.:
I also want to keep a record of this tree so that I can measure the cohesion of this code, for that I want to know specific things:
- Which high definitions (Struct, Class) I find in a file
- Which properties are members of these definitions
- Which methods are members of these definitions
- Which properties are members of these methods
- Which methods are private
- Which properties are static
- Which high definition has extensions
This is enough for CoherentSwift to measure the cohesion for the given code.
When cohesion is high, it means that the methods and variables of the class are co-dependent and hang together as a logical whole.
- Clean Code pg. 140
Struggles with Parsing
In my recent experience, it’s extremely easy to go through the high-level tokens within another, but going down the rabbit hole trying to map an entire class at once turned out to be very error-prone, for this reason, I’m following two approaches at the moment:
Map immediate members of a definition within the same visit.
For this purpose, I’ve created a Factory to give back the expected properties for a given node:
Expect another visit to a different syntax (i.e.:
Here, the factory also processes high-level tokens looking for properties and return our very own
CSMethodafter assigning the found
The downside of the last approach is, with all the abstract syntaxes, it is also error-prone to climb up the tree as it is climbing down, so I had to keep track of the
currentDefinition being processed if any.
If you’ve seen enough of
visit(_:), how about
You read it right, for every visit there is a visitPost. A post is called after a visit has been paid to the given syntax and all its descendants, it, therefore, doesn’t return any value. It’s very useful for post-processing.
SwiftSyntax is very powerful, not always easy - depending on what you want to achieve, but has certainly increased the accuracy of CoherentSwift’s measurement, as part of the upcoming 0.5.0 release.