Razor parser engine for the razor syntax highlighter
Here’s my Razor Parser class that I use in my Razor Syntax Highlighter. I’m going to post later about how I integrate this into Visual Studio using the Visual Studio SDK.
public IList<Token> Parse(ITextBuffer buffer) {
List<Token> tokens = new List<Token>();
CodeLanguageService languageService = CodeLanguageService.GetServiceByExtension(".cshtml");
InlinePageParser parser = new InlinePageParser(languageService.CreateCodeParser(), new HtmlMarkupParser());
consumer = new RazorParserConsumer();
consumer.OnSpanStart += delegate(object s, ParsedSpanEventArgs span) {
switch (span.span.Type) {
case SpanType.Code:
tokens.AddRange(RazorCSharpCodeParser.Parse(span.span.Start.AbsoluteIndex, span.span.Content).ToArray());
if (span.span.Content.Trim().Length != 0) {
tokens.Add(new Token(TokenType.CodeBlock, new Span(span.span.Start.AbsoluteIndex, span.span.Length)));
}
break;
case SpanType.MetaCode:
case SpanType.Transition:
tokens.Add(new Token(TokenType.CodeStart, new Span(span.span.Start.AbsoluteIndex, span.span.Length)));
break;
case SpanType.Markup:
break;
}
};
MemoryStream ms = new MemoryStream(System.Text.ASCIIEncoding.ASCII.GetBytes(buffer.CurrentSnapshot.GetText()));
using (System.IO.StreamReader reader = new System.IO.StreamReader(ms)) {
parser.Parse(reader, consumer);
}
return tokens;
}
The ITextBuffer parameter of the Parse() method is pass from Visual Studio using the EditorClassifier project. This gives us access to the text content within the editor.
There are 4 different types of blocks returned to us via the InlinePageParser: Code, Markup, MetaCode, Transitions.
Transitions
Simply escaping from html into the code world. In the case of razor that’s one of several sets of characters @, @:, @<, and <text>
MetaCode
These are characters that define a block of code. These include () and {}. for example
@{
View.Title = "Your Title Here";
LayoutPage = "~/Views/Shared/&#95;Layout.cshtml";
}
or
@("The title of this page is " + View.Title)
Code
This specifies that the block of text is actually code and not markup or a transition. I’ve just used basic keyword matching and a bit of magic to add syntax highlighting. It was very basic and not intended to be a full source highlighter.
Markup
We don’t care about markup here in that we’re letting the Visual Studio default html classifiers here syntax highlight that.
Token Types
We’re only interested in a few token types to highlight cshtml to the point where it’s basically usable.
public enum TokenType {
CodeBlock,
CodeStart,
Unknown,
WhiteSpace,
CSharpObject,
CSharpKeyword,
CSharpString,
CSharpComment
}
This gives us our basic Content Types. 1. CodeBlock gives us the background color for the code. 2. CodeStart is the transition colors. 3. Unknown/Whitespace are currently unused. (Whitespace was used in a previous implementation) 4. CSharp[x] is just some very basic c# color coding
This is the heart of the Razor Syntax Highlighter. It works reasonably well. I’m sure there are lots of improvements to be made regarding the c# highlighter and the highlighter’s overall performance.
In my next post I’ll explain how I integrate the parser with the editor classifier though I’m sure several of you will already have figured it out and will be working on your own soon.