String matching algorithms for syntax highlighting
Posted: Tue Sep 28, 2010 10:01 am
Hi Folks ,
I am implementing my own C compiler and the corresponding IDE for my own little C compiler. It aims to be an easily retargetable C compiler. Currently I prototyped a small compiler that supports C expressions , if statement and while statement . Only data types currently supported are char , int . it also supports single dimensional arrears ( that's all for the time being ) . It uses fasm as assembler. The project is far from complete . A small driver program invokes the gnu C pre processor and the compiler and fasm to produce the final executable.
I am writing an IDE and linker for my compiler as well . Currently I am looking for algorithms to implement efficient string pattern matching without much complexity , for my IDE. My IDE more or less looks like the Turbo C++ IDE created by Borland ( Screenshot attached ). Now I would like to implement syntax highlighting . Here are my lines of thought
--Thomas
I am implementing my own C compiler and the corresponding IDE for my own little C compiler. It aims to be an easily retargetable C compiler. Currently I prototyped a small compiler that supports C expressions , if statement and while statement . Only data types currently supported are char , int . it also supports single dimensional arrears ( that's all for the time being ) . It uses fasm as assembler. The project is far from complete . A small driver program invokes the gnu C pre processor and the compiler and fasm to produce the final executable.
I am writing an IDE and linker for my compiler as well . Currently I am looking for algorithms to implement efficient string pattern matching without much complexity , for my IDE. My IDE more or less looks like the Turbo C++ IDE created by Borland ( Screenshot attached ). Now I would like to implement syntax highlighting . Here are my lines of thought
- Pattern matching using an NFA :- create the NFA for the string to be matched and find the match using 2 stack nfa simulation algorithmn or by using thomson's construction1)
- Brute force pattern matching :- Very easy to implement , but performance penalty is huge.
- Boyer-Moore algorithm -- (found by goggling ) : See : Boyer-Moore algorithm
--Thomas