Sunday 8 February 2015

Lexical Analyzer in C

After FAILING to write a Lexical Analyzer in C by just reading the source file character by character, I decided to follow a different approach to make my task simpler:

1.Trim the lines and store the result in a temporary file.
2.Take the above temporary file as input, open another temporary file to write output, look for signs
of string/character constants and comments  as they can enclose any character sequence,
if found, store that in a global data-structure write some key in place of that in new temporary file,
else write the same to the new temporary file.
After step 2 we will have a temporary file which doesn't contain string/character constants (in their place we will have keys)
and comments.
3. Use this temporary file as input and extract pre-processor directories (store them in data-structure and replace with the key)
write o/p to another temporary file.
4. Now the file contains only identifiers/­keywords/numbers/­operators/delimeters and keys we replaced with .
(If a pre-processor statement contain any string constants they will be extracted and replaced with key .
So, while printing, you need some more lines of code to replace the key with the string-constant)
5. Now read character by character and split into tokens (if you find the key print the details in key) I followed the above steps to avoid confusion.

The program I wrote can respond with some errors.
  
This program takes file names as arguments.

If you follow those steps you can write program easily even the program is large.

Let me know if you have any other approach. 

No comments:

Post a Comment