超级简单词法分析器
The scanner implementation
This project can separate streams from the original code into several tokens which will be used in compiling. There are several kinds of tokens, such as—— operator, id, delimiter, number, bool symbols, key words and so on.
First, the code can create some array for each type of token, such as @id, @key, @op.
Secondly, using open method to read a sample code file, then use @line= to read it in lines and delete the \r of each line using chop(@line)after that, use open method to create a new text file which will save the tokens which has been separated from the original code.
Third, while statement is used to loop each line, in order to compare each char of one line stored an array @char with the kinds of token which have been mentioned above to judge if this char should be added as another token or not. Such as, if the second “=” in original code “==“ should be added as another token.
#!/user/bin/perl @keyword = qw (if else while do break); @basic = qw (int char bool float); @boolean = qw (true false); @delimiter = qw ({ } [ ] ; , ( )); @arithmetic_operator = qw (+ - * /); @logical_operator = qw (|| &&); @comparison_operator = qw (== != < > <= >=); $assignment_operator = "="; @unary_operator = qw (! -); $line=1; print "Input the grammer :\n\n"; print "Line 0 : #begin\n"; while($input ne "#end"){ print "Line ".$line." : "; $input = <STDIN>; chomp ($input); #($input) = $input =~ s/(^ +)//; @input = split(/ +/,$input); $scalar = @input; foreach $input (@input){ if($input =~ /^[A-Z a-z]/){ $state1=1; foreach $keyword(@keyword){ if($keyword eq $input){ $Input = uc $input; print "\t< ".$Input." , reserved word >\n"; $state1=0; } } foreach $basic(@basic){ if($basic eq $input){ $Input = uc $input; print "\t< ".$Input." , basic type >\n"; $state1=0; } } foreach $boolean(@boolean){ if($boolean eq $input){ $Input = uc $input; print "\t< ".$Input." , boolean value >\n"; $state1=0; } } if($input =~ /[A-Z a-z]/&&$state1 == 1){ print "\t< ID , identifier name ".$input." >\n"; } }elsif($input =~ /[0-9]/){ if($input =~ /\./){ print "\t< NUMBER, float value ".$input." >\n"; }else{ print "\t< NUMBER , ingeter value ".$input." >\n"; } }else{ foreach $delimiter(@delimiter){ if($delimiter eq $input){ print "\t< ".$input." >\n"; } } foreach $arithmetic_operator(@arithmetic_operator){ if($arithmetic_operator eq $input){ print "\t< ".$input." , arithmetic operator >\n"; } } foreach $logical_operator(@logical_operator){ if($logical_operator eq $input){ print "\t< ".$input." , logical operator >\n"; } } foreach $comparison_operator(@comparison_operator){ if($comparison_operator eq $input){ print "\t< ".$input." , comparison operator >\n"; } } foreach $unary_operator(@unary_operator){ if($unary_operator eq $input){ print "\t< ".$input." , unary operator >\n"; } } if($assignment_operator eq $input){ print "\t< ".$input." , assignment operator >\n"; } if($input eq "#end"){ print "\n----------------------------------"; } } } $line++; }