AMBuildScript
author David Anderson <dvander@alliedmods.net>
Sat Nov 17 18:41:01 2012 -0800 (2012-11-17)
changeset 195 c1ba166f3fc4
parent 106 8534dbfb8c1a
child 251 ca99a81745fe
permissions -rw-r--r--
Massive SemA/BC refactoring to support better type systems. Read on for more.

The previous compiler had a simple pipeline, best outlined as:
(1) Parsing and name binding -> AST
(2) Storage allocation -> AST
(3) Semantic Analysis (SemA) -> AST
(4) Bytecode Compilation (BC) -> Code

This pipeline, unfortunately, had two problems. First, name binding cannot be performed in one pass if we wish to have multi-file support in the form of modules. Name binding must be two-pass.

Second, because SemA was only capable of annotating AST nodes with a single type, most coercion work had to be duplicated in the BC phase, as coercion was not explicit in the SemA output. This made introducing new type rules extremely difficult, and all but precluded concepts like operator overloading (or overloading at all).

This patch rewrites most of Keima's backend. Of interest are the following changes:
(1) CompileContext has been refactored around future multi-file support.
(2) Parsing no longer performs any name binding.
(3) The grammar for type-and-name has been changed from:
(Label | Identifier)? Identifier
to:
Type? Identifier
where:
Type ::= OldType | NewType
OldType ::= Label
NewType ::= Identifier (. NewType)*


Although we do not implement the full production for Type yet, this
distinction is important. A type identifier is now affixed to the AST
as an Expression (right now, always a NameProxy), so it can fully
participate in name binding.

(4) Immediately after parsing, the AST goes through a NamePopulation phase.
This phase creates Scope objects for every scope which declares a name,
and if those names are declaring types or functions, a Symbol is created
and entered into the scope. Symbols entirely replace the old BoundName
classes.

(5) After NamePopulation comes NameBinding. This phase performs three steps:
(a) Links Scope objects together, to form a tree.
(b) Binds any free name to an existing name in the scope chain.
(c) Creates and registers any Symbols for names which have not yet been
added to the scope (for example, local variables).

In accordance, all "allocation" and "binding" concepts have been removed
from Scopes, which are now lightweight container classes.

(6) SemA now generates bytecode for each statement in the AST. Expression
compilation is performed via a separate mechanism called HIR (High-level
Intermediate Representation). To analyze an expression, SemA will walk
the AST, and produce a HIR object for each node. HIR is typed, and SemA
may insert coercion nodes, or even expand an AST node into multiple HIR
nodes. The result of evaluating an expression in SemA is therefore the
the root of a HIR tree, which SemA then sends to the HIRTranslator, which
performs bytecode generation.

As before, SemA still performs full semantic analysis. However, not it
also produces each function's bytecode.

This split allows us to decompose potentially complex semantics into
a more fine-grained, AST-like structure, which can have very simple
code-generation logic. For example, (1.0 + 5) might look like:

HAdd(HFloat(1.0), HConvert(HInteger(5), <Float>))

As part of this decomposition, many opcodes have been removed. Rather than
using typed jumps, we now expand jumps into longer tests. For example:
jge.f <label>
becomes:
ge.f
cvt.f2b
jt <label>

This simplifies the pipeline, and JITs should be able to melt the added
work away.

HIR is not used at a statement level, and HIR does not have any concept
of control-flow.

(7) The old BytecodeCompiler has been removed, as its work is now split
between SemA and HIRTranslator.


In addition, some minor refactorings have taken place:

(1) Opcodes.tbl no longer hardcodes numbers (aaah).
(2) Type has been split into smaller, typed structs.
(3) BytecodeEmitter no longer relies on Pools or RootScopes.
(4) SemA is now responsible for variable/storage allocation.
(5) Native table are now global, rather than per-module. As such, native
declarations now result in a Native object (which still requires a
CALLNATIVE opcode), which is an index into the global table. Natives
must be bound globally. This is in preparation for module support.
(6) Publics are no longer tracked, but rather registered via a global
callback upon module load. This callback can be set by embedders. This
is in preparation for multi-file support.
(7) The BytecodeEmitter now performs Symbol allocations itself, and it
also generates Code objects itself, which removes a good deal of
complexity.

Finally, a test harness has been added. This harness is a python script which finds *.test files, recursively, in the tests folder. For each file it runs the corresponding .sp. The contents of the .test file must match stdout+stderr.
     1 # vim: set ts=2 sw=2 tw=99 noet ft=python:
     2 # 
     3 # Copyright (C) 2004-2012 David Anderson
     4 # 
     5 # This file is part of SourcePawn.
     6 # 
     7 # SourcePawn is free software: you can redistribute it and/or modify it under
     8 # the terms of the GNU General Public License as published by the Free
     9 # Software Foundation, either version 3 of the License, or (at your option)
    10 # any later version.
    11 # 
    12 # SourcePawn is distributed in the hope that it will be useful, but WITHOUT ANY
    13 # WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
    14 # FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
    15 # 
    16 # You should have received a copy of the GNU General Public License along with
    17 # SourcePawn. If not, see http://www.gnu.org/licenses/.
    18 #
    19 
    20 import os
    21 import sys
    22 from ambuild.command import SymlinkCommand
    23 
    24 class KE:
    25 	def __init__(self):
    26 		self.compiler = Cpp.Compiler()
    27 
    28 		if AMBuild.mode == 'config':
    29 			#Detect compilers
    30 			self.compiler.DetectAll(AMBuild)
    31 
    32 			#Set up defines
    33 			cxx = self.compiler.cxx
    34 			if isinstance(cxx, Cpp.CompatGCC):
    35 				if isinstance(cxx, Cpp.GCC):
    36 					self.vendor = 'gcc'
    37 				elif isinstance(cxx, Cpp.Clang):
    38 					self.vendor = 'clang'
    39 				self.compiler.AddToListVar('CFLAGS', '-pipe')
    40 				self.compiler.AddToListVar('CFLAGS', '-fno-strict-aliasing')
    41 				if (self.vendor == 'gcc' and cxx.majorVersion >= 4) or self.vendor == 'clang':
    42 					self.compiler.AddToListVar('CFLAGS', '-fvisibility=hidden')
    43 					self.compiler.AddToListVar('CXXFLAGS', '-fvisibility-inlines-hidden')
    44 				self.compiler.AddToListVar('CFLAGS', '-Wall')
    45 				self.compiler.AddToListVar('CFLAGS', '-Werror')
    46 				self.compiler.AddToListVar('CFLAGS', '-Wno-switch')
    47 
    48 				if AMBuild.options.arch == 'x86':
    49 					self.compiler.AddToListVar('CFLAGS', '-m32')
    50 					self.compiler.AddToListVar('POSTLINKFLAGS', '-m32')
    51 				elif AMBuild.options.arch == 'x64':
    52 					self.compiler.AddToListVar('CFLAGS', '-m64')
    53 					self.compiler.AddToListVar('POSTLINKFLAGS', '-m64')
    54 				elif AMBuild.options.arch:
    55 					raise Exception('Unknown architecture: {0}'.format(AMBuild.options.arch))
    56 
    57 				# Disable some stuff we don't use, that gives us better binary
    58         # compatibility on Linux.
    59 				self.compiler.AddToListVar('CXXFLAGS', '-fno-exceptions')
    60 				self.compiler.AddToListVar('CXXFLAGS', '-fno-rtti')
    61 				self.compiler.AddToListVar('CXXFLAGS', '-fno-threadsafe-statics')
    62 
    63 				# We don't really care about these.
    64 				self.compiler.AddToListVar('CXXFLAGS', '-Wno-non-virtual-dtor')
    65 				self.compiler.AddToListVar('CXXFLAGS', '-Wno-overloaded-virtual')
    66 
    67 				# The compiler is too aggressive about what what should be a valid offsetof.
    68 				self.compiler.AddToListVar('CXXFLAGS', '-Wno-invalid-offsetof')
    69 
    70 				# This would otherwise forbid ((x = y) == NULL) in clang apparently.
    71 				if self.vendor == 'clang':
    72 					self.compiler.AddToListVar('CXXFLAGS', '-Wno-null-arithmetic')
    73 
    74 				if (self.vendor == 'gcc' and cxx.majorVersion >= 4 and cxx.minorVersion >= 7) or \
    75 						(self.vendor == 'clang' and cxx.majorVersion >= 3):
    76 					self.compiler.AddToListVar('CXXFLAGS', '-Wno-delete-non-virtual-dtor')
    77 
    78 				self.compiler.AddToListVar('POSTLINKFLAGS', '-lm')
    79 			elif isinstance(cxx, Cpp.MSVC):
    80 				self.vendor = 'msvc'
    81 				if AMBuild.options.debug == '1':
    82 					self.compiler.AddToListVar('CFLAGS', '/MTd')
    83 					self.compiler.AddToListVar('POSTLINKFLAGS', '/NODEFAULTLIB:libcmt')
    84 				else:
    85 					self.compiler.AddToListVar('CFLAGS', '/MT')
    86 				self.compiler.AddToListVar('CDEFINES', '_CRT_SECURE_NO_DEPRECATE')
    87 				self.compiler.AddToListVar('CDEFINES', '_CRT_SECURE_NO_WARNINGS')
    88 				self.compiler.AddToListVar('CDEFINES', '_CRT_NONSTDC_NO_DEPRECATE')
    89 				self.compiler.AddToListVar('CXXFLAGS', '/EHsc')
    90 				self.compiler.AddToListVar('CXXFLAGS', '/GR-')
    91 				self.compiler.AddToListVar('CFLAGS', '/W3')
    92 				self.compiler.AddToListVar('CFLAGS', '/nologo')
    93 				self.compiler.AddToListVar('CFLAGS', '/Zi')
    94 				self.compiler.AddToListVar('CXXFLAGS', '/TP')
    95 				self.compiler.AddToListVar('POSTLINKFLAGS', '/DEBUG')
    96 
    97 				self.compiler.AddToListVar('POSTLINKFLAGS', 'kernel32.lib')
    98 				self.compiler.AddToListVar('POSTLINKFLAGS', 'user32.lib')
    99 				self.compiler.AddToListVar('POSTLINKFLAGS', 'gdi32.lib')
   100 				self.compiler.AddToListVar('POSTLINKFLAGS', 'winspool.lib')
   101 				self.compiler.AddToListVar('POSTLINKFLAGS', 'comdlg32.lib')
   102 				self.compiler.AddToListVar('POSTLINKFLAGS', 'advapi32.lib')
   103 				self.compiler.AddToListVar('POSTLINKFLAGS', 'shell32.lib')
   104 				self.compiler.AddToListVar('POSTLINKFLAGS', 'ole32.lib')
   105 				self.compiler.AddToListVar('POSTLINKFLAGS', 'oleaut32.lib')
   106 				self.compiler.AddToListVar('POSTLINKFLAGS', 'uuid.lib')
   107 				self.compiler.AddToListVar('POSTLINKFLAGS', 'odbc32.lib')
   108 				self.compiler.AddToListVar('POSTLINKFLAGS', 'odbccp32.lib')
   109 
   110 			#Optimization
   111 			if AMBuild.options.opt == '1':
   112 				self.compiler.AddToListVar('CDEFINES', 'NDEBUG')
   113 				if self.vendor == 'gcc' or self.vendor == 'clang':
   114 					self.compiler.AddToListVar('CFLAGS', '-O3')
   115 				elif self.vendor == 'msvc':
   116 					self.compiler.AddToListVar('CFLAGS', '/Ox')
   117 					self.compiler.AddToListVar('POSTLINKFLAGS', '/OPT:ICF')
   118 					self.compiler.AddToListVar('POSTLINKFLAGS', '/OPT:REF')
   119 
   120 			#Debugging
   121 			if AMBuild.options.debug == '1':
   122 				if self.vendor == 'gcc' or self.vendor == 'clang':
   123 					self.compiler.AddToListVar('CFLAGS', '-g3')
   124 				elif self.vendor == 'msvc':
   125 					self.compiler.AddToListVar('CFLAGS', '/Od')
   126 					self.compiler.AddToListVar('CFLAGS', '/RTC1')
   127 
   128 			#Platform-specifics
   129 			if AMBuild.target['platform'] == 'linux':
   130 				self.compiler.AddToListVar('CDEFINES', '_LINUX')
   131 				if self.vendor == 'gcc':
   132 					self.compiler.AddToListVar('POSTLINKFLAGS', '-static-libgcc')
   133 				if self.vendor == 'clang':
   134 					self.compiler.AddToListVar('POSTLINKFLAGS', '-lgcc_eh')
   135 			elif AMBuild.target['platform'] == 'darwin':
   136 				self.compiler.AddToListVar('POSTLINKFLAGS', '-mmacosx-version-min=10.5')
   137 				if AMBuild.options.arch == 'x86':
   138 					self.compiler.AddToListVar('POSTLINKFLAGS', ['-arch', 'i386'])
   139 				elif AMBuild.options.arch == 'x64':
   140 					self.compiler.AddToListVar('POSTLINKFLAGS', ['-arch', 'x86_64'])
   141 				self.compiler.AddToListVar('POSTLINKFLAGS', '-lstdc++')
   142 
   143 				# For OS X dylib versioning
   144 				import re
   145 				productFile = open(os.path.join(AMBuild.sourceFolder, 'product.version'), 'r')
   146 				productContents = productFile.read()
   147 				productFile.close()
   148 				m = re.match('(\d+)\.(\d+)\.(\d+).*', productContents)
   149 				if m == None:
   150 					self.version = '1.0.0'
   151 				else:
   152 					major, minor, release = m.groups()
   153 					self.version = '{0}.{1}.{2}'.format(major, minor, release)
   154 				AMBuild.cache.CacheVariable('version', self.version)
   155 
   156 			#Finish up
   157 			self.compiler.ToConfig(AMBuild, 'compiler')
   158 			AMBuild.cache.CacheVariable('vendor', self.vendor)
   159 			self.targetMap = { }
   160 			AMBuild.cache.CacheVariable('targetMap', self.targetMap)
   161 		else:
   162 			self.compiler.FromConfig(AMBuild, 'compiler')
   163 			self.targetMap = AMBuild.cache['targetMap']
   164 
   165 	def DefaultCompiler(self):
   166 		return self.compiler.Clone()
   167 
   168 	def JobMatters(self, jobname):
   169 		file = sys._getframe().f_code.co_filename
   170 		if AMBuild.mode == 'config':
   171 			self.targetMap[jobname] = file
   172 			return True
   173 		if len(AMBuild.args) == 0:
   174 			return True
   175 		if not jobname in AMBuild.args:
   176 			return False
   177 
   178 ke = KE()
   179 globals = {
   180 	'KE': ke
   181 }
   182 
   183 FileList = [
   184 		['src', 'AMBuild.library'],
   185 		['src', 'AMBuild.shell'],
   186 		['AMBuild.dist']
   187 	]
   188 
   189 for parts in FileList:
   190 	AMBuild.Include(os.path.join(*parts), globals)
   191