Description of Fast Link option for KCL Author: Bill Schelter When we refer to times of function calls, without other qualification, we will be referring to the simplest possible function of no args returning nil: (defun foo () nil). This provides a good general indication of the timing of all functions. The original KCL function calling system, distinguishes between functions defined in the same file, proclaimed functions, as well as having different calling mechanisms for different safety levels. Some disadvantages were that calling across files always took at least 50mu, in spite of proclamations or safety. Function calls inside a file either were fast (10 mu (or 3mu for proclaimed)) at safety 0 but incapable of being traced or redefined, or else as slow as cross file compilation. We wished to have a scheme which would allow tracing and redefinition, of all calls, as well very fast calling. In order to do this we set up links in the calls, and these are modified at the first call to the function, if the function is compiled. Recompiling tracing, or redefining, undoes the link. (use-fast-links t) turns this feature on, and it is on by default. An argument of nil turns it off, so that all calls go through the function symbol. Some timings on the fast link compiling provided in this version of kcl. FILEA: (proclaim '(optimize (safety 0))) (proclaim '(function blue() t)) (proclaim '(function blue1 (t) t)) (proclaim '(function blue2 (t t) t)) (proclaim '(function blue-same-file() t)) (defun test-blue (n) (sloop for i below n do (blue))) (defun test-blue1 (n) (sloop for i below n do (blue1 nil))) (defun test-blue2 (n) (sloop for i below n do (blue2 nil nil))) (defun test-blue-same-file (n) (sloop for i below n do (blue-same-file))) FILEB: (defun blue () nil) (defun blue1 (x)x nil) (defun blue2 (x y) x y Compile and load FILEA then FILEB. Timings: We timed the invocation of blue,blue1, and blue2 by executing the loops in fileA. We subtracted the time for one empty loop iteration (2.7mu). Call New Old (blue) 3.03 60.5 (blue1 x) 4.1 62.2 (blue2 x y) 5.1 64.3 (blue-same-file) 3.03 2.73 As can be seen all calls of blue are substantially speeded up, except for the calls in the same file, which are slightly slowed down. There is however the advantage, that the calls in the same file can now be traced or redefined. Also it is conceivable that the program might want to change a definition dynamically. It is no longer necessary to recompile the whole file. They are handled in exactly the same manner as the non local calls. Since most software projects consist of more than one file, and since it is customary to move key routines to a basic files at the beginning of the system, we feel the importance of having fast calls across files is important. For example in MAXIMA, there are 380 calls to ptimes, with naturally the large majority being in files other than the basic definition. It is useful if the other calls can be made faster too. Also when debugging some chunk of MAXIMA code, it is useful to be able to trace ptimes, without having to load in new definitions and recompile. Disadvantages: The link table data takes up approximately 10 words, independent of the number of calls in a file to that function. Space: I made a file with (defun try (a b) a b (foos a b)(foos a b)(foos a b)(foos a b)(foos a b) (foos a b)(foos a b)(foos a b)(foos a b)(foos a b) (foos a b)(foos a b)(foos a b)(foos a b)(foos a b) (foos a b)(foos a b)(foos a b)(foos a b)(foos a b) (foos a b)(foos a b)(foos a b)(foos a b)(foos a b) ) I compared the size with various settings of *fast-link-compile* and with proclaiming foos. DIFF means the size above the case with all calls to FOOS removed. text data bss dec DIFF FLC proclaimed Case SAMEFILE 1076 0 28 1104 836 nil nil I nil 1308 0 32 1340 892 nil nil Ia t 1296 4 28 1328 1060 t nil II nil 1436 4 32 1472 1056 t nil IIa t 684 4 28 716 448 t t III nil 244 0 24 268 0 t ; calls removed. IV nil 384 0 32 416 0 nil ;cals removed V t The reason II is bigger than I is that the vs_top and vs_base settings are being performed in the file, in exactly the same manner as if the definition for foos were in the file. FLC=nil with definition of foos in the same file would also be higher. Should probably have a type of proclamation which would favor the case I call in cases where speed is irrelevant. But then why not go with III.. Appendix: Notes: 1)Empty loop takes 2.70 seconds for 1,000,000 iterations. 2)blue-same-file or blue >(time (test-blue 1000000)) real time : 5.750 secs run time : 5.733 secs NIL >(trace blue) (BLUE) >(test-blue 2) 1> (BLUE) <1 (BLUE NIL) 1> (BLUE) <1 (BLUE NIL) NIL >(trace blue-same-file) (BLUE-SAME-FILE) >(test-blue-same-file 2) 1> (BLUE-SAME-FILE) <1 (BLUE-SAME-FILE NIL) 1> (BLUE-SAME-FILE) <1 (BLUE-SAME-FILE NIL) NIL