@chapter Loading Object Code We will outline some of the features of the object loader, by William Schelter. When you do @code{(load "foo.o")} the output from the C compiler, must be loaded into static space in the running KCL, and references to external symbols must be resolved. Originally KCL used the loader from the underlying lisp system, calling it in a subshell, to produce yet another file, which had the correct references to externals. This was then read into kcl. The data vector (a lisp readable vector at the end of the object file) was also read into KCL. Unfortunately some operating systems (such as System V) do not supply a loader capable of doing this relocation, and in any event it is fairly slow. Also there was no possiblity of incrementally adding new external C symbols to an already running lisp, and then having future files refer to them. For example you might have a function @code{search1} written in C, which you wished to access directly in subsequently loaded files. This was not possible since the loader only knew about the addresses of the external symbols in the original saved image. The new scheme builds a list of the external symbols into a table called @code{c_table}. This table is built by examining the current image. It will be built automatically with the first call to load. Subsequent calls just use this table. Of course there is the additional benefit, that it is easy to add additional symbols to the table. For example if you have a file @file{try.c} which looks like @code@{init_code() add_symbols(joe,&joe,pete,&pete,NULL); @} joe(x) object x @{...@} pete() @{...@} } then joe and pete will be added to the symbol table of the current kcl. You may refer to them as external variables in subsequent files, and these files will load correctly, referencing these variables. It is an error apply add_symbol twice, to the same variable. The loading of files has speeded up considerably, so that a small file with only a few small functions in it, can be loaded in less than .05 seconds. @chapter Metering and Profiling KCL utilities have been added, by W. Schelter, to allow one to determine the percentage of time spent in individual functions. Usage involves deciding which block of code one wishes to profile, that is to say what address range, and then allocating an appropriate size @code{*profile-array*}. For example in the Sun version, if you have loaded a few object files, then if you wish to meter all of kcl and the files which you loaded you could allocate a 1 megabyte array. This would give a roughly 2 to one reduction relative to the code address range. Note that the loader prints out the address at which code is loaded. There is also a function @code{si@:function-start (fun)} which returns the start address of a compiled function. In the above example after loading the file lsp/profile.o you could do @code{(si:set-up-profile 1000000)} This allocates the 1 megabyte array, and also reads in the c symbol table, if this has not already been done. It also gets the addresses of all compiled function objects currently in the image, and keeps them in a table. This table is called @code{combined_table} at the C level. The function @code{si:set-up-combined (size-of-table)} sets up a combined table for the lisp and C functions. This function is called by the previous @code{si:set-up-profile} function, with a default size-of-table of 6000. Now to turn profiling on you do @code{(si::prof 0 90)}. This will start metering all addresses in the range of 0 (the first arg) to 1,000,000 * (256/90), where 90 is the second arg. To display the data collected so far you can invoke @code{si::display-profile} with no arguments. In order to clear the profile array you run @code{(si::clear)}. A call of @(si::prof 500000 256) would profile the code in the address range of 500,000 to 1,500,000. You may switch the profiler off by specifying a 0 mapping, ie @code{si::prof 0 0)}. It can then be restarted by supplying a nonzero second argument. Of course if you start up again with a scale different from the previous one, without clearing the profile array, you will have gibberish. The argument list to the last call of @{si::prof} is stored in the variable @code{si::*current-profile*}. Unless one is using a one to one mapping of the profile array to the code, there is a possibility of quantization errors. There is also the possibility of overflowing a slot in the profile array, if the mapping is very coarse, or if the interval being measured is very long. @code{ 0.08% ( 9): _eql 15.26% ( 1822): _equal 0.01% ( 1): _Fquote 0.01% ( 1): SET 0.04% ( 5): _parse_key 0.01% ( 1): _Fcond ... 0.50% ( 60): RELIEVE-HYPS1 0.03% ( 4): REMAINDER 0.01% ( 1): REMOVE-*2*IFS 0.03% ( 3): REMOVE-TRIVIAL-EQUATIONS 4.35% ( 520): REWRITE 0.47% ( 56): REWRITE-CAR-V&C-APPLY$ ...} is a sample of the output. The first column represents percentage of total time spent with the program counter in the range starting at this function, up to the next named function. The second column is the actual number of times that a profile interrupt landed in this section of the code. Note the default display is by address, and as mentioned before, one should beware of overlaps, in a coarse mapping. Functions for which there were no ticks, are not displayed. Note we did not sort the output, since we wished to leave it in address order. It is possible (because of roundoff if the second arg to prof is small) that some calls could be credited to the adjacent function. This could be spotted more easily if the order is by address. It is trivial to sort the table by ticks in gnu emacs using the command sort-columns. Have the point set at the beginning of column, in the first line and the mark at the end of the column in the last line. Unfortunately the System V loader likes to separate the original C functions of KCL, from those incrementally loaded, by about 2 megabytes. This makes it awkward to meter both ranges simultaneously without using a very large profile array. It is probably reasonable to rewrite the basic interrupt call, to handle such an address configuration. This has not yet been done. Of course you can always make two runs, and combine the information for the two ranges.