ftp.nice.ch/pub/next/tools/performance/TimeMon.2.2.s.tar.gz#/TimeMon-2.2/Percentages.rtf

This is Percentages.rtf in view mode; [Download] [Up]

/*
  Percentages.rtf

	Description of the Percentages class.

  Copyright 1991 Scott Hess.  Permission to use, copy, modify, and
  distribute this software and its documentation for any purpose
  and without fee is hereby granted, provided that this copyright
  notice appear in all copies.  The copyright notice need not appear
  on binary-only distributions - just in source code.
  
  Scott Hess makes no representations about the suitability of this
  software for any purpose.  It is provided "as is" without express
  or implied warranty.
*/

_cp_time

In the _cp_time structure of the kernel resides information indicating the number of ticks which have went to each of sys, user, nice, and idle times.  The total from adding these gives the total number of ticks since the kernel came up.

To retrieve these values is a bothersome thing.  First, you need to nlist the kernel (/mach) to find out where the value for _cp_time resides.  nlist is a function that will look in an executable's symbol tables and return such information.  Once that is know, we must look in the /dev/kmem device to find the actual info.

The actual format of _cp_time is trivial ± four long integers with each of the values.  See <sys/dk.h> for more information on that.

[NOTE:  Previously, I used direct access to /dev/kmem to get this stuff.  Almost exactly 24 hours after posting TimeMon2.1, I came across code demonstrating some kernel calls that did not require this low-level access.  Sigh.  The call is the table() call, and it is undocumented.  Because of this, I chose to retain the /dev/kmem code as a backup for it.  See the README file for more information. ]

Drawing

This is the fun part.  Excepting drawing, the things TimeMon does take negligible time.  Throwing in drawing, TimeMon can take large amounts of time.  This is a problem in that such time can affect the very output of TimeMon, which is unwanted (TimeMon should not imping on the system's overall times).

There are a couple ways I try to work around the need to draw, but not to draw too much.  One method is to allow the user to reduce the update interval.  This will cause TimeMon to check the kernel less often, thereby reducing the number of times it must draw, thus reducing the total drawing time (or drawing time per unit time).

The other big optimization I apply is to realize the mechanics of drawing the view.  For a complete redraw, I first draw the rectangle which is the view, then each of the layers from outside in.  This reduces complexity in the drawing code ± by letting interior layers overlay the outside ones, I can use the same basic code to draw everything.

The problem (or advantage) is that the outer layers do not change so fast as the inner layers.  So, with some smarts, redrawing of the outer layers can be avoided in some cases.  I implement this in the lpcents and updateFlags variables.  lpcents indicates the pcents values which were used to draw the view the last time it was updated (for any of the layers).  In essence, it caches what I drew last.  updateFlags indicates which layers need a redraw.  In the step method (which is the method that changes pcents), the new pcents values are tested against the lpcents values.  If any pcent value is .01 or more different from the lpcent value, then that layer and all layers inside it are marked for redraw ± note that drawing a layer will overwrite all layers inside it.

This optimization allows another user-controlled variable to impinge on drawing time ± the magnitude associated with layers can be adjusted.  Since higher magnitudes will lessen the change any single step will make to the layer, this can reduce the amount of drawing done by the program.

Getting a little insane

The above optimizations reduced the amount of CPU time taken by TimeMon to something like 10% of where it started out at.  But, that was not all there was, by far!  In the optimized version, I still used multiple pswraps to do the drawing, and while they were fairly good, they still left me wanting some more speed.

Well, to tell the truth, I thought things were fast enough as-is.  But, of course, I was wrong.  After a discussion with Scott Byer (byer@adobe.com), I was jolted into some real performance measurements.  So I spent Saturday night at home, profiling . . . the results caused me to change some things.

One was the DPSUserPath stuff in the -update routine to draw the circles around each layer of the display.  This worked fairly well given that only one or two of the circles were ever drawn.  The better solution, though, was to just draw the whole thing in one swoop using a user path.  Since the path doesn't ever change, I also used the dps_ucache operator to cause things to be cached on the windowserver.  It turns out that this was faster than drawing even one of the circles in the normal manner.  In fact, drawing the circles with a single pswrap was comparable to drawing a single one of them, falling about midway between drawing the circles separately and using a user path (note that when drawing them seperately, we'll only average about 2 drawn on every iteration).

The other speedup revolves around reducing the amount of communications with the windowserver and the number of lookups that the windowserver has to do.  Basically, I upload a couple procedures to the windowserver, using the Postscript bind operator to bind the tokens to their executable code, and then the pswraps just call these routines with a couple parameters.  Since I've found in other projects that control structures (if, ifelse, for, etc) are quite slow on the windowserver, I separated the routine into the two major cases.  Since there is seldom a nice section to the display, one routine draws the nice value, one doesn't, and I use an if statement on the client side to differentiate.

Safety net

Well, given all of the above, you still never really know.  So, I cause TimeMon to run niced to a relatively low priority relative to what most programs run at.  What this means is that if the system is very busy, TimeMon will only get to run when there's a spare cycle or two.

These are the contents of the former NiCE NeXT User Group NeXTSTEP/OpenStep software archive, currently hosted by Netfuture.ch.