From rubini@unipv.it Thu Apr 7 16:35:48 EDT 1994 Article: 7532 of comp.os.linux.development Path: bigblue.oit.unc.edu!concert!gatech!howland.reston.ans.net!math.ohio-state.edu!jussieu.fr!univ-lyon1.fr!ghost.dsi.unimi.it!mirage.unipv.it!rubini From: rubini@unipv.it (Alessandro Rubini) Newsgroups: comp.os.linux.development Subject: A boost in performance when performance weakens (patch) Date: 7 Apr 94 11:59:57 GMT Organization: Pavia University Lines: 120 Message-ID: NNTP-Posting-Host: ipvvis.unipv.it Keywords: performance, kernel, swap I posted this on the mailing list but the list seems to be dead :-( this is a _very_ tiny patch to decrease swapping when the system is heavily loaded, with a little performance penalty in small-load situations. ---------------------------- I noticed that all docs state that the good thing of the elevator alghorithm is that it favours reads. Though this is surely a good behaviour if you have plenty of ram, when the system is heavy-loaded, it is undoubtely a loss. I have only 4 megs ram and I'm pretty interested to get performance out of my box, since I sometimes need X. Thus, I made a simple patch to /usr/src/linux/drivers/block/blk.h, in order to favor writes. What follows is the patch itself ('<' to '>') and some timings I made to my system (no-net, no other activity). The net result is a decreased swapping, at the cost of some file-page multiple fetch. The performance of the system _dramatically_ increases as the load rises. (Well, it decreases, but less than before). I myself have definitely switched to the new approach, as I ususally have one emacs, one compilation (or TeX) and two or three shells, and performance in the interactive tools is terrible due to heavy swapping. I can't provide reliable tests on multi-terminal sessions, though. Maybe something can be made about it: first of all make some tests on machines with plenty of memory (which I can't). Then, if there's interest I can try to add a flag in the kernel, in order to trigger between the two approaches on an euristic basis (i.e. two or more processes in the 'swapping' state) Please keep me informed... My data follow: ================ THE PATCH *** blk.h.orig Thu Mar 31 12:03:43 1994 --- blk.h Thu Mar 31 12:04:03 1994 *************** *** 44,50 **** * are much more time-critical than writes. */ #define IN_ORDER(s1,s2) \ ! ((s1)->cmd < (s2)->cmd || ((s1)->cmd == (s2)->cmd && \ ((s1)->dev < (s2)->dev || (((s1)->dev == (s2)->dev && \ (s1)->sector < (s2)->sector))))) --- 44,50 ---- * are much more time-critical than writes. */ #define IN_ORDER(s1,s2) \ ! ((s1)->cmd > (s2)->cmd || ((s1)->cmd == (s2)->cmd && \ ((s1)->dev < (s2)->dev || (((s1)->dev == (s2)->dev && \ (s1)->sector < (s2)->sector))))) ================ THE TIMINGS - raw data, with a little hand editing. startx, up to when the disk stops (2M available mem due to kernel profiling) pag-in pag-ot swp-in swp-ot cpu-u cpu-s idle old-before: 01:00:36 29088 722 653 1220 3933 5164 77592 old-after: 01:12:18 50058 1003 14366 15739 4612 13452 138851 new-before: 01:20:09 3261 234 59 506 475 964 9711 new-after: 01:28:19 24305 465 8374 9483 1069 6538 52252 ==difference -3:22 +74 +50 -5398 -5542 -85 -2714 -18718 30% startx, up to when the disk stops (3.5M available, but hphoon run) pag-in pag-ot swp-in swp-ot cpu-u cpu-s idle old-before: 03:44:41 002674 00207 00026 00235 00447 00955 03911 old-after: 03:47:55 011746 00418 02017 03167 01191 03811 19298 new-before: 03:53:36 002651 00217 00008 00198 00480 00896 03722 new-after: 03:56:36 012074 00434 01778 02883 01206 03692 17938 ==difference -0:14 +351 +6 -221 -247 -18 -60 -1171 6% some other timings, not postprocessed. un-tarring a gzipped distribution: Mar 31 02:56:26 pg=(015012,003303) sw=(01736,02512) 002154u 005505s (050443) Mar 31 02:56:45 pg=(016136,004952) sw=(01739,02513) 002722u 006114s (051115) Mar 31 03:26:15 pg=(012049,000483) sw=(01684,02450) 001123u 003758s (030805) Mar 31 03:26:32 pg=(013205,001747) sw=(01687,02450) 001699u 004338s (031395) 0:19 -> 0:17 (11% gain, not precise, though) a compilation: Mar 31 02:58:59 pg=(016304,005014) sw=(01744,02513) 002793u 006473s (064090) Mar 31 03:12:27 pg=(059729,013856) sw=(03845,04975) 037530u 019961s (096712) Mar 31 03:29:05 pg=(013271,002143) sw=(01692,02450) 001726u 004811s (046149) Mar 31 03:41:17 pg=(050705,010677) sw=(03573,04670) 036245u 018014s (071771) 13:28 -> 12:12 (9.5% gained) a TeX run: Mar 31 03:58:51 pg=(012503,000466) sw=(02012,02883) 001307u 004073s (030596) Mar 31 04:01:03 pg=(013764,000920) sw=(02020,02883) 003247u 014806s (031204) Mar 31 03:49:22 pg=(012215,000456) sw=(02274,03167) 001296u 004207s (027356) Mar 31 03:51:37 pg=(013547,000912) sw=(02282,03167) 003272u 015213s (027920) 2:12 -> 2:15 (2% loss for a non-swapping run) ================ THE TIMING SCRIPT (just in case) echo `date | awk '{print $2 " " $3 " " $4}'` \ `fgrep page /proc/stat | awk '{printf "pg=(%06d,%06d)",$2,$3}'` \ `fgrep swap /proc/stat | awk '{printf "sw=(%05d,%05d)",$2,$3}'` \ `head -1 /proc/stat | awk '{printf "%06du %06ds (%06d)",$2,$4,$5}'` -- ========================= alessandro rubini rubini@ipvvis.unipv.it =========================