|STAT 5.4 DATA MANIPULATION & ANALYSIS PROGRAMS FOR UNIX and MSDOS |STAT is a set of over 20 data manipulation and analysis programs developed by Gary Perlman at the University of California, San Diego and at the Wang Institute of Graduate Studies. The programs are designed with the UNIX philosophy that individual programs should be designed as tools that do one task well and produce output suitable for input via pipes to other programs. Interactive use is supported in the command line interpreter which also provides a programming language for complex analyses. Functions built into many statistical packages (e.g., graphics and editing) are not re-invented in |STAT which delegates such responsibility to standard tools. Typical usage involves a pipeline of transformations of data followed by input to an analysis program, summarized schematically by: INPUT DATA | TRANSFORM | ANALYSIS | OUTPUT RESULTS Data Manipulation Programs: abut join data files beside each other colex column extraction/formatting dm conditional data extraction/transformation dsort multiple key data sorting filter linex line extraction maketrix create matrix format file from free-format input perm permute line order randomly, numerically, alphabetically probdist probability distribution functions ranksort convert data to ranks repeat repeat strings or lines in files reverse reverse lines, columns, or characters series generate an additive series of numbers transpose transpose matrix format input validata verify data file consistency Data Analysis Programs: anova multi-factor analysis of variance calc interactive algebraic modeling calculator contab contingency tables and chi-square desc descriptions, histograms, frequency tables dprime signal detection d' and beta calculations features tabulate features of items oneway one-way anova/t-test with error-bar plots pair paired data statistics, regression, scatterplots rankind rank order analysis for independent conditions rankrel rank order analysis for related conditions regress multiple linear regression and correlation stats simple summary statistics ts time series analysis and plots Package Features: simple input formats (free format field oriented) flexible data manipulation several simple lineprinter plotting options data validation (range and type checking) consistent option conventions with online help runs on any UNIX System (V6, V7, 2.8BSD, 4BSD, System V, etc.) runs on MSDOS 2.0 and 3.0 with 96K (IBM, Wang, AT&T, Epson, etc.) usually less than a few seconds per analysis liberal copyright (but can't be distributed for gain) Notes: UNIX is a trademark of AT&T Bell Laboratories. MSDOS is a trademark of MicroSoft. |STAT is NOT a product of any company or organization. Distribution Conditions: CAREFULLY READ THE FOLLOWING CONDITIONS. IF YOU DO NOT FIND THEM ACCEPTABLE, YOU SHOULD NOT USE |STAT. |STAT IS PROVIDED "AS IS" AND WITHOUT ANY WARRANTY EXPRESS OR IMPLIED. THE USER ASSUMES ALL RISKS OF USING |STAT. THERE IS NO CLAIM OF THE MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. |STAT MAY NOT BE SUITED TO YOUR NEEDS. |STAT MAY NOT RUN ON YOUR PARTICULAR HARDWARE OR SOFTWARE CONFIGURATION. THE AVAILABILITY OF AND PROGRAMS IN |STAT MAY CHANGE WITHOUT NOTICE. NEITHER MANUFACTURER NOR DISTRIBUTOR BEAR RESPONSIBILITY FOR ANY MISHAP OR ECONOMIC LOSS RESULTING THEREFROM OF THE USE OF |STAT EVEN IF THE PROGRAMS PROVE TO BE DEFECTIVE. |STAT IS NOT INTENDED FOR CONSUMER USE. CASUAL USE BY USERS NOT TRAINED IN STATISTICS, OR BY USERS NOT SUPERVISED BY PERSONS TRAINED IN STATISTICS, MUST BE AVOIDED. USERS MUST BE TRAINED AT THEIR OWN EXPENSE TO LEARN TO USE THE PROGRAMS. DATA ANALYSIS PROGRAMS MAKE MANY ASSUMPTIONS ABOUT DATA, THESE ASSUMPTIONS AFFECT THE VALIDITY OF CONCLUSIONS MADE BASED ON THE PROGRAMS. REFERENCES TO APPROPRIATE STATISTICAL SOURCES ARE MADE IN THE |STAT HANDBOOK AND IN THE MANUAL ENTRIES FOR SPECIFIC PROGRAMS. THE PROGRAMS HAVE NOT BEEN VALIDATED FOR LARGE DATASETS, HIGHLY VARIABLE DATA, NOR VERY LARGE NUMBERS. YOU MAY MAKE COPIES OF ANY TANGIBLE FORMS OF |STAT, PROVIDED THAT THERE IS NO MATERIAL GAIN INVOLVED, AND PROVIDED THAT THE INFORMATION IN THIS NOTICE ACCOMPANIES EVERY COPY. YOU MAY DISTRIBUTE COPIES OF |STAT, PROVIDED THAT MASS DISTRIBUTION (SUCH AS ELECTRONIC BULLETIN BOARDS) IS NOT USED. YOU MAY NOT MODIFY THE SOURCE CODE FOR ANY PURPOSES OTHER THAN GETTING THE PROGRAMS TO WORK ON YOUR SYSTEM. ANY COSTS IN COMPILING OR PORTING |STAT TO YOUR SYSTEM ARE YOUR'S ALONE, AND NOT ANY OTHER PARTIES. YOU MAY NOT DISTRIBUTE ANY MODIFIED SOURCE CODE OR DOCUMENTATION TO USERS AT ANY SITES OTHER THAN YOUR OWN. Ordering Information 8/30/89: Carefully read the instructions below. Orders not following them may be be returned or even discarded. All prices include delivery and should be prepaid to G. Perlman. Checks must be in US funds, drawn on a US bank. Orders that demand any terms or conditions other than those in this notice may be returned or discarded. Orders must include a delivery mailing label acceptable to the post office, and international orders must include the country name on the label. UNIX Version of |STAT: $20/$30 Contents: Programs (C language) & Online Manual Entries Format: half inch 9 track mag tape, 1600 bpi tar format Format: 1/4 inch cartridge tape (this version costs $30) MSDOS Version of |STAT: $15 Contents: Preformatted Manuals and Executables Format: 2S/2D DOS 5.25 inch floppy diskettes Format: 1.2 Mbyte HD DOS 5.25 inch floppy diskette MSDOS Source: $10 Contents: C Source Code, Turbo C Project Files, Preformatted Manuals Format: 1.2 Mbyte HD DOS 5.25 inch or 3.5 inch floppy Handbook: $10 Contents: Examples, Reference Materials, CALC & DM Manuals, Manual Entries Format: Typeset Manual (over 100 pages) Gary Perlman Department of Computer and Information Science perlman@cis.ohio-state.edu The Ohio State University 614-292-2566 2036 Neil Avenue Mall Columbus, OH 43210-1277