IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

Mumps (the language)

Kevin O'Kane okane at cs.uni.edu
Mon Jun 24 04:29:20 EST 2002


Home Page: http://www.cs.uni.edu/~okane 

The Mumps language originated in the mid-60's at the Massachusetts General Hospital. 
The acronym stands for "Massachusetts General Hospital Utility Multi-Programming System". 
While it has been used in a number of areas, its primary application is to medicine.
Although most implementations are proprietary, consolidated into the hands of a small 
number of vendors, an open source version of the language has been developed which is 
distributed freely under the GNU GPL and LGPL licenses.

Mumps is potentially attractive for bioinformatics applications because:

- It supports an hierarchical data base facility.  Mumps data sets are not only organized 
  according to traditional sequential and direct access models, but also as hierarchical trees 
  whose data nodes are addressed as many-level path descriptions in a manner that is easy for 
  a programmer to master in a relatively short time.  

- The data base can also be viewed as string-indexed, many-dimensional matrices of effectively 
  unlimited size.  

- The underlying data base processor, the Berkeley DB, can be configured for data bases up to 
  256 terabytes in size.

- Mumps has flexible and powerful string manipulation facilities. Its built-in string manipulation 
  operators and functions, which include the Perl Compatible Regular Expression Library, permit
  complex string manipulation and pattern matching operations.

- This version of Mumps, unlike all others, is a compiler that translates Mumps code to C.
  Mumps subroutines can be constructed so that they can be called by any other program that obeys
  the C calling conventions.  Each Mumps subroutine is fully functional and requires no additional
  interpreters, main programs, language processors, etc, other than ordinary link libraries. 
  Similarly, Mumps programs and subroutines can call any other system facility that uses a C 
  calling structure.  These features are unique to this version of Mumps and makes it possible 
  to exploit Mumps' features in non-Mumps contexts.

- The data base can operate in standalone or client-server mode.  In standalone mode, multiple
  programs simultaneously access the same data base files.  In client-server mode, Mumps client 
  programs or functions access local or remote Mumps data bases through TCP/IP or UDP protocols.  
  TCP/IP connections have the option of using OpenSSL encryption.  These are compile time switch 
  options and require no specific program modifications to use.

- Mumps programs can be used with the open source Gtk based Glade "drag and drop" GUI builder.  
  This permits rapid deployment of user friendly GUI interfaces (see references below for examples).

- Mumps routines can be used to easily construct CGI executable scripts for data base access.  Mumps 
  programs can be called directly by the web server and have builtin facilities to parse the QUERY_STRING
  web server environment variable to instantiated program variables and data (see references).

- Direct SQL commands can access PostgreSQL RDBMS data bases (can be modified for MySQL) with
  the results archived to native tables (matrices) or trees.

Initial testing has been done using Mumps in connection with the NCBI BLAST software
(ftp.ncbi.nih.gov/blast/demo).  In the test, data were moved directly from the "doblast" 
example output routines to a Mumps tree-structured data base and subsequently accessed 
without problems.  There appear to be no compatibility issues involved with using Mumps
with the NCBI Toolkit.   The prototype code is given in the references below.  It 
demonstrates, albeit somewhat trivially, an easy way to organize sequence matching data
hierarchically.

We would be very interested in any suggestions regarding how we might extend this work to make 
it more useful for bioinformatic applications as well as suggestions for demonstration projects.

As noted, all the software is open source and GNU GPL/LGPL. The main web page for this work, which 
includes coding examples, manuals and so forth, is:

        http://www.cs.uni/edu/~okane

The direct link to the documentation is:

        http://www.cs.uni.edu/~okane/source/compiler.html

The link to the BLAST example is:

        http://www.cs.uni.edu/~okane/source/compiler.html#blast

The source code is at:

        http://www.cs.uni.edu/~okane/source

The main development and testing vehicle is Linux.

-- 
Kevin C. O'Kane
Department of Computer Science 
University of Northern Iowa
Cedar Falls, IA 50614-0507
http://www.cs.uni.edu/~okane
okane at cs.uni.edu




More information about the Bio-www mailing list

Send comments to us at archive@iubioarchive.bio.net