Copyright 1982, 1985, 1992 by William R. Pearson. All rights reserved. The MAPC, MAPL, or DIGED programs and documentation may not be sold or incorporated into a commercial product, in whole or in part, without written consent of William R. Pearson. Revised May, 1992 I have had several requests lately for my mapping for inferring restriction maps from fragment length data, so I have recompiled them under unix. The notes below refer to the MS-DOS version, which is essentially identical to the UNIX version. The source code has been changed to move from Microsoft Fortran 3.3 to SunOS f77. Workstations have become a lot faster in the past 10 years, so that it may be quite reasonable to attempt mapping problems that arelarger than those outlined below. August, 1985 This disk contains the programs and source code for the restriction mapping programs MAPC (circular molecules) and MAPL (linear molecules) described in Nucleic Acids Research (1982) 10:217-227. In addition, an improved data entry program, DIGED, is included. These programs are the same as the original distribution for the Dec-10/20 and VAX computers. The program files are: mapc, mapl map circular or linear molecules. mapc will work on linear molecules but mapl is faster. diged improved data entry program that allows editing of digest data In addition to the programs, I have included several test data files: ct1.dat, ct2.dat, lt1.dat, lt2.dat. CT1,2 are two tests for circular molecules, LT1,2 are tests for linears. CT1 and LT1 are artificial tests that should be solved very rapidly with errors <10-6, CT2 and LT2 are real data. The results of these tests are shown in CT1.RES, etc. Running the programs. To run the mapping programs, you must first create a data file using diged and then run mapc or mapl. For example to map the test file ct2.dat, you would type: % mapc ; run the program Type data filename ct2.dat ; test file name Error, Efact = 0.02, 2.00 ; be sure to use decimal points and ; separate the values with a comma. Using ERR= .0200 EFACT= 2.00 pGT55 glutathione-s-transferase 16-Jan-81 ERROR = .0200 EFACT = 2.000 936 Digestions calculated in 0 sec Running the mapping programs is simple, just remember to enter the Error and Efact values as real numbers with decimal points, and separate them with a comma. If you forget a decimal point, a very low value for the error or efact may be used. If you forget to enter one of the numbers, the program will wait until it is typed or a / is typed. These mapping programs are fast and efficient but they have their limits. Enzymes which make one to five cuts can be mapped quickly, but more than 5 fragments takes a lot more time. It is impossible to accurately map more than 8 unknown fragments with this program, and it would take forever. [This may no longer be true.] Once you get a feel for the program with simple digests (sub-cloned phage fragments or lots of six-cutters) you can try more complicated problems. When trying to solve complex problems (more than 5 fragments in 2 or more digests) start with very low error limits (<0.01) and raise the limits after successive failures. If the error value is too low, the search will fail quickly and you can try a larger value. If the error value is too high, it may take several hours to get an answer. DIGED In the NAR paper, an example of data entry using the program DIGFIL was shown. Because DIGFIL required retyping of all data to make the smallest change, I have written a new program, DIGED. In addition to the data entry functions of DIGFIL, DIGED allows you to add enzymes, change fragment sizes, and reorder data from already existing files. The data entry portions of DIGED are exactly like DIGFIL. The new sections should be self-explanatory. Before you can map restriction data, they must be entered into a data file. Most people have much more difficulty with this process than the actual mapping, because the several kinds of data must be entered that one does not usually consider explicitly when mapping by hand. Again, when entering values for the size of fragments, be sure to use decimal points. And when entering the integer IBEG and IEND parameters, do not use decimal points. In addition, f77 requires that lists with an arbitrary number of values (such as restriction fragment sizes) end with a slash (/). For example, to enter a restriction fragment sizes, you would type: BAM1: 14.7, 8.4, 4.5, 3.2, / If you forget the /, the program will wait for you to type it. Most new users of the mapping programs are confused by some of the parameters required to map circular molecules. One of the most common mistakes is to forget to enter -1. when asked for the restriction digest XOFF (-1. UNKNOWN). XOFF is the coordinate of any known restriction site for the given enzyme in the molecule being mapped. If no site is known, -1. MUST BE USED. The entry cannot be left blank, or the program assumes that there is a known restriction site at 0.0. Circular molecules also require the investigator to specify a NEW FRAGment in the double digest data which is not present in either single digest. This information is also essential and cannot be left blank. I have left the choice of the NEW FRAG to the user so that he may specify a fragment clearly different from those in either single digest. In addition, the fragment size of this fragment should be accurately known. You should also note that the order of restriction digests is important. Since the program tries to fit the first two digests, and then include the third and fourth, etc. The program can go much faster if the digests with few restriction fragments are tested before digests with a large number of fragments. In addition, the best restriction fragment data should be tested before poorer data. DIGED offers an option to reorder the enzyme digest data for efficient fitting. Recompiling the programs: I have included all of the source files required to recompile the mapping and data entry programs, so that you can modify (and hopefully improve) them. The programs would be much more useful if they took into consideration which fragments were not cut in a digestion, and knew that certain sites were not present in the vector. To rebuid the programs, type: make all . This program is a direct translation from earlier large machine versions, but the unix version has not been extensively tested. I would appreciate hearing about any bugs you might find. William R. Pearson Department of Biochemistry Box 440 Jordan Hall University of Virginia Charlottesville, VA 22908 wrp@virginia.EDU