IUBio GIL .. BIOSCI/Bionet News .. Biosequences .. Software .. FTP

R factors and B factors

R. Bryan Sutton sutton at laplace.csb.yale.edu
Tue Oct 27 16:51:05 EST 1998

R. Bryan Sutton
Department of Biophysics and Molecular Biology
Howard Hughes Medical Institute
Yale University
266 Whitney Ave. BASS 434A
New Haven, CT 06520

email: sutton at laplace.csb.yale.edu
-------------- next part --------------

Hi Alex,

You asked for it...

I thought that I'd relate to you some of the newer concepts behind
macromolecular refinement.  There seems to be some fundemental
prejudices with protein structure refinement that need to be
addressed.  To address your first question: 

>1. What's generally the acceptable R-factors for a structure, with the
>data of 2.5 A; 3.0 A; and 3.5 A.

As far as the traditional working R is concerned, it's value depends on alot of things.  You can't really 
make hard-and-fast rules about the absolute number.  While it is true that
you don't want to report ridiculously high values, the old rule that
anything in the high teens to low 20's is good is not a reasonable answer.  Macromolecular structures have 
probably been severly overfit for years.  The assumption was that our data were measured perfectly and 
the model is complete; this is clearly not the case. Also, we simply don't have enough parameters
to justify what we are doing to the model.  New refinement programs such as CNS and 
Refmac have attempted to remedy these problems by including target functions (residual function, for example)
that take the errors in the data and the errors in the model into account.  This is known as
Maximum Liklihood refinement.  Using this type of target function, your Rfactor number 
will be higher, since it will be harder (but not impossible) to induce overfitting.  In the coming
years, the Rfactors of refined protein models will probably go up...at least they ought to go up.  
This type of refinement strategy keeps you more honest.  So, to answer your question, the acceptable
Rfactor minimizes the protein model while still staying honest to your measurements.  
It may be that you cannot refine below 30%, in such a case... thems the breaks...  
OH yeah... about sigma cutoffs:
Poorly measured data is better than no data at all.  So, it is a statistically good idea to use ALL data (0 sigma and higher)
during refinement.  Weak data shouldn't be discriminated against....it's just as valid.  Again, the R-factor 
will suffer a bit... but, it's probably a better indicator of your refinement progress anyway.

Another more important indicator of your model quality is the R-free value.  In the newer refinement packages,
this value should be slightly higher (~2% - 5% higher) than the working R.  Actually, the working R 
is "almost" meaningless.  Real experimental data has shown that the
changes in R-free are more consistent with the actual progress of refinement than the 
traditional working R...although the working R looks better in journals.  Basically, there IS no
acceptable R-factor that is dependent on resolution.  It's a case by case experience.

>2. Is it true, that the higher resolution your data has, the lower
>R-factors you should get? Or just the way around?

The higher resolution, the more parameters you are justified to use in your refinement.
So, conceivably, you should get better models and therefore, better R factors.  But again,
it's dependent on your crystal and your data.  I suppose it's possible to have 1.5A data with a 
high final R-factor. Again, it depends on YOUR situation.  There is no good, statistically valid
rule-of-thumb anymore.

>3. What's the significance of the atomic B-factors when you have a low
>resolution data, for example, 3.0 A; or 3.5 A.

The value of the atomic B-factor depends on the quality of your data.  You can have a 
crystal that diffracts to high resolution, but also has high (>100A^2) overall B-factors.
It depends on your crystal.  If you're asking about whether or not you have the right to 
apply individual B-factor refinement to lower resolution data, the answer is... probably not.
It's statistically safer to use grouped B-factor refinement at this resolution.  However, if 
your R-free value decreases using atomic B-factor refinement, you may be justified... although 
you may have to fight with a referee a bit.

More information about the X-plor mailing list

Send comments to us at archive@iubioarchive.bio.net