************ k2d.read.me ******************
Documentation file for the k2d program
for protein secondary structure prediction.
Last update: 14/4/96
*******************************************
1.- FILES SUPPLIED
k2d.read.me - This file
k2d.zip - A zip compressed file containing
* weights.dat - a file containing the weights of
100 trainings
* k2d.c - the k2d program written in c
* k2d.exe - the executable k2d program for PC
* gd.dat - CD sample
(glyceraldehyde-3-phosphate dehydrogenase)
k2d.SUN.tar.Z - A compressed tar file containing
* weights.dat - a file containing the weights of
100 trainings
* k2d.c - the k2d program written in c
* k2d - the executable k2d program for SUN
* gd.dat - CD sample
(glyceraldehyde-3-phosphate dehydrogenase)
2.- HOW TO USE k2d?
- Place in the same directory the executable program and the weights.dat
file.
- Generate a file with your problem CD spectra. It must contain 41 CD
values ranging from 200 nm to 241 nm. You can also experiment with
the example supplied (gd.dat).
The CD values must be given in deg cm^2 dmol^-1 multiplied by 0.001.
- Run the k2d program following the instruccions on the screen.
- The program generates two files:
* The output CD file has three columns.
The 1st has the wavelength values, the 2nd is the CD spectra of the
sample, and the 3rd has the mean CD spectra value of the winning
neuron for the 100 sets of training weights contained in
weights.dat.
* The percentage file gives the predicted alpha, beta and
random coil values. Additionally, it gives the square of
the euclidean distance between the real and the winning neuron
CD spectra, and, acordingly to this distance, the estimation of
the mean error in the prediction of the three secondary structure
values. If the distance is too large, the prediction could not be
reliable, and the program cannot give an error estimation. In this
case the predicted values should not be taken into account.
3.- AN EXAMPLE
- Run the k2d program.
- Use the 'gd.dat' file as input file.
- Generate a CD file
and a percentage file.
- You can display the CD file using your favourite graphics tool
to see how the computed spectrum mimics the sample spectrum.
- The resulting percentage values are 0.30, 0.12, 0.58. The square
of the distance between the two spectra is 32.20. According to this
distance, the program has given a maximum mean error of 0.080.
This means that the sum of the errors in the prediction of the
alpha, beta and random percentage values divided by three is expected
to be less then 0.08.
Since the secondary structure percentage values of the
glyceraldehyde-3-phosphate dehydrogenase are 0.30, 0.22 and 0.48,
the sum of the absolute errors in the three predicted values is
0.00+0.10+0.10=0.20. Since 0.20/3=0.066 < 0.080, the error is
below the maximal error given by the program.
4.- THE ALGORITHM
- The algorithm has been published in:
M.A. Andrade, P. Chacon, J.J. Merelo and F. Moran. (1993)
"Evaluation of secondary structure of proteins from UV circular
dichroism spectra using an unsupervised learning neural network".
Protein Engineering. 6: 383-390
Merelo, J.J., M.A. Andrade, A. Prieto and F. Moran. (1994)
"Proteinotopic Feature Maps". Neurocomputing. 6: 443-454
5.- SEND A MAIL
- If you get the program, please let us know your e-mail address by
sending a mail to andrade@embl-heidelberg.de
We will inform you about the following versions of the program.
[K2d Home Page]