Sequence formats The server can handle several file formats: plain text, Fasta, NBRF/PIR and Swissprot format. Plain text format A sequence with one letter (small or capital) code (e.g. by copy and paste). This format can be applied only for single sequence mode (see below). Example: ACFGHIKLMPQRTYVVFGHKLPPASSCVFGHKLMNVVVVDEQVREWTYPLLLASW ERTYMCDK Fasta format A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of the text should be shorter than 80 characters in length. Sequences are expected to be represented in the standard IUB/IUPAC amino acid codes. Example: >MYPAT This is the name of my pat protein. In one line! TYPLKLPPASSCVFGHKLMNVVVVDEQVREWTYPLKLPPASSCVFGHKLMNVVVV DEQVREWTYPL >MYPATWO This is the name of my other pat protein. In one line! ACFGHIKLMPQRTYVVFGHKLPPASSCVFGHKLMNVVVVDEQVREWTYPLLLASW ERTYMCDKTYPLKLPPAS NBRF/PIR format A sequence in NBRF/PIR format begins with a two-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by the starting symbols ">P1;". The second line contains the comments. The sequence have to end by an asterisk ("*") symbol. Example: >P1;MYPAT This is some comment in the second line. TYPLKLPPASSCVFGHKLMNVVVVDEQVREWTYPLKLPPASSCVFGHKLMNVVVV DEQVREWTYPL* >P1;MYPATWO Hello it is just a comment. ACFGHIKLMPQRTYVVFGHKLPPASSCVFGHKLMNVVVVDEQVREWTYPLLLASW ERTYMCDKTYPLKLPPAS* Sequence type The server can handle the submitted sequences by two different way: as single sequences or as homologue sequences. Single sequences The server sends back the topology prediction for each sequence. For each prediction only the actual single sequence information is used. Homologous sequences The server sends back only one topology prediction. All the submitted sequences are used as homologues in the prediction. The time of the prediction is linear for the number of submitted sequences. Prediction type The server can run in two different modes: as reliable or as fast. Reliable mode In this mode the server makes the Baum-Welch iteration for the submitted sequence(s), i.e. it searches or makes optimization for the best topology. Therefore the results are reliable but this mode is more timeconsuming. Fast mode In this mode the server does not make the Baum-Welch iteration for the submitted sequence(s), just does the Viterbi algorithm using parameters derived from topology information of wellknown transmembrane proteins. Therefore, the results are generated very fast, but the prediction accuracy is lower in this mode. This option is useful for genome screening. Localization of sequence part(s) If the localization of some parts of the query protein is known, then this option allows to lock this or these part(s). The prediction will be made using this restriction as conditional probabilities. The syntax of the localisation is the following: begin_pos-end_pos-type, where begin_pos and end_pos are the sequence numbers of the sequence piece wanted to localise, while type is a one letter code of the cell parts, where the sequence piece is localised (I: inside; i: inside-tail; H: transmembrane-helix; O: outside; o: outside-tail). Examples: 123-145-I generates prediction where the sequence piece between 123 and 145 will be in the cytosol (inside). 10-45-O 156-E-I generates prediction where the sequence pices between 10 and 45 will be outside AND from the residue 156 to the C terminus will be inside. Output format The output can be generated as html file or as simple text file. The last option is useful for additional processing of the result(s). Results in one line Using this option results in a short output, where the prediction are in one line for each protein. The line begins with the >HP: symbol followed by the length and the name of the submitted protein, the orientation of the first residue relative to the membrane (IN or OUT), then the number of the predicted transmembrane helices followed by the begin and end positions of the preditcted transmembrane helices.