![]() |
||||||
|
||||||
|
Primary Structure SectionThe primary structure section of a PDB file contains the sequence of residues in each chain of the macromolecule. Embedded in these records are chain identifiers and sequence numbers that allow other records to link into the sequence. DBREFOverviewThe DBREF record provides cross-reference links between PDB sequences and the corresponding database entry or entries. Record Format
COLUMNS DATA TYPE FIELD DEFINITION
----------------------------------------------------------------
1 - 6 Record name "DBREF "
8 - 11 IDcode idCode ID code of this entry.
13 Character chainID Chain identifier.
15 - 18 Integer seqBegin Initial sequence number
of the PDB sequence segment.
19 AChar insertBegin Initial insertion code
of the PDB sequence segment.
21 - 24 Integer seqEnd Ending sequence number
of the PDB sequence segment.
25 AChar insertEnd Ending insertion code
of the PDB sequence segment.
27 - 32 LString database Sequence database name.
34 - 41 LString dbAccession Sequence database accession code.
43 - 54 LString dbIdCode Sequence database
identification code.
56 - 60 Integer dbseqBegin Initial sequence number of the
database seqment.
61 AChar idbnsBeg Insertion code of initial residue
of the segment, if PDB is the
reference.
63 - 67 Integer dbseqEnd Ending sequence number of the
database segment.
68 AChar dbinsEnd Insertion code of the ending
residue of the segment, if PDB is
the reference.
Details
Database name database
(code in columns 27 - 32)
----------------------------------------------------------
GenBank GB
Protein Data Bank PDB
Protein Identification Resource PIR
SWISS-PROT SWS
TREMBL TREMBL
UNIPROT UNP
The sequence database entry found during PDB's search is compared to that provided by the depositor and any differences are resolved or annotated. In most cases, only one reference to a sequence database will be provided. PDB does not guarantee that all possible references to the listed databases will be provided. Relationships to Other Record TypesDBREF represents the sequence as found in SEQRES records. Example
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890
DBREF 2J83 A 61 322 UNP Q8TL28 Q8TL28_METAC 61 322
DBREF 2J83 B 61 322 UNP Q8TL28 Q8TL28_METAC 61 322
DBREF 1ABC B 1B 36 PDB 1ABC 1ABC 1B 36
DBREF 3AKY 3 220 SWS P07170 KAD1_YEAST 5 222
DBREF 1HAN 2 288 GB 397884 X66122 1 287
DBREF 3HSV A 1 92 SWS P22121 HSF_KLULA 193 284
DBREF 3HSV B 1 92 SWS P22121 HSF_KLULA 193 284
DBREF 1ARL 1 307 SWS P00730 CBPA_BOVIN 111 417
SEQADVOverviewThe SEQADV record identifies conflicts between sequence information in the SEQRES records of the PDB entry and the sequence database entry given on DBREF. Please note that these records were designed to identify differences and not errors. No assumption is made as to which database contains the correct data. PDB may include REMARK records in the entry that reflect the depositor's view of which database has the correct sequence. Record Format
COLUMNS DATA TYPE FIELD DEFINITION ----------------------------------------------------------------- 1 - 6 Record name "SEQADV" 8 - 11 IDcode idCode ID code of this entry. 13 - 15 Residue name resName Name of the PDB residue in conflict. 17 Character chainID PDB chain identifier. 19 - 22 Integer seqNum PDB sequence number. 23 AChar iCode PDB insertion code. 25 - 28 LString database 30 - 38 LString dbIdCode Sequence database accession number. 40 - 42 Residue name dbRes Sequence database residue name. 44 - 48 Integer dbSeq Sequence database sequence number. 50 - 70 LString conflict Conflict comment.Details
Cloning artifact
Conflict
Engineered
Disordered
Variant
Insertion
Deletion
Microheterogeneity
D-configuration
SEQADV records are automatically generated by the PDB. Relationships to Other Record TypesSEQADV refers to the sequence as found in the SEQRES records, and to the sequence database reference found on DBREF. REMARK 999 contains text that explains discrepancies when the explanation is too lengthy to fit in SEQADV. Example
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890
SEQADV 2J83 ALA A 269 UNP Q8TL28 CYS 269 ENGINEERED MUTATION
SEQADV 2J83 ALA B 269 UNP Q8TL28 CYS 269 ENGINEERED MUTATION
SEQADV 3ABC MET A -1 SWS P10725 CLONING ARTIFACT
SEQADV 3ABC GLY A 50 SWS P10725 VAL 50 ENGINEERED
SEQRESOverviewSEQRES records contain the amino acid or nucleic acid sequence of residues in each chain of the macromolecule that was studied. Record Format
COLUMNS DATA TYPE FIELD DEFINITION
-------------------------------------------------------------------
1 - 6 Record name "SEQRES"
9 - 10 Integer serNum Serial number of the SEQRES record
for the current chain. Starts at 1
and increments by one each line.
Reset to 1 for each chain.
12 Character chainID Chain identifier. This may be any
single legal character, including a
blank which is used if there is
only one chain.
14 - 17 Integer numRes Number of residues in the chain.
This value is repeated on every
record.
20 - 22 Residue name resName Residue name.
24 - 26 Residue name resName Residue name.
28 - 30 Residue name resName Residue name.
32 - 34 Residue name resName Residue name.
36 - 38 Residue name resName Residue name.
40 - 42 Residue name resName Residue name.
44 - 46 Residue name resName Residue name.
48 - 50 Residue name resName Residue name.
52 - 54 Residue name resName Residue name.
56 - 58 Residue name resName Residue name.
60 - 62 Residue name resName Residue name.
64 - 66 Residue name resName Residue name.
68 - 70 Residue name resName Residue name.
Verification/Validation/Value Authority Control
The residues presented on the SEQRES records must agree with those found in the ATOM records. The SEQRES records are checked by PDB using the sequence databases and information provided by the depositor. SEQRES is compared to the ATOM records during processing, and both are checked against the sequence database. All discrepancies are either resolved or annotated in the entry. Example
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890
SEQRES 1 A 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU
SEQRES 2 A 21 TYR GLN LEU GLU ASN TYR CYS ASN
SEQRES 1 B 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU
SEQRES 2 B 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR
SEQRES 3 B 30 THR PRO LYS ALA
SEQRES 1 C 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU
SEQRES 2 C 21 TYR GLN LEU GLU ASN TYR CYS ASN
SEQRES 1 D 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU
SEQRES 2 D 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR
SEQRES 3 D 30 THR PRO LYS ALA
Known Problems
Polysaccharides do not lend themselves to being represented in SEQRES. There is no mechanism provided to describe sequence runs when the exact ordering of the sequence is not known. For cyclic peptides, PDB arbitrarily assigns a residue as the N-terminus. No distinction is made between ribo- and deoxyribonucleotides in the SEQRES records. These residues are identified with the same residue name (i.e., A, C, G, T, U). MODRESOverviewThe MODRES record provides descriptions of modifications (e.g., chemical or post-translational) to protein and nucleic acid residues. Included are a mapping between residue names given in a PDB entry and standard residues. Record Format
COLUMNS DATA TYPE FIELD DEFINITION
----------------------------------------------------
1 - 6 Record name "MODRES"
8 - 11 IDcode idCode ID code of this entry.
13 - 15 Residue name resName Residue name used in this entry.
17 Character chainID Chain identifier.
19 - 22 Integer seqNum Sequence number.
23 AChar iCode Insertion code.
25 - 27 Residue name stdRes Standard residue name.
30 - 70 String comment Description of the residue
modification
Details
Glycosylation site
Post-translational modification
Designed chemical modification
Phosphorylation site
Blocked N-terminus
Aminated C-terminus
D-configuration
Reduced peptide bond
MODRES is generated by the PDB. Relationships to Other Record TypesMODRES maps ATOM and HETATM records to the standard residue names. SEQADV, HET, and FORMUL may also appear. Example
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890
MODRES 1ABC ASN A 22A ASN GLYCOSYLATION SITE
MODRES 2ABC TTQ A 50A TRP POST-TRANSLATIONAL MODIFICATION
MODRES 3ABC DAL A 32 ALA POST-TRANSLATIONAL MODIFICATION,D-ALANINE
MODRES 3ABC DAL B 32 ALA POST-TRANSLATIONAL MODIFICATION,D-ALANINE
� 2007 wwPDB |