We are using X-ray crystallography, nuclear magnetic resonance (NMR), and electron microscopy. Although structure determination is the essential part of our approach, we emphasize that structures themselves are not a goal, but a starting point of studies. "Structures at work" is an important keyword of our researches. Protein structures inspire the design of new biochemical experiments, and the feedbacks produce new targets for structural biology in the future. The tight coupling of structure determination and biological experiments is essential for successful structural studies.
Interactions of proteins with their ligands can be divided into two categories: strong interactions and weak interactions. The strict specificity and tight binding of strong interactions are based on the shape and electrostatic complementarity at the molecular interface. The structure of the protein-ligand complex clearly describes the details of the strong interaction. In contrast, weak interactions are characterized by molecular recognitions with nonstrict specificity and weak affinity. Note that nonstrict specificity is not equal to non-specificity. A protein with nonstrict specificity can bind to a set of ligands without apparent structural similarity. Many people are interested in strong interactions, but weak interactions are also very important in various biological phenomena.
The mitochondrion is one of the organelle in animal and plant cells, and performs various important biological processes, including the production of ATP and apotosis. The mitochondrion is enclosed by two biomembranes, the outer membrane and the inner membrane. The most of mitochondrial proteins are synthesized in the cytosol and must be imported into mitochondria. What is the molecular basis for the sorting of these proteins into mitochondria?
Mitochondrial matrix proteins are synthesized as precursor proteins with cleavable N-terminal signal sequences. The mitochondrial signal sequence is termed presequence. The presequences function as a tag for mitochondria and cleaved off after import into mitochondria. It is curious that no consensus sequence is found among the presequences of 1000 mitochondrial preproteins.
Protein import into mitochondria is mediated by protein assemblies, TOM and TIM, in the mitochondrial outer and inner membranes, respectivly. A TOM subunit, Tom20, functions as a general protein import receptor. The cytosolic domain of Tom20 can bind to 1000 different presequences of preproteins. The recognition of presequences by Tom20 is a good example of weak interaction.
We monitored the chemical shift perturbation of the NMR signals of five different 15N-labeled presequence peptides by the addition of the cytosolic domain of Tom20. The perturbed segments occupy different positions, either near the N-terminus or at the C-terminus, in the presequences. Now we are ready to answer the question, “why no consensus sequence are found among mitochondrial presequences ?” A presequence is composed of short amino acid sequences that are recognized by several proteins, and the organization (position, order, and overlapping) is unique for each presequence. Thus, simple alignment of presequences cannot reveal any consensus sequences without the deep understanding of the cryptogram embedded in mitochondrial presequences.
The NMR analyses revealed a common five-residue pattern for Tom20 binding in different presequences. To refine the common amino-acid motif for the recognition by Tom20, we introduced a new peptide library approach: we prepared a mixture of ALDH presequence variants, tethered these peptides to Tom20 in a competitive manner by an intermolecular disulfide bond, and determined the relative affinities by MALDI-TOF MS spectrometry. We successfully deduced a refined, common motif for the recognition by Tom20. The 5-residue consensus is φχχφφ, where φ is a hydrophobic amino acid and χ is any amino acid. This consensus can represent a huge variety of amino acid sequences.
We determined the three-dimensional structure of Tom20 in a complex with an 11-residue peptide derived from rat aldehyde dehydrogenase (ALDH), which contained LSRLL as the Tom20-binding consensus. The cytosolic domain of Tom20 forms all α-helical structure with a groove to accommodate the presequence peptide. The bound presequence forms an amphiphilic helical structure with hydrophobic leucine side-chains aligned on one side to interact with a hydrophobic patch in the Tom20 groove.
The NMR structure of the Tom20-pALDH complex was the first structure that revealed the recognition of a signal sequence by its receptor in an α-helical conformation. The pictures of the NMR structure appear in many text books, including “Molecular Biology of the Cell”.
Our first attempt to cocrystallize the cytosolic domain of Tom20 with the ALDH presequence failed, probably due to the weak affinity of the presequence peptide for Tom20. Therefore, we tethered the presequence peptide to Tom20 via an intermolecular disulfide bond.
We successfully obtained three forms of crystals suitable for X-ray data collection. The three-dimensional structures of the complex of Tom20 and the ALDH presequence peptide were determined to 2 A resolutions. To our surprise, Tom20 was equipped with only two hydrophobic sites for the recognition of the three hydrophobic side chains in the Tom20 consensus motif.
PubMed: 17948058, 21591667
The comparison of the crystal structures implied that a dynamic equilibrium exists among two (or more) bound states of the presequence peptide on Tom20 in solution. In accord with this model, an NMR relaxation study revealed motion on the sub-millisecond time scale at the interface between Tom20 and the presequence peptide. We propose a dynamic, multiple mode of recognition that explains the structural basis of the broadly selective specificity of Tom20 towards diverse mitochondrial presequences.
Contacts with neighboring molecules in protein crystals inevitably restrict the internal motions of intrinsically flexible proteins. The resultant clear electron densities permit model building, as crystallographic snapshot structures. Although these still images are informative, they could provide biased pictures of the protein motions. If the mobile parts are located at a site lacking direct contacts in rationally designed crystals, then the amplitude of the movements can be experimentally analyzed. We will call the special space CCFS (crystal contact-free space).
We propose a fusion protein method, to create CCFS in protein crystals. We selected MBP as a fusion partner to construct a rigid CCFS scaffold. We successfully used α-helical spacers to fuse the C-terminal α-helix of MBP and the N-terminal α-helix of the target Tom20 protein firmly. In the cases of ligands with weak affinity, the problem of partial ligand occupancy must be considered. The pALDH presequence peptide was tethered onto Tom20, to ensure the full occupancy of the presequence in the binding site. We added a cysteine residue at the C-terminus of pALDH, to form an intermolecular disulfide bond with the single cysteine residue in the fusion protein.
We collected X-ray diffraction data to a 2 Å resolution. Conventional model building fails when large amplitude motions exist. Here, the mobile presequence appears as smeared electron densities in the Fo-Fc difference electron density map, after suitable processing (i.e., low-pass filtering and FreeR-averaging) of the X-ray diffraction data. Now the moving presequence peptide is visualized as an L-shaped smeared electron density in the binding site of Tom20.
The smeared electron density in the difference electron density map corresponded to the partially-overlapped volume among the multiple poses of the presequence helix.
Our current working hypothesis is that “a rapid equilibrium of multiple states with partial recognitions” is the molecular basis for the promiscuous binding of the Tom20 receptor to diverse mitochondrial presequences with nearly equal affinities. We expect that better diffraction measuring and data processing will improve the signal-to-noise ratio of the Fo-Fc difference electron density map, and reveal the spatial distribution of the moving α-helical presequence peptide experimentally in the near future.
Protein glycosylation is one of the most important protein modifications. The transfer of oligosaccharide chains to asparagine residues in proteins occurs in the consensus sequence of Asn-X-Thr/Ser, where X can be any amino acid residue except for Pro. Asn-glycosylation is widespread not only in eukaryotes but also in archaea and some eubacteria.
Oligosaccharyltransferase (OST) is an enzyme that catalyzes the transfer of the oligosaccharide from a lipid donor to the side chain of an Asn residue in the sequon. The OST enzyme is a membrane-associated multisubunit protein complex in Eukaryotes. The glycosylation actually occurs at about 60 % sequon. The conformation of the sequon in the bound state is said to determine the occurrence of glycosylation. We think that the occupancy rule of the glycosylation will be solved immediately after the weak interaction nature of the sequon is revealed.
The catalytic subunit of the OST enzyme is referred to as STT3 in Eukarya, AglB in Archaea and PglB in Eubacteria. We determined the 2.7 A resolution crystal structure of the C-terminal soluble domain of P. furiosus AglB. This is the first 3D structure of the STT3/AglB/PglB proteins.
PubMed: 18046457, 17768359
We then determined the crystal structures of the C-terminal soluble domains of other AglB proteins and a PglB protein. The superimposition of the 5 crystal structures revealed the unusual plasticity of a segment in the C-terminal globular domain of the OST enzyms.
No structure of the eukaryotic STT3 proteins has been reported yet.
PubMed: 23815857, 23177926, 22559858, 20007322
We developed a new assay method for the oligosaccharyl transfer activity. The peptide substrate is a synthetic peptide that contains an N-terminal fluorescent dye for detection. The produced glycopeptide is separated from the unreacted peptide by SDS-PAGE. The addition of a C-terminal biotin tag enables the efficient purification of the glycopeptide product.
We determined the full-length crystal structures of the Archaeoglobus fulgidus AglB. The comparison with the eubacterial PglB structure determined by another group revealed the structural conservation of the catalytic core and the membrane-spanning region. The N-terminal transmembrane region consists of 13 TM helices, and contains the active site consisting of two conserved acidic residues and a metal ion for the activation of the carboxyamide group of the Asn residue. The C-terminal globular domain contains a binding site for the Ser and Thr residues in the sequon. This Ser/Thr binding pocket might be a dynamic structure, because the plastic segment identified by the structural comparison is involved in the formation of the Ser/Thr pocket.
A peptide carrying a Asn-X-Thr sequence was tethered to the AglB protein through a disulfide bond. We determined the crystal structure of the AglB-peptide complex. Interestingly, the Asn residue fixed on the enzyme can accept the oligosaccharide chain. This unique reaction system showed that Gln can be glycosylated in place of Asn.
The N-glycan structures in Archaea exhibit huge varieties in their monosaccharide compositions, linkages, and branching patterns. We determined the chemical structures of the N-glycans from Pyrococcus furiosus, Archaeoglobus fulgidus, and Pyrobaculum calidifontis by a sugar analysis, MS and NMR. Oligosaccharide chains attached to structurally defined peptides were produced by an in vitro oligosaccharide-transfer reaction, using membrane fractions that contained AglB and lipid-linked oligosaccharides, the donor of oligosaccharide for OST.
For better sensitivity in NMR measurements, 13C-glucose was added to the culture medium for stable isotope labeling of the lipid-linked oligosaccharides in the case of A. fulgidus glycan.
PubMed: 24562177, 26093517
The oligosaccharide donor for the OST enzyme is a lipid-linked oligosaccharide (LLO), in which an oligosaccharide chain is preassembled on a lipid-phospho carrier. We determined the archaeal LLO structures from the phylum Euryarchaeota, Pyrococcus furiosus and Archaeoglobus fulgidus and the phylum Crenarchaeota, Pyrobaculum calidifontis and Sulfolobus solfataricus, by LC-MSMS analysis. We found that the euryarchaeal LLOs are dolichol-monophosphate-oligosaccharide, and but crenarchaeal LLOs are dolichol-diphosphate-oligosaccharide. This novel finding provides an insight into the evolution of the N-glycosylation system.
DNA replication forks are arrested by various internal and external threats. In bacteria, the PriA protein is a sensor protein that recognizes the arrested forks. We found that PriA specifically recognized the 3'-termini of arrested nascent DNA chains. The fluorescence correlation spectroscopy analyses show that the N-terminal domain of E. coli PriA has almost the same affinity for four 3' terminal nucleotides, A, C, G, and T of oligonucleotides. We determined the crystal structures of the N-terminal domain (105 aa) of PriA in complexes with oligonucleotides, ApA, ApC, ApG, ApT, CpCpC.
A hypothetical complex model of the N-terminal domain of PriA and arrested fork-like DNA structure was made. One aspartate residue (Asp17) has intimate contacts with the four bases in a manner without discriminating them nor disturbing the base pairing, to realize the non-selective recognition of the 3’-end base of dsDNA.
（collaboration with Prof. Hisao Masai, Tokyo Metropolitan Institute of Medical Science）