The natural structure of the RhopH complex, a key determinant of malaria parasite nutrient access | NASA

2021-12-14 12:55:15 By : Mr. jack chen

View all hidden authors and organizations

Edited by Robert M. Stroud, University of California, San Francisco, approved on June 4, 2021 (review received on January 13, 2021)

Plasmodium invades and replicates in human red blood cells, which lack a nucleus and have extremely low metabolic activity. In order to survive, parasites have created new ways to change the permeability of red blood cell membranes, enabling them to import nutrients and export waste. Here, we introduce the natural structure of the ternary RhopH protein complex, which plays a key role in this process. We used the endogenous structural proteomics approach enabled by cryo-electron microscopy to determine the structure of this basic complex from a heterogeneous mixture of proteins enriched directly from parasite cell lysates. The natural structure of the RhopH complex in the soluble transport state helps to clarify the long-standing problem of how parasite transmembrane proteins are transported to the red blood cell membrane.

The RhopH complex is related to the ability of Plasmodium to invade host red blood cells and create new permeability pathways in the host red blood cells, but its mechanism is still poorly understood. Here, we directly enrich endogenous RhopH complexes, including RhopH2, CLAG3.1 and RhopH3, in natural soluble form from parasite cell lysates, and use cryo-electron microscopy (cryo-EM), mass spectrometry, and cryo-identification procedures . CLAG3.1 is located between RhopH2 and RhopH3, both of which share a large number of binding interfaces with CLAG3.1, but have the least contact with each other. The force that stabilizes a single subunit includes 13 intramolecular disulfide bonds. It is worth noting that residues 1210 to 1223 of CLAG3.1 previously predicted to constitute the transmembrane helix are embedded in the helix bundle formed by residues 979 to 1289, near the C-terminus of CLAG3.1. Buried in the core of the RhopH complex and largely isolated from solvent, this putative transmembrane spiral insertion into the red blood cell membrane may require a large number of conformational rearrangements. In view of the abnormally high disulfide content of the complex, this rearrangement may be triggered by the breaking of allosteric disulfide bonds, which may be triggered by interactions on the red blood cell membrane. The first direct observation of the exported P. falciparum transmembrane protein-in a soluble, transportable state, and with buried atomic details of the putative membrane insertion helix-provides information on the assembly and transport of RhopH and other parasite-derived complexes to the red blood cell membrane opinion. Our research shows that endogenous structural proteomics methods have the potential to elucidate the molecular mechanisms of difficult-to-separate complexes in natural, functional forms.

Nearly half of the world's population is at risk of contracting malaria. To control this disease, we rely heavily on artemisinin-based therapies to limit the impact of malaria to 200 million cases and 500,000 deaths per year (1). The recent discovery of the de novo emergence of artemisinin-resistant malaria parasites in Africa has increased the urgent need to define and target the molecular mechanisms necessary for the parasite's survival in human red blood cells, thereby promoting new therapies with new modes of action The development of (1 ⇓ –3). In order to survive in human red blood cells, the Plasmodium falciparum deployed hundreds of effector proteins. These proteins have extensively modified the normal resting host red blood cells and established for nutrient absorption, waste outflow and immune evasion. New infrastructure was created to support the active growth and replication of parasites (4, 5). The members of the high molecular weight rhombohedral protein complex called the RhopH complex—including CLAG3, RhopH2, and RhopH3—have been shown to play a key role in this process by directly or indirectly promoting the activity of anion channels on the surface of the malaria parasite. PSAC), a new type of ion channel that appears on the surface of red blood cells infected with Plasmodium, has been shown to mediate the increase in the permeability of the red blood cell plasma membrane to various solutes (6⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ 18).

The mechanism of action of the RhopH complex in PSAC activity is still poorly understood. How the components of the RhopH complex are transferred from the parasite to the erythrocyte membrane is also unclear, although some have hypothesized that they may be in a soluble form or bind to other proteins, protecting them when they pass through the host cell cytosol (9). One theory suggests that they may pass through the cytoplasm through a series of membrane compartments produced by the parasite, called the outer membrane system (19, 20). The high-resolution structure of the membrane-bound or soluble form of the RhopH complex will help answer these unanswered questions and provide a much-needed structural framework to rationalize the massive seemingly contradictory phenotypic and biochemical data surrounding this interesting complex.

In order to solve the problem of how the members of the RhopH complex are delivered to the red blood cell membrane, we tried to study whether the soluble complex containing one or more RhopH complex components is present in the soluble parasite lysate, and if so, determine its structure. Unfortunately, the extreme difficulty of producing a correctly folded and assembled P. falciparum protein complex by reconstitution or co-expression in a heterologous system has so far hindered efforts to obtain a high-resolution structure of the RhopH complex. In addition, uncertainty regarding the exact composition of the putative soluble RhopH complex precludes the use of heterologous systems to generalize the complex. In response to such challenges, we recently established an endogenous structural proteomics method that combines the use of cryo-electron microscopy (cryo-EM), mass spectrometry, and cryoID. This program is developed by us for identification Program density map of proteins in sub4.0Å frozen EM to identify and determine the structure of unknown protein complexes (Figure 1) (21). Here, we used this method to obtain the near-atomic resolution structure of the unmodified RhopH complex in a soluble state, which was directly enriched from the lysate of Plasmodium falciparum parasite. This structure explains the main body of existing biochemical data, provides insights into the dual role of RhopH complex in host red blood cell invasion and creates new permeability pathways, and clarifies the long-term problem of how parasite transmembrane proteins are trafficked to Red blood cell membrane. The research also demonstrates the power of endogenous structural proteomics methods that can provide exciting insights into the molecular mechanisms of difficult-to-separate complexes for challenging natural sources where traditional structural biology methods have failed.

Endogenous structural proteomics workflow using cryoID. (A) A description of the workflow. P. falciparum parasites and parasitic vacuoles release saponin from the red blood cells infected by P. falciparum, and use sucrose gradient to lyse and separate the resulting parasites and vacuole pellets. The fractions were evaluated by SDS-PAGE, mass spectrometry, and negative staining EM. Cryo-electron microscopy imaging and analysis of selected parts produced a cryo-electron microscopy density map with a resolution of 3.7 Å. Use cryoID to identify the protein in the cryo-EM map and then model it from scratch to generate the final atomic resolution structure of the RhopH complex. (B) Silver-stained SDS-PAGE of selected partial lysates. The arrow indicates a band consistent with the molecular weight of the RhopH complex component. (C and D) Representative negative staining (C) and cryo-electron microscope (D) micrographs of selected parts. (Scale bar, 20 nm.) Arrows indicate particles that are consistent with the 2D average of the RhopH complex. (E) A representative 2D average value, corresponding to multiple protein complexes present in a single frozen EM micrograph data set. As a result, the class average of the RhopH complex structure presented here is framed in green. (Scale bar, 10 nanometers.)

Following the endogenous structural proteomics workflow (21) (Figure 1A), we used mass spectrometry to identify parts of the P. falciparum parasite lysate, which looked promising by negative staining EM and contained a Or multiple RhopH complex components (Figure 1) B and C and data set S1). Analysis of cryo-EM data sets collected from one of these parts in RELION (22) and cryoSPARC (23) yielded many promising two-dimensional (2D) averages (Figures 1D and E and the SI appendix, figure 1). We determined a subset of these class averages. After de novo reconstruction and non-uniform refinement in low-temperature SPARC, a new near-atomic resolution cryo-electron microscopy density map of a new asymmetric protein complex was generated. The total resolution It is 3.7 Å (Figure 2 and SI appendix, Figure 1).

Cryo-EM density map of the soluble RhopH complex. (A) The Cryo-EM density map of the RhopH complex viewed from multiple angles. RhopH2, CLAG3.1 and RhopH3 are colored in navy, gold and green, respectively. The unmodeled areas are shown in gray. (B) The atom model of the soluble RhopH complex, the color is the same as A, the query fragment is used for protein identification through cryoID, and the color is pink. (C and D) The local resolution evaluation of the soluble RhopH composite map, calculated using Resmap (57), colored according to the resolution, and displayed as a complete surface (C) and a central slice (D). (EG) A detailed view of each query sequence identified by cryoID for RhopH2 (E), CLAG3.1 (F) and RhopH3 (G), showing the corresponding cryo-EM density (grid).

After further intensive classification and refinement in RELION to improve the local resolution of the entire map, we were able to use cryoID to successfully identify all three components of the complex, namely RhopH2, CLAG3.1 and RhopH3, which are the three members of the RhopH complex (Fig. 2 B and EG and SI appendices, Fig. 1 and movies 1-3). The clear side chain density in the entire map allows us to construct a three-component de novo atomic model (Figure 2B and SI appendix, Figure 2), revealing a tripartite protein complex containing a single copy of each of the three proteins, assuming Without post-translation truncation, the total calculated mass is 435 kDa.

Although CLAG3.1 and CLAG3.2 subtypes have 96% protein sequence identity, there is a known hypervariable region (10) (HVR) (F1094 to G1142), among which CLAG3.1 in the NF54 parasite includes The two additional amino acids (aa) (T1140 to H1141) were not found in CLAG3.2 of the NF54 parasite. Fortunately, we have a strong density in this area, which allows us to unequivocally resolve that the isotype in our figure is CLAG3.1 (SI appendix, Figure 3), which is inconsistent with the previous indication that CLAG3.1 is in cultivation The findings of priority expression are consistent (24). The overall map quality also allows us to model most of the visible density of the complex, with the main exception being a small auxiliary domain near the C end of RhopH2, which accounts for approximately at the lower density threshold (σ = 0.0355). 6.9% of the total volume of our complex (SI appendix, Figure 4). Although the density quality here is not enough to record the residues, we can discern that this region mainly contains several twisted β-strands, forming a core rich in β-sheets.

CLAG3.1 forms the core of the complex, located between RhopH2 and RhopH3 (Figure 2 A and B and Movie 4). Both RhopH2 and RhopH3 share a large number of bonding interfaces with CLAG3.1, but have the least contact with each other (Movie 4). Use PDBePISA (25) to calculate buried and exposed surface areas. Of the two, the CLAG3.1-RhopH3 interface is more extensive. Compared with the buried surface, the buried surface area is 5,416 Å2 (accounting for 11% to 15% of the total exposed surface area of ​​CLAG3.1 or RhopH3). The area is 3,410 Å2 (7% to 7.5% of the total exposed surface area of ​​CLAG3.1 or RhopH2). All three protein components contain abnormally high abundance of cysteine ​​in their primary sequence (13%, 10.5%, and 15.6% in RhopH2, CLAG3.1, and RhopH3, respectively), taking into account the cysteine ​​in P The relatively low content of amino acids is particularly noticeable. Plasmodium falciparum proteome (~1.7%). Our CLAG3.1 structure contains three sets of disulfide bonds, while RhopH2 and RhopH3 each contain five sets of disulfide bonds (SI Appendix, Figure 5).

RhopH3 is a 897 aa protein, and its visible structure can be divided into three domains (Figure 3): The elongated N-terminal domain generated by RhopH3 exons 1 to 3 (corresponding to aa 1 to 378) is composed of a A set of staggered β-sheets flanked by α-helices; globular α-helices from RhopH3 exon 4 to 6 (aa 379 to 675) rich in the "middle domain"; and extended C produced by RhopH3 exon 7 The terminal domain is characterized by a long circular segment extending towards RhopH2 (Figure 3C). Due to the disordered density, the first N-terminal 25 residues were not modeled, and the residues after aa 738 were not modeled either, although aa 738 was further extended to the cracks in the RhopH complex between the main bodies of RhopH2. A scattered weak "pigtail" density and RhopH3 are observed (SI appendix, Figure 4E). The N and C ends of the N-terminal and C-terminal domains are respectively located near the RhopH2-CLAG3.1 interface, which is different from the intermediate domain at the far end of CLAG3.1 (Figure 3 A and C). This explains the previous research showing that exons 1 to 3 and exon 7 contribute to the interaction between CLAG3 and RhopH2, while RhopH3 exons 4 to 6 do not seem to play a role in CLAG3-RhopH2 binding. Instead, it is transported to Rhoptries (9, 12).

The structure details of RhopH3. (A) RhopH3 (band diagram) and RhopH2 and CLAG3.1 (different blue and orange space-filling surface diagrams, respectively), showing the interface between RhopH3 and the other components of the RhopH complex. (B) The front view and back view of the RhopH3 field arrangement are displayed. (C) Multiple views of the N-terminal, middle, and C-terminal domains of RhopH3, corresponding to RhopH3 exons 1 to 3, 4 to 6, and 7, as well as mint, green, and forest colors.

The above-mentioned abundant cysteine ​​in RhopH complex plays a particularly important role in the structure and organization of RhopH3. Although RhopH2 has an unmodeled auxiliary domain, RhopH3 is unique among the components of the RhopH complex because it has the only β-sheet in the main complex; in contrast, RhopH2 and CLAG3.1 have an α- Mainly spiral. The three sets of β-sheets present in the N-terminal domain of RhopH3 produce staggered β-sheet motifs, which are mainly facilitated by a 20-residue-long continuous loop (aa 225 to 245) that contributes a single β to all three sets -Strand β-sheets, effectively linking three β-sheets together (SI appendix, Figure 6). It is worth noting that this 20-residue continuous loop contains two cysteine ​​residues, Cys231 and Cys244, each of which participates in disulfide bonds (Cys157 [from adjacent loops, which also provide for the staggered β-sheet β-strand] and Cys253 of RhopH3, respectively) in a way that the structurally critical loop containing the β-strand can be fixed in place (SI appendix, Figures 5 and 6). Therefore, we speculate that these cysteine ​​residues are essential for the correct folding and assembly of RhopH3.

Among the three components of the RhopH complex, RhopH2 has been the least studied. We simulated 972 residues of 1378 amino acids and five disulfide bonds in RhopH2. Our structure shows that RhopH2 is an elongated globular protein, mainly composed of α-helices and loops, although it is presumed that a small unmodeled domain extending from the C-terminus of RhopH2 has sufficient resolution to show that it is rich Contains β-sheets. In the three-way RhopH complex, RhopH2 forms the smaller of the two leaves and only interfaces with CLAG3.1 through residues 385 to 830.

The 1,209 residues of CLAG3.1 in our structure can also be divided into three domains (Figure 4A). Residues 52 to 675 form the spherical N-terminal domain, which shares an interface with the RhopH3 "middle domain" (Figure 4). Residues 688 to 978 form a squid-shaped "middle" domain, the head shares an interface with RhopH2, and the legs surround the third CLAG3.1 domain-a helical bundle containing residues 979 to 1289 (Figure 4A) ).

The structure details of CLAG3.1. (A) The atomic model of CLAG3.1 is shown as a ribbon. (B and C) CLAG3.1 spiral bundle and HVR from the top (B) and side (C) detailed views. For clarity, the previous spiral has been removed from C. The helical bundle is shown in cornflower blue, and the HVR is shown in cyan. For CLAG3.1 residues N1197 to A1222, hydrophobic residues, charged residues, polar residues, and A1211 are shown in yellow, pink, and green, respectively. CLAG3.1 specific residues T1140 to H1141 are shown in orange. (D) The view is rotated 90° from C to better display the HVR. (E and F) Fill the cut space of the top (E) and side view (F) of the spiral bundle and HVR, except for residues N1197 to A1222, shown in the ribbon diagram. (G) CLAG3.1 detailed view from N1197 to A1222.

CLAG3 has been shown to regulate PSAC activity (10, 26), and it is speculated that CLAG3 may be the main or only component of the ion-conducting pore of PSAC (10). Based on membrane protein prediction software and spiral wheel analysis, it is assumed that residues 1203 to 1223 of CLAG3.2 constitute a transmembrane domain, and F1200 to S1217 are predicted to form an amphipathic spiral, which is oligomerized with the same spiral from multiple CLAG3.2 monomers The transmembrane pores that form PSAC are similar to oligomeric pore-forming bacterial toxins (11, 27). However, recent CLAG3.1/3.2 knockout and complete knockout studies have shown that the CLAG3 subtype cannot be the only component of the PSAC transmembrane pore (8, 28).

In our structure, CLAG3.1 residues 1210 to 1223, previously predicted to form a transmembrane helix, are embedded in the middle of the helical bundle formed by residues 979 to 1289, near the C-terminus of CLAG3.1 (Figure 4 B-G) . Therefore, they are buried in the core of the RhopH complex and are mostly separated from the solvent by the helical bundle (Figure 4 BG). Examining the residues in the structure corresponding to the putative transmembrane helix (aa 1200 to 1223) showed that F1200 to S1217 did form an α-helix, but it was not amphipathic over the entire length of the sequence as previously predicted (Figure 2). 4G) (11). The amphipathic part of the helix (N1199 to Y1210) spans three turns (11 residues) and is approximately 15 angstroms in length (Figure 4G), which is 4 to 7 residues shorter than the typical minimum length of a transmembrane helix. Since typical biofilm thicknesses range from 20 to 40 angstroms, the predicted transmembrane spiral may not pass through the membrane all the way, or may have to interact with another spiral to form a coiled spiral to do so. The three-turn polar face of the N-end of the spiral, including residues N1199, E1203, N1206, and possibly Y1210, is exposed to the solvent and helps to form the outward solvent-facing surface of the spiral bundle (movies 5 and 6). All other residues in the N1199 to A1222 helix are buried.

Residues Y1093 to G1142 constitute the HVR previously shown to be exposed on the surface of infected red blood cells in membrane-associated CLAG3.1/3.2 (10) (SI appendix, Figure 3). In our structure, this region forms the ends of the two helices in the helical bundle, and the long loop between them, extending from the middle of the CLAG3.1 C-terminal helical bundle upwards through the CLAG3.1-RhopH3 N-terminal domain interface ( Figure 4 and Movie 7). There are 20 residues in the middle of the loop that are unstructured in our density plot. Interestingly, most of the HVR we can see in our structure is buried deep in the middle of the helical bundle, passing through the CLAG3.1 helical bundle and the RhopH3 N-terminal domain (Figure 4 BG and Movie 5-7).

The high-resolution structure provides a valuable framework for explaining the massive biochemical and genetic data of protein complexes involved in the pathogenesis of malaria parasites (29). Unfortunately, the structural study of malaria is notoriously difficult (30). Many P. falciparum proteins are extremely challenging in terms of recombinant overexpression. Because it is difficult to reproduce the correct folding and assembly through recombination or co-expression in a heterologous system, multi-protein complexes face special challenges. Using our endogenous structural proteomics approach (21), we bypassed these obstacles to determine the near-atomic structure of the natural RhopH complex enriched directly from parasite-infected red blood cells. A recent paper (31) reported the atomic structure of the RhopH complex. Their structure is obtained from proteins purified using engineered tags. These two structures are largely consistent with each other, but our natural complex contains different CLAG isoforms (CLAG3.1 and CLAG 3.2), and we can simulate an additional 459 in the RhopH2 subunit of the natural complex Amino acid residues. This hypothetical but unobserved discovery of the soluble state of the natural RhopH complex is a direct observation of the transmembrane protein of Plasmodium falciparum in a soluble transport state. This represents an important step in solving the long-standing problem of how parasite effectors, many of which are intact membrane proteins (4, 5), are transported to their final sites of action in the red blood cell membrane.

The structure provides exciting insights into how transport works, revealing that previously predicted transmembrane and extracellular elements are buried in the middle of the complex, possibly protecting these elements from the aqueous cytosolic environment during transport . Considering all the data we have so far (Movie 8) (8⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ –19, 24, 26, 28, 32, 33), this discovery seems to point to a model CLAG3, RhopH2 and RhopH3 binds in the early stages of the secretory pathway (8⇓ –10, 12), and only dissociates when translocation across parasitic vacuolar membranes through PTEX (9, 34) and reforms into a soluble complex state in the cytosol of red blood cells to complete the process. The journey of the surface of red blood cells (9). This indicates that the RhopH complex has the ability to be pathogenic pore-forming proteins, that is, to transform from a soluble form to a transmembrane protein form (27, 35, 36) upon reaching its target membrane, and this conversion may represent the use of parasites General strategy for the delivery of protein complexes to the red blood cell membrane.

Upon reaching the target membrane, the ability to transform from a soluble form that passes through the cytosol or extracellular environment to a complete transmembrane protein form is a unified key feature shared by pore-forming proteins in biological systems from bacteria to vertebrates ( 36). Therefore, it is controversial to regard the RhopH complex as a pore-forming protein complex. Although the use of pore-forming proteins to permeate and lyse the host cell membrane without restriction during invasion and exit is a well-studied strategy that is commonly used by many pathogens, including Plasmodium falciparum itself (27, 36⇓ –38). Pore-forming proteins selectively change the permeability of host cells to specific ions or solutes without damaging the membranes are rare and not well understood. This strategy is rare in bacteria, but it is essential for the function of several enveloped viruses, including influenza and HIV (36). In this regard, Plasmodium falciparum uses the RhopH/PSAC complex to regulate the permeability of the host erythrocyte membrane to obtain nutrients more similar to how a virus uses viral porins to establish ion-specific permeability across the host cell membrane, thereby making the RhopH complex Unlike malaria parasites, which more commonly use pore-forming proteins to pierce the host cell membrane during invasion and exit (38). Similar to most viral porins (35), the vast majority of RhopH complexes are located inside the cell, and only HVR is exposed on the surface of the red blood cell membrane (10). The mechanism of viral porin transition from solubility to membrane binding is usually triggered by changes in pH or interaction with lipids or proteins in the target membrane, which may imply a possible mechanism for the insertion of CLAG3.1 elements into the red blood cell membrane. In view of the abundance of cysteine ​​and disulfide bonds in the RhopH complex, we speculate that the RhopH complex needs to undergo a huge conformational change on the membrane to expose and insert putative transmembrane and extracellular elements into the red blood cell membrane. Triggered or regulated by allosteric disulfide bonds.

First discovered more than ten years ago, the formation and breaking of allosteric disulfide bonds has been firmly established as a strategy widely used to regulate and regulate protein structure and function in biological systems (39, 40). In fact, when we examined the 13 disulfide bonds in the structure, we found that three exhibited well-defined strain geometries, which are the hallmarks of all the allosteric disulfide bonds characterized so far (SI appendix, Table S1) (39)-One of RhopH2 and two of RhopH3 (SI appendix, Figure 5). Therefore, these bonds may break or break when interacting with the red blood cell membrane, triggering a conformational change, extruding putative transmembrane and extracellular elements from the complex, and driving them through the membrane. Although no parasite-derived protein disulfide isomerase (PDI) was found in the P. falciparum export group, human red blood cells do carry active PDI in the plasma membrane (41, 42), which can cleave allosteric disulfides And trigger a conformational change. Contact with the RhopH complex on the membrane. In fact, previous studies have shown that several human PDI inhibitors appear to be resistant to P. Plasmodium falciparum is active, although their IC50 values ​​vary (43). Alternatively, changes in the redox potential in the cytosol may trigger the rupture of disulfide bonds, thereby affecting the exposure and extensive conformational changes required for the CLAG3 helical bundle to interact with the red blood cell membrane (Figure 5B and C).

CLAG3.1 puts forward the model of transmembrane domain and HVR insertion into the red blood cell membrane. (AC) Schematic illustration of how the conformational changes of the RhopH complex drive the insertion of extracellular and putative transmembrane elements into the plasma membrane of erythrocytes.

Although surface-exposed membrane proteins that have important functions in pathogenesis are usually popular targets for drug development, so far, due to the various mechanisms used by parasites to evade immune detection, these proteins have been proven to be P. falciparum drugs and vaccine development Difficult goals. If the use of soluble forms is a general strategy used by parasites to deliver proteins to red blood cell membranes, then the high-resolution structures of the soluble transport forms of these essential membrane protein complexes provide an opportunity for therapeutic development from a new perspective. This prevents the assembly or transportation of the complex before it has a chance to insert into the red blood cell membrane. Therefore, the method proposed here to overcome the obstacles of high-resolution structural research of malaria parasites opens the door to the development of new anti-malarial therapies with new modes of action, and can also be applied to complexes that are difficult to separate in other organisms. System to answer long-standing questions.

The culture of Plasmodium falciparum was prepared as previously described (21, 29). Then use cold phosphate buffered saline (PBS) containing 0.0125% saponin (Sigma, saponin content ≥10%) and a protease inhibitor mixture (Roche or Pierce) without EDTA to lyse red blood cells. The released P. falciparum parasites were then washed in cold PBS with a protease inhibitor mixture without EDTA. The washed cell pellet was quickly frozen in liquid nitrogen and stored at -80°C.

The frozen parasite pellet was resuspended in lysis buffer (25mM Hepes pH7.4, 150mM KCl, 10mM MgCl2, 10% glycerol) and lysed using a glass Dounce tissue homogenizer. The soluble lysate was separated from the membrane fraction by centrifugation at 100,000 g for 1 hour, and then fractionated on the sucrose gradient as described previously (21). As mentioned above (21), the presence and relative abundance of the target protein in the obtained fractions were evaluated. In short, based on silver-stained sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), trypsin digestion liquid chromatography-mass spectrometry and negative staining EM to assess the abundance of potential target particles in the fraction. Avoid using gel filtration chromatography to evaluate the traditional method of sample quality, and instead use negative staining EM to directly visualize and evaluate the abundance of intact particles, resulting in promising 2D class averages with unique characteristics. Therefore, a small data set of approximately 100,000 particles was collected for each part, and the 2D class average value generated in RELION (22, 44) was used to identify the part containing promising particles (Figure 1E).

Mass spectrometry analysis was performed as previously described (21). Proteomics data will be placed in the ProteomeXchange Alliance (http://proteomecentral.protomexchange.org) through the MassIVE partner storage inventory.

The cryo-EM grid of selected parts of the parasite lysate was prepared and imaged as described previously (21), resulting in a final data set consisting of a total of 19,449 movies.

Align the frames in each movie, correct the gain reference, and use MotionCor2 (45) (SI appendix, Figure 1) to generate photomicrographs. The automatic particle picking in CTFFIND4 (46) and Gautomatch (47) was also used to generate photomicrographs aligned without dose weighting for contrast transfer function estimation.

A total of 2,900,167 particles were extracted from 19,449 photomicrographs. Use two to three rounds of non-referenced two-dimensional classification in RELION to exclude junk particles. In CryoSPARC (23), a total of 858,049 particles belonging to the 2D average value with clear secondary structure characteristics were further classified. A subset of the 2D class averages corresponding to 108,872 particles was determined to possibly originate from the same asymmetric volume. These 108,872 particles were unsupervised single-class ab initio three-dimensional reconstruction, and then used C1 symmetry in CryoSPARC for single-class homogeneous refinement. The C1 symmetry in CryoSPARC is then used to non-uniformly refine the obtained reconstruction, resulting in a density map with a final total resolution of 3.72 Å (Figure 2 C and D and SI appendix, Figure 7). Then 108,872 particles were intensively classified and refined in RELION to achieve sufficient local resolution in the entire map, so that cryoID was used to successfully identify all three proteins in the map.

We used the default symmetry (Any) and a high resolution limit of 3.6 Å as input parameters, and ran the cryoID Get_queries subroutine on the initial 3.7 Å resolution density map obtained from the refinement and post-processing in RELION (21) . Then we manually check the generated query model, correct the residues incorrectly assigned by Get_queries, and expand the query at both ends when the density allows (Figure 2E). This produces the following degenerate sequence, which is then used for searching:

Using this set of query sequences, cryoID identified two candidates for the protein in the region of the density map from the candidate pool consisting of 1,170 proteins identified in the sucrose gradient fraction by mass spectrometry (SI Appendix, Table S2 ). The two candidates CLAG3.1 and CLAG3.2 from Plasmodium falciparum (O77309/O77310) have 96% protein sequence identity. We confirm the recognition by manually constructing an ab initio atom model in the rest of the map (Figure 2B). Centralized classification and refinement are used to improve the local resolution around the region corresponding to residues 1110 to 1150, which are one of the few fragments with different sequences of CLAG3.1 and CLAG3.2. Based on the presence of the two residues T1140 and H1141, the improved resolution is sufficient for us to confidently determine that the sequence in this region matches CLAG3.1 instead of CLAG3.2, these two residues are found in CLAG3.1 but not in CLAG3 It was found in .2 that it was confirmed that the protein in the complex was CLAG3.1 (SI appendix, Figure 3).

We completed the skeleton tracing of the remaining areas in the density map that did not correspond to CLAG3.1, which allowed us to determine that there might be two other different polypeptide chains in the density map. Then we use Get_queries to generate queries from these two remaining regions of the density map, resulting in two sets of additional queries (Figure 2 F and G). We manually checked the query model, corrected the incorrectly assigned residues of Get_queries, and expanded the query at both ends when the density allowed. This produces the following degenerate sequence, which is then used for searching:

Using these two sets of query sequences, cryoID identified proteins in two regions of the density map from a candidate pool consisting of 1,170 proteins identified in the sucrose gradient section by mass spectrometry, namely RhopH2 (query set 2) and RhopH3 (query Episode 3) from Plasmodium falciparum (C0H571 and Q8I395) (SI appendix, tables S3 and S4). We then confirmed the recognition by manually constructing de novo atomic models in the corresponding two regions of the map (Figure 2B).

Use UCSF Chimera (48) and COOT (49) for map interpretation. The protein sequence of Plasmodium falciparum was obtained from the National Biotechnology Information Center (50) and PlasmoDB (51) protein database. Use cryoID (21) to determine the initial sequence registration of all three proteins in the map during the model construction process. PHYRE2 (52) secondary structure prediction is also used as a guide in the subsequent model building process. Using the best focus classification map for each local area, manually track and construct each residue in the three proteins from scratch in COOT. Use the select_rotamer option in COOT to evaluate by residue and map fitting, and manually select the rotamer of the residue.

The map resolution outside the core area of ​​the complex is not high enough to allow the use of real_space_refine_zone in COOT. Therefore, use COOT's regularize_zone and fixed_atoms_for_refinement functions to perform manual refinement for protein geometry and density map fitting. Finally, the phenix.real_space_refine program in PHENIX (53) is used to perform an iterative loop of automatic refinement of the composite generative model, and then further manual refinement is performed according to the visually determined map to have the best overall for each local area Combination of features to achieve the final structure.

All characters and movies are made using UCSF Chimera, Pymol (54) and Resmap (55). Molprobity (56) was used to verify the stereochemistry of the final model.

The atom model and cryo-electron microscope density map have been deposited in the protein database and the electron microscope database under the registration number. They are 7MRW and EMD-23959 respectively.

This research was partially funded by NIH (R01GM071940, AI094386 and DE025567 to ZHZ, 1DP5OD029613 to C.-MH, and K99/R00 HL133453 to JRB). C.-MH recognizes funding from the Ruth L. Kirschstein National Research Service Award (AI007323). We thank the University of California, Los Angeles (UCLA) Proteomics Research Center for its help in mass spectrometry, and thank the UCLA Center for Nanomachine Electronic Imaging for the use of resources and NIH (S10RR23057, S10OD018111 and U24GM116792) for funding and NSF (DBI) -1338135 and DMR-1548924).

Author contributions: C.-MH and ZHZ design research; C.-MH conducted research; DEG and JRB contributed new reagents/analysis tools; C.-MH, JJ, ML and XL analysis data; and C.-MH Wrote this paper with ZHZ.

The author declares no competing interests.

This article is directly contributed by PNAS.

This article contains online support information at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2100514118/-/DCSupplemental.

This open access article is distributed under the Creative Commons Attribution-Non-Commercial-No Derivative License 4.0 (CC BY-NC-ND).

Thank you for your interest in advertising on PNAS.

Note: We only ask you to provide your email address so that the people you recommend the page to know that you want them to see it and that it is not spam. We do not capture any email addresses.

Feedback privacy/legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490. PNAS is a partner of CHORUS, COPE, CrossRef, ORCID and Research4Life.