/ cryosparc tips

2D Classification in cryoSPARC

Overview

2D classification sorts particles into classes of similar particles. This can assist in visually identifying and removing classes that appear to be "junk" prior to refinement, as the inclusion of "junk" particles can detract from achieving the highest resolution refined structure.

To perform a 2D classification experiment in cryoSPARC, select the dataset of interest from the Datasets page.

Navigate to the Experiments page and choose New Experiment > 2D Classification. Click the Setup button on the left-hand sidebar to set parameters.

Setting parameters

In general, a successful 2D Classification involves adjusting two parameters over a few iterations: the Number of 2D Classes and the Initial Classification Uncertainty Factor. (As always, cryoSPARC populates default parameters which can be adjusted by the user.)

In a typical dataset comprising hundreds of thousands of particles, the Number of 2D Classes is typically set between 50 and 200, or even as high as 300 classes. In general, as the Number of 2D Classes increases, the likelihood of finding "junk" classes also increases because "good" classes will become visually more obvious. With too few classes, "junk" particles may be hidden within what otherwise looks like a "good" class.

The Initial Classification Uncertainty Factor (ICUF) tries to capture the user's knowledge of the similarity in quality of particles within a dataset. When the ICUF is set to a value of 1, this reflects that "junk" particles look very different from good particles within the same dataset. On the other hand, a larger ICUF means that the "junk" may look very similar to good particles, and therefore the algorithm should at first be more uncertain about assigning particles to classes. Modifying this parameter instructs the optimization algorithm to search for 2D classes that are more similar (ICUF large) or less similar (ICUF small) to each other.

Recommended workflow for initial 2D Classification experiment

We recommend starting with Number of 2D Classes = 100 and leaving the default value for ICUF. (If your dataset is very small, 100 classes may be too many. In general, we use a rule of thumb of at least 100 images per class.)

To launch the 2D Classification experiment, click on the Launch button on the left-hand sidebar, then click Enqueue. Once complete, examine the 2D classes visually. The class comprising the most particles is found in the top left-hand corner, with decreasing numbers of particles going from left to right in each decreasing row. The exact number of particles is provided at the top of each class. The numbers at the bottom are the estimated resolution of the 2D class average, and the effective sample size for that class.

You may be interested in the Effective number of assigned classes plot and the Probability of best class plot below. The plots indicate the level of uncertainty about classification of particles.

On examining the results, if the classes do not look "good" or if there are artefacts, streaks, etc, in the images, then you may wish to:

  • Use a different noise model. Prepare a fresh 2D Classification experiment, and on the Parameters page, click on the Show Advanced Params for 2D Classification. Set 'Use white noise model' (class2D_sigma_use_white) to 'false'. This will use a coloured noise model, which can help in tricky cases.

  • Marginalize over poses. Set 'Force Max over poses/shifts' (class2D_force_max) to 'false'. The algorithm will marginalize over poses which can help achieve better 2D classes especially for very small molecules.

  • Use clamp-solvent. If the classes appear to have a lot of unwanted artefacts in the background, you can use a special optimization method to ensure that all classes will have a blank background. Set 'Use clamp-solvent to solve 2D classes' (class2D_clamp) to 'true'.

If you can spot several "good" classes and some obvious "junk" classes, then you can:

  • Perform another 2D Classification experiment on the entire dataset using different values for ICUF and Number of 2D Classes; or
  • Select those particles that belong to the "good" datasets and perform another 2D Classification experiment only on those. If performing another 2D Classification experiment on "good" classes only, you can increase the ICUF as you presumably have gotten rid of "obvious junk" and are now looking for "non-obvious junk".

Select 2D Classes

To select and exclude 2D classes before proceeding to another 2D Classification experiment, or before proceeding to perform an Ab-initio Reconstruction experiment, select the particles of interest from the 2D Classification experiment you performed previously, then choose New Experiment > Select 2D Classes. Launch and Enqueue the experiment.

Almost immediately, you will be able to see the 2D classes on the Launch page.

Hover over a class image to see an enlarged version.

Click on classes to select them. Alternatively, you can use the 'Select All' or 'Select None' buttons at the bottom. Selected classes are highlighted in red.

Click Done. Additional questions will populate at the bottom of the outputs, after the plots.

If you select 'Include all particles belonging to selected classes', the experiment will complete immediately and the particles belonging to the selected classes will be available for selection from the Experiments page, to be used in further 2D Classification experiments, an Ab-initio Reconstruction experiment, or a Refinement experiment with an existing initial reference.

If you select 'Include particles if classification probability is above a threshold', then you will be prompted to select from a list of thresholds before the experiment completes. As above, the particles belonging to the classes you selected will be available for selection from the Experiments page.

Union/Intersect Particles

This experiment is useful for keeping and/or rejecting particles following several rounds of 2D Classification. For example, for Dataset A, if you performed three 2D Classification experiments using different parameters and found different "good" classes, you may want to take the Union of all of the "good" particles from the three experiments and use that as the starting point for an Ab-initio Reconstruction or a Refinement with an initial model.

If you performed three 2D Classification experiments on Dataset A and found that the "junk" classes were slightly different in each experiment, you may want to take the Intersection of all of the "good" classes to ensure you are including only those particles that were classified as "good" all three times, while excluding any particles that have been classified as "junk" even once.

To start a Union/Intersect Particles experiment, select multiple sets of particles (all particles or selected particles from previous 2D classifications), then choose New Experiment > Union/Intersect Particles. Click Launch and Enqueue.

You will be prompted to select a threshold - either the default value of 90%, or a custom threshold - for each set of particles you used as an input, unless that set of particles does not have associated classification probabilities. In that case, all the particles from the selected set will be used.

Once complete, you will be able to view the result on the Experiments page.

Note: You can also use the Union/Intersect Particles experiment to work on particles resulting from any other experiment type.

2D Classification speed benchmark (1 x NVIDIA GPU, cryoSPARC v0.5)

Dataset Particles Box Classes Runtime
80S Ribosome 105K 360^2 200 2.6 hours
V0 complex 437K 200^2 200 4.1 hours
AP1 complex 145K 256^2 200 2.6 hours
Hemoglobin 537K 100^2 100 1.4 hours
TRPV1 218K 192^2 200 2.8 hours
ATPase 197K 256^2 100 1.3 hours

Questions or feedback? We would love to hear from you at [feedback@structura.bio](mailto: feedback@structura.bio).

Browse through other blog posts for more workflow tips and interesting results.