"I fell into a burning ring of fire..." (Johnny Cash, 1963)
One of the challenges of working with high parameter data is how complicated it is to explore and present. It is easy to get burned by bad data presentation, which can frustrate audiences of a paper or talk, or can - more dangerously - lead to mis-interpretation of the data.
The goal for any high parameter graphic should be to completely present the data, allowing viewers to appreciate fine subsets (or at least recognize that they exist), while still providing a quick and intuitive read-out. The ability to emphasize or de-emphasize markers is also important, and often a fixed underlying structure makes it easier to compare samples or markers in a high dimensional data visualization.
It is also critically important that the graphic doesn't mislead people. This seems like an obvious statement, but in practice, it can present a challenge. Often, there is uncertainty in how cells are grouped, but the graphic doesn't provide any hint of this. Some graphics don't appreciate that even among cells lacking expression of a particular marker ("the negative population"), there is a distribution of autofluorescence. When these graphics shade events by expression level, but don't pin all events in the negative population at zero, they are misleading.
We face a real challenge with 30-parameter flow cytometry data, because existing tools don't meet our needs. Displaying 15 bivariate (two-parameter) plots can create a "rage of fire" inside many of us ... too much to look at, and information about co-expression of three or more markers is lost. Recent tools for mass cytometry can be applied, but they often require a bioinformatician's expertise, and some suffer from the limitations described above. I participated in the development of various tools while I was in the Roederer lab (e.g., making a primitive pre-cursor that inspired SPICE/Pestle; https://niaid.github.io/spice/); although these are useful, they work better in the 8-12 color space.
So, a few weeks before a talk at the International Society for Advancement of Cytometry (ISAC) meeting (where I had to show 30-parameter data), I developed the Ring of Fire plot.
Here's a primer on how to make these plots:
1) Export your FCS file into a text file. Genepattern has a tool to do this (http://software.broadinstitute.org/cancer/software/genepattern/flow-cytometry-data-preprocessing); there are other tools available within flow cytometry data analysis software as well.
2) I used JMP software from this point on, but these steps can be replicated with R-based algorithms, or other statistical software packages.
3) Choose a set of markers with which to group cells. These could be markers that are commonly used classify cell populations (e.g., markers of naive and memory T-cells).
4) Cluster your data, based on expression of these markers, using whatever algorithm you prefer. I used hierarchical clustering, because it is simple and many different software packages do it. Keep in mind that you may have to down-sample your data, depending on the size of the FCS file and available computing power. There are a number of ways to down-sample data, consider carefully which to use. FlowJo has tools for downsampling an FCS file; other flow analysis packages may as well.
5) Assign values to each event that indicate to which cluster the event belongs. It is likely that you will have to jitter these values, so that all events are not right up on top of each other. You can assign x and y values to each event and cluster, spacing them in such a way that they are arrayed in a ring around a central point. JMP does this automatically under constellation plot in the hierarchical clustering platform. You now have something, in cartoon form, like this:
6) Now that you have a ring, you need flames. Select a marker (or markers) that you'd like to display in the third dimension. What makes sense here? Any marker for which expression levels tell you something meaningful. For example, T-cell literature shows that polyfunctional cells (making IFNg, IL2, and TNF simultaneously) express very high levels of IFNg. Alternatively, you can choose a marker that you want to emphasize, or one for which you'd like to know if expression is restricted to a particular cluster. Or, you can choose any marker you want to highlight, and compare it to other markers by making multiple ring of fire plots. Here's an example of how plots can be compared for expression in the "flaming" dimension. (This is unpublished data, so I've intentionally left it small and blurry.) Let's consider the first plot, with the maroon dots. We have a cluster of cells uniquely capable of expressing high levels of the marker, compared to other clusters of cells. This high maroon expressor population, doesn't express the blue or green markers. The blue marker in particular is restricted to only a few clusters. In sum, you can compare and contrast expression of many markers across the underlying, fixed structure.
7) The flames are made simply by plotting the constellation plot values on the x and y axis, and then overlaying the flame marker in the vertical (z dimension).
8) Importantly, the events in the flames are colored by whether they are positive or negative for expression of the marker by standard flow cytometry gating. This ensures that the any distribution in autofluorescence or spreading error is not falsely considered real expression.
Good luck! Please credit me if you use this method to make your own Ring of Fire plots! (The method is unpublished, beyond this blog.)
Here are some other "Rings of Fire," for your enjoyment:
Johnny Cash (via YouTube):
(Cut and paste these links into your browser)
A geologic feature:
A type of eclipse: