Visualizing exome coverage depth
The coverage of a base pair in a sequencing run is the number of times that it was sequenced. We usually estimate coverage in an intuitive way: by comparing the total amount of measurement to the total amount of stuff to be measured. Dividing the total length of the reads by the length of the target gives the coverage depth and is often simply expressed as a multiplier: “40x,” “20x,” and so on.
It should be clear why we want good coverage depth: in genomics as elsewhere, redundancy makes for accuracy. But this suggests that we also want good coverage depth everywhere. Because coverage is calculated by dividing the total output length by the target length, the multiplier is an average taken over the whole target: there is no guarantee that every part of your target was covered even approximately to that depth, and indeed it is common for some parts of the target to be covered much more deeply than others.
We might naturally wonder, then, not only about the overall coverage multiplier but also about the portion of the target that is above some threshold. Suppose your overall coverage is 50x and you need to have 30x coverage to be sufficiently confident in your conclusion. Although the 50x overall coverage is promising, you will also be curious how much of your target was covered to at least 30x depth.
Seven Bridges Platform now includes a pipeline, Exome Coverage, that allows you to easily ask, answer, and visualize the actual coverage of your sequencing run. It lives here. You can use it to compare your results to the specifications that your sequencing provider might give you. For example, Invitrogen TargetSeq presents its coverage similarly here. Here you can see how it’s used: