Ok, you’ve made your figure so now it’s time to write the legend (also known as a caption)… Easy, right? Well yes and no.
I think everyone learns very quickly that figure legends are there to tell the reader what they need to know to be able to understand and interpret the data you have assembled in your figure. However, figure legends are, surprisingly, an area where lots of writers miss the mark or struggle at first, but once you have done a few you end up banging them out without any issue.
Usually the problem new writers stuggle with is getting the balance right between what methods, results and discussion are needed in the legend versus what is covered elsewhere in your manuscript.
Important! Your figure+legend should be able to stand alone. A reader shouldn’t need to read any other part of your document to understand what you need them to!
Important! Each figure should work on its own eg fig 2 should not rely on fig 1
Some of the following is a little subjective so, as with everything else in your writing, a good place to start is by looking at your supervisor’s most recent paper and see the style that they use, especially if you can find similar types of data to that which you are presenting.
What goes into a figure legend?
I’m starting here as in my experience most students’ first drafts don’t contain enough info.
1. Title statement (clause)
Not all journals allow this but if you are writing a thesis or project report this is a good place to start. Your title is a chance for you to tell the reader what you want them to think about your data. What does your figure demonstrate? Note this is different from saying what data it contains (you’re going to do that shortly). You can go with a descriptive title if you are setting up the model but I consider the title as the only place where you actually have the opportunity to interpret the data so you are missing an opportunity if you don’t use your title effectively.
Note; don’t use a full sentence, just a clause. Definitely do not ask a question!
Some examples (from Journal of Biological chemistry)
- March-I E3 ligase activity is not required for its ubiquitination.
- Bovine PERK luminal domain (bPERK-LD) can directly interact with the denatured model proteins and suppress the heat or chemical induced protein aggregations.
- NANOG antagonize OTX2 to regulate neural patterning in hESCs.
2. Description of how the data was acquired
This is the bit that new writers usually miss! For each panel first describe how the data was generated/where did it come from. This isn’t the methods section so all you need here is just enough information to interpret the images/graphs/whatever. Usually we are talking about details that are relevant but are not obvious from the figure panel itself.
For example, if you are showing pictures of cells that you have processed from indirect immunofluorescence microscopy then your figure legend might say what cell type, onto what substrate (glass/plastic etc), and for how long, with which antibodies, and whether the images were epifluorescent, TIRF, confocal or whatever. You may not need all of these details, it depends how you have labelled the fig and which are relevant to the data interpretation.
This is true for graphs too! It’s not enough to just say “graph showing….” you need to say a little bit about where those numbers came from. What did you measure? How was it processed?
Important!Your figure+legend should be able to stand alone. You shouldn’t need to read the methods to have a solid grasp of what you are trying to show (I know that I am repeting myself…this is important!)
3. Description of what you are actually showing
This is the bit that usually new writers do OK at! Say what you have presented… representative* images of…. scatter graph of…..
Describe what is where; “left and middle panel are single channel images with antibodies….etc…. right panels are merged images.
Describe colour schemes (eg pseudocolouring), and any acronyms that you didn’t describe in part 2.
Describe what you have plotted in your graphs; mean/median/mode, s.d. sem**, interquartile range, what do your boxes and whiskers represent etc etc. Also remember to indicate what you have normalised to (eg for ddCt which reference transcripts did you use). Your reader needs this information to understand what decisions you have made.
Scale bars or absolute magnification? In an all print era (i.e. not now!) you knew what size your figure would be printed at, therefore you could state the magnification of the image and it was useful for interpretation. Now, your work will likely be viewed primarily on a screen therefore you don’t know what size monitor or how zoomed in your reader will be looking at your figure. Therefore I would always plump for defining your scale bar in your legend so that irrespective of how the image is viewed, the dimensions are correct.
*Important! whenever I read “representative” image, I automatically start thinking about how the writer decided what was representative? If there is no quantification, I immediately question why they consider the pic they have chosen to represent the population. If it is quantified I expect the image to represent the average of the population so don’t pick your most extreme example!
**Important 2! If you have distilled your data into mean/median rather than choosing to show the data set in a more complete way, then the justification and proof as to why that you have made a reasonable choice should be clear. There is a very strong drive to move away from (for example) plotting bar chart with mean and error bars for small data sets where it is hard to assume normality. Choosing to hide variability in your data set is never a good idea!! #banbarcharts
4. Statistical/analysis/population information
This is important. Don’t forget to indicate the number of values you plotted and/or the number of technical and biological repeats you performed to generate your data. What was the experimental n numbers should be explicit in your legend if it is not already clear from your figure.
Indicate what statistical tests you have performed, thresholds you used to reject the null hypothesis, which groups were compared, how you accounted for multiple comparisons etc. I would even consider including how I tested for normality if I used a parametric test.
What doesn’t go into a figure legend
Methods. Beyond those which are absolutely required to interpret the data you don’t need methods. Usually this means you don’t need things like concentrations or dilutions. This sounds like exactly the opposite advice to point 2 above and that’s where the challenge is. Ask yourself; do I need this information to carry out the experiment or do I need it to interpret the figure, if it is the former it goes in the methods.
Results. The job of the figure legend is to describe the figure you don’t need to describe any of the actual findings. That stuff goes in the results section (guide here). If you find yourself writing “mean of X…” then stop and delete!
Discussion. With the exception of the fig title, a figure legend should not include any discussion of what the data means. Again, you have other sections of your writing for this purpose (you know, that bit called the discussion! guide here). To be clear, if you feel that you need to add something on to your figure legend to explain what it is you are showing it is a very clear sign that your figure isn’t as good as it should be!*
This probably is the area where other PIs might disagree with me so don’t be too suprised if they add some comments to your fig legend that are interpretative rather than factual statements.
OK, you’ve read this far, now go write a draft! Once you have drafted it give it and your figure to a friend to make sure it makes sense. Remember, each fig+legend should be able to stand alone.
Back to writing guides
Back to figure preparation