Experimental Design – Measurements and Data Acquisition


Key Point: What data do you need to test your hypothesis?

Don’t forget that you are designing your experiment to test a specific hypothesis. It is the hypothesis that defines the data you should collect, not the other way round.

There are a bunch of things to think about here. Some will be more obvious than others. I have expanded briefly on each point lower down this page and added links to more complete descriptions where I think they will be helpful.

  • What will I actually measure? – qualitative or quantitative? direct or indirect measurements? How will the data be collected?
  • Do I need a pilot study?
  • What experimental variables are required? timings? doses? 
  • What controls do you need? – experimental, biological, calibrators, sub-groups, confounders, reverse causation. 

What will I actually measure?

For many experiments the answer to this is pretty clear cut but if it isn’t you need to be absolutely explicit about what data you will collect and how you will interpret the data.

Direct or indirect? If you can actually measure what you want to measure, great! If not, will your indirect measurements be good enough to generate an unambiguous test of  your hypothesis or will you need to include extra controls? Note that most antibody based techniques are indirect.

Qualitative or quantitative? When I started writing this I was purely thinking about quantitative studies but qualitative studies require good design too! Of course they do! Both types of research have important roles to play. Usually qualitative research is more exploratory and/or descriptive whereas quantitative is looking for numerical data. More complete description here. Often this means that the qualitative study is used as an exploratory study to establish the hypothesis that a quantitative study will ultimately test.

Key point: if your overall study employs mixed-methods, make sure you fully design the quantitative aspects as discrete questions so that you don’t end up up with ambiguous findings 

In bench science, if you have no intention of quantifying things you are doing descriptive research. Likely you won’t be able to make any concrete claims about your data. The times you see this most frequently is in imaging based studies where the researcher is just describing what they see. But, often you can do better than that! If you are using an image to assert something you should look for a way to quantify what you are asserting. Ask yourself – how can I prove that my “representative” image is truly representative of the population? Whatever proof you are using is most likely the data that your experiment should be designed to obtain.

Do I need a pilot study

pilot study

Probably yes.

Pilot studies are small scale versions of your main studies. You use them to get everything working consistently. You need to know the specifics of how your experiment will work and to be clear that your measurement system are OK.

For design purposes your pilot studies provide two key indications that you will used later: how big a difference do you expect between your populations, how variable your data is. These two numbers are key to determining how many samples you will need. Knowing this variability will help you decide how many technical repeats you need and might end up meaning you need less (saving you time and money).

Working in lab-based research? Be aware that more time is spent optimising protocols than generating final data. A fully optimised experiment that consistently performs well will need fewer repeats and give you tighter data so don’t ever think of this as wasted time!

Pilot experiments might not always be possible, sometimes you will have to rely on publications where people have performed the equivalent experiments asking a different question.

What experimental variables are required

I know I’ve said this before, but answering one question well is better than answering three badly.

Be aware that as you add more and more comparison groups you are reducing the statistical power of your experiments and likely will need to increase the sample size dramatically to be able be confident that any differences you observe have not occurred by chance.

With those caveats in mind, think about exactly what you want to compare. How many groups, how many doses, how many time points. It will be tempting to do everything at once, but if you don’t need the extra groups to test your hypothesis seriously consider what the extra comparisons add.

If you really need to compare many things, make sure to read up on multiple comparison testing before you start. Here is a handy page on multiple comparisons to get you started.

Screen Shot 2018-03-21 at 20.59.47
When your (unadjusted) p value equals 0.05 this means that there is a 1 in 20 chance of the difference you have observed actually being a false positive. If you do 20 different tests (and don’t compensate for the extra tests) you would be very surprised if you don’t get at least one which gives a false positive. There are statistical techniques to adjust your p value to account for multiple comparisons, but the practical upshot is that your sample sizes will likely need to be larger.

What controls do you need?

You can’t prove a positive without a negative. You can’t prove a negative without a positive. But controls do more than that. Can you control for off-target effects, non-specific signal or confounding variables.

Using experimental controls can also help you to know if your experiment has worked or not, and if not, where the problem is. Controls can also be used to calibrate the system.

It’s a big topic and one that you should take the time to absorb fully so I have migrated this to a separate page: click here

If you already know everything you need to know about controls, you can jump to independence and sample size calculations here.