Malin Edvardsson May 3, ’22 < 19 min

Save time in the lab with Design of Experiments

Surface Science Blog

If you spend a lot of time planning, executing, and evaluating experiments, time efficiency may be important. A method that could help you both save time and resources and give you as accurate results as possible is Design of Experiments, DoE. We were curious to learn more about this method, and had the privilege to talk to Prof. Marina Axelson-Fisk, Professor in Mathematical Statistics at Chalmers University of Technology. In the interview, Prof. Axelson-Fisk shared her knowledge on how to use DoE to efficiently plan your work and to make the most of the time you spend in the lab.

What is design of experiments?

In brief, design of experiments is a way of making the most of the experiments by planning ahead from the beginning to the end, Prof Axelson-Fisk says. It includes everything from data collection to sample sizes, defining the population, getting an idea of the measurement errors, and what analysis methods to use and so on. With the DoE method, you plan every step of the way before you even begin to collect your data. You can say that it is an umbrella strategy that can help you find answers in an efficient way, Prof Axelson-Fisk says.

Why is Design of Experiments needed?

Design of Experiments helps you save time and resources. It makes the analysis more efficient and accurate, and it also makes the data collection more efficient, Prof Axelson-Fisk says. Additionally, it is an efficient strategy to help you find the answers that you are looking for. If you do not take care, you may actually miss true phenomena. For example, when you measure something to see a relation between variables, there may be other variables that influence your measurements, and which hide the effects that you are looking for. So, your hypothesis may be true, but you cannot see it because you have not been careful when planning the experiment, Prof Axelson-Fisk says.

The importance of selecting which factors to include

Before you start, you must figure out what factors you believe influence the result, Prof Axelson-Fisk says. If there are external factors other than those you are interested in, and you do not include these, you may not see the effects of the variable that you are interested in, she explains. Say for example that you want to know where the difference in blood pressure in humans comes from, and whether or not it is affected by the weight of the individual. If you only build a model that contains weight plus-minus some error, which contains everything else, you may not necessarily see that the effect of weight because the variation of the error is so big. However, if you include other factors, like gender, age, etc., so that you separate the sources of variation, then you may see that weight is significant. Whereas if you disregard all the other factors, your effect may drown in the large variation of these other factors so to speak, Prof Axelson-Fisk says.

How many factors should be included?

The common answer from a statistician is always “it depends”, Prof Axelson-Fisk says with a smile. How many factors to take into account depends on how large sample you take, how big the variation is, and how precise you need your results to be. If you take a large enough sample, then you can take as many factors as you want. Typically, you would have two or three factors in the same model. Or, you can have a so-called factorial design which is common, for example when you want to optimize the process of something and want to figure out the optimal settings of various factors. As a first approach, it is common to include a large number of factors, but only at two levels, i.e., two different settings of each factor, to identify which factors that influence your process the most. The next step is then to determine the optimal settings of these factors. But in short, the number of factors to include depends on how large sample you can afford, or how big the variation is in the process, Prof Axelson-Fisk explains.

The definition of ‘sample’

The term ‘sample’ has a very specific meaning for a statistician, Prof Axelson-Fisk says. The goal is to draw a conclusion of the population as a whole with respect to some feature. Say for example that you want to know what the variation of blood pressure is in the human population, and how much of it depends on weight. Usually, you cannot measure the entire population. If you could, then you would not need statistics, because then you already have the answer. In situations where you cannot measure the entire population, you must draw a small sample. This sample should then be as good representation of the entire population as possible.

In statistics, sample means that the observations are independent and from the same distribution I.e., we draw individuals randomly, independently from each other, and from the same population. It is very important that the sample mirrors the different sources of variation in the population. If there are different subpopulations, resulting in different distributions of the feature of interest, this needs to be taken into account in the sample. For instance, if the basic levels of blood pressure differ between the genders, the sample needs to include the genders in the same proportions as in the population. You need to make sure that the proportions of the different subpopulations in your sample mirror the proportions in the population. If you have inhomogeneities in your population, it must be represented. Basically, the sample must be a good representation of the variation of the population as a whole. Otherwise, your conclusions of the sample will only be true for that specific sample and will not hold to extrapolate to the population, Prof Axelson-Fisk says.

Who benefits from using Design of Experiments?

Everyone who is doing experiments and who wants to draw proper conclusions from these experiments, benefits from using design of experiments, Prof Axelson-Fisk says. In its simplest form it simply means that you think your experiment through, what you want to achieve with it and the best way to go about it. Note that there is a difference between exploratory and explanatory analysis. If you are new to a field, or if you are going into a new direction, then perhaps you do not know what hypothesis to test or what relations between variables there are. In this situation, you may collect data without any preconditions and just look for patterns. You do not do any statistical inference of that data, but rather use descriptive statistics to identify patterns and trends in the data. But once you have found some interesting patterns you cannot do the hypothesis testing of the same data set, since you do not know if the pattern you found is only a random event in your data or if it is true to the whole population. So, you may begin with an exploratory analysis to find potential patterns, hypothesis, or relations. Then, you can do an explanatory analysis. You draw a new sample from the population and test those hypotheses on that to test if the pattern is in fact present in the population at large and not just some unlucky choice of data, Prof Axelson-Fisk says.

Design of experiments - step by step

In brief, you typically begin by having some hypothesis that you would like to test, Prof Axelson-Fisk says. You want to see if there is a relation between something that you measure and some variables. You decide to test this, so you want to draw a sample. Now, you must determine what is the population that you want to test this on, and what the population looks like. Are there any inhomogeneities? Is it difficult to sample from the entire population? Etc. There is a whole theory on how to select a representative sample. How large sample you need depends on how big the variation is in the population, and how large error margin you can allow. For instance, if you want to claim that a medicine has effect, you may want to be much more certain than if you test something less critical. The more certain you want to be, the larger sample you should have. So, the planning on how big margins you have and how big variance you have you have to decide on the sample size. There are of course methods for computing this as well, Prof Axelson-Fisk says. Next, depending on what question you are asking, you must choose analysis method, and model. Along with the model comes conditions that must be fulfilled. For example, common conditions are that the data must be normally distributed, it should be independently sampled, the variance must be homogeneous over your sample and so on, so you need to make sure that your data fulfills the conditions. If the data does not fulfill the conditions, either you must rethink how you sample your data, if your data can be transformed to meet the conditions, or you have to choose another method. Then, given what method you have chosen to analyze your data, there are lots of tools to help you to do the analysis and how to draw the conclusions. These are the main steps, Prof Axelson-Fisk says.

Choosing analysis method

Depending on what you want to achieve there are various types of statistical analysis methods to use. For instance, if you want to compare populations to see if there is a difference between them, between actual individuals or between methods etc., then you do a so-called two-sample t-test, Prof Axelson-Fisk says. If you have more than two populations, subpopulations, or groups that you want to compare, then you have analysis of variance (ANOVA). Analysis of variance is an extension of two sample t-tests to more than two populations. If you want to explicitly determine the relationship between your measurements and some factors, then you may use for example linear regression, or other forms of regression. In regression, you fit a model to your data. That model can then be used to test if the factors have an influence on the response or to predict what response you would get if you made new measurements of your factors, Prof Axelson-Fisk explains.

Which method should you use to optimize for example molecular adsorption to a surface?

In a situation where you would like to optimize the surface uptake of a molecule as a function of, for example, pH and salt concentration, you may use factorial design, where you split your design into factors, Prof Axelson-Fisk says. You have your measurements, which we call the response, and you have your different factors, in this case pH, and salt concentration, that you believe influence the response. If you want to optimize the molecule uptake at the surface, i.e., identify where you get the highest or the lowest response, then you want to find the optimal settings of the two factors. Typically, you determine levels, i.e., fixed values, of these factors. The most common, and most efficient factorials designs, have two levels of each factor, but you could have more. You then randomize the experiment, so you run the experiment on each combination of the factor settings in random order. If you have pH level and salt level, then you choose two pH levels and two salt levels. In the lab you will then have four different combinations of factors and then you will measure your response on these four different conditions, Prof Axelson-Fisk says.

How to select the factor levels

If you do not have any prior knowledge of the system, say you want to maximize the molecular uptake, and you do not know if you need a high salt concentration or a low one, then you must guess which levels to choose but they should be reasonably far apart, Prof Axelson-Fisk says. When you begin your experiment, you may for example see that high salt concentration results in a larger molecular uptake than a low concentration. Then you know that you should go for the higher one. Next, you may choose two new levels, or you may use the so-called response surface, a plot of the result as a function of the factors, to guide you in which direction the molecular uptake increases. Now, you will try new levels of your factors, and maybe eventually you will identify the range, between which values where you have the maximum molecular uptake. Then you can do smaller steps, i.e., choose levels more densely, to pin-point the actual optimum, Prof Axelson-Fisk says.

Science on surfaces - tips and tricks on design of experiments, DoE

Listen to the interview with Prof. Axelson-Fisk to learn more about how to use design of experiments to efficiently plan your work and to make the most of the time spent in the lab. In the conversation we also talk about what challenges and difficulties that may arise when using DoE, and what pitfalls to look out for.

Topics:

Tips & Tricks

Coating QCM sensors ex-situ: How to setup the measurement and analyze the data

Get guidance on how to set up a QCM sensor ex-situ coating procedure.

< 13 min

Strategies to approach QCM-D data analysis and interpretation - advice from experienced users

Generating QCM-D data is straightforward, but analysis can be tricky. Here are some tips and tricks from four seasoned QCM-D users

> 8 min

QCM-D measurement best practice – After measurement

Learn about what aspects to consider after you have run a QCM measurement

> 4 min

How to optimize the reproducibility of your QCM-D data

Get a checklist that will help you optimize the reproducibility of your QCM-D data by minimizing unintentional changes of the QCM-D parameters.

> 5 min

QCM-D measurement best practice – Running a measurement

Learn about what aspects to consider when running your QCM measurement

< 5 min

The Sauerbrey equation or Viscoelastic modeling? – how to assess which method to use

Learn the guidelines on how to assess which method to use to quantify QCM mass.

< 16 min

A conversation on mathematical modelling in science

Learn more about mathematical modelling in science, examples of models, what to keep in mind when modelling, challenges and pitfalls to avoid.

Tips & Tricks

> 10 min

QCM-D measurement best practice – Measurement setup

Learn about what aspects to consider when setting up your QCM measurement

< 5 min

4 steps to prevent temperature related problems in QCM measurements

Temperature stability is critical for reliable QCM data. Here are the top four factors that will help you eliminate temperature induced artifacts.

~ 6 min

QCM data analysis - When to use Viscoelastic modelling

Read about the key signatures in the QCM data that reveal if viscoelastic modelling should be used to extract the mass.

> 3 min

QCM best practice - Reusing or not reusing your QCM sensors?

QCM-sensor regeneration can be a resource efficient way of running experiments. Learn about aspects to consider if you plan to resuse ypur sensors.

~ 11 min

QCM-D measurement best practice – preparation of sample liquids

Learn about what aspects to consider when preparing your QCM samples and solvents

About the author

Malin Edvardsson

Malin graduated in engineering physics in 2006, where her research focused on the QCM-D technology. Since then, she has been scrutinizing the how’s and why’s of the world in general, and the world of QCM-D in particular.

Biolin Scientific China

German Webshop

What is design of experiments?

Why is Design of Experiments needed?

The importance of selecting which factors to include

How many factors should be included?

The definition of ‘sample’

Who benefits from using Design of Experiments?

Design of experiments - step by step

Choosing analysis method

Which method should you use to optimize for example molecular adsorption to a surface?

How to select the factor levels

Science on surfaces - tips and tricks on design of experiments, DoE

About the author

Find me on:

More from author

Summary

Topics in post

Share

Explore the blog

Popular

Topics

Authors

Archive