Handling Experimental Data

We have an ethical obligation to check our data and make sure that they are accurate before publishing them in the literature. This doesn’t mean that we must be 100% certain before we communicate our findings but it does mean that we should have carefully eliminated all sources of error, ruled out any other reasonable hypotheses first.

All data – even negative results – must be reported. Data should never be “edited” so that they fit our hypotheses, no matter how confident we may be about the validity of our hypotheses. A suspect data point (note: not data points) should only be removed if you can legitimately meet the statistical requirements for an outlier. Report the results of all of your experiments to your advisor no matter how attractive or unattractive you feel the results may be.

Approach your work with a healthy sense of skepticism. Be critical of your results and of your interpretation. Be careful not to jump too quickly to conclusions. That is investigator bias. A good way to make sure that you aren’t wearing blinders is to present your data to your advisor and/or other member’s of your research group. If they don’t see what you see in the data, it may be that the trend you see in the data really isn’t there.

When using a new, unfamiliar method of data analysis always exercise due caution. This is particularly important today as we have access on our computer desktops to some very sophisticated methods of data analysis. Sometimes, it is simply too easy to use these programs without fully understanding the underlying methods, their assumptions, and limitations. Don’t use methods that you don’t understand and cannot defend. Don’t use something simply because your advisor tells you to either. Ignorance is not a valid excuse for misusing statistical methods of data analysis. Ask your advisor and/or consult a statistician if you don’t understand what you are doing, learn the background on the method, and then you will be able to apply it with confidence and skill to the analysis of your data.