Statistical analysis is a powerful technique that enables a researcher to draw meaningful conclusions from a study in which data are collected through observation, survey or experimentation. The success of a medical study however depends to a great extent, on the proper statistical analysis of the data emanating from such a study. This fact is quite often ignored by many medical researchers. As a consequence, some very interesting medical studies may be rendered useless due to insufficient and improper statistical analysis.
In the field of medicine and health, one encounters several research problems ranging from simple to complicated, such as monitoring the weight of a group of infants receiving a specific diet and testing whether the average weight of such infants differs from the general average weight for that age, comparing the efficacy of a new medicine with that of the existing one, assessing the effectiveness of different doses of a medication, comparing different treatments simultaneously and choosing the best treatment, estimating the effect of personal factors on a particular disease status e.g. diabetes, categorizing a person as healthy or not based on his response variables, diagnosing cancer subtype and assessing the changes in the health of patients after repeated applications of chemotherapy, prescribing appropriate diet to patients with multi-disease syndrome, predicting survival time of HIV infected patients and other multifarious issues.
In the course of solving these types of problems, researchers tend to collect data from the subjects involved in the study. It is important to note that such problems can be well categorized as univariate or multivariate; cross-sectional or longitudinal; case control or cohort design problems in statistical sense. When the data are collected from all members of the population, it may be sufficient to describe and summarize data using numerical and graphical descriptive statistical methods. However, under most circumstances, it is not feasible to investigate the entire population and thus information is collected from only a sample of members representing the population. Descriptive analysis in this case can be used only to describe the features of sample data. Opinions regarding the populations should not be made at this stage; rather inferential statistical procedures should be applied to draw conclusions with respect to the population on the basis of sample data analysis. Population parameters can be estimated using point estimation and confidence interval estimation methods. In order to test the hypotheses regarding the parameters, one can choose parametric or non-parametric technique depending on the problem under investigation and the type of data collected. Each method is based on certain assumptions. It is utmost important to verify that all the assumptions are met, before the selected inferential method is applied to analyse the data. After performing the statistical analysis, the findings should be precisely interpreted and the probabilities of specific eventualities be clearly stated. It’s crucial to highlight the limitations of the study undertaken..
To summarize, in data related medical studies, it is quite reasonable to convert a medical problem into a statistical problem; collect data through relevant experimental design or questionnaire; choose the most appropriate statistical method; apply it properly, that is, paying a solemn attention to the underlying theory and requirements of the selected statistical method; and at the end, adequately interpret the outcome of the statistical analysis while indicating the level of uncertainty involved. The proper and complete statistical analysis will eventually lead to reliable and valid conclusions. In health care and clinical trial studies involving human subjects, it is highly desirable that proper treatment therapies be used. In the event that the decisions are based on improper statistical analyses the consequences can be disastrous.
Statistical analysis kit contains plenty of tools. A brief orientation is provided in Figure 1. Most of these methods are available for application on various statistical software packages such as SPSS, SAS, Minitab, Epi-info etc.
Figure 1: Statistical analysis kit
This figure is however not exhaustive. Many other testing procedures and user defined model based techniques can be developed and programmed using programming languages such as R and S-Plus, to address typical problems. These packages assist users by automating the calculation involved in applying a statistical method. It is in fact, the onus of the researcher to choose the appropriate method for a given problem and properly interpret the output provided by the software after processing the data.
Dr. Vandna Jowaheer