STAT6000: STATISTICS FOR PUBLIC HEALTH
Assignment: Identifying and Interpreting Statistics in Research Articles
Paper 1: Alcohol and other drug use at school leavers’ celebrations (Lam et al., 2014).
- In statistics, null hypothesis refers to a general statement that proposes that there is no significant difference or there is no association between specific population characteristics. It is a default position which is opposite of the alternative hypothesis. It is often denoted as H0. The null hypothesis is the one which is tested by the researchers and the researchers wants to reject or nullify this hypothesis. They work to reject this hypothesis and come up with an alternative hypothesis; one that they think explains the phenomenon (Ganeshpurkar et al., 2018).
In contrast to null hypothesis, alternative hypothesis is often denoted by Ha. This hypothesis states that something is happening and there is a new theory which is preferred over the old one (null hypothesis). This hypothesis is consistent with the research hypothesis as it is built from previous studies and literature review (Emmert-Streib & Dehmer, 2019).
In the paper by Lam et al (2014), two hypotheses were tested. In the first hypothesis the null and alternative hypothesis are as follows:
Null Hypothesis (H0):There is no significant difference in the alcohol and other drug (AOD) use between an average day at thecelebrations of school leavers and the last event.
Alternative Hypothesis (Ha) : There is significant difference in the alcohol and other drug (AOD) use between an average day at the celebrations of school leavers and the last event.
Logistic regression analysis is done to evaluate the effects of six major elements or factors on the likelihood of experience of 17 adverse impacts which are reported. Independent variable refers to those variables which are used as an input value in a regression analysis. They do not change by any other variables. Independent variables are also known as explanatory variables or regressors. On the other hand, dependent variable is the variable which is an output value and its measurement depends on other variables, that is, independent variables. Thus dependent variable is also known as explained variable or regressand. The choice of explanatory variables is such that to find the ability of strategies which are safe to diminish experience of several of unfavourable actions. In this model, the explanatory variables are:
- Alcohol use on an average celebration day
- PBSS (Protective Behaviour Strategy Score)
- Gender
- Other drug use
- Accommodation location
- Survey modality ( These are all categorical variables)
In the secondhypothesis of the paper, the null and alternative hypotheses are:
Null Hypothesis (H0) :There is no significant difference in the number of drinks consumed per hour between an average day at the celebrations of school leavers and the last event.
Alternative Hypothesis (Ha) : There is significant difference in the number of drinks consumed per hour between an average day at the celebrations of school leavers and the last event.
The independent variables in this hypothesis are same as that of first hypothesis. The dependent variables are also same, that is, likelihood of adverse effects related with AOD use are:
- Hangover
- Vomiting
- Physically aggressive
- Accident/ injury
- Stole private/ public property
- Blackout
- Emotional outburst
- Any sexual risk/ problem
- Unprotected sex
- Any legal problem
- Inability to pay for things
- Act if vandalism
- Sexual situation was not happy
- Heated argument
- Removed from island/ accommodation
- Regretted sexual encounter
- Arrested for intoxicated behaviour
- In the paper, convenience sampling method is used. The measurement was done using self reported measures. Self reported measure refers to those in which the respondents or participants are asked to report directly on their own of their beliefs, attitudes or intentions. The most common measure of self report used in the paper is likert scale. The advantages of self report methodology are:
- This method is very simple and easy to use.
- The cost involved is very low.
- The self report data can be collected in several ways according to the need of researchers.
- This method is considered as an appropriate measuremainly for adolescent drug use.
There are also disadvantages of using this methodology:
- Deliberate misreporting is a problem in this method. For all the self- administered surveys, there is possibility of under reporting and over reporting. This can skew results.
- It is possible in this method, recall effectsof past behaviour can lead to incomplete or inaccurate reports.
- There is a chance that order of items can influence what the people responds.
- This methodology is very harmful if the respondents are intentionally dishonest.
- In the study, a significant proportion of adolescents are surveyed. Seventeen to eighteen year old Australians were asked to complete the pre and post event surveys.
The first survey sampled young people, the number of people who planned to take part in the event is 541. Among them 56 percent of them were female ( 91 percent of 17 years old and 9 percent of 18 years old).Among these numbers of people, 87 percent of them have taken admission in an independent school.
The number of people who complete the post celebration survey are 405 young people, out of which 50 percent were female ( 94 percent of 17 years old and 6 percent of 18 years old). Among these number of people 92 percent attended an independent school.
The prevalence of negative consequences is also reported in the paper. The highest prevalence of negative consequence among young people was hangover ( 67.42 percent), followed by blackout (57.95 percent).
- Inferential statistic refers to drawing conclusions or inferences from the data. In inferential statistics, we make preferences and predictions of the data and infer properties of the population. The main inferential statistics that are easy and mostly used to interpret the data are: hypothesis testing, one sample test of difference, confidence intervals, regression analysis, t test or Pearson correlation (Alacaci, 2004).
In this paper, logistic regression analysis is executed to analyse the effect of explanatory variables on the explained variable. Logistic regression is used in the study because the dependent variable is a binary variable (Nagelkarke, 1995). It is the likelihood of experience of 17 adverse effects which are reported. Wilcoxon signed rank test was also used as it is a paired difference test. It is a hypothesis test that is used to compare two related samples. In this paper, to compare alcohol and other drug use between an average day at the celebrations of school leavers and the last event, Wilcoxon signed rank tests is carried out a number of times
- Odds ratio tells the strength of association between any two events, say event X and event Y. Engagement in unprotected sex, the odds ratio turned out to be10.92. It means that in comparison with the score of 14-46 (comparison/reference group), respondent whose protective behaviour strategy score was between 66-84 were 10.92 times as likely to stateunprotected sex.
- Since the paper uses convenience sampling and self reported measures are used for survey, the sample is not very much representative. It cannot be stated that sample is representative of the celebrations. Though, the results are similar to many Australian Schoolies studies, the estimates in this study are comparatively higher.
Paper 2: The rising tide of diabetes mellitus in a Chinese population: a population- based household survey on 128,895 persons (Wong et al., 2014).
- In the paper by Wong et al. (2013), the author attempts to study the diabetes mellitus prevalence which is self-reported by the population in specific years from 2001 to 2008.The paper evaluates the factors which are associated with diabetes. There are three factors which are considered in this paper: age, sex and household income.
It is possible to state the aim in terms of null and alternative hypothesis as the main aim is to find the significant factors which are associated with diabetes.
- Null Hypothesis (H0) :There is no significant associationbetween diabetes prevalence and increasing age.
Alternative Hypothesis (Ha) :There is significant association between diabetes prevalence and increasing age..
- Null Hypothesis (H0) : There is no significant associationbetween diabetes prevalence and gender.
Alternative Hypothesis (Ha) : There is significant association between diabetes prevalence and gender.
- Null Hypothesis (H0) : There is no significant association between diabetes prevalence and household income.
Alternative Hypothesis (Ha) : There is significant association between diabetes prevalence and household income.
- The main aim of this sampling was to gatherdata of households of the population of Hong Kong. In the survey, the major population is of age between 35 to 44 years (18.2%), followed by 45-54 years (16.4%). The lowest proportion of total population in the survey is of age group of more than 75 years (5.1%). The percentage of male in the population surveyed is 49.8% and the percentage of female in the population is 50.2%. The majorpopulation in the surveyhas a monthly household income between 10,000- 24,999 (42.4%). Only 10.4 percent if surveyed population has monthly household income of more than 50,000.
- The main inferential statistics used in this study was binary logistic regression. Logistic regression was used because the main aim of the paper was to evaluate the factors that are associated with the prevalence of diabetes. The logistic regression made it easy to determine the impact of age, sex and household income on the diabetes prevalence. The outcome variable in the regression was self-reported diabetes which was a binary variable. The independent or explanatory variables in the regression analysis are age, sex and household income.
- The researchers find that the prevalence of diabetes was increasing progressively over the years in the two lowest income groups among all the respondents when the age and sex is adjusted. It was observed from the analysis that there was a significant connection between household income and prevalence of diabetes. The lowest income group respondents have reportedmore prevalence of diabetes as compared to the highest income group.
- When the regression was done using patient demographics as independent variables, many insights were drawn from the results. The adjusted odds ratio for older age respondents of age between 40-65 years is 32.2 and for older than 65 years is 120.08. The adjusted odds ratio of lower income people with monthly household income less than 9,999 is 2.19. Thus, the odds ratio of more than one points out that there exists a trend of relation between increasing age and prevalence of diabetes. Also, the people with lower income reported more likely of diabetes prevalence. Whereas, the respondents’ sex has no significant association with diabetes (Wallot & Leonardi, 2018).
- There are several limitations of the study which makes it difficult to trust the results. The major problem involved in the study is that self reportinginformation is taken to determine the prevalence of diabetes and a study in China suggested that three out of four diabetic patients remain undiagnosed. There has been no validation of self reporteddiabetes. Also, the standard of coefficient of the regression analysis turned out to be 20%. This means that only 20 percent of the variability of diabetes prevalence is explained by age, sex and household income. Thus, there are more factors which have an impact on the diabetes prevalence like body mass index, lifestyle factors, family history which the study should have included.
References
Alacaci, C. (2004). Inferential statistics: Understanding expert knowledge and its implications for statistics education. Journal of Statistics Education, 12(2).
Emmert-Streib, F., &Dehmer, M. (2019).Understanding Statistical Hypothesis Testing: The Logic of Statistical Inference. Machine Learning and Knowledge Extraction, 1(3), 945-961.
Ganeshpurkar, A., Maheshwari, R., Tekade, M., &Tekade, R. K. (2018).Concepts of Hypothesis Testing and Types of Errors.In Dosage Form Design Parameters (pp. 257-280).Academic Press.
Lam, T., Liang, W., Chikritzhs, T., &Allsop, S. (2014). Alcohol and other drug use at school leavers’ celebrations. Journal of Public Health, 36(3), 408-416. Retrieved from: http://jpubhealth.oxfordjournals.org/content/early/2013/08/26/pubmed.fdt087.full.pdf+html
Nagelkerke, N. J., Moses, S., Plummer, F. A., Brunham, R. C., & Fish, D. (1995). Logistic regression in case‐control studies: The effect of using independent as dependent variables. Statistics in medicine, 14(8), 769-775.
Wallot, S., &Leonardi, G. (2018). Deriving inferential statistics from recurrence plots: A recurrence-based test of differences between sample distributions and its comparison to the two-sample Kolmogorov-Smirnov test. Chaos: An interdisciplinary journal of nonlinear science, 28(8), 085712.
Wong, M. C., S., Leung, M. C., M., Tsang, C. S., H., & Griffiths, S. M. (2013). The rising tide of diabetes mellitus in a Chinese population: A population-based household survey on 121,895 persons. International Journal of Public Health, 58(2), 269-276. Retrieved from: http://dx.doi.org.ezproxy.laureate.net.au/10.1007/s00038-012-0364-y
Get solved or fresh solution on STAT6000: Identifying and Interpreting Statistics in Research Articles and many more. 24X7 help, plag free solution. Order online now!