Effect sizing in the statistical sport data processing

PhD, Associate Professor I.E. Barnikova1
Dr.Hab., Professor A.V. Samsonova1
PhD, Professor L.L. Tsipin1
1Lesgaft National State University of Physical Education, Sport and Health, St. Petersburg

Keywords: effect size, statistical sport data processing, effect sizing for related samples, Student t-criterion.

Backgroundю Presently one of the key requirements for the international research publications is the effect size calculation, with the effect size considered one of the key verifications of the practical significance of the study findings. International research journals and publishers including those specializing in the physical education and sport matters tend to increasingly reject the study data verified by the p-value only. It is a fairly standard requirement nowadays that a study report shall specify the effect size to confirm the practical benefits of the findings, confidence intervals and effective ranges of the applied criteria[7, 11]. It should be mentioned that the national research publications on the physical education and sport matters seldom if ever give the effect size ratings [9, 14] and give preference in the practical validity ratings to p-values – despite the opinions [5, 12, 10] that the statistical meaning of difference may not always be indicative of the practical significance of such difference.

Objective of the study was to give examples of the study reports substantiated by the effect size calculations to analyze the pros and cons of this statistical index.

Results and discussion. Effect size is used to rate the variation in the subject phenomenon in values or degrees with acronym effect size referring to the effect size including a group of indices [1]. The effect size calculation procedures are generally determined by data type, research hypothesis, experimental model and significance/ meaning criterion [12, 13]. We analyze in this study the effect size calculation procedure to check a statistical hypothesis with application of the Student t-criterion for couples of data arrays or related samples prior to and after an experiment.

As demonstrated by analyses of the research literature, the research community gives preference to Cohen’s d values in rating the effect size when checking a statistical hypothesis with application of the Student t-criterion for related independent samples – like experimental group (EG) and control group (CG). [7, 8, 14]. Effect size for related samples is defined as [4, 13] and computed using the following equation:

,                    (1)

 means the effect size with of the Student t-criterion for related samples; – mean arithmetic differences between the data arrays for the samples, e.g. the pre- and post-experimental ones; and  – standard deviation of the difference between the two data arrays. When the Student t-criterion is known, the following equation may be used:

,                    (2)

t – means the Student criterion; n – sample. Let us now give the following Table 1 to illustrate the case.

Table 1. 18-20 year old female sample: pre- versus post-experimental body masses, kg (n=10)

Pre-experimental

59,1

62,3

58,6

60,2

63,4

78,6

55,4

64,9

65,0

63,2

Post-experimental

58,0

62,3

58,0

59,1

60,2

68,3

57,0

60,8

62,0

60,7

Difference, d

1,1

0

0,6

1,1

3,2

10,3

-1,6

4,1

3,0

2,5

Line d gives the mean arithmetic values of the pre- and post-experimental data; ; standard deviation ; and Student t-criterion is t=2.369. The differences of the data arrays are statistically significant (p<0.05) as verified by any of the statistical toolkits.

If we now compute the using the equation (1):

.Equation (2) gives the same result:

.Let us compute the effect size. J. Cohen[4] offered the following guiding limits for effect size ratings for averages: minor effect at = 0.2; medium effect at  = 0.5; and large effect at  0.8. In the above case the effect size may be rated medium. Regretfully, J. Cohen [4] offered no detailed effect size ranges since his effect size computation procedures were developed for the psychological and social test data arrays only which may be above one  (  = 1,57) as demonstrated by some study reports [2]. A few foreign expert analyses have shown that in application to the sport data arrays the effect size values may differ from the Cohen’s analyses [9, 3].

Conclusion. The proposed effect size calculation procedure for related samples with application of the popular Student t-criterion for the statistical hypotheses verification is rather simple and practical for the data processing purposes. The study data and analysis demonstrate benefits of the effect size calculation formula for substantiation of the study data and findings. It should be emphasized, however, that the physical education and sport research community needs a sector-specific dependable effect size rating scale since every of the existing scales is still imperfect in fact.

References

  1. Barnikova I.E. Ispolzovanie informatsionnykh tekhnologiy dlya otsenki razmera effekta v biomekhanicheskikh issledovaniyakh [Information technology to assess the effect in biomechanical research]. Works of Biomechanics Department of. P.F. Lesgaft University]. no. XI. St. Petersburg: R-KOPI, 2017. pp. 6-11.
  2. Samsonova A.V., Borisevich M.A., Barnikova I.E. Izmenenie mekhanicheskikh svoystv skeletnyih myishts pod vliyaniem fizicheskoy nagruzki [Changes in mechanical properties of skeletal muscles under physical activity]. Uchenye zapiski universiteta im. P.F. Lesgafta.  2017. no. 2 (144). pp. 221-224.
  3. Bernards J.R., Sato K., Haff G.G., Bazyler C.D. Current Research and Statistical Practices in Sport Science and a Need for Change  Sports. 2017. Vol. 5(4) P. 87.
  4. Cohen J. Statistical Power Analysis for the Behavioral Sciences. New York: Lawrence Erlbaum Associates, 1988. 568 p.
  5. Cohen J. The Earth Is Round (p < .05). American Psychologist. 1994. Vol. 49(12) pp. 997-1003.
  6. Ferguson C.J. An Effect Size Primer: A Guide for Clinicians and Researchers. Professional Psychology: Research & Practice.2009. No.40. pp.532-538.
  7. Fritz C.O., Morris P.E., Richler J.J. Effect Size Estimates: Current Use, Calculations, and Interpretation.   Journal of Experimental Psychology: General. 2012. vol. 141 (1). pp.2-18.
  8. Fröhlich M., Emrich E., Pieter A., Stark R. Outcome effects and effects sizes in sport sciences   International Journal of Sports Science and Engineering. 2009. vol. 3(3). pp. 175-179.
  9. Goodman S.N. Aligning statistical and scientific reasoning. Science. 2016. Vol. 352(6290). pp.1180-1181.
  10. Kelley K., Preacher K.J. On effect size.   Psychological Methods. 2012. vol.17. pp. 137-152.
  11. Kline R.B. Beyond significance testing: Reforming data analysis methods in behavioral research []. Washington, DC: American Psychological Association, 2004. 325 p.
  12. Lakens D. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology. 2013. No. 4. P. 863.
  13. Tomczak M., Tomczak E. The need to report effect size estimates revisited. An overview of some recommended measures of effect size.   Trends in Sport Sciences.2014. vol. 1. pp.19-25.

Corresponding author: l_tsipin@mail.ru

Abstract

Objective of the study was to give examples of the study reports substantiated by the effect size calculations to analyze the pros and cons of this statistical index. The study further analyzes the effect size calculation method for the statistical hypotheses verification purposes with application of the Student t-criterion to the pre- versus post-experimental data arrays.

The proposed effect size calculation procedure for related samples with application of the popular Student t-criterion for the statistical hypotheses verification is rather simple and practical for the data processing purposes. The study data and analysis demonstrate benefits of the effect size calculation formula for substantiation of the study data and findings. It should be emphasized, however, that the physical education and sport research community needs a sector-specific dependable effect size rating scale since every of the existing scales is still imperfect in fact.