Examiner experience moderates reliability of human lower extremity muscle ultrasound measurement – a double blinded measurement error study

Examiner experience moderates reliability of human lower extremity muscle ultrasound measurement – a double blinded measurement error study

Authors

  • Konstantin Warneke Department for Movement Science and Exercise Physiology, Friedrich Schiller University Jena, Jena, Germany; Institute of Human Movement Science, Sport and Health, University of Graz, Graz, Austria; Institute of Psychology, Leuphana University Lüneburg, Lüneburg, Germany
  • Stanislav D. Siegel Department for Movement Science and Exercise Physiology, Friedrich Schiller University Jena, Jena, Germany
  • Jonas Drabow Department for Movement Science and Exercise Physiology, Friedrich Schiller University Jena, Jena, Germany
  • Lars H. Lohmann Department for Movement Science and Exercise Physiology, Friedrich Schiller University Jena, Jena, Germany
  • Daniel Jochum Department of Health Science and Technology, ETH Zürich, Zürich, Switzerland
  • Sandro R. Freitas Neuromuscular Research Lab, Faculty of Human Kinetics, University of Lisbon, Lisbon, Portugal
  • José Afonso Centre of Research, Education, Innovation, and Intervention in Sport (CIFI 2 D), Faculty of Sport, University of Porto, Porto, Portugal
  • Andreas Konrad Institute of Human Movement Science, Sport and Health, University of Graz, Graz, Austria

Keywords:

Sonography, Agreement, , Intraclass correlation coefficient, Pennation angle, Muscle thickness

Abstract

Structural muscle properties are critical in health and athletic settings, with magnetic resonance imaging considered the gold standard assessment procedure under static conditions due to its reliability and objectivity. Practical limitations, including cost and accessibility, have led to the increasing use of ultrasound as an alternative for skeletal muscle morphological parameters. However, ultrasound measurements are sensitive to evaluation conditions and assessor experience, which has not been sufficiently explored, yet. Therefore, this study investigated the influence of assessor experience on the reliability of ultrasound measurements. A double-blind design was used, involving an experienced assessor (> 12,000 images for several years) and multiple inexperienced assessors (< 100 images) to collect data from 39 recreationally active participants. Measurements of muscle architecture were conducted in the leg muscles over two consecutive days, generating 1,248 ultrasound images. Relative and absolute reliability were analyzed using intraclass correlation coefficients (ICCs), standard error of measurement, minimal detectable change, mean absolute error (MAE), mean absolute percentage error (MAPE) and Bland-Altman analyses. Relative reliability was good to excellent in all measurement spots and time-points for muscle thickness (ICC = 0.76–0.98) irrespective of assessor experience, except for the inter-day comparison for the gastrocnemius lateralis by the inexperienced assessors, (ICC = 0.58). The pennation angle assessment ranged from insufficient to excellent reliability (ICC = 0.18–0.94) and experience contributed greatly to better results. The random error for the inexperienced assessors was reflected in two- to three-times higher MAEs/MAPEs and limits of agreement in the Bland-Altman analyses, respectively. The findings emphasize the importance of experience and standardization in achieving reliable ultrasound data, particularly for (a) sensitive parameters like the pennation angle and/or (b) interday, intra-subject comparisons.

References

1. Abe T, DeHoyos DV, Pollock ML, Garzarella L (2000) Time course for strength and muscle thickness changes following upper and lower body resistance training in men and women. Eur J Appl Physiol Occup Physiol 81(3):0174. https://doi.org/10.1007/s00421005 0027

2. Atkinson G, Nevill AM (1998) Statistical methods for assessing measurement error (Reliability) in variables relevant to sports medicine. Sports Med 26(4):217–238. https://doi.org/10.2165/ 00 007 256 - 199 82 6040 - 00002

3. Beaudart C, Zaaria M, Pasleau F, Reginster J-Y, Bruyère O (2017) Health outcomes of sarcopenia: A systematic review and Meta-Analysis. PLoS ONE 12(1):e0169548. https://doi.org/10.1371/journal.pone.0169548

4. Bemben MG (2002) Use of diagnostic ultrasound for assessing muscle size. J Strength Cond Res 16(1):103–108

5. Betz TM, Wehrstein M, Preisner F, Bendszus M, Friedmann-Bette B (2021) Reliability and validity of a standardized ultrasound examination protocol to quantify Vastus lateralis muscle. J Rehab Med 53(7). https://doi.org/10.2340/16501977 - 2854

6. Bland JM, Altmann DG (1986) Statistical Methods of Assessing Agreement between two methods of Clinical Measurement. Lancet, i, 307–310

7. Brusco CM, Pinto RS, Blazevich AJ (2022) Reliability and comparison of sonographic methods for. Med Sci Sports Exerc 54(12):2216–2226. https://doi.org/10.1249/ MS S .00 00 00 00 00 00 3015. Vivo Measurement of Human Biceps Femoris Long-Head Architecture

8. Cronin K, Foley S, Cournane S, De Vito G, Delahunt E (2022) Hamstring muscle architecture assessed sonographically using wide field of view: A reliability study. PLoS ONE 17(11):e0277400. https://doi.org/10.1371/journal.pone.0277400

9. Damas F, Libardi CA, Ugrinowitsch (2018) · Carlos. The development of skeletal muscle hypertrophy through resistance training: the role of muscle damage and muscle protein synthesis. Eur. J. Appl. Physiol., 118(3), 485–500. https://doi.org/10.1007/s00421 - 017 - 379 2 - 9

10. DeFreitas JM, Beck TW, Stock MS, Dillon MA, Kasishke PR (2011) An examination of the time course of training-induced skeletal muscle hypertrophy. Eur J Appl Physiol 111(11):2785–2790. htt ps://doi.org/10.1007/s0042 1 - 011 - 1905 - 4

11. e Lima KMM, da Matta TT, de Oliveira LF (2012) Reliability of the rectus femoris muscle cross-sectional area measurements by ultrasonography. Clin Physiol Funct Imaging 32(3):221–226. https://doi.org/10.1111/ j.1475 -097 X. 2011.01115.x

12. English C, Fisher L, Thoirs K (2012) Reliability of real-time ultrasound for measuring skeletal muscle size in human limbs in vivo: a systematic review. Clin Rehabil 26(10):934–944. https://doi.org/10.1177/026921 5511434994

13. Fortin M, Rosenstein B, Levesque J, Nandlall N (2021) Ultrasound imaging analysis of the lumbar multifidus muscle echo intensity: Intra-Rater and InterRater reliability of a novice and an experienced rater. Medicina 57(5):512. https://doi.org/10.3390/medicina 57050512

14. Giavarina D (2015) Understanding Bland Altman analysis. Biochemia Med 25(2):141–151. https://doi.org/10.11613/BM.2015.015

15. Goldspink G, Harridge S (2003) Cellular and Molecular Aspects of Adaptation in Skeletal Muscle. In P. V. Komi (Ed.), Strength and Power in Sport (2nd ed., Vol. 3, pp. 231–251)

16. Goodpaster BH, Park SW, Harris TB, Kritchevsky SB, Nevitt M, Schwartz AV, Simonsick EM, Tylavsky FA, Visser M, Newman AB (2006) The loss of skeletal muscle strength, mass, and quality in older adults: the health, aging and body composition study. Journals Gerontol Ser A: Biol Sci Med Sci 61(10):1059–1064. https://doi.org/10.1093/gerona/61.10.1059

17. Hammond K, Mampilly J, Laghi FA, Goyal A, Collins EG, McBurney C, Jubran A, Tobin MJ (2014) Validity and reliability of rectus femoris ultrasound measurements: comparison of curved-array and linear-array transducers. J Rehabil Res Dev 51(7):1155–1164. https://doi.org/10.1682/JR RD.2013.08.0187

18. Högelin ER, Thulin K, von Walden F, Fornander L, Michno P, Alkner B (2022) Reliability and validity of an Ultrasound-Based protocol for measurement of quadriceps muscle thickness in children. Front Physiol 13. https://doi.org/10.3389/fphys.2022.830216

19. Hopkins WG (2000) Measures of reliability in sports medicine and science. Sports Med 30(1):1–15. https://doi.org/10.2165/00007256 - 2000 30010-00001

20. Ishida H, Suehiro T, Suzuki K, Watanabe S (2018) Muscle thickness and echo intensity measurements of the rectus femoris muscle of healthy subjects: intra and interrater reliability of transducer Tilt during ultrasound. J Bodyw Mov Ther 22(3):657–660. https://doi.org/10.1016/j.jbmt.2017.12.005

21. Kim S, Kim H (2016) A new metric of absolute percentage error for intermittent demand forecasts. Int J Forecast 32(3):669–679. https://doi.org/10.1016/ j.ijforecast.2015.12. 003

22. König N, Cassel M, Intziegianni K, Mayer F (2014) Inter-rater reliability and measurement error of sonographic muscle architecture assessments. J Ultrasound Med 33(5):769–777. https://doi.org/10.7863/ultra. 33 .5 .769

23. Koo TK, Li MY (2016) A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 15(2):155–163. https://doi.org/10.1016/ j. jcm. 2016.02.012

24. Kwah LK, Pinto RZ, Diong J, Herbert RD (2013) Reliability and validity of ultrasound measurements of muscle fascicle length and pennation in humans: a systematic review. J Appl Physiol 114(6):761–769. https://doi.org/10.1152/japplphysiol.01430.2011

25. Lanza MB, Rock K, Marchese V, Gray VL, Addison O (2022) Ultrasound measures of muscle thickness and subcutaneous tissue from the hip abductors: Inter- and intra-rater reliability. Musculoskelet Sci Pract 62:102612. https://doi.org/10.1016/j.msksp.2022.102612

26. Lesinski M, Bashford G, Markov A, Risch L, Cassel M (2024) Reliability of assessing skeletal muscle architecture and tissue organization of the gastrocnemius medialis and Vastus lateralis muscle using ultrasound and Spatial frequency analysis. Front Sports Act Living 6. https://doi.org/10.3389/fspor.2024.1282031

27. Lixandrão ME, Damas F, Chacon-Mikahil MPT, Cavaglieri CR, Ugrinowitsch C, Bottaro M, Vechin FC, Conceição MS, Berton R, Libardi CA (2016) Time course of resistance Training–Induced muscle hypertrophy in the elderly. J Strength Conditioning Res 30(1):159–163. https://doi.org/10.1519/ JSC.0000000000001019

28. Lohmann LH, Hillebrecht M, Schiemann S, Warneke K (2024) Stressing the relevance of differentiating between systematic and random measurement errors in ultrasound muscle thickness diagnostics. Sports Med - Open

29. Maestroni L, Read P, Bishop C, Papadopoulos K, Suchomel TJ, Comfort P, Turner A (2020) The Benefits of Strength Training on Musculoskeletal System Health: Practical Applications for Interdisciplinary Care. In Sports Medicine (Vol. 50, Issue 8, pp. 1431–1450). Springer. https://doi.org/10.1007/s40279- 020-01309-5

30. Nijholt W, Scafoglieri A, Jager-Wittenaar H, Hobbelen JSM, van der Schans CP (2017) The reliability and validity of ultrasound to quantify muscles in older adults: a systematic review. J Cachexia Sarcopenia Muscle 8(5):702–712. https://doi.org/10.1002/jcsm.12210

31. Panidi I, Donti O, Konrad A, Petros CD, Terzis G, Mouratidis A, Gaspari V, Donti A, Bogdanis GC (2023) Muscle architecture adaptations to static stretching training: a systematic review with meta-analysis. Sports Med Open, 9(1)

32. Petermann-Rocha F, Balntzi V, Gray SR, Lara J, Ho FK, Pell JP, Celis‐Morales C (2022) Global prevalence of sarcopenia and severe sarcopenia: a systematic review and meta‐analysis. J Cachexia Sarcopenia Muscle 13(1):86–99. https://doi.org/10.1002/jcsm.12783

33. Pinto-Ramos J, Costa-Ramos C, Costa F, Tavares H, Cabral J, Moreira T, Brito R, Barroso J, Sousa-Pinto B (2022) Reliability of point-of-care ultrasound for measuring quadriceps femoris muscle thickness. Eur J Phys Rehabil Med 58(5). https://doi.org/10.23736/S19739087.22.07432-9

34. Ribeiro G, de Aguiar RA, Penteado R, Lisbôa FD, Raimundo JAG, Loch T, Meira Â, Turnes T, Caputo F (2022) A-Mode ultrasound reliability in fat and muscle thickness measurement. J Strength Conditioning Res 36(6):1610–1617. https://doi.org/10.1519/JSC.0000000000003691

35. Santos R, Armada-da-Silva PAS (2017) Reproducibility of ultrasound-derived muscle thickness and echo-intensity for the entire quadriceps femoris muscle. Radiography 23(3):e51–e61. https://doi.org/10.1016/j.radi.2017.03.011

36. Šarabon N, Kozinc Ž, Podrekar N (2019) Using shear-wave elastography in skeletal muscle: A repeatability and reproducibility study on biceps femoris muscle. PLoS ONE 14(8):e0222008. https://doi.org/10.1371/journal.pone.0222008

37. Sarto F, Spörri J, Fitze DP, Quinlan JI, Narici MV, Franchi MV (2021) Implementing ultrasound imaging for the assessment of muscle and tendon properties in elite sports: practical aspects, methodological considerations and future directions. Sports Med 51(6):1151–1170. https://doi.org/10.1007/s40279-021-01436-7

38. Scott JM, Martin DS, Ploutz-Snyder R, Caine T, Matz T, Arzeno NM, Buxton R, Ploutz-Snyder L (2012) Reliability and validity of panoramic ultrasound for muscle quantification. Ultrasound Med Biol 38(9):1656–1661. https://doi.org/10.1016/j.ultrasmedbio.2012.04.018

39. Soares ALC, Carvalho RF, Mogami R, de Meirelles C M., Gomes PSC (2024) Effect of resistance training on quadriceps femoris muscle thickness obtained by ultrasound: A systematic review with meta-analysis. J Bodyw Mov Ther 39:270–278. https://doi.org/10.1016/j.jbmt.2024.02.007

40. Stausholm MB, da Silva KR, Inácio PA, de Sá Filho AS, Lopes-Martins PSL, Bjordal JM, Leonardo PS, Lopes-Martins RAB (2024) Reliability of ultrasound assessment of the rectus femoris muscle thickness: intra-rater, inter-rater, and inter-day analysis accounting for age and sex. BMC Musculoskelet Disord 25(1):916. https://doi.org/10.1186/s12891-024-08033-5

41. Stokes T, Tripp TR, Murphy K, Morton RW, Oikawa SY, Choi L, McGrath H, McGlory J, MacDonald C, M. J., Phillips SM (2021) Methodological considerations for and validation of the ultrasonographic determination of human skeletal muscle hypertrophy and atrophy. Physiological Rep 9(1). https://doi.org/10.14814/phy2.14683

42. Thoirs K, English C (2009) Ultrasound measures of muscle thickness: intraexaminer reliability and influence of body position. Clin Physiol Funct Imaging 29(6):440–446. https://doi.org/10.1111/j.1475-097X.2009.00897.x

43. Tighe J, McManus I, Dewhurst NG, Chis L, Mucklow J (2010) The standard error of measurement is a more appropriate measure of quality for postgraduate medical assessments than is reliability: an analysis of MRCP(UK) examinations. BMC Med Educ 10(1):40. https://doi.org/10.1186/1472-6920-10-40

44. Turner AN, Parmar N, Jovanovski A, Hearne G (2021) Assessing Group-Based changes in High-Performance sport. Part 1: null hypothesis significance testing and the utility of P values. Strength Conditioning J 43(3):112–116. https://doi.org/10.1519/SSC.0000000000000625

45. Warneke K, Keiner M, Lohman LH, Brinkmann A, Hein A, Schiemann S, Wirth K (2022) Critical evaluation of commonly used methods to determine the concordance between sonography and magnetic resonance imaging: A comparative study. Front Imaging accepted. https://doi.org/10.3389/fimag.2022.1039721

46. Warneke K, Keiner M, Wohlann T, Lohmann LH, Schmitt T, Hillebrecht M, Brinkmann A, Hein A, Wirth K, Schiemann S (2023) Influence of long-lasting static stretching interventions on functional and morphological parameters in the plantar flexors: A randomized controlled trial. J Strength Conditioning Res 37(10):1993–2001

47. Warneke K, Lohmann LH, Behm DG, Wirth K, Keiner M, Schiemann S, Wilke J (2024) Effects of chronic static stretching on maximal strength and muscle hypertrophy: A systematic review and Meta-Analysis. Sports Med Open 10(1):45. https://doi.org/10.1186/s40798-024-00706-8

48. Warneke K, Gronwald T, Wallot S, Magno A, Hillebrecht M, Wirth K (2025) Discussion on the validity of commonly used reliability indices in sports medicine and exercise science - A critical review with data simulations. Eur J Appl Physiol. https://doi.org/10.1007/s00421-025-05720-6

49. Willemse L, Wouters EJM, Pisters MF, Vanwanseele B (2022) Intra-assessor reliability and measurement error of ultrasound measures for foot muscle morphology in older adults using a tablet‐based ultrasound machine. J Foot Ankle Res 15(1). https://doi.org/10.1186/s13047-022-00510-1

50. Willmott C, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Res 30:79–82. https://doi.org/10.3354/cr030079

51. Willmott CJ, Matsuura K (2006) On the use of dimensioned measures of error to evaluate the performance of Spatial interpolators. Int J Geogr Inf Sci 20(1):89–102. https://doi.org/10.1080/13658810500286976

52. Wong V, Song JS, Abe T, Spitz RW, Yamada Y, Bell ZW, Kataoka R, Kang M, Loenneke JP (2022) Muscle thickness assessment of the forearm via ultrasonography: is experience level important? Biomedical Phys Eng Express 8(2):027003. https://doi.org/10.1088/2057-1976/ac4d42

Downloads

Published

2025-03-26

How to Cite

1.
Warneke K, Siegel SD, Drabow J, et al. Examiner experience moderates reliability of human lower extremity muscle ultrasound measurement – a double blinded measurement error study. Ultrasound J. 2025;17(1):20. Accessed January 30, 2026. https://www.mattioli1885journals.com/index.php/theultrasoundjournal/article/view/18134