Arthroscopic classification of intra-articular hip pathology demonstrates at best moderate interrater reliability
PURPOSE: The purpose of this study was to report several novel classification systems for intra-articular lesions observed during hip arthroscopy, and to quantify the interrater reliability of both these novel systems and existing classifications of intra-articular lesions when tested by a group of high-volume hip arthroscopists. METHODS: Five hip arthroscopists deliberated over shortcomings in current classification systems and developed several novel grading systems with particular effort made to capture factors important to the treatment and outcomes of hip arthroscopy for labral injury. A video learning module describing the classifications was then developed from the video archive of surgeries performed by the senior author and reviewed by study participants. Following review of the module, a pilot study was completed using five randomly selected videos, after which participating surgeons met once more to discuss points of disagreement and to seek clarification. The final video collection for testing reliability was composed of 29 videos selected with the intent of capturing all sublevels of each classification scheme. Study participants recorded their assessments using each classification scheme, and interrater reliability was calculated by a study participant not involved in grading. RESULTS: The average kappa coefficients for the classification schemes ranged from 0.38 to 0.54, with the interrater reliability of all classification schemes except labral degeneration qualifying as moderate. The percent of cases with absolute agreement ranged from 17.2% to 51.7% across the classification systems. CONCLUSIONS: Even among a group of high-volume hip arthroscopists who engaged in several discussions about the proposed classification schemes, grades were found to have at best moderate interrater reliability. Moderate interrater reliability is demonstrated for novel grading systems for describing labral tear complexity, labral bruising, labral size, and extent of synovitis, and fair reliability is demonstrated for labral degeneration. Further development and refinement of multifactorial grading systems for describing labral injury are indicated. Evaluating the multifactorial nature of intra-articular lesions in the hip is an important part of intraoperative decision-making and defining reliable classifications for intra-articular lesions is a critical first step towards developing generalizable criteria for guiding treatment type. LEVEL OF EVIDENCE: Level III.