Do We Agree on Hematuria? Evaluating the Drinks Rating System

By: Justin Ingram, BS, Columbia University Irving Medical Center, New York, New York; Benjamin Muller, MD, Columbia University Irving Medical Center, New York, New York; Nina Mikkilineni, MD, Columbia University Irving Medical Center, New York, New York; Eleanor Hayes-Larson, PhD, MPH, Mailman School of Public Health, New York, New York; Gen Li, PhD, Mailman School of Public Health, New York, New York; Christopher Anderson, MD, MPH, Columbia University Irving Medical Center, New York, New York | Posted on: 09 Mar 2023

Introduction

Figure 1. Twelve deidentified hematuric samples from inpatient and outpatient adult urology patients at Columbia University Medical Center, with mean severity scores for each corresponding sample.

Figure 2. Word cloud of written responses to describe appearance of hematuria. Word size reflects frequency of use.

Gross hematuria (GH) is a common urological symptom of an array of underlying diseases, from urinary tract infection to bladder cancer.¹ As a result, GH is one of the most frequent reasons for a urological consult, second only to urinary calculi and lower urinary tract symptoms.² Despite this, there is no widely agreed upon classification system for GH used by providers, which makes communication about the appearance and severity of hematuria challenging. Investigators have attempted to develop methods to describe hematuria, yet the drinks rating system remains commonly used in practice, particularly among non-urological practitioners.^3-6 Nonetheless, this system has never been meaningfully validated, making its true utility unknown. This study aims to assess the performance of the drinks rating system among providers at an academic medical center.

Methods

Twelve hematuric urine samples were collected from adult urology patients (Figure 1). A survey using pictures and videos of the urine samples was distributed to various providers. Each survey included 8 randomly chosen samples. Participants were asked to provide a free-text description of the appearance of each urine sample prior to choosing a match from a predetermined list of 10 drink options: lemonade, amber, pink lemonade, rosé wine, fruit punch, cranberry juice, tea, red wine, cola, and tomato juice. Participants were also asked to rate the severity of hematuria on a 6-point scale from “no hematuria” to “severe” and if the sample contained clots. We used intra-class coefficient (ICC) to estimate the reliability of the drink descriptor and hematuria severity ratings across providers. We stratified ICC by level of training and specialty. The protocol was approved by the Columbia University Institutional Review Board.

Results

Figure 3. Selected intraclass correlation coefficient values for hematuria severity and categorical drink descriptor. IM indicates internal medicine physicians.

We received 105 survey responses for a response rate of 14.1%. Almost 80% of the respondents were physicians (n = 83). The majority of nonphysician participants were registered nurses (RNs; n = 18). Over half of the respondents reported internal medicine as their primary field (n = 54), and more than 35% were 30-40 years old.

Each sample was rated by an average of 67 providers using the ratings shown in the Table. Across all ratings, the most common was mild (22.4%) and least common was severe (12.5%). One hundred ninety-eight unique terms were provided across all samples (Figure 2). When describing samples in their own words, respondents commonly used the words “red,” “pink,” and “hematuria.” The 10 most commonly used words are provided in the Table. Mean severity score was calculated for each sample and ranged from 0.0 (no hematuria) to 4.6 (severe); results are shown in Figure 1.

Selected ICC results are provided graphically in Figure 3. Overall inter-rater reliability for severity was good (ICC 0.75, 95% CI 0.57-0.91) and moderate for the drinks rating system (0.62 [0.42-0.85]).

Table. Survey responses

	Times used, No. (%)
Drink descriptors
Lemonade	83 (10.4)
Amber	73 (9.1)
Pink lemonade	111 (13.9)
Rosé wine	92 (11.5)
Fruit punch	126 (15.8)
Cranberry juice	68 (8.5)
Tea	75 (9.4)
Red wine	57 (7.1)
Cola	26 (3.3)
Tomato juice	89 (11.1)
Severity ratings
No hematuria	154 (19.2)
Mild	180 (22.4)
Mild-moderate	111 (13.8)
Moderate	134 (16.7)
Moderate-severe	124 (15.4)
Severe	100 (12.5)
Free-text word frequency
Red	152 (8.2)
Pink	131 (7.1)
Hematuria	94 (5.1)
Light	89 (4.8)
Clear	83 (4.5)
Bloody	81 (4.4)
Urine	79 (4.3)
Yellow	77 (4.2)
Dark	74 (4.0)
Clots	72 (3.9)

Discussion

Despite being one of the most common reasons for urological consultation, there remains little standardization in the description of GH. A commonly used but unvalidated method for characterizing GH is to compare it to familiar drinks. Our study aimed to assess the agreement between providers in describing hematuria in appearances using the drinks rating system and in subjective severity.

Our observations support the need for a standardized hematuria scale, evidenced by the sheer number of words used to describe hematuria in the free-text portion of the survey (Figure 2). This highlights the lack of consensus regarding an accurate way to describe hematuria.

When asked to estimate the severity of GH, the ICCs were relatively high (ICC 0.75, 95% CI 0.57-0.91), while the ICCs for drink descriptor were almost entirely in the low-moderate to moderate range. As such, our findings highlight that conveying perceived severity is a more reliable way of communicating degree of hematuria compared to the drinks rating system. While groups of practitioners generally tend to agree on the clinical significance of hematuria, we observed differences among types of providers. Nonphysician providers were more likely to rate a particular sample as being more severe twice as often as physicians, despite both groups having similar internal reliability.

Systematic assessment of hematuria has received limited attention in the literature. Studies have previously explored GH assessment by color or severity but rarely both. One study created a point system based on the degree to which urine in catheter tubing obscured the Roman numeral “II” in increasingly large font with a corresponding blood concentration.³ No assessment of reliability was performed. Other studies have utilized color swatch cards for comparison to hematuria.^4-6 Good reliability was found for a 6-color swatch card assessment.⁴ In regard to severity, one study used a 5-point visual severity scale created by asking urologists how they would adjust continuous bladder irrigation if presented with a urine sample.⁸ The resulting scale proved highly reliable among study participants, but due to a small study size (n = 43) and that over 50% of participants in the study were urological practitioners, generalizability to non-urological providers remained limited.⁸

Our study has some limitations. Due to the relatively small number of respondents, the 95% CI for all ICCs was wide. Despite this wide range, the broad conclusions remain the same: practitioners agree more on the severity of hematuria than they do on how to describe it using the drinks rating system. Another limitation is with the drinks rating system itself, specifically due to the large number of possible choices. In addition, the participants in this study were all providers at a single academic center. However, because most consults occur within an institution, the within-institution reliability is of greater relevance than that across institutions.

Our results suggest there is an opportunity to develop a standardized scale for describing hematuria which may help streamline communication about hematuria among health care providers across disciplines. While approaches such as swatch cards may be useful and reliable, a severity score requires no device or product and therefore has fewer barriers to implementation.

Conclusion

We found that practitioners tended to agree more on the subjective severity of hematuria than a description using a drinks rating scale. This suggests there are opportunities to standardize communication about hematuria and using a simple severity scale may effectively convey acuity during urological consultation.

Khadra MH, Pickard RS, Charlton M, Powell PH, Neal DE. A prospective analysis of 1,930 patients with hematuria to evaluate current diagnostic practice. J Urol. 2000;163(2):524-527.
Stoffel JT, Moinzadeh A, Hansen M. Identification of common themes from after-hour telephone calls made to urology residents. Urology. 2003;62(4):618-621.
Sakuma S, Fujita R, Komiya H. A novel method for evaluating and expressing the degree of macroscopic hematuria. Int Urol Nephrol. 2006;38(2):203-205.
Hageman N, Aronsen T, Tiselius HG. A simple device (Hemostick) for the standardized description of macroscopic haematuria: our initial experience. Scand J Urol Nephrol. 2006;40(2):149-154.
Lee JY, Chang JS, Koo KC, Lee SW, Choi YD, Cho KS. Hematuria grading scale: a new tool for gross hematuria. Urology. 2013;82(2):284-289.
Wong LM, Chum JM, Maddy P, Chan ST, Travis D, Lawrentschuk N. Creation and validation of a visual macroscopic hematuria scale for optimal communication and an objective hematuria index. J Urol. 2010;184(1):231-236.
Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155-163.
Stout TE, Borofsky M, Soubra A. A visual scale for improving communication when describing gross hematuria. Urology. 2021;148:32-36.