Field site and CRA/CRC screening
Kaiser Permanente Hawaii (KPH) is a fully integrated health plan that encourages all members between the ages of 50 and 75 (except those who have had a previous CRA or CRC, which are followed by colonoscopy) to participate annual screening for CRA/CRC2; approximately 75% of these members provide a FIT for screening each year. Thus, members with previous CRA or CRC were excluded from the current study, but those with previous FIT screening (regardless of FIT result) were eligible. Colonoscopy is not used for routine screening. The first stage of screening involves self-collection, at home, of feces using a licensed, commercially available, bar-coded tagged FIT device (Polymedco OC-Auto Micro FOB Test, Cortland Manor NY). The device typically gets 10-50mg of feces into a sealed vial containing 2ml of proprietary solution. Within 3 days, it is sent at room temperature in a prepaid courier via the United States Postal Service to the KPH (Moanalua Medical Center) reference laboratory where, the next working day, the immunochemical tests for human hemoglobin are carried out with a system dedicated robotic instrument (Polymedco Auto Sensor Diana) integrated into the laboratory information system. The positive threshold with the Polymedco process and equipment is 100 ng hemoglobin / mL (feces or diluent), which gives an analytical sensitivity of 96.11%, a specificity of 99.33% (accessdata.fda. gov/cdrh_docs/reviews/K041408.pdf). Both positive (FIT+) and negative results flow electronically from the instrument to KPH’s electronic medical record (EMR). Patients with a FIT+ result are referred to KPH Gastroenterology for second-stage colonoscopy screening, which is usually completed within a few weeks. Descriptive colonoscopy results are added to the EMR on the same day. Macroscopic and histopathological diagnoses of biopsies and excised lesions are added within a week. The KPH primary care provider coordinates follow-up and clinical management, if needed. All study methods and procedures were performed in accordance with applicable guidelines and regulations.
During the six-month interval from March to August 2014, KPH tested 18,061 FIT devices (18,001 patients), of which 941 (5.21%) were FIT+ (in 937 patients). Among the FIT+ patients, colonoscopy revealed that 10 had CRC (ICD-9 153) and 171 had CRA (ICD-9 211.3) including 111 with tubular adenoma, 9 with tubovillous adenoma and 15 with tubular adenoma and tubulovillous. In 2014, the prevalence of type 2 diabetes was 19% and the prevalence of obesity [body mass index (BMI) > 30 was 22% in KPH members over age 50.
Outcome and covariate classification
The current project employed a hierarchy of histopathologic diagnoses: first, CRC (excluding in-situ) if present; else high-risk CRA (at least one adenoma with diameter ≥ 1 cm or with villous histology); else low-risk CRA (< 1 cm diameter and no villous histology); else all other (including miscellaneous and benign). BMI, calculated from weight and height (Kg/M2), was classified as healthy (18.5–24.99), overweight (25–29.99), or obese (≥ 30). Race and ethnicity (self-declared), Charlson morbidity index, and medical diagnoses were ascertained from KPH EMR data. Races were grouped into four categories: White, Asian, multiple, and other/ambiguous race. Charlson morbidity index was classified into three categories: = 0; ≥ 1 and ≤ 3; ≥ 4.
Using the KPH electronic pharmacy records, we assessed oral and parenteral medications prescribed to our study participants during the 365 days prior to FIT collection. The medications included antibiotics (specifically, cephalosporins, fluroquinolones, macrolides, penicillins, tetracyclines, and aminoglycosides), cardiovascular drugs (specifically, statins, fibrates, beta blockers, calcium channel blockers, and other antihypertensives), corticosteroids, and proton pump inhibitors. Antibiotics and cardiovascular drugs were categorized by cumulative length of prescription (none = 0, < median = 1, ≥ median = 2). Corticosteroids and proton pump inhibitors were categorized as none vs any.
Collection and shipping of specimens
Over the course of six months, all FIT + devices, plus four randomly selected FIT-negative devices per week (N = 96 total FIT-negatives), plus 24 blank FIT devices (run through detection system with no feces) were immediately frozen at −20 °C, then shipped on dry ice by overnight courier in approximately equal sized batches to the National Cancer Institute (NCI) repository in accordance with International Airline Transport Association (IATA) regulations. KPH staff excluded no specimens, as they knew only the FIT result, not the indication for testing or other information. The NCI repository organized the specimens into 24 batches, each including a blank FIT, an Artificial Colony specimen, and a Robogut A specimen26.
Processing of specimens; generation and editing of 16S rRNA sequence data
At the Institute for Genome Sciences, University of Maryland School of Medicine, all 2 mL proprietary solution plus feces was suctioned from each FIT device. This was mixed with 350 µL of lysis buffer composed of 0.05 M potassium phosphate buffer containing 50 µL lyzosyme (10 mg/mL), 6 µL of mutanolysin (25,000 U/ml; Sigma-Aldrich) and 3 µL of lysostaphin (4000 U/mL in 98sodium acetate; Sigma-Aldrich, St. Louis, MO). The mixture was incubated for 1 h at 37 °C, following which 10 µL proteinase K (20 mg/ml), 100 µL 10% SDS, and 20 µL RNase A (20 mg/ml) will be added. This mixture was incubated for 1 h at 55 °C. To further lyse microbial cells, Lysing Matrix B 2 ml beads [MP Biomedicals (Santa Ana, CA)] was added, after which mechanical disturbance (bead beating) was performed on the mixture using a FastPrep instrument (MP Biomedicals, Solon, OH) set at 6.0 m/s for 30 s. The lysate was processed using the QIAsymphony SP Pathogen complex 400 protocol (Qiagen, Gaithesburg, MD) according to the manufacturer’s recommendations. DNA was eluted in 100 µL of storage buffer [QIAsymphony reagent buffer AVE (0.04% sodium azide), Qiagen], pH 8.0. PCR inhibitors were removed from extracted DNA using the Zymo-Spin IV Spin Filter column according to the manufacturer’s recommendations (Irvine, CA). DNA was quantitated by Quant-iT PicoGreen (Molecular Probes, Inc., Eugene, OR) in a SpectraMax M5 microplate reader (Molecular Devices, Sunnyvale, CA).
A region of approximately 469 bp encompassing the V3 and V4 hypervariable regions of the 16S rRNA gene was targeted for sequencing. This region provides extensive information for the taxonomic classification of microbial communities from specimens associated with human microbiome studies and has been used by the Human Microbiome Project27. Dual barcode fusion primers 319F (5′ ACTCCTACGGGAGGCAGCAG-3′) and 806R (5′-GGACTACHVGGGTWTCTAAT-3′) were used to amplify the V3–V4 region of bacterial 16S rRNA genes. Amplicons were pooled at equimolar concentration and sequenced on an Illumina MiSeq instrument using the 300 bp paired end protocol. Sequenced reads were processed using the following steps: (1) removal of the primer sequence, (2) truncation of reads not having an average quality of 20 over a 30 bp sliding window based on the algorithm phred 28.29 previously implemented 30.31, (3) removing cut reads that were less than 75% of their original length, and (4) removing companion reads that were rejected because they were less than 75% of their original length. Quantitative information on microbial ecology (QIIME pipeline, version 1.6.0)32 was used for all other sequence processing steps, including quality adjustment and demultiplexing. Quality slicing in QIIME was performed using the following criteria: (1) truncate the sequence before 3 consecutive low quality bases and reevaluate the length, (2) no ambiguous base calls, and (3) minimum sequence length of 150 bp after clipping, (4) delete sequences with less than 60% identity with a pre-built Greengenes database of 16S rRNA gene sequences (October 2012 version)33. Additional data processing included denoising by grouping similar sequences with less than 3% dissimilarity using USEARCH34 and de novo detection and removal of chimeras in UCHIME v5.135. Paired reads were stitched together with “N” between each sequence and treated as a single sequence in the analysis.
Sequence raw files were demultiplexed by running the command split_libray_fastq (QIIME 1.9.1)32 to extract forward reads and reverse reads. Then, the DADA2 pipeline was used to generate an OTU table and the associated phylogenetic tree36. After quality filtering of sequences and processing using error correction models, amplicon sequence variants (ASV) were identified. Chimeric sequences have also been deleted. For all samples, including QC samples, there was an average of 15,960 reads/sample and 22,065 sequence features were identified. These sequence features were then aligned with the SILVA v128 database to obtain taxonomy information37. Data have been published in Sequence Read Archive (SRA) with Bioproject ID PRJNA673212: http://www.ncbi.nlm.nih.gov/bioproject/673212.
To calculate the alpha and beta diversities of the microbiome, we first rarefied all samples to 10,000 reads so that all samples were comparable. Samples with sequencing depth less than 10,000 were removed. The result of the analysis at other levels of rarefaction (5000 and 1000) is quantitatively similar. Microbiome alpha diversities (number of species, Chao1, Shannon index, and phylogenetic diversity) and beta diversities (Bray-Curtis dissimilarity, weighted UniFrac distance, and Unifrac distance) were calculated using the phyloseq package in R ( 3.6.2)38 Associations between microbiome alpha diversities and disease diagnosis, and between demographic variables and disease diagnosis, were assessed using linear regression models or Fisher’s exact test for outcomes. continuous or categorical, respectively. MiRKAT39 was used to assess the association between microbiome beta diversities and disease diagnosis. All statistical analyzes were conducted in R (3.6.2).
For analyzes of individual taxa, we focused on the genus level. We excluded rare amplicon sequence variants (ASV, present in
Ethical approval and consent to participate
Prior to implementation, the project was reviewed and approved by KPH’s Institutional Review Board. Only authorized KPH personnel had access to personally identifiable data. For statistical comparisons with faecal microbiome profiles, KPH provided the NCI with a limited dataset including demographic data (gender, age, race/ethnicity); the length of KPH membership; FIT result; date of FIT; diagnosis and date of colonoscopy; types of gastrointestinal surgery and dates; most recent height and weight; presence/absence of type 2 diabetes; drug prescriptions within 365 days of the TIF; history of Crohn’s disease or ulcerative colitis and other common major clinical diagnoses. Based on analysis of coded data and samples previously collected for clinical care, the NIH Office of Human Research Subjects Protection issued the decision (#12694) that the proposed project was not research on subjects humans as defined by 45 CFR 46 and was therefore exempt from NIH review by the Institutional Review Board.