Strand-Specific Bulk RNA-Seq Library Preparation from Total RNA Using Poly(A) Selection and dUTP Second-Strand Marking (HEK293T)

Experimental Design

Three biological replicates per condition (control vs. treated), each a separately cultured HEK293T flask passaged ≤20 times. One technical replicate (independent library) from a pooled reference RNA is included across batches to estimate technical variance. Input is fixed at 500 ng total RNA (RIN ≥ 8.0). Libraries are processed in randomized plate order to avoid positional batch effects; the operator is blinded to condition via barcoded tube labels. Single fragmentation time point (94°C, 8 min) is used to target 200-300 bp inserts. Twelve unique dual-index (UDI) barcodes allow pooling of all 6 samples plus controls on one lane.

Safety Notes

BSL-1 for HEK293T culture (use BSL-2 practices if transfected with viral vectors). TRIzol/phenol-chloroform extraction is corrosive and toxic — perform in a chemical fume hood, wear nitrile gloves, lab coat, and goggles; collect phenol waste in designated halogenated/organic waste. Guanidinium salts must never contact bleach (releases cyanide gas). Ethanol is flammable. Dispose of biological waste per institutional BSL-1 protocol; RNase contamination control requires dedicated reagents and gloves.

Controls

Positive control: Universal Human Reference RNA (UHRR) processed in parallel to confirm library construction and provide a cross-batch anchor. Negative/no-template control: water through the full workflow to detect adapter-dimer contamination and reagent carryover (should yield no quantifiable Qubit signal and no Bioanalyzer peak). ERCC spike-in mix (1 µL of 1:100 dilution per 500 ng) as an internal quantitative control for dynamic range and strand specificity. A no-RT control on one sample confirms absence of gDNA contamination.

Expected Results

Final library yield 10-50 nM at 20 µL. Bioanalyzer shows a single symmetric peak with insert mode ~250-300 bp and <2% adapter dimer (~120 bp). After sequencing (PE150), >90% reads pass filter, duplication <25%, rRNA contamination <5%, exonic mapping rate >80%, and strand specificity >95% (RSeQC). ERCC log2(observed) vs log2(expected) R² > 0.92 across the dynamic range.

Materials & Reagents

HEK293T total RNA, extracted by guanidinium-phenol (TRIzol) or column kit, DNase-treated, RIN ≥ 8.0, 50-100 ng/µL
Oligo-d(T)25 magnetic beads (e.g., NEB poly(A) capture module), 2x bead binding buffer
RNA fragmentation/priming buffer (containing random hexamers)
First-strand synthesis: reverse transcriptase (RNase H-) + Actinomycin D 5 ng/µL (suppresses spurious second-strand priming)
Second-strand mix containing dUTP in place of dTTP (dNTP/dUTP mix: 10 mM dA/dC/dG, 10 mM dUTP)
USER enzyme (UDG + Endonuclease VIII) for uracil excision
End-repair / dA-tailing enzyme mix
T4 DNA ligase + Illumina TruSeq-style methylated, forked adapters
AMPure XP SPRI beads
KAPA HiFi HotStart polymerase (uracil-tolerant must NOT be used post-USER)
Nuclease-free water, 80% freshly prepared ethanol, 10 mM Tris-HCl pH 8.0
Qubit RNA HS and dsDNA HS assay kits; Agilent Bioanalyzer/TapeStation HS reagents

Objective

To construct directional (strand-specific) Illumina-compatible RNA-seq libraries from 500 ng of high-quality HEK293T total RNA via poly(A) capture, fragmentation, dUTP-marked second-strand synthesis, adapter ligation, and PCR enrichment. The protocol targets a final library insert size of 200-300 bp and a yield sufficient for 30-40 million paired-end reads per sample, enabling quantification of protein-coding and lncRNA transcripts with correct strand orientation preserved (>95% of reads assigned to the expected strand).

Procedure

Dilute 500 ng total RNA to 50 µL in nuclease-free water; verify RIN ≥ 8.0 on TapeStation.
Add 50 µL oligo-d(T) beads in 2x binding buffer; heat 65°C 5 min, cool to room temp 5 min to anneal poly(A) to beads.
Place on magnet, wash 2x with bead wash buffer to remove rRNA.
Elute mRNA in 11.5 µL fragmentation/priming buffer; fragment 94°C 8 min, then immediately place on ice.
First-strand synthesis: add RT mix with Actinomycin D; 25°C 10 min, 42°C 15 min, 70°C 15 min.
Second-strand synthesis with dUTP mix: 16°C 60 min. Purify with 1.8x AMPure XP, elute 50 µL.
End-repair + dA-tailing: 20°C 30 min, 65°C 30 min.
Ligate forked UDI adapters (T4 ligase) 20°C 15 min; clean up 0.9x then 0.9x SPRI to remove adapter dimers.
USER digestion: 37°C 15 min to destroy the dUTP-marked second strand.
PCR enrich 12-14 cycles with KAPA HiFi (98°C 45 s; [98°C 15 s, 60°C 30 s, 72°C 30 s]; 72°C 1 min).
Dual-sided SPRI (0.6x then 0.8x) to select 320-420 bp final fragments; elute in 20 µL 10 mM Tris.
QC: Qubit dsDNA HS for concentration; Bioanalyzer for size (peak ~300 bp insert + 120 bp adapters).

Variables

Independent variable: experimental condition (control vs. treated HEK293T). Dependent variables: per-gene read counts / normalized expression (TPM), library yield (nM), insert-size distribution, and percent strand specificity. Controlled variables: input mass (500 ng), RIN threshold (≥8.0), fragmentation time (8 min), PCR cycle number (held constant within batch), adapter concentration, passage number (≤20), and bead lot.

Hypothesis

If poly(A)+ mRNA is selected, chemically fragmented, and the second strand is synthesized with dUTP followed by USER/UDG digestion prior to PCR, then only the first-strand-derived cDNA will amplify, yielding libraries in which read strandedness faithfully reports transcript orientation (expected >95% strand specificity by RSeQC infer_experiment.py), distinguishing sense from antisense transcription at overlapping loci.

Data Analysis

Demultiplex with bcl2fastq/BCL Convert. Trim adapters with fastp. Align to GRCh38 with STAR (two-pass) or quantify with Salmon (--libType ISR for dUTP). QC with FastQC, RSeQC (inferexperiment, readdistribution, geneBody_coverage), and Picard CollectRnaSeqMetrics. Generate gene-level counts via featureCounts (-s 2 for reverse strandedness) or tximport from Salmon. Normalize with DESeq2 median-of-ratios or edgeR TMM; report TPM for visualization.

Troubleshooting

Low library yield / no peak: check input RIN and poly(A) capture efficiency; increase PCR by 1-2 cycles but avoid over-amplification (causes duplicates). 2. Strong adapter-dimer peak (~120 bp): titrate adapter down (use 0.5x for low input) and repeat 0.8x SPRI. 3. Poor strand specificity (<90%): verify Actinomycin D was added and that dUTP (not dTTP) was used in second-strand mix; confirm USER step occurred before PCR. 4. High rRNA (>10%): poly(A) selection failed — repeat bead binding with fresh oligo-d(T) and confirm RNA was DNase-treated. 5. Broad/large insert size: shorten fragmentation time or increase first SPRI ratio for tighter size selection.

Statistical Analysis

Differential expression with DESeq2 negative-binomial Wald test, n=3 biological replicates per group, Benjamini-Hochberg FDR correction at α = 0.05, with independent filtering and apeglm log2FC shrinkage. Power: with 3 replicates and a typical biological CV of 0.4, the design detects ~2-fold changes for moderately expressed genes (baseMean >50) at ~80% power. Report adjusted p-values and effect sizes; flag genes with |log2FC| > 1 and padj < 0.05.

More Genomics protocols