CRISPRon-BE(v1.0): CRISPR-Cas9 base editing guide design

Help


The CRISPRon-BE webserver is dedicated to the design of guide RNAs (gRNAs) for CRISPR/Cas9-derived base editing experiments. Currently, our webserver supports two types of base editor predictions: adenine base editors (ABE, converting A:T base pairs to G:C base pairs) and cytosine base editor (CBE, converting C:G base pairs to T:A base pairs). Specifically, the model designed for ABE (CRISPRon-ABE) is trained on data from ABE7.10, while the model for CBE (CRISPRon-CBE) is trained on data from BE4.

The CRISPRon-BE model can predict the nucleotide conversion efficiency induced by designed gRNAs using base editors, as well as the detailed frequency for each potential edited product. If you are interested in inducing a specific nucleotide conversion or correcting a point mutation at a particular genomic region, simply input your DNA sequence along with the specific nucleotide of interest into our CRISPRon-BE webpage. In approximately one minute, you will receive all potential designed gRNAs for the given DNA sequence and their corresponding edited products.

The webserver can be used with three different kinds of input: a custome input sequence or a genomic range or a gene name. For the first case, the sequence can be any DNA sequence, mapping or not mapping the the target genome, while for the later two cases the target genome must be specific.

Format for INPUT-1: Targeting a custom input sequence


Paste-in a target DNA sequence in plain or in fasta format as shown in the examples below and press submit. If your sequence is from one of the organisms listed in the drop down menu, it is recommended to select that organism in the form, even if your input sequence doesn't map perfectly to the target genome.
Example of sequence in plain:

AGGCAACCTGAGGACTTGTCTGAGGATGGGGCCGCAA

Example of sequence in fasta format:

>myseq_id
AGGCAACCTGAGGACTTGTCTGAGGATGGGGCCGCAA

Hint: how to get a DNA sequence. A possible way to obtain the DNA sequence of a given region or gene is to search for it in the UCSC genome browser and then use View->DNA. If the region contains a mutation in your subject, edit the sequence accordingly before submitting it to CRISPRon-BE (see "Examples of input/output" below).

Input criteria. The input must contain only the following five canonical bases 'A,a', 'C,c', 'G,g', 'T,t' and 'U,u' or unknown bases 'N,n'. Targets that include unknown bases are omitted from the output. Predictions are made for a possible target site only if at least 30 nt made of 4 nt + target (20 nt) + PAM (NGG) + 3 nt are present in the given sequence.

Format for INPUT-2: Targeting a gene


First, select one of the listed genomes in the drop down menu. Then, select a gene in the form by typing its name (a drop-down menu will appear after 3 characters), and press submit. In this mode, only exons (coding exons or UTRs) will be considered as targets. If you wish to target introns you must use one of the other input options. If either the genome or the gene is not listed, you should use the "Format for INPUT-1" and manually enter your target sequence as explained above. In addition to targets in exons, targets in the 300 nt sequence up-stream and down-stream of each transcript are also shown in the output. Input example:

SLC10A4

Format for INPUT-3: Targeting a genomic region


First, select one of the listed genomes in the drop down menu. If the genome you are interested in is not listed, please use the "Format for INPUT-1" described above. Then, type in a genomic region in the dedicated form, and press submit. The genomic region should be specified in the following format: chrN:start-end, where chrN is the identifier of the chromosome, and start/end mark the genomic range, as in the example below.

chr5:73,164,226-73,170,298

Examples of CRISPRon-BE for gRNA design


Here we use one example to show how to use CRISPRon-BE for gRNA design.

One SNP NM_004006.3(DMD):c.5804G>A (p.Gly1935Asp) is collected in the ClinVar Database, which represents a point mutation at chromosome X: 32342218, changing from “C” to “T” on the annotated (+) strand of the reference genome (hg38). To correct this SNP using a base editor, you can design a gRNA complementary to the annotated (-) strand. This allows the nucleotide "A" in the patient to be corrected to "G" by ABE.

1. Prepare for the target DNA sequence.

CRISPRon-BE requires the DNA sequence with a PAM motif ("NGG") for prediction. In this case, the target nucleotide “A” is in the complementary strand, so we extract the sequence from the complementary (-) strand for gRNA design. Considering that ABE or CBE has strong positional editing efficiency at positions 3 to 10 on the PAM-distal side (editing window, where base pair conversions by base editors mainly happen in this region), only SNPs located within this editing window can be corrected efficiently. Therefore, we extract the nucleotide sequence upstream by 13 nt and downstream by 23 nt for gRNA design.

For the target nucleotide located in the complementary (-) strand (5’ to 3’), the sequence from chrX:32342195-32342231 is extracted as follows:

AGGCAAGCTGAGGGCTTGTCTGAGGATGGGGCCGCAA
Next, manually change the 14th nucleotide from “G” to “A” to reflect the mutation:
AGGCAAGCTGAGGACTTGTCTGAGGATGGGGCCGCAA

In CRISPRon-BE, the webserver allows to search potential gRNAs on both strands to make it convenient. In this case if you input the DNA sequence "TTGCGGCCCCATCCTCAGACAAGTCCTCAGCTTGCCT" which is reverse complementary of the input sequence mentioned above, you will obtain the same designed gRNA sequences. But in IGV, the rectangles represnting editing windows will appear in different colors and positions, as the gRNAs are located on different strands compared to the original input sequence.


2. Predict designed gRNA editing efficiency and outcome frequencies.

After pasting the sequence with the mutation into CRISPRon-BE, and choosing the suitable base editor (ABE, CBE or both), we can generate potential designed gRNAs along with their predicted outcomes. Here we choose "both" editors as example to show the results. In IGV, there are three different gRNAs capable of correcting the SNP. Each rectangle in IGV represents the editing window for a potential gRNA. For instance, there are two "A"s in the editing window for the third gRNA. To correct the mutation, we target the second "A" (14th nucleotide) within the editing window, while ensuring high editing efficiency specifically at this position, and avoiding unintended editing of the first "A" (11th nucleotide) in the editing window. Clicking each rectangle (gRNA) in IGV shows a brief information of this gRNA, including the base editor type (BE_type), strand compared with input sequence (BE_strand), sequence in the editing window (editing_window), gRNA editing efficiency(total_gRNA_efficiency), the outcome sequence with highest predicted freqeuncy (main_outcome) its corresponding predicted frequency (outcome_frequency), and its location in the input sequence (Location).


output1

In the table below, detailed predictions for gRNA editing efficiency and outcome frequency are provided. Hovering over the columns in the table, you can also get the introduction of the given column. In this table, only outcomes with frequency >= 1.0% are shown here. You can download the result to see the rest.


output2

If you are interested in the edited result from one specific gRNA, you can click on the gRNA ID or filter the table using the gRNA sequence or gRNA ID.

In this example, three gRNAs sequence are available. “GCTGAGGACTTGTCTGAGGA”, “CTGAGGACTTGTCTGAGGAT” and “TGAGGACTTGTCTGAGGATG”. Their predicted gRNA editing efficiencies are “21.36”, “52.51” and “40.93”, respectively. Based solely on editing efficiency, the second gRNA is the highest at 52.51, making it the best choice, while the third gRNA at 40.93 is also considered acceptable.

Considering the bystander effects from the base editor, we also evaluated their frequencies of correcting this mutation without any bystander effects. The outcome frequencies for the gRNAs are 10.10, 34.24, and 40.46. The third gRNA stands out as the best option with the highest outcome frequency and minimal bystander edits (other edited outcomes with frequencies less than 1). In contrast, the second gRNA shows stronger bystander effects where both "A"s in the editing window can be edited simultaneously with a frequency of 11.784497.

In this example, we recommend using the third gRNA "TGAGGACTTGTCTGAGGATG" to correct this mutation. It offers both high gRNA editing efficiency and minimal bystander effects, making it the optimal choice.

The example above demonstrates how to use CRISPRon-BE to design gRNAs for correcting mutations in the genome, applicable to any DNA sequence as an input for gRNA design by base editors. CRISPRon-BE also supports inducing mutations in the reference genome through a user-friendly interface. Users can input DNA genomic locations or gene names to obtain results similar to inputting DNA sequences extracted from the reference genome. In this scenario, DNA sequences from both strands are searched and the designed gRNAs are labeled with different colors in IGV for clarity. If you are searching a coding gene, the resulting protein sequence for the canoical CDS is shown rather than the translation using six different frames.
output3


Citing


If you are using the results of the webserver in your publication, please cite:

Deep learning models simultaneously trained on multiple datasets improve base-editing activity prediction

Sun Y, Qu K, Corsi GI, Anthon C, Pan X, Xiang X, Jensen LJ, Lin L, Luo Y*, Gorodkin J* submitted
[ Paper | Webserver | Software ]

Feedback


We greatly appreciate your comments. Open Feedback form in a new tab. Alternatively you can E-mail us with your questions and comments.