CRISPRon(v1.0): CRISPR-Cas9 guide efficiency prediction

Help


The CRISPRon webserver is dedicated to the prediction of on-target cleavage efficiency by CRISPR-Cas9 at target sites designed from a given DNA sequence. Using the webserver, you can find possible CRISPR-Cas9 targets from an input sequence and obtain the matching gRNA and its predicted Cas9 cleavage effiency (indel frequency obtained at approx. 3 days after gRNA delivery). For details about the method used to generate predictions, please see the About page.

This webserver is for on-target Cas9 cleavage efficiency predictions only. We suggest to process your selected gRNAs with the CRISPRoff webserver as well and select those that have not only high predicted on-target efficiency but also low predicted off-target potential. For more information about CRISPR off-targets please visit the CRISPRoff webserver.

The webserver can be used with three different kinds of input: a genomic range, a gene name, or a custom input sequence. For the first two cases a target genome must be specified, whilte for the latter the given sequence may or may not map to a target genome.

Format for INPUT-1: Targeting a genomic region


First, select one of the listed genomes in the drop down menu. If the genome you are interested in is not listed, please use the "Format for INPUT-3" described below. Then, type in a genomic region in the dedicated form, and press submit. The genomic region should be specified in the following format: chrN:start-end, where chrN is the identifier of the chromosome, while start and end are the genomic ranges, as in the example below.

chr5:73,164,226-73,170,298

Format for INPUT-2: Targeting a gene


First, select one of the listed genomes in the drop down menu. Then, select a gene in the form by typing its name (a drop-down menu will appear after 3 characters), and press submit. In this mode, only exons (coding exons or UTRs) will be considered as targets. If you wish to target introns you must use one of the other input options. If either the genome or the gene is not listed, you should use the "Format for INPUT-3" and manually enter your target sequence as explained below. In addition to targets in exons, targets in an additional 300 nt sequence up-stream and down-stream of each transcript are also shown in the output.

SLC10A4

Format for INPUT-3: Targeting a custom input sequence


Paste in a target DNA sequence in plain or in fasta format as shown in the examples below and press submit. If your sequence is from one of the organisms listed in the drop down menu, it is recommended to select that organism in the form, even if your input sequence doesn't map perfectly to the target genome.
Example of sequence in plain:

ATCGTTGCGTACGGTACGTCCTGACGTAGGGCACGCTCGATCGAGTTCGGACCTGTAGGGATCGAGGCTTGTACGGACC
TCACGATCGATCCCGATCGGAATGC

Example of sequence in fasta format:

>myseq_id
ATCGTTGCGTACGGTACGTCCTGACGTAGGGCACGCTCGATCGAGTTCGGACCTGTAGGGATCGAGGCTTGTACGGACC
TCACGATCGATCCCGATCGGAATGC
Hint: how to get a DNA sequence. A possible way to obtain the DNA sequence of a given region or gene is to search for it in the UCSC genome browser and then use View->DNA. If the region contains a mutation in your subject, edit the sequence accordingly before submitting it to CRISPRon (see "Examples of input/output" below).

Input criteria. The input must contain only the following five canonical bases 'A,a', 'C,c', 'G,g', 'T,t' and 'U,u' or unknown bases 'N,n'. Targets that include unknown bases are omitted from the output. Predictions are made for a possible target site only if at least 30 nt made of 4 nt + target (20 nt) + PAM (NGG) + 3 nt are present in the given sequence.

Email and custom job name


It is not necessary that you provide an email address or a custom job name. If you do provide an email address we will inform you by email when the computation for your job finishes. The email will include the optional job name (if specified, "crispron" otherwise) and a link that you can use to access the results for your job. The results are stored on the server for 14 days; after that, your results and your email address are deleted from the server.

Examples of input/output


Here we show the input-output for a search done using a custom sequence mapping in part to the human genome assembly hg38. The output of the other input types is similar, but extra care is needed when the input sequence is different from the reference assembly.

Input

For this example we input a custom sequence, which consists in a portion of the IGF1R gene carrying the rs1409058783 G>A mutation at position chr15:98707586. The DNA sequence in a region of 100 nt left and right the mutation (below shown in bold) was retrieved from the UCSC and manually edited as follows:

Wild-type sequence:
CTGTATTATTGTTTGGAAAATAGTTTAAAAATTATTTCCTTCTAACTGAGACGTTTACCCTCTTGTCTCCCTTCAGTCT
GCGGGCCAGGCATCGACATCCGCAACGACTATCAGCAGCTGAAGCGCCTGGAGAACTGCACGGTGATCGAGGGCTACC
TCCACATCCTGCTCATCTCCAAGGCCGAGGACTACCGCAGCTAC
Sequence carrying rs1409058783 G>A mutation:
CTGTATTATTGTTTGGAAAATAGTTTAAAAATTATTTCCTTCTAACTGAGACGTTTACCCTCTTGTCTCCCTTCAGTCT
GCGGGCCAGGCATCGACATCCACAACGACTATCAGCAGCTGAAGCGCCTGGAGAACTGCACGGTGATCGAGGGCTACC
TCCACATCCTGCTCATCTCCAAGGCCGAGGACTACCGCAGCTAC

As explained above, the input is provided to the webserver by selecting the hg38 genome and pasting the target sequence in the appropriate field.

>hg38_dna
CTGTATTATTGTTTGGAAAATAGTTTAAAAATTATTTCCTTCTAACTGAGACGTTTACCCTCTTGTCTCCCTTCAGTCT
GCGGGCCAGGCATCGACATCCACAACGACTATCAGCAGCTGAAGCGCCTGGAGAACTGCACGGTGATCGAGGGCTACCT
CCACATCCTGCTCATCTCCAAGGCCGAGGACTACCGCAGCTAC

Output

The output consists in an interactive view implemented in the IGV (Integrative Genoimc Veiwer) browser and a table of results, in which details are reported for each predicted target with efficiency >= 50% and repeat masked nucleotides <= 20%. The results can also be downloaded as a zip folder by clicking on "Download results". The folder will include the results table in csv format and bed files of the predicted targets. Note: you can open the csv table in Excel, just remember to save it as xlsx (Excel file) if you edit it with colors, font changes, filters etc. Below instructions on how to navigate through the results in the IGV interactive view and in the table view are given. To see the results on the UCSC genome browser instead, click on "View in UCSC".

The IGV interactive view


An example of the IGV browser view is displayed in the figure below, where the genome view is at the top and the region view is at the bottom. The genome view is relative to where your custom sequence maps in the genome, while the region view is relative to your custom sequence. You can select one of the two views by clicking on the tabs located in the top left margin of the IGV view.

In IGV you can: If the given sequence matches a region in the specified genome then the matching region is shown in the genome view. Note that this view shows only the reference sequence, and does not include eventual mutations present in the sequence given as input. Mutations are instead visible in the region view, whose coordinates are relative to the input sequence. In the image below, the mutated nucleotide is highlighted with a black dotted box in both views. The region view does not contain annotations (transcript, repeatmasked nucleotides...).

Target regions, which represent potential gRNAs, are represented by horizontal bars. The efficiency is reflected in the color of the bar, which follows a color scale from blue to light green. The directionality is shown by tiny arrows within the bars. Additional information can be obtained by clicking on a bar. Each target has a unique identifier, which takes the form of m_xx or p_xx where m and p signal the strandness (+ or -) and xx represents the start position of the target within the given region.

CRISPRon output view

The table view


The table view shows the details of all good targets, which are those targets with efficiency >= 50% and repeat masked nucleotides <= 20% (a table including all possitble targets can be obtained in csv format by clicking on "Download results"). The table can be sorted by clicking on the header of a column. The direction of the sorting will then be indicated by the triangles next to the column's name. By default, the table is sorted in descending order baesed on the efficiency column (eff.(%)), hence the targets in the given region at which Cas9-gRNA cleavage is predicted to be optimal will be on the top.
You can filter the table by typing a filter in the first row. The table can be filtered by:

Click on the id of a target/gRNA to jump to its location on the IGV view!



CRISPRon output table

Citing CRISPRon


If you are using the results of the CRISPRon in your publication, please cite:

Enhancing CRISPR-Cas9 gRNA efficiency prediction by data integration and deep learning

Xiang X, Corsi GI, Anthon C, Qu K, Pan X, Liang X, Han P, Dong Z, Liu L, Zhong J, Ma T, Wang J, Zhang X, Jiang H, Xu F, Liu X, Xu X, Wang J, Yang H, Bolund L, Church GM, Lin L, Gorodkin J*, Luo Y* Nature Communications 2021, 12(1)
[ Paper | Webserver | Software ]

CRISPRon is trained on data reported in the publication above and in a 2019 paper from Kim, please cite that as well.

SpCas9 activity prediction by DeepSpCas9, a deep learning-based model with high generalization performance

Kim HK, Kim Y, Lee S, Min S, Bae JY, Choi JW, Park J, Jung D, Yoon S, Kim HH Sci Adv. 2019 Nov 6;5(11):eaax9249. eCollection 2019 Nov
[ PubMed | Paper ]

Finally, if you use the interaction energies reported by the CRISPRoff tool, please cite:

CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters

Alkan F, Wenzel A, Anthon C, Havgaard JH, Gorodkin J Genome Biol. 2018 Oct 26;19(1):177
[ PubMed | Paper | Webserver | Software ]

Feedback


We greatly appreciate your comments. Open Feedback form in a new tab. Alternatively you can E-mail us with your questions and comments.