BiQ Analyzer

This website provides the BiQ Analyzer (Bock et al. 2005) software tool for analyzing classical small-scale bisulfite sequencing data (typically based on Sanger sequencing). The BiQ Analyzer HT and HiMod software tool (Lutsik et al. 2011; Becker et al. 2014) for analyzing medium-scale amplicon sequencing data (typically using MiSeq) is available from a separate website.

About BiQ Analyzer

BiQ Analyzer is a software tool for the easy visualization, and quality control, of DNA methylation data obtained by small-scale bisulfite sequencing (typically based on Sanger sequencing). It is intended to be used by anyone who works with DNA methylation data from bisulfite sequencing, from occasional users to experts. BiQ Analyzer includes the following features:

End-to-end support of the analysis process: from raw sequence files to a comprehensive documentation and visualization.
Automatically generate publication-quality lollipop diagrams.
Integrated 1-click multiple sequence alignment.
Automated CpG highlighting

For detailed information on the methodology of BiQ Analyzer please refer to our journal article.

Download & Install BiQ Analyzer

You will need to first download and install a copy of Java for your platform. We recommend using the installer where available as this will make the set up process easier. BiQ Analyzer has been tested to run with Java 1.8.0_411 on both macOS (14.3) and Windows (11).

Once you have installed Java then you can download BiQ Analyzer for your platform.

macOS

You can use the macOS Disk Image (recommended) if you have an Apple Silicon Mac or the Cross Platform Java Files if you have an Intel Mac. If you use the macOS Disk Image then you will have to download the demonstration dataset separately.

Windows

You can use the Cross Platform Java Files.

Linux

You can use the Cross Platform Java Files.

Using BiQ Analyzer

We recommend that you download the example data so that you can try things out as you read through the documentation. If you encounter any issues whilst using the software, please check the frequently asked questions on this page, and e-mail support@bocklab.org.

Step 1 - Starting BiQ Analyzer

MacOS (Disk Images)

The easier way to run the BiQ Analyzer software on Mac OS is to download the corresponding disk image. This will only work if you have an Apple Silicon Mac. If you have an older, Intel Mac then you should follow the instructions for the 'Cross Platform Java Files' below. Otherwise, mount the disk image and drag BiQ Analyzer to your Applications folder as indicated.

download-and-install

use-biq-analyzer

Then, because of macOS security features, you should right-click on the BiQ Analyzer in your Applications folder and choose Open. You may have to repeat the right-click > Open step if the option to open the application does not appear.

Alternatively after attempting to open the application once, you can open System Settings, scroll down to Privacy & Security on the left and then to Security on the right and choose ‘Open Anyway’.

You can now skip ahead to first launch.

Cross Platform (Java) for Windows, Linux & macOS

Once you have installed Java, you will need to check that it can be launched from your Terminal emulator of choice. This is usually Terminal.app on macOS, and cmd.exe on Windows. You can open the terminal on macOS by typing ‘terminal’ into Spotlight and the command prompt on Windows by typing ‘command prompt’ into the start menu.

Once this is open you can check java is installed by running java --version. If the output is something like:

java 22.0.1 2024-04-16
Java(TM) SE Runtime Environment (build 22.0.1+8-16)
Java HotSpot(TM) 64-Bit Server VM (build 22.0.1+8-16, mixed mode, sharing

then you can run BiQ Analyzer by double-clicking on BiQ_Analyser.bat (Windows) or running BiQ_Analyser.sh (macOS/Linux). You can then skip ahead to first launch.

If instead you saw an error message, then you can find the path to Java in one of several ways:

macOS

On macOS run: mdfind -name 'java' | grep '/bin/java$'.

Linux

On Linux run: find / -name java 2>/dev/null | grep '/bin/java$'.

Windows

On Windows run: for %i in (java.exe) do @echo. %~$PATH:i

This will result in either a list, or a single entry, such as:

/Library/Java/JavaVirtualMachines/jdk1.8.0_401.jdk/Contents/Home/jre/bin/java
/Library/Java/JavaVirtualMachines/jdk-11.0.22.jdk/Contents/Home/bin/java

You can then edit BiQ_Analyzer.sh (Linux/macOS) or BiQ_Analyzer.bat (Windows) in your chosen text editor and replace java with the full path listed above.

After you’ve edited the corresponding file you can launch the jar file by double clicking BiQ_Analyzer.bat on Windows or by running BiQ_Analyzer.sh using Terminal.app on macOS or Linux. As a result of macOS security features you may need to run xattr -c BiQ_Analyzer.sh and chmod +x BiQ_Analyzer.sh first.

Step 2 - First Launch

On first launch BiQ Analyzer will prompt you to complete some configuration information. This, for example, is where you enter the location of the clustalw2 executable if you wish to perform alginments using the ClustalW software instead of the built-in aligner. You do not have to fill in all the fields and can simply choose OK and then No when prompted if you wish to complete more information. You can also return to this dialog at any time by clicking the “Config…” button in the bottom left hand corner of the BiQ Analyzer window.

first_launch

Step 3 - Example Analysis: Genomic Sequence

After launching BiQ Analyzer you will be greeted by the main screen. Important things to note are: the status indicator on the top left, the text box for the genomic sequence at the top, some parameter settings on the top right, an empty space for the input sequences on the left and an empty space for the multiple sequence alignment on the right. The message box at the very bottom of the window will provide you with guidance at each step of the analysis.

To continue, copy and paste the following genomic sequence into the ‘Genomic Sequence’ box. Then, click next.

CCCGGGATCGCTCTCCCAGCAGGTGAAGCCTCGCCATGGACCCTCCCCGTCGGGGCCCCG CGCTGCCCCGCCCGCCCCCAGCCGCTGGCCAAGGCCGCGGTCGCGCAGGCGCAGTGCCG CGTCCCGCCGCCGCCCCGCCCTGCCCGTCGCTGCGGAAGGCGCCGCGCGCAGCAACGCG CACTTCCTCTCCAGGAATCCGCGGAGGGAGCGCAGGCTCGAAGAGCTCCTGGACG

Keep in mind for your own analysis that the genomic sequence must not contain primers and must be unconverted.

Step 4 - Example Analysis: Raw Bisulfite Sequence Files

Now select the raw sequence files that were bisulfite converted and sequenced. These files must be in FASTA format and should have the same orientation as the genomic sequence. It is helpful if they also don’t contain any primers. Both the orientation and primers can be adjusted later.

Each file must contain exactly one sequence and it is not possible to import multiple sequences from one file. Furthermore, all sequences must be located in the same directory. You can download the sample sequences required for this walk-through here. Select them all now by pressing Ctrl+A (Windows) or Command+A (macOS). Alternatively you can click and then hold shift or ctrl to select further samples by clicking. Once you’ve done this, click next. At this point a multiple sequence alignment will be computed. This may take some time so please wait for this step to be completed.

Step 5 - Example Analysis: Quality Control (Alignment)

After it is computed, the multiple sequence alignment between the genomic sequence and all selected bisulfite converted sequences will appear. All CpGs and unconverted Cs are highlighted; an explanation of the color coding can be found here. We must now work on quality control. In the first step, you should look closely at the alignment and check whether any of the sequences are reversed compared to the genomic sequence. The program will help you with an automatic analysis, highlighting sequences or groups of sequences that don’t agree well with the genomic sequence. However, it cannot assess whether the reverse complement fits better without calculating a new alignment (which would take too long), so be sure to check the program’s suggestion. If you believe a sequence is reversed, choose to include only the reverse sequence in the relevant drop-down box on the left hand side. You can press “recalculate” to re-compute the sequence alignment. You can also use this as an opportunity to remove any remaining primers from the sequences. You can do so by directly editing the text boxes on the left (your modifications will not be written back to the original files).

When you are happy with the alignment press the “Next” button to proceed.

Step 6 - Example Analysis: Quality Control (Sequencing Errors)

In this quality control step, the aim is to remove all sequences with an unacceptably low conversion rate or with a high number of sequencing errors from the alignment. This step is necessary to ensure high data quality. In the example data we see that sequences [5] and [7] have a conversion rates below 90% (this is the default cutoff, which can be changed in the program’s configuration file) and the program suggests to exclude these two sequences. There is one aspect that may lead to confusion here: why does sequence [5] fall below the threshold even though it has a conversion rate of exactly 90%? This is because it has a conversion rate slightly below 90%, which is rounded to 90%. But the cutoff is done on the exact value. Furthermore, the program suggests to exclude sequence [11] because of a high error rate.

In this example, we will accept the program’s suggestion to exclude sequences [7] and [11], but for sequence [5] we decide to relax our conditions a little and include it even if it has a conversion rate slightly below 90%. This is done by changing the choice box below the corresponding sequence on the left side from the suggestion “Exclude” to the previous selection “Include as is”. After that, we press the “Recalculate” button to see the results of our decisions.

When you’re ready, press the Next button to continue.

Step 7 - Example Analysis: Quality Control (Clones)

Now, the program asks us the inspect the multiple sequence alignment for clones, i.e. sequences that are likely to come from the same chromosome of the same cell. Such sequences are a potential threat to all statistical analyses that are based on their methylation data. The program interprets all sequences as clones that agree in all C positions of the genomic sequence. Then it suggests to exclude all but one sequence from each group of clone sequences.

In this case, the suggestion of the program looks reasonable so we directly proceed to the next step by pressing the “Next” button.

Step 8 - Example Analysis: Quality Control (Final Checks)

Now, after all program-supported quality control steps are completed, the program asks the user to manually validate the alignment. In many cases, the best idea is to print out the current alignment by pressing the “Print (via browser)” button: then the program will load the current alignment including all highlighting into a web browser, from where you can print it. If you find any additional errors, you can again directly edit the sequence text boxes on the left or change the choice boxes to exclude doubtful sequences. However, in our test case the alignment looks okay, hence we directly proceed to the next step by pressing the “Next” button. The data are now ready to export. Click ‘Next’ a further time to begin the export process.

Step 9 - Example Analysis: Exporting Data

The program now displays an experiment documentation questionnaire, which should help the user to properly document their results. The upper part with the small text boxes is standardized and you should fill it out in all cases. The “Details” box should be used to give very specific details on the genome position of the analyzed sequences, in order to make it easy to find the location later on. Finally, the text box “Free comment” allows you to add any additional information that you might find useful when coming back to this analysis.

In order to facilitate filling out this questionnaire, you can use the “Cycle through history” button, which provides a history function for all that you have entered into the text boxes since you installed the program. Finally, after filling out the questionnaire, we continue by pressing the “Save data” button. The program is now ready to export the results of the analysis into a single documentation HTML file that contains: the questionnaire, some basic data about the experiment, the genomic sequence unconverted and converted, the final multiple sequence alignment, and a lollipop-style diagram of the methylation patterns. We enter a name for that file and press the “Open” button to save the file (be careful: when you select an existing file here, it will be overwritten without further warning).

In addition to the HTML file, the program also suggests to save the raw methylation data into a plain text file (as tab-separated values). This is only important if you want to further analyze your methylation data in a statistics package that does not properly support copy & paste. More often than not, you can skip this, hence we press the “Cancel” button.

You can now press ‘Next’ again to open the output file in your web browser or you can open it directly from the location you saved it in.

Step 10 - Example Analysis: Statistical Analysis

When we finished the analysis, the BiQ Analyzer copied the generated methylation into the system clipboard. Hence, in order to carry out a statistical analysis of that data, we can directly go to a spreadsheet program and paste our methylation data. When we do so each row corresponds to one bisulfite treated sequence and each column to a CpG dinucleotide in the genomic sequence. A ‘1’ represents a methylated C, a ‘0’ represents an unmethylated C, and an ‘x’ represents a non-CpG or ambiguous position. From that data, you can easily calculate average methylation, medians, variances, and so on.

Step 11 - Example Analysis: Reset

Back in BiQ Analyzer we press “Restart” in order to return to the main screen - from where we could start a new analysis.

Now that you are familiar with BiQ Analyzer, we can discuss the meaning of the three choice boxes at the top right, which we ignored on the first go. The “Conversion” box allows you to work on the reverse complement strand of the DNA, where a bisulfite conversion is “logically G->A” instead of “C->T”. Apart from this, everything stays the same. The “Suggestions” box allows you to specify whether or not you want to have the program’s suggestions as the default or whether you prefer to make all selections manually.

Frequently Asked Questions

Q: The program does not recognise the CpGs in the genomic sequence. Why?

A: Are you sure that your genomic sequence has the right orientation (5’ -> 3’) and that you adjusted the “Conversion” choice box according to whether you are working on the plus strand (logical C to T conversion) or on the minus strand (logical G to A conversion)?

Q: Can I use clustal instead of the built-in clustering algorithm for my alignments?

A: If you intend to use a local copy of clustalW for your alignments, you will need to download the correct version of clustalW for your operating system. We provide a build of clustalW for Apple Silicon Macs (i.e., those with an Apple M3/M2/M1 processor). Note the location you install or extract the files to and add this to the BiQ configuration by clicking “Config…” in the bottom left of the BiQ analyzer window. Note for macOS users, as with BiQ Analyzer itself, you must also make clustalw2 executable by either right-clicking and choosing open on the clustalw2 file, or by double clicking and then allowing clustalw2 to run by choosing ‘Open Anyway’ in System Settings.

Q: The program doesn’t ignore my primers, what can I do?

A: You must remove the primers manually. You can do so in the sequence text boxes on the left of the BiQ Analyzer window: just edit the sequences in the boxes to remove the primers. These modifications will not be written back to the original sequence files.

Q: The multiple sequence alignment looks very messy: there is a line break after the sequence name and each sequence occupies two lines. What shall I do?

A: This can happen if you have sequence names of unusual length. Just select a smaller font size in the selection box in the center of the BiQ Analyzer window and everything should look alright again. Alternatively, there may be a problem with font availability. In that case, please adjust the “Alignment font” entry in the configuration editor to a non-proportional font that is available on your system (e.g. “Courier New”).

Q: For my sequences the “Reverse Complement” suggestions of the BiQ Analyzer are always wrong. Why?

A: The program applies a simple sequence identity clustering in order to determine which sequences don’t fit the genomic sequence. For the biggest cluster of those, it suggests to invert the sequences. This works very well when your sequences are sufficiently similar to the genomic sequence. But if you have strong deviations from the genomic sequence (e.g. due to many sequencing errors), this method is less successful. Unfortunately, the alternative (calculate another alignment with the inverted sequences and check whether this improves the fit) is time-consuming. Therefore, the best way to cope with this is to manually exclude highly erroneous sequences right after the first multiple sequence alignment and then press the “Recalculate” button to see if the situation improves.

Q: The multiple sequence alignment is very bad, even after reverse complement has been tried. What’s wrong?

A: Maybe you selected G to A conversion instead of a C to T conversion or the other way round.

Q: When I copy the methylation information from the HTML output file and paste them into MS Excel, all information goes into the same cell. What shall I do?

A: This is because Excel tries to interpret the data as HTML instead of text. Use “Edit->Paste Special” and select “Text”, then each field will go into a separate cell.

Q: What do all the colors in the multiple sequence alignment mean?

A: Please review the guide to the color codes.

Q: How can I change the minimum conversion rate and the minimum sequence identity that the program employs for quality control?

A: These values are defined in the configuration file of BiQ Analyzer and they can be changed using the configuration editor. You can open the configuration editor editor by clicking the “Config” button in BiQ Analyzer’s main window.

Q: How can I change the default data path of the program?

A: This value is defined in the configuration file of BiQ Analyzer and they can be changed using the configuration editor. You can open the configuration editor editor by clicking the “Config” button in BiQ Analyzer’s main window.

frequently-asked-questions

References & License

BiQ Analyzer was developed in a cooperation between the Max Planck Institute for Informatics and Saarland University, both located in Saarbrücken, Germany.

BiQ Analyzer should be cited as follows:

Bock, C., S. Reither, T. Mikeska, M. Paulsen, J. Walter and T. Lengauer (2005). “BiQ Analyzer: visualization and quality control for DNA methylation data from bisulfite sequencing.” Bioinformatics 21(21): 4067-8.

BiQ Analyzer is made available for non-commercial use under the terms of the BiQ Analyzer End User License Agreement.

If you have any questions, suggestions, criticism, etc., please feel free to contact us at support@bocklab.org.

Looking for the original BiQ Analyzer web pages at the Max Planck Institute for Informatics? Visit them in the Internet Archive.

references-and-license