Creating a new track
Data tracks take input data tiles and display it within a browser. To create a new track type, it is necessary to go through a number of steps. For this tutorial, we'll create a new track which displays a box-plot.
Define a viewconfig section describing an instance of the new track
In order for HiGlass to display a new track, it needs to know that it should display the new track. As with all other tracks, this is part of the viewconfig. In production, viewconfigs can be generated by exporting the view. During development, they are either loaded from app/index.html
(when viewing http://127.0.0.1:8080
after running npm start
) or from app/scripts/testViewConfs.js
when running the tests (npm run tests
).
When creating a new track we recommend adding a test case to test/HiGlassComponentTest.jsx
to ensure that it is created and functions properly. If this proves too troublesome, it's also possible to add the config for the new track in `app/index.html.
We'll be creating a new type of track called a horizontal-boxplot
, so add the following section to the "top" view of the testViewConfig
in app/index.html
. The uid here should be unique to this instance of the track so we just give it some random string (xxyxx
). The height specifies how high this track should be. The tilesetUid
specifies the uid of the data source on the server
. In this case we'll use data that exists on our public server (higlass.io).
{
'uid': 'xxyxx',
type:'horizontal-boxplot',
height: 40,
tilesetUid: 'F2vbUeqhS86XkxuO1j2rPA',
server: "http://higlass.io/api/v1"
}
When index.html
is loaded, it will create a HiGlass component using that viewconfig. When it gets to the "top" section and sees the definition of that track, it will try to render it. Since it doesn't yet exist, it won't be able to. To tell it how to render a horizontal-boxplot
track, we have to create a class that can render it and associate it with the horizontal-boxplot
track type.
Creating a class to render the new track type
To create the new track type, we'll use the horizontal-line
track as a template. This track is defined in the app/scripts/HorizontalLine1DPixiTrack.js
file. To begin, we'll copy this file:
cp app/scripts/HorizontalLine1DPixiTrack.js app/scripts/HorizontalBoxplotTrack.js
There's a lot of boilerplate in the track code but the important parts are in drawTile
. In particular, the loop which iterates of tileValues
does the actual drawing using the graphics.lineTo
and graphics.moveTo
function calls. These need to be changed to draw rectangles instead of lines. See the PIXI.js documentation to find the documentation for the drawRect
function which needs to be called.
When polishing the track, the exportSVG
method should also be implemented so that the view can be exported to SVG. This can be done after the new track is created and tested.
Associating a track type with a track rendering class
Now that we have a class which renders this track type, we need to associate it with the track name (horizontal-boxplot
) which was used in the viewconfig. This resolution is done app/scripts/TrackRenderer.jsx
. The easiest thing to do is to copy the example and plug in the newly created track names:
case 'horizontal-boxplot':
return new HorizontalBoxplotTrack(this.currentProps.pixiStage,
track.server,
track.tilesetUid,
handleTilesetInfoReceived,
track.options,
() => this.currentProps.onNewTilesLoaded(track.uid));
We also need to import HorizontalBoxplotTrack
from its javascript file at the top of TrackRenderer.jsx
:
import {HorizontalBoxplotTrack} from './HorizontalBoxplotTrack.js';
This should be enough to get the track to display if it's already specified in the viewconf. To make it discoverable and configurable, we need to add it to the list of known track types in app/scripts/config.js
.
Making the track discoverable and configurable
To be able to add a track using the "Add Track" dialog (accessed using the plus sign icon in HiGlass), HiGlass needs to know what types of data it is capable of displaying. This is specified in app/scripts/config.js
. For our new box-plot track, we'll just copy the config for horizontal-line and change it slightly:
{
type: 'horizontal-boxplot',
datatype: ['vector'],
local: false,
orientation: '1d-horizontal',
thumbnail: null,
availableOptions: [ 'labelPosition', 'labelColor', 'labelTextOpacity', 'labelBackgroundOpacity', 'axisPositionHorizontal', 'valueScaling' ],
defaultOptions: {
labelColor: 'black',
labelPosition: 'topLeft',
axisPositionHorizontal: 'right',
valueScaling: 'linear'
}
}
This tells HiGlass, that whenever it encounters a tileset containing data of the type vector
it can display it using horizontal-boxplot
track. It tells it that it can be placed in horizontal
orientation, which means it can only be added as a top or bottom track. The thumbnail
option is null because we haven't specified a thumbnail for this track. The available options mean that we can set the position of the label (dataset name) position, color, opacity as well as the axis position and type of scaling (e.g. log or linear) of data as options. We also provide some default values for these options in case they're not specified.
That's it. The horizontal-boxplot
track should now be ready to use. What follows in this page are some scattered thoughts on the nitty gritty topics of scales and tiles. They can be ignored for unless a more thorough understanding of the track operations is desired.
Advanced Track Topics (under construction)
Scales
Zoomed scales:
Horizontal tracks: this._xScale()
Vertical tracks: this._yScale()
2D tracks: this._xScale()
and this._yScale()
Original scales:
this._refXScale()
this._refYScale()
To draw the data, it needs various scales. The HorizontalLine1DPixiTrack
, for
example, requires a valueScale
that maps tile values to y positions within
the track. This scale can be calculated in a number of different ways, but the
simplest is to just use the maxVisibleValue()
function of the track. This
returns the maximum value present in the dense
fields of all the visible
tiles.
Other scaling methods may include... quantile scaling, log scaling, etc...
Custom tracks may require bespoke scaling methods. When drawing intervals, we may want to calculate what the maximum number of intervals that will be drawn on top of each other at any point will be. Then for each interval, we will want to calculate its y position.
If the track will rely on translations and zooms to move and rescale the content,
it needs to set pMain = this.pMobile
in its constructor and draw using the
reference scales (this._refXScale
and this._refYScale
).
Implement the initTile
function
This function is called when the tile is initially created.
It is especially useful for tracks that require heavy initial rendering and
lighter transformations for zooming and panning. The HeatmapTiledPixiTrack
,
for example, creates the heatmap sprite and renders it in the initTile
function. It omits the drawTile
function because it wouldn't do anything and
relies on the zoomed
function to alter the graphic's translate and scale
factor to change the view.
Implement the drawTile(tile)
method:
Within the tile
structure there is the tileData
member which contains the
data retrieved from the server. The tile
object itself contains the following
fields. The following is an example of a tile:
tile = {
graphics: <Pixi.Graphics>,
remoteId: "uid.4.3",
tileId: "uid.4.3",
tileData: {
discrete: [[0,1,0],[0,3,0]],
tileId: "uuid.0.0",
tilePos: [3],
zoomLevel: 4
}
The tile
object can also contain information that is relevant to its
rendering. If it is meant to be displayed as text, then it can contain
additional PIXI.Text objects which are simply rescaled when the tile is
redrawn.
Implement a drawing method
There are two ways to draw the visible data:
- Draw each individual tile:
Example: HeatmapTiledPlot: Each tile can be drawn completely independently of every other one.
- Adjacent tiles required:
Example: HorizontalLine1DPixiTrack.js: To connect the lines between adjacent tiles, we need a draw method that looks at the adjacent tiles.
- Draw all the tiles at once
Example: CNVIntervalTrack: We need to have all of the intervals that are visible ready so that we can create a layout where all the elements are considered and there's no overlaps.
Debugging notes
- LeftTrackModifier switches out
pBase
so removing it requires removing its pBase from the stage, rather than the original track's
The chromosome axis shows the current position within a given chromosome. When zoomed out far enough, the numbers disappear showing only the chromosome names.
Because different genome builds have different chromosomes, the chromosome axis requires a list of chromosome sizes to function properly.
Example configuration
{
"chromInfoPath": "//s3.amazonaws.com/pkerp/data/hg19/chromSizes.tsv",
"type": "vertical-chromosome-labels",
"position": "top",
"name": "Chromosome Labels (hg19)"
}
The chromosome grid can be overlayed on top of heatmaps to show where the 1-based chromosome boundaries are.
Example configuration
{
'type': '2d-chromosome-grid',
'position': 'center',
'chromInfoPath': "//s3.amazonaws.com/pkerp/data/hg19/chromSizes.tsv",
}
The property chromInfoPath
should be a link to a file which contains the chromosome sizes:
chr1 249250621
chr2 243199373
chr3 198022430
...
Overview and Terminology
HiGlass (HG) is organized into multiple levels of display.
Website: The website that you are currently looking at.
App: The page that contains a fullscreen view of HG.
Container / Component: The part of the website that shows the actual HG views and tracks.
View: A collection of tracks that share common axes. Views can be linked by zoom.
Track: A region which contains plotted data. This can range from lines, gene annotations and axes on 1D tracks to heatmaps and annotations on 2D tracks.
Series: A set of data which is plotted in a track. Multiple series can be displayed in a single track. Each series requires a track in order to be displayed. A track cannot exist without any series.
Usage
Adding a new track
The most elementary task in HiGlass is adding a track. This can be done by clicking on the '+' sign of the view header and selecting where to add the new track. The list of available tracks is pulled from our server and filtered according to which datatypes can be displayed in this location. 2D data (heatmaps from cooler files) may be displayed either in the center, or on the edge tracks, where only the region near the diagonal will be shown. 1D data (e.g. from bigWig files) is limited to the edge tracks.
Each dataset can often be displayed in multiple ways. A 2D dataset can be displayed as a heatmap or as a rendering of the tiles which it is retrieved as. 1D datasets can [currently] be displayed as lines or as tile outlines. Gene annotations can be displayed as exon-intron plots or as tile ids.
Resizing a view:
Individual views can be resized but they always expand to fill the available vertical space.
Adding and removing views:
New views can be created by copying existing views. They can then be edited to display different data.
Replacing tracks
A common task is to replace an existing track with a new track. This can be accomplished by either first closing the original track and then adding a new one in its place or simply selecting Replace track
from a track's configuration menu:
Changing a heatmap's colormap
The colormap for contact matrices can be changed from the 'track config' menu at the upper right corner of the heatmap. From there you can select from a number of preset color maps (afmhot
, hot
, jet
...):
These presets roughly correspond to the some of the examples available from matplotlib
and are defined in the file app/scripts/config.js
.
Custom color map values can also be set by selecting 'Custom ...':
Adding track labels
Track labels display the name of the dataset being displayed. Note that at the moment having multiple series will lead to overlapping labels. This will be fixed in future releases.
Adding horizontal heatmap tracks
Click the 'plus' icon in the upper right corner:
Pick the location where you want it displayed:
Pick a dataset:
Pick the way you want to display it (for a heatmap in the 'top' position, there's currently only one option: 'horizontal-heatmap')
After clicking 'submit', you should see the new dataset:
You can then flip it by changing its configuration using the little 'cog' icon that appears in the upper right corner when you hover over the track:
And selecting 'Rao et al...' -> 'Configure series' -> 'Flip heatmap' -> 'Yes'
To yield an upside down view:
To create new track types, there are a number of methods that need to be implemented:
Required
draw()
Render this track to the SVG or Pixi canvas.
Optional
These methods are not strictly required but may cause problems with certain functionality if not implemented.
exportSVG()
This method should return a string version of the SVG representation of this track. This is required for exporting and is not rendered.
Example: (from HeatmapTiledPixiTrack.js
)
exportSVG() {
let svg = '<g>'
for (let tile of this.visibleAndFetchedTiles()) {
//console.log('sprite:', tile.canvas.toDataURL());
let rotation = tile.sprite.rotation * 180 / Math.PI;
svg += `<g
transform="translate(${tile.sprite.x}, ${tile.sprite.y})rotate(${rotation})scale(${tile.sprite.scale.x},${tile.sprite.scale.y})"
>`;
svg += '<image xlink:href="' + tile.canvas.toDataURL() + '"/>';
svg += "</g>";
}
svg += '</g>';
return svg;
}
The HiGlass project consists of the following components:
- The higlass client: A javascript application that can render tiled data. It can be instantiated independently.
- The higlass server: A django server which reads specialized file formats and responds to tile requests
- The higlass website: The website for our hosted public-facing higlass instance
- The higlass docker container: A docker container which integrates the previous three repositories into a fully functioning self contained higlass application
Example custom tracks
![]() | Labelled Annotations |
![]() | GeoJSON |
Multivec | |
![]() |
Time Interval Track |
View compositions can be shared through the JSON configuration files defining them. Configuration files can be exported through the view config menu by either exporting as a JSON file or a hyperlink. To restore the composition, links can be clicked* and JSON files can be dragged onto the HiGlass client to load their contents.
Below is an example JSON config file. It contains separate sections for each view along with a host of other information defining how the views in HiGlass are laid out, how they're linked to each other. Within each view are section for the tracks that it contains. Track definitions point to the dataset which they render and contain additional styling information.
{
"editable": true,
"zoomFixed": false,
"trackSourceServers": [
"higlass.io/api/v1"
],
"exportViewUrl": "higlass.io/api/v1/viewconfs/",
"views": [
{
"uid": "aa",
"initialXDomain": [
0,
3000000000
],
"autocompleteSource": "higlass.io/api/v1/suggest/?d=OHJakQICQD6gTD7skx4EWA&",
"genomePositionSearchBoxVisible": true,
"chromInfoPath": "//s3.amazonaws.com/pkerp/data/hg19/chromSizes.tsv",
"tracks": {
"top": [
{
"type": "horizontal-gene-annotations",
"height": 60,
"tilesetUid": "OHJakQICQD6gTD7skx4EWA",
"server": "higlass.io/api/v1",
"position": "top",
"uid": "OHJakQICQD6gTD7skx4EWA",
"name": "Gene Annotations",
}
,
{
"chromInfoPath": "//s3.amazonaws.com/pkerp/data/hg19/chromSizes.tsv",
"type": "horizontal-chromosome-labels",
"position": "top",
"name": "Chromosome Labels (hg19)"
}
],
"left": [
{
"type": "vertical-gene-annotations",
"width": 60,
"tilesetUid": "OHJakQICQD6gTD7skx4EWA",
"server": "higlass.io/api/v1",
"position": "left",
"name": "Gene Annotations",
"options": {
"labelPosition": "bottomRight"
}
}
,
{
"chromInfoPath": "//s3.amazonaws.com/pkerp/data/hg19/chromSizes.tsv",
"type": "vertical-chromosome-labels",
"position": "top",
"name": "Chromosome Labels (hg19)"
}
],
"center": [
{
"uid": "c1",
"type": "combined",
"height": 200,
"contents": [
{
"server": "higlass.io/api/v1",
"tilesetUid": "CQMd6V_cRw6iCI_-Unl3PQ",
"type": "heatmap",
"position": "center",
"options": {
"colorRange": [
"#FFFFFF",
"#F8E71C",
"#F5A623",
"#D0021B"
],
"maxZoom": null
}
}
,
{
"type": "2d-chromosome-grid",
"position": "center",
"chromInfoPath": "//s3.amazonaws.com/pkerp/data/hg19/chromSizes.tsv",
}
],
"position": "center"
}
],
"right": [],
"bottom": []
}
}
],
"zoomLocks": {
"locksByViewUid": {},
"zoomLocksDict": {}
}
};
The HiGlass server is capable of loading tile data from different file types. While they may physically store the data in different formats, they share the capability of being queried for data at a given zoom level and location.
Multires cooler files
Multires cooler files are HDF5 files which store multiple contact matrices binned at different resolutions. Each individual contact matrix is stored using the standard cooler format.
Regular cooler files can be turned into multires files using the cooler coarsegrain
command. See the Processing and importing data section of the wiki for more information about the format.
Hitile files
Hitile files sore 1D genomic data at multiple resolutions using the HDF5 format. They are created using the clodius package. See the BigWig section of the processing and importing data section of the wiki for information about creating hitile
files.
Contents
At the root level, attributes define metadata about the file. This is perhaps best explained with a chunk of code:
import h5py
f = h5py.File('file.hitile')
d = f['meta']
d.attrs['zoom-step'] = zoom_step # store every nth aggregation (zoom) level (default: 8)
d.attrs['max-length'] = assembly_size # the size of the genome assembly (default: hg19)
d.attrs['assembly'] = assembly # the name of the genome assembly (default: hg19)
d.attrs['chrom-names'] = bwf.chroms().keys() # the chromosome names in the assembly (default ['chr1', 'chr2',...])
d.attrs['chrom-sizes'] = bwf.chroms().values() # the sizes of the chromosomes (e.g. [249250621, ...])
d.attrs['chrom-order'] = chrom_order # the order in which the chromosomes are stored (default ['chr1'..., 'chrX', 'chrY', 'chrM'])
d.attrs['tile-size'] = tile_size # the size of each individual tile (default: 1024)
d.attrs['max-zoom'] = max_zoom = math.ceil(math.log(d.attrs['max-length'] / tile_size) / math.log(2))
# the maximum zoom level (default: 22)
d.attrs['max-width'] = tile_size * 2 ** max_zoom
# the maximum width of a tileset with this tile size and maximum zoom
Internally, the data is stored at each zoom-step
'th zoom level as one long array.
Size
Because HDF5 compresses data when storing it, hitile
files end up being smaller than their bigWig counterparts.
File | BigWig size | HiTile size | Conversion time (seconds) |
---|---|---|---|
wgEncodeSydhTfbsA549CtcfbIggrabSig | 595M | 166M | 480 |
E116-H3K4me2.fc.signal | 203M | 175M | 455 |
E004-H3K79me1.fc.signal | 710M | 465M | 577 |
Each gene can have multiple isoforms (combinations of exons and introns). These isoforms can overlap
chr4 115519557 115599381 UGT8 25 + NM_001128174 7368 protein-coding UDP glycosyltransferase 8 115544036 115597444 115519557,115544034,115585150,115586835,115589240,115597080, 115520130,115544858,115585293,115586912,115589460,115599381,
chr4 115519557 115599381 UGT8 25 + NM_001322112 7368 protein-coding UDP glycosyltransferase 8 115544036 115597444 115519557,115540578,115544034,115585150,115586835,115589240,115597080, 115520130,115540681,115544858,115585293,115586912,115589460,115599381,
chr4 115519557 115599381 UGT8 25 + NM_001322113 7368 protein-coding UDP glycosyltransferase 8 115544036 115597444 115519557,115544034,115585150,115586835,115589240,115597080, 115520213,115544858,115585293,115586912,115589460,115599381,
chr4 115520440 115599381 UGT8 25 + NM_001322114 7368 protein-coding UDP glycosyltransferase 8 115544036 115597444 115520440,115544034,115585150,115586835,115589240,115597080, 115520942,115544858,115585293,115586912,115589460,115599381,
chr4 115543522 115599381 UGT8 25 + NM_003360 7368 protein-coding UDP glycosyltransferase 8 115544036 115597444 115543522,115585150,115586835,115589240,115597080, 115544858,115585293,115586912,115589460,115599381,
or they can be located on distant regions (even different chromosomes).
chr1 367658 368597 OR4F16 2 + NM_001005277 81399 protein-coding olfactory receptor family 4 subfamily F member 16 367658 368597 367658, 368597,
chr1 621095 622034 OR4F16 2 - NM_001005277 81399 protein-coding olfactory receptor family 4 subfamily F member 16 621095 622034 621095, 622034,
chr5 180794287 180795226 OR4F16 2 + NM_001005277 81399 protein-coding olfactory receptor family 4 subfamily F member 16 180794287 180795226 180794287, 180795226,
We want to display an overview of all known exons but we don't want our genes to extend across chromosomes. To resolve this, we show all overlapping sets of exons as single entities. Genes with annotations that are far away from each other and don't overlap will be displayed separately:
7368 115519557 115599381 UGT8 25 + union_7368 7368 protein-coding UDP glycosyltransferase 8 115544036 115597444 115519557,115519557,115520440,115540578,115543522,115544034,115585150,115586835,115589240,115597080 115520130,115520213,115520942,115540681,115544858,115544858,115585293,115586912,115589460,115599381
81399 367658 368597 OR4F16 2 + union_81399 81399 protein-coding olfactory receptor family 4 subfamily F member 16 367658 368597 367658 368597
81399 621095 622034 OR4F16 2 - union_81399 81399 protein-coding olfactory receptor family 4 subfamily F member 16 621095 622034 621095 622034
81399 180794287 180795226 OR4F16 2 + union_81399 81399 protein-coding olfactory receptor family 4 subfamily F member 16 180794287 180795226 180794287 180795226
Links
- The developer documentation (clodius) describes how we generate our gene annotations starting from GenBank and UCSC Genome Browser data.
Gene annotations show where genes are located on a given genome. When zoomed in, the full exon-intron structure is shown. Because genes can be transcribed into numerous isoforms, we calculate an overlapping union of the known exons for each gene and display that. More details can be found on the Displaying gene annotations wiki page.
Available options
plusStrandColor and minusStrandColor
The colors for the annotations on each strand can be changed via the + Strand Color and - Strand Color configuration options in the Configure Series menu.
Annotations are tiled and require a server and tilesetUid. The labelPosition
option can either be omitted or set to hidden
if no label is desired.
Example configuration
{
"type": "vertical-gene-annotations",
"width": 60,
"tilesetUid": "OHJakQICQD6gTD7skx4EWA",
"server": "/api/v1",
"position": "left",
"name": "Gene Annotations",
"options": {
"labelPosition": "bottomRight"
}
}
HiGlass is a web application for displaying genomic contact matrices.
Demo
An online demo can be found at higlass.io
Running locally
HiGlass can also be run locally as a docker container. The higlass-docker repository contains detailed information about how to set it up and run it.
The simple example below stops any running higlass containers, removes them, pulls the latest version and runs it.
docker stop higlass-container;
docker rm higlass-container;
docker pull higlass/higlass-docker:v0.6.1 # higher versions are experimental and may or may not work
docker run --detach \
--publish 8989:80 \
--volume ~/hg-data:/data \
--volume ~/tmp:/tmp \
--name higlass-container \
higlass/higlass-docker:v0.6.1
The higlass website should now be visible at http://localhost:8989
. Take a look at the documentation for adding a new track to see how to display data.
For security reasons, an instance created this way will not be accessible from hosts other than "localhost". To make it accessible to other hosts, please specify a hostname using the SITE_URL
environment variable:
docker run --detach \
--publish 8989:80 \
--volume ~/hg-data:/data \
--volume ~/tmp:/tmp \
--name higlass-container \
-e SITE_URL=my.higlass.org \
higlass/higlass-docker:v0.6.1
To use the admin interface for managing the available datasets, a superuser needs to created:
docker exec -it higlass-container higlass-server/manage.py createsuperuser
Once a username and password are created, the admin interface can be accessed at http://localhost:8989/admin
.
Processing and importing data
Large datasets need to be converted to multiple resolutions so that they can be tiled and displayed using higlass. Unfortunately, due to the variety of data types available there are different procedures for different starting file types.
Cooler files
Cooler files store genome contact matrices as HDF files. Typical cooler files store data at one resolution. To support zooming, they need to be converted to multi-resolution cooler files. Starting with the highest resolution you would like to visualize in a file called matrix.cool
:
pip install cooler
cooler zoomify --balance matrix.cool
This command will aggregate the contact matrix in matrix.cool
to produce multiple normalized zoom levels, storing the resulting contact matrices in matrix.multi.cool
. This can then be loaded into higlass:
docker exec higlass-container python higlass-server/manage.py \
ingest_tileset \
--filename /tmp/matrix.multi.cool \
--datatype matrix \
--filetype cooler
Creating cooler files from contacts
If a cooler file doesn't already exist, it can be created from a list of contacts (positions of pairs of genomic loci) and a set of chromosome sizes. Here's an example of a tab-delimited contact list or "pairs file":
chr1 124478180 - chr1 121966441 +
chr1 124478180 - chr1 121760032 +
...
It can be aggregated into a multi-resolution cooler using the following commands:
CHROMSIZES_FILE=hg19.chrom.sizes
BINSIZE=1000
CONTACTS_FILE=contacts.tsv
cooler cload pairs -c1 1 -p1 2 -c2 4 -p2 5 \
$CHROMSIZES_FILE:$BINSIZE \
$CONTACTS_FILE.sorted \
out.cool
cooler zoomify out.cool
Note that the order of the chromosomes in the chromosome sizes file should match the coordinate system used in HiGlass.
BigWig Files
BigWig files need to be processed using the clodius package before they can be displayed in higlass:
pip install clodius
clodius aggregate bigwig file.bigwig
The default bigwig aggregation will assume that the chromosome sizes are from hg19. To aggregate for a different assembly use the --assembly
option. E.g. --assembly mm9
. It is also possible to pass in a set of chromosome size with the --chromsizes-filename
option. Even though chromosome sizes are stored in the bigWig file, the conversion script requires an ordering as provided by the chromsizes-filename
to produce the hitile
file.
This will convert file.bigwig
into a higlass-legible file. If no filename is specified using the --output-file
option, the original extension is replaced with .hitile
. This hitile
file can then be loaded into higlass:
docker exec higlass-container python higlass-server/manage.py \
ingest_tileset \
--filename /tmp/file.hitile \
--filetype hitile \
--datatype vector \
--name "Some 1D genomic data"
bedGraph files
Data can be imported from text files which have a bedGraph-like format:
chrom start end eigU eigT eigN GC
chr1 3000000 3020000 -0.30001076078261446 -0.28139497528740076 -0.4257141574669923 0.39005
chr1 3020000 3040000 -0.6506417814728713 -0.04220806911621135 -0.7562304803612467 0.3995
chr1 3040000 3060000 -0.5962263338769729 -0.58579839698137 -0.5406451925771123 0.38845
These files need to be aggregated and converted to hitile files using clodius
:
pip install clodius
clodius aggregate bedgraph file.tsv --output-file file.hitile --assembly hg19
The columns containing the chromosome name (--chromosome-col
), the starting position (--from-pos-col
), the ending position (--to-pos-col
) and the values (--value-col
) can be specified as 1-based parameters. They default to 1,2,3 and 4, respectively. The genome assembly defaults to hg19 but can be changed using the --assembly
parameter.
Note: The entries in the bedlike file must be sorted so that the order of the chromosomes matches the order defined in the negspy package (e.g. hg19/chromOrder.txt). For assemblies such as hg19
and mm9
this defaults to a semantic ordering (e.g. chr1, chr2, chr3... chrX, chrY, chrM).
Bedpe-like files
2D annotations often have a two start and end points:
chr10 74160000 74720000 chr10 74165000 74725000
chr12 120920000 121640000 chr12 120925000 121645000
chr15 86360000 88840000 chr15 86365000 88845000
These can be aggregated using clodius:
clodius aggregate bedpe \
--assembly hg19 \
--chr1-col 1 --from1-col 2 --to1-col 3 \
--chr2-col 4 --from2-col 5 --to2-col 6 \
--output-file domains.txt.multires \
domains.txt
Once created, they can be entered into higlass using docker:
docker exec higlass-container python higlass-server/manage.py \
ingest_tileset \
--filename /tmp/domains.txt.multires.db \
--filetype bed2ddb \
--datatype 2d-rectangle-domains
Gene annotation files
Gene annotation files store information about exons, introns and gene names. They are sqlite3 db files with a schema that is compatible with higlass-server. Creating these files first requires a bed-like list of gene annotations:
chr5 176022802 176037131 GPRIN1 7 - union_114787 114787 protein-coding G protein regulated inducer of neurite outgrowth 1 176023808 176026835 176022802,176036999 176026878,176037131
chr8 56015016 56438710 XKR4 8 + union_114786 114786 protein-coding XK, Kell blood group complex subunit-related family, member 4 56015048 56436786 56015016,56270237,56435839 56015854,56270437,56438710
These can be generated from publicly available data as described in the clodius wiki. This bed-like file then needs to be aggregated for multiple resolutions and converted to an sqlite3 db file using clodius:
pip install clodius
clodius aggregate bedfile \
--max-per-tile 20 --importance-column 5 \
--assembly hg19 \
--output-file gene-annotations.beddb
gene-annotations.bed
Once created, the gene annotations file can be loaded into higlass:
docker exec higlass-container python higlass-server/manage.py \
ingest_tileset \
--filename /tmp/gene-annotations.beddb \
--filetype beddb \
--datatype gene-annotation \
--coordSystem hg19 \
--name "Gene Annotations (hg19)"
The horizontal heatmap track is similar to the regular heatmap but it is rotated 45 degrees so that the diagonal lies along the x-axis. While it shows 2D data, this view is technically a 1D track and can be added to the top, left, right, or center track regions.
In the top and bottom configurations, the default is for the diagonal to be facing down. In the left and right configurations, the default is for the diagonal to face right. This default can be changed by selecting yes
on the Flip heatmap
option.
Example config
{
"tilesetUid": "CQMd6V_cRw6iCI_-Unl3PQ",
"server": "http://higlass.io/api/v1",
"position": "center",
"type": "horizontal-heatmap",
"height": 120,
"options": {
"maxZoom": null,
"labelPosition": "bottomRight",
"colorRange": [
"#FFFFFF","#F8E71C","#F5A623","#D0021B"
],
}
}
Options
Horizontal and vertical heatmaps have the same options as regular heatmaps. See the Heatmap
section above for more information.
Horizontal lines tracks display 1D tiled data as a line.
Value scaling
The values in a line are scaled according to the minimum and maximum visible values in the currently visible tiles (the so-called "visible values"). If the default linear scaling is selected, then values are scaled linearly from the minimum to the maximum visible values. If log scaling is selected, then to avoid having to scale values equal to 0, a pseudocount equal to the median of the "visible" values is added to each value and values are scaled from log(median_value) to log(max_value+median_value)
.
Configurable options
Label position
The label position indicates where the name of the track will be drawn. The example on the left has been labelled as "wgEncodeSydhTfbsGm12878Rad21IggrabSig.hitile". The available values are topLeft
, topRight
, bottomLeft
, bottomRight
, and hidden
.
Axis position
They can be adorned with an axis using the axisPositionHorizontal
option. The default value is right
, but can be set to null
or hidden
if no axis is desired. For vertical line axes use the axisPositionVertical
option with available options top
, bottom
and hidden
.
Stroke color
The stroke color determines how to color the drawn line. It can be configured using hex or word colors in the config file or selected from the presets shown in the track config menu.
Example configuration (horizontal line)
{
"server": "http://higlass.io/api/v1",
"tilesetUid": "b6qFe7fOSnaX-YkP2kzN1w",
"type": "horizontal-line",
"options": {
labelPosition: 'topLeft',
"axisPositionHorizontal": "left",
lineStrokeColor: 'blue',
}
}
Example configuration (vertical line)
{
"server": "http://higlass.io/api/v1",
"tilesetUid": "b6qFe7fOSnaX-YkP2kzN1w",
"type": "vertical-line",
"options": {
"axisPositionVertical": "right"
}
}
The rectangular heatmap is one of the central plot types in HiGlass. It depicts matrices by coloring each cell according to its value. Data is pulled in remotely from a server and rendered client-side. This configuration gives us the opportunity to dynamically change how the data is displayed by changing its scaling and color mapping.
Value scaling
Values in rectangular (and horizontal and vertical) heatmaps are scaled logarithmically. Some cells may, however, have values of 0 which make logarithmic scaling impossible. To get around this, we add the minimum non-zero value in the visible area to each value as a pseudocount. The colors used to display these values are then scaled from log(min_value)
to log(min_value + max_value)
, where min_value
is the minimum non-zero value in all of the currently visible tiles and max_value
is the maximum value in all of the currently visible tiles.
Color map
The color map of the heatmap can be changed through the track configuration options menu. The presets roughly correspond to the some of the examples available from matplotlib and are defined in the file app/scripts/config.js. The colors are spaced and interpolated evenly over the range of visible values.
Histogram-based color selection is planned for future releases.
Example configuration
{
"server": "/api/v1",
"tilesetUid": "CQMd6V_cRw6iCI_-Unl3PQ",
"type": "heatmap",
"position": "center",
"options": {
"colorRange": [ "#FFFFFF", "#F8E71C", "#F5A623", "#D0021B"],
"maxZoom": null
}
}
Custom color maps can be defined by selecting the Custom
option. Up to five different colors can be selected. The cell values in the matrix will be interpolated evenly over the range of colors. More information on the color interpolation can be found in the documentation of d3's continuous scales.
Label position
Selecting a position ('top left', 'top right', 'bottom left' and 'bottom right') from the Label position
configuration menu will place a label with information about the track in that position. Currently, the track label shows the following information:
Dataset name: The name of the dataset being displayed. This is either the name that was supplied when the file was uploaded, the name of the uploaded file (if no name was explicitly provided) or the name of the track.
Current data resolution While HiGlass provides smooth zooming, the data is stored and served at discrete resolutions. The current data resolution shows the resolution of the data being served at the current zoom level.
The regular axis track shows absolute positions along a given axis. It does not distinguish between chromosomes and is thus included primarily for debugging purposes.
Example configuration
{
"type": "top-axis",
"position": "top",
"name": "Top Axis"
}
Tiles in HiGlass
HiGlass only requests small chunks of data corresponding to the visible region from the server. As seen on the left, any higlass view is composed of a number of "tiles" which are pieced together to form the visible region on the screen. Tiles are identified by their zoom level, x position and y position (shown as z/x/y on in the figure).
Tiles can be classified according to the dimensionality of the data they contain (1D or 2D) and according to the structure of the data (sparse or dense). Dense data containing tiles (e.g. for matrices or lines) store the data as an array. The positions of each data point can be determined by the tile's position and the index of the datapoint within the dense array. They are transferred between the server and client as base64 encoded strings.
Sparse tiles are more flexible in the type of data they contain and they contain position information for each data point. This makes it possible to display features such as gene annotation that are present in some locations and not others. Each dense. 2D tile contains the data for a 256x256 pixel region. Due to the free zooming, when a tile is displayed, it can take up an area smaller or larger than 256x256 pixels. Dense 1D tiles contain 1024 data points.
All tile data is compressed on the server and extracted on the client to minimize the amount of data which needs to be transferred.
Motivation
To display large datasets, HiGlass relies on aggregation and tiling to fetch only the visible region at any given time. A high-resolution (1Kb) genomic contact map will be matrix of roughly 3M rows and 3M columns. While the sparsity of the matrix implies that the majority of the cells in the matrix will be unpopulated, these cells still need to be rendered. Assuming one pixel per column, a monitor would need to be approximately two and a half football fields long to display it in its entirety. To fit the entire matrix in a single monitor, we need to aggregate data so that we are displaying multiple rows and column on each pixel.
Aggregation
Aggregation is the process of reducing larger datasets to smaller ones for the purposes of displaying more data than can fit on the screen at once. While there are a multitude of ways to aggregate large datasets, we make significant use of summation and prioritization for numerical and categorical data, respectively. The following sections will provide examples for how we use aggregation to reduce the size of larger datasets.
Summation
Aggregation by summation is simply aggregation of adjacent elements by summing their values. This can be generalized to any n-dimensional data set, but we only employ it for matrices and vectors.
Contact matrices
In the case of contact matrices, this is easily accomplished by summing adjacent cells. Consider the following matrix:
1 | 2 | 3 | 4 |
5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 |
We can perform a round of aggregation, summing each block of four cells into one:
14 | 22 |
46 | 54 |
From this state, we could perform one more round of aggregation and end up with a matrix with just one entry which is simply the sum of all the values of our original matrix.
136 |
This procedure is an example of how we can lower the resolution of our matrix so that it can be displayed using fewer pixels.
Vectors
Similar to matrices, vectors can also be reduced by summation. The vector
1 | 2 | 3 | 4 |
reduces to
3 | 7 |
and can be further reduced to
10 |
While summation is neither the only way nor, perhaps, the best way to reduce large matrices and vectors, we find that it not only serves our purposes but meshes smoothly with the notion of binning in the creation of contact matrices. A contact matrix that starts out at 1Kb resolution becomes a matrix at 2Kb resolution after one round of aggregation. After 14 rounds of aggregation, an entire 1Kb resolution human contact matrix can fit into a 256x256 pixel image (3.2e9 < 256 2 ** 14 1000). The resolution of the data at this level of aggregation is 1Kbp * 2 ** 14 = 16384 Kbp per pixel (or bin).
Prioritization
Another method of aggregating data is by picking out entries from a dataset according to some importance function. This is commonly found in maps. Showing every village on an overview of the world would be useless because all of the labels would overlap. Showing every village when only a county is displayed, however, makes more sense. As the size of the area increases, labels are selectively hidden to show features with a higher priority (often population or area, in the case of maps). The same holds true for gene annotations and other genomic features.
Gene Annotations
To display every gene label in the genome on one monitor is impossible. The labels would overlap. By prioritizing some labels over others and selectively hiding those with lower priority, we can maintain a nearly constant number of non-overlapping labels at any resolution. To do this, we first declare that we will attempt to display no more than 100 gene labels in any 1024 pixel region. We then aggregate adjacent regions by taking the 100 most 'important' entries from the union of the genes in the two regions. This can be illustrated using a simple example where we begin with a list of prioritized regions
region start | region end | gene | priority |
---|---|---|---|
1 | 1 | A | 9 |
2 | 2 | B | 2 |
3 | 3 | C | 6 |
4 | 4 | D | 13 |
and aggregate them such that no single region has more than one gene in it:
region start | region end | gene | priority |
---|---|---|---|
1 | 2 | A | 9 |
3 | 4 | D | 13 |
And once more for completeness:
region start | region end | gene | priority |
---|---|---|---|
1 | 4 | D | 13 |
Track configuration
Each track can be configured to a certain extent. All configuration and track-related operations can be accessed to through the track configuration menu. This menu only appears on mouseover
so if you don't see it, move the mouse out of and back onto the track. An example of the menu can be seen in the screenshot on the right.
Track information
Many tracks can display information about themselves. This can be enabled by selecting Label Position
for a particular series in a track and picking a location (such as Top Left
).
This will display some information about the track in one of its corners:
Zoom limiting
A common use case is to limit the resolution of the data which is visible. While this may result in a more coarse-grain image, it can also preserve features that are only visible under a more coarse-grain aggregation:
This option can be found under the Configure Series -> Zoom limit
menu:
Axes
Axes can be added to horizontal-line
and vertical-line
track types.
Selecting left
or right
(top
or bottom
on vertical-line
tracks), places the axis in the selected position:
Track operations are actions that affect how tracks are displayed and how they interact with one another. To access the track operations menu, click on the cog icon that appears when the mouse is over a track.
HiGlass can display data in a variety of different track types.
![]() |
Rectangular heatmap |
![]() |
Horizontal and vertical heatmaps |
![]() |
Line |
![]() |
Chromosome grid |
![]() |
Chromosome axis |
![]() |
Viewport projection |
![]() |
Gene annotations |
Views
Views are visible units with their own x and y scales. Every track within a unit shares the same view-wide x and y scales. 2D tracks in the middle use both the view-wide x and y scales. Horizontal tracks use the view-wide x scale and ignore the y. They can define their own y scale (as in Line tracks) that scales values in the track or they can simply display information without a y scale (gene annotations). Vertical tracks ignore the view-wide x scale and can define their own.
All tracks share the same scaling factor.
Adding new views
New views are created by clicking the copy view icon on the right side of the view header. The newly created view will be a copy of the view on which the icon was clicked. When created, it will try to place itself at nearest available position moving left to right, top to bottom.
Closing views
Views are closed by clicking the close view icon. The vertical space which is occupied by a view can then be compacted by views below it.
Cross-view operations
Cross view operations involve transferring or linking parameters (such as scaling factor and location) between two separate views. They are always initiated from the view settings menu and always involve the selection of a target view. When a cross-view operation such as taking the zoom level is initiated, a target view selector appears as green overlay. Hovering over different views moves the target view selector to that view. Clicking on a target performs the operation between the source view (the one which initiated the operation) and the target view (which was selected).
View synchronization
While scales between views are generally independent, it is possible to synchronize the axes of one view with another.
Take zoom from
Taking the zoom from a different view sets the scaling factor of this view to that of the target view. Both views remain centered on the same point that they were centered on before the operation.
Take location from
Taking the location from a different view sets the center of this view (along both the view-wide x and y axes) to the the center of the target view.
Take location and zoom from
Taking the location and zoom from a different view centers and zooms this view on the same location (e.g. same center point) as the target view.
View linking
Having independent view-wide x and y scales is useful for displaying different regions in different views, but there are situations when we may want to link (lock) scales between views. This operation maintains a constant difference between either the center points or scale factors of two views. This constant difference is equal to the difference in the parameter (center point or scale factor) at the time of linking.
It is entirely possible to link more than two views. The pairwise differences in parameters are maintained between all of the members of the zoom group.
Linking views by zoom level (scale factor)
Views linked by zoom level maintain a constant zoom separation. When one view is zoomed, the other linked views follow. The locations remain free and panning is unconstrained between views.
Linking views by location (center point)
Views linked by location maintain a constant separation of their center points. They may be scaled independent of each other but the difference in center point location remains constant. Note that zooming often modifies the center point so zooming operations may appear to move both views but this is simply a byproduct of the ability to zoom into points away from the center.
Linking views by zoom and location
Views linked by zoom level and location always maintain a constant separation between both parameters. Zooming in one zooms or moves in the other.
Unlinking
Any parameter linking can be removed from the view settings menu.
Syncing and linking at same time
A common operation is taking the zoom level and location from a different view and linking their zooms and locations. This is useful when one wishes to compare identical locations in multiple samples. While this operation can be accomplished by first taking the zoom and location and then linking the zoom and location, we've also included a convenience menu option which performs both operations with one action.
Searching for a gene or genomic coordinate
It's possible to search for a particular locus using the genome position search box:
If the genome position search box isn't visible, it can be enabled by toggling it in the view config menu:
The viewport projection track shows the bounds of one view on another. It is useful when showing the same dataset at two different resolutions. To instantiate it, select the settings of the target view (the one whose bounds we want to draw) and click 'Show this viewport on'. This will then let you select another track (2D) onto which to display the bounds of the target view.
Available options
projectionFillColor
: The color with which to fill the viewport projection box. Can be specified as a name (e.g. "red"), a hex value (e.g. #f00) or an rgb string(e.g. rgb(255,0,0))projectionStrokeColor
: The color of the stroke for the projection box.projectionFillOpacity
: The opacity of the fill of the projection (0 to 1).projectionStrokeOpacity
: The opacity of the stroke of the projection (0 to 1).
Example configuration
{
"uid": "FI58zIkYQKe2S--8x6Iwfg",
"type": "viewport-projection-center",
"fromViewUid": "A4tM32baS9qnYB0HCAiuTg",
"options": {},
"name": "Viewport Projection",
"options": {
"projectionFillColor": "#777",
"projectionStrokeColor": "#777",
"projectionFillOpacity": 0.3,
"projectionStrokeOpacity": 0.3
}
}
- Adding a new track
- Resizing a view
- Adding and removing views
- Replacing tracks
- Changing a heatmap's colormap
- Adding track labels
- Adding new views
- Closing views
- View synchronization
- View linking
- Syncing and linking
- Exporting and sharing