ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
PackageAuthorMaintainerDescriptionbiocViews
downloads as of 5/4
2
AlphaBeta
Yadollah Shahryary Dizaji [cre, aut], Frank Johannes [aut], Rashmi Hazarika [aut]
Yadollah Shahryary Dizaji <shahryary@gmail.com>
AlphaBeta is a computational method for estimating epimutation rates and spectra from high-throughput DNA methylation data in plants. The method has been specifically designed to: 1. analyze 'germline' epimutations in the context of multi-generational mutation accumulation lines (MA-lines). 2. analyze 'somatic' epimutations in the context of plant development and aging.
Epigenetics, FunctionalGenomics, Genetics,
MathematicalBiology
131
3
ALPS
Venu Thatikonda, Natalie Jäger
Venu Thatikonda <thatikonda92@gmail.com>
The package provides analysis and publication quality visualization routines for genome-wide epigenomics data such as histone modification or transcription factor ChIP-seq, ATAC-seq, DNase-seq etc. The functions in the package can be used with any type of data that can be represented with bigwig files at any resolution. The goal of the ALPS is to provide analysis tools for most downstream analysis without leaving the R environment and most tools in the package require a minimal input that can be prepared with basic R, unix or excel skills.
Epigenetics, Sequencing, ChIPSeq, ATACSeq, Visualization,
Transcription, HistoneModification
159
4
AnVIL
Martin Morgan [aut, cre], Nitesh Turaga [aut], BJ Stubbs [ctb], Vincent Carey [ctb], Marcel Ramos [ctb], Sweta Gopaulakrishnan [ctb], Valerie Obenchain [ctb]
Martin Morgan <mtmorgan.bioc@gmail.com>
The AnVIL is cloud computing resource developed in part by the National Human Genome Research Institute. The AnVIL package provides end-user and devloper functionality. For the end-user, AnVIL provides fast binary package installation, utitlities for working with Terra / AnVIL table and data resources, and convenient functions for file movement to and from Google cloud storage. For developers, AnVIL provides programatic access to the Terra, Leonardo, Dockstore, and Gen3 RESTful programming interface, including helper functions to transform JSON responses to more formats more amenable to manipulation in R.
Infrastructure3
5
APAlyzer
Ruijia Wang [cre, aut] (<https://orcid.org/0000-0002-4211-5207>), Bin Tian [aut]
Ruijia Wang <rjwang.bioinfo@gmail.com>
Perform 3'UTR APA, Intronic APA and gene expression analysis using RNA-seq data.
Sequencing, RNASeq, DifferentialExpression, GeneExpression,
GeneRegulation, Annotation, DataImport, Software
136
6
ASpediaFI
Doyeong Yu, Kyubin Lee, Daejin Hyung, Soo Young Cho, Charny Park
Doyeong Yu <parklab.bi@gmail.com>
This package provides functionalities for a systematic and integrative analysis of alternative splicing events and their functional interactions.
AlternativeSplicing, Annotation, Coverage, GeneExpression,
GeneSetEnrichment, GraphAndNetwork, KEGG, Network,
NetworkInference, Pathways, Reactome, Transcription,
Sequencing, Visualization
127
7
AutotunerCraig McLean
Craig McLean <craigmclean23@gmail.com>
This package is designed to help faciliate data processing in untargeted metabolomics. To do this, the algorithm contained within the package performs statistical inference on raw data to come up with the best set of parameters to process the raw data.
MassSpectrometry, Metabolomics159
8
AWFisherZhiguang Huo
Zhiguang Huo <zhuo@ufl.edu>
Implementation of the adaptively weighted fisher's method, including fast p-value computing, variability index, and meta-pattern.
StatisticalMethod, Software124
9
basilisk
Aaron Lun [aut, cre, cph], Vince Carey [ctb]
Aaron Lun <infinite.monkeys.with.keyboards@gmail.com>
Installs a self-contained Python instance that is managed by the R installation. This aims to provide a consistent Python version that can be used reliably by Bioconductor packages. Module versions are also controlled to guarantee consistent behavior on different user systems.
Infrastructure25
10
basilisk.utils
Aaron Lun [aut, cre, cph]
Aaron Lun <infinite.monkeys.with.keyboards@gmail.com>
Implements utilities for installation of the basilisk package, primarily to avoid re-writing the same R code in both the configure script (for centrally administered R installations) and in the lazy installation mechanism (for distributed binaries). It is highly unlikely that developers - or, heaven forbid, end-users! - will need to interact with this package directly; they should be using the basilisk package instead.
Infrastructure35
11
BgeeCall
Julien Wollbrett [aut, cre], Julien Roux [aut], Sara Fonseca Costa [ctb], Marc Robinson Rechavi [ctb], Frederic Bastian [aut]
Julien Wollbrett <julien.wollbrett@unil.ch>
Reference intergenic regions are generated by the Bgee RNA-Seq pipeline. These intergenic regions are used to generate all Bgee RNA-Seq present/absent expression calls. BgeeCall now allows to generate present/absent calls for any RNA-Seq library as long as reference intergenic sequences have been generated for the corresponding species. The threshold of present/absent expression is no longer arbitrary defined but is calculated based on expression of all RNA-Seq libraries integrated in Bgee.
Software, GeneExpression, RNASeq36
12
biobtreeRTamer Gur
Tamer Gur <tgur@ebi.ac.uk>
The biobtreeR package provides an interface to [biobtree](https://github.com/tamerh/biobtree) tool which covers large set of bioinformatics datasets and allows search and chain mappings functionalities.
Annotation50
13
BiocDockerManager
Maintainer [aut, cre], Nitesh Turaga [aut]
Maintainer <maintainer@bioconductor.org>
Package works analogous to BiocManager but for docker images. Use the BiocDockerManager package to install and manage docker images provided by the Bioconductor project. A convenient package to install images, update images and find which Bioconductor based docker images are available.
Software, Infrastructure, ThirdPartyClient10
14
BiocSet
Kayla Morrell [aut, cre], Martin Morgan [aut], Kevin Rue-Albrecht [ctb], Lluís Revilla Sancho [ctb]
Kayla Morrell <kayla.morrell@roswellpark.org>
BiocSet displays different biological sets in a triple tibble format. These three tibbles are `element`, `set`, and `elementset`. The user has the abilty to activate one of these three tibbles to perform common functions from the dplyr package. Mapping functionality and accessing web references for elements/sets are also available in BiocSet.
GeneExpression, GO, KEGG, Software151
15
BioTIP
Zhezhen Wang, Andrew Goldstein, Biniam Feleke, Qier An, Antonio Feliciano and Xinan Yang
Zhezhen Wang <zhezhen@uchicago.edu>, X Holly Yang < xyang2@uchicago> and Yuxi (Jennifer) Sun <ysun11@uchicago.edu>
Adopting tipping-point theory to transcriptome profiles to unravel disease regulatory trajectory.
Sequencing, RNASeq, GeneExpression, Transcription, Software107
16
biscuiteer
Tim Triche, Jr. [aut, cre], Wanding Zhou [aut], Ben Johnson [aut], Jacob Morrison [aut], Lyong Heo [aut]
\Jacob Morrison\" <jacob.morrison@vai.org>"
A test harness for bsseq loading of Biscuit output, summarization of WGBS data over defined regions and in mappable samples, with or without imputation, dropping of mostly-NA rows, age estimates, etc.
DataImport, MethylSeq, DNAMethylation136
17
blacksheepr
MacIntosh Cornwell [aut], RugglesLab [cre]
RugglesLab <ruggleslab@gmail.com>
Blacksheep is a tool designed for outlier analysis in the context of pairwise comparisons in an effort to find distinguishing characteristics from two groups. This tool was designed to be applied for biological applications such as phosphoproteomics or transcriptomics, but it can be used for any data that can be represented by a 2D table, and has two sub populations within the table to compare.
Sequencing, RNASeq, GeneExpression, Transcription,
DifferentialExpression, Transcriptomics
113
18
brainflowprobes
Amanda Price [aut, cre] (<https://orcid.org/0000-0001-7352-3732>), Leonardo Collado-Torres [ctb] (<https://orcid.org/0000-0003-2140-308X>)
Amanda Price <amanda.joy.price@gmail.com>
Use these functions to characterize genomic regions for BrainFlow target probe design.
Coverage, Visualization, ExperimentalDesign,
Transcriptomics, FlowCytometry, GeneTarget
103
19
brendaDb
Yi Zhou [aut, cre] (<https://orcid.org/0000-0003-0969-3993>)
Yi Zhou <yi.zhou@uga.edu>
R interface for importing and analyzing enzyme information from the BRENDA database.
ThirdPartyClient, Annotation, DataImport121
20
BRGenomics
Mike DeBerardine [aut, cre]
Mike DeBerardine <mike.deberardine@gmail.com>
This package provides useful and efficient utilites for the analysis of high-resolution genomic data using standard Bioconductor methods and classes. BRGenomics is feature-rich and simplifies a number of post-alignment processing steps and data handling. Emphasis is on efficient analysis of multiple datasets, with support for normalization and blacklisting. Included are functions for: spike-in normalizing data; generating basepair-resolution readcounts and coverage data (e.g. for heatmaps); importing and processing bam files (e.g. for conversion to bigWig files); generating metaplots/metaprofiles (bootstrapped mean profiles) with confidence intervals; conveniently calling DESeq2 without using sample-blind estimates of genewise dispersion; among other features.
Software, DataImport, Sequencing, Coverage, RNASeq, ATACSeq,
ChIPSeq, Transcription, GeneRegulation, GeneExpression,
Normalization
9
21
BUSpaRse
Lambda Moses [aut, cre] (<https://orcid.org/0000-0002-7092-9427>), Lior Pachter [aut, ths] (<https://orcid.org/0000-0002-9164-6231>)
Lambda Moses <dlu2@caltech.edu>
The kallisto | bustools pipeline is a fast and modular set of tools to convert single cell RNA-seq reads in fastq files into gene count or transcript compatibility counts (TCC) matrices for downstream analysis. Central to this pipeline is the barcode, UMI, and set (BUS) file format. This package serves the following purposes: First, this package allows users to manipulate BUS format files as data frames in R and then convert them into gene count or TCC matrices. Furthermore, since R and Rcpp code is easier to handle than pure C++ code, users are encouraged to tweak the source code of this package to experiment with new uses of BUS format and different ways to convert the BUS file into gene count matrix. Second, this package can conveniently generate files required to generate gene count matrices for spliced and unspliced transcripts for RNA velocity. Here biotypes can be filtered and scaffolds and haplotypes can be removed, and the filtered transcriptome can be extracted and written to disk. Third, this package implements utility functions to get transcripts and associated genes required to convert BUS files to gene count matrices, to write the transcript to gene information in the format required by bustools, and to read output of bustools into R as sparses matrices.
SingleCell, RNASeq, WorkflowStep479
22
calmKun Liang [aut, cre]
Kun Liang <kun.liang@uwaterloo.ca>
Statistical methods for multiple testing with covariate information. Traditional multiple testing methods only consider a list of test statistics, such as p-values. Our methods incorporate the auxiliary information, such as the lengths of gene coding regions or the minor allele frequencies of SNPs, to improve power.
Bayesian, DifferentialExpression, GeneExpression,
Regression, Microarray, Sequencing, RNASeq, MultipleComparison,
Genetics, ImmunoOncology, Metabolomics, Proteomics,
Transcriptomics
101
23
CARNIVAL
Enio Gjerga <enio.gjerga@gmail.com> Panuwat Trairatphisan Anika Liu Alberto Valdeolivas Nikolas Peschke
Enio Gjerga <enio.gjerga@gmail.com>
An upgraded causal reasoning tool from Melas et al in R with updated assignments of TFs' weights from PROGENy scores. Optimization parameters can be freely adjusted and multiple solutions can be obtained and aggregated.
Transcriptomics, GeneExpression, Network6
24
cBioPortalData
Levi Waldron [aut], Marcel Ramos [aut, cre]
Marcel Ramos <marcel.ramos at roswellpark.org>
The cBioPortalData package takes compressed resources from repositories such as cBioPortal and assembles a MultiAssayExperiment object with Bioconductor classes.
Infrastructure, Software, ThirdPartyClient82
25
ceRNAnetsim
Selcen Ari Yuka [aut, cre] (<https://orcid.org/0000-0002-0028-2453>), Alper Yilmaz [aut] (<https://orcid.org/0000-0002-8827-4887>)
Selcen Ari Yuka <selcenarii@gmail.com>
This package simulates regulations of ceRNA (Competing Endogenous) expression levels after a expression level change in one or more miRNA/mRNAs. The methodolgy adopted by the package has potential to incorparate any ceRNA (circRNA, lincRNA, etc.) into miRNA:target interaction network. The package basically distributes miRNA expression over available ceRNAs where each ceRNA attracks miRNAs proportional to its amount. But, the package can utilize multiple parameters that modify miRNA effect on its target (seed type, binding energy, binding location, etc.). The functions handle the given dataset as graph object and the processes progress via edge and node variables.
NetworkInference, SystemsBiology, Network, GraphAndNetwork,
Transcriptomics
47
26
CeTF
Carlos Alberto Oliveira de Biagi Junior [aut, cre], Ricardo Perecin Nociti [aut], Breno Osvaldo Funicheli [aut], João Paulo Bianchi Ximenez [ctb], Patrícia de Cássia Ruy [ctb], Marcelo Gomes de Paula [ctb], Rafael dos Santos Bezerra [ctb], Wilson Araújo da Silva Junior [aut, ths]
Carlos Alberto Oliveira de Biagi Junior <cbiagijr@gmail.com>
This package provides the necessary functions for performing the Partial Correlation coefficient with Information Theory (PCIT) (Reverter and Chan 2008) and Regulatory Impact Factors (RIF) (Reverter et al. 2010) algorithm. The PCIT algorithm identifies meaningful correlations to define edges in a weighted network and can be applied to any correlation-based network including but not limited to gene co-expression networks, while the RIF algorithm identify critical Transcription Factors (TF) from gene expression data. These two algorithms when combined provide a very relevant layer of information for gene expression studies (Microarray, RNA-seq and single-cell RNA-seq data).
Sequencing, RNASeq, Microarray, GeneExpression,
Transcription, Normalization, DifferentialExpression,
SingleCell, Network, Regression, ChIPSeq, ImmunoOncology,
Coverage
67
27
circRNAprofilerSimona Aufiero
Simona Aufiero <simo.aufiero@gmail.com>
R-based computational framework for a comprehensive in silico analysis of circRNAs. This computational framework allows to combine and analyze circRNAs previously detected by multiple publicly available annotation-based circRNA detection tools. It covers different aspects of circRNAs analysis from differential expression analysis, evolutionary conservation, biogenesis to functional analysis.
Annotation, StructuralPrediction, FunctionalPrediction,
GenePrediction, GenomeAssembly, DifferentialExpression
134
28
CiteFuse
Yingxin Lin [aut, cre], Hani Kim [aut]
Yingxin Lin <yingxin.lin@sydney.edu.au>
CiteFuse pacakage implements a suite of methods and tools for CITE-seq data from pre-processing to integrative analytics, including doublet detection, network-based modality integration, cell type clustering, differential RNA and protein expression analysis, ADT evaluation, ligand-receptor interaction analysis, and interactive web-based visualisation of the analyses.
SingleCell, GeneExpression3
29
cliqueMS
Oriol Senan Campos [aut, cre], Antoni Aguilar-Mogas [aut], Jordi Capellades [aut], Miriam Navarro [aut], Oscar Yanes [aut], Roger Guimera [aut], Marta Sales-Pardo [aut]
Oriol Senan Campos <oriol.senan@praenoscere.com>
Annotates data from liquid chromatography coupled to mass spectrometry (LC/MS) metabolomics experiments. Based on a network algorithm (O.Senan, A. Aguilar- Mogas, M. Navarro, O. Yanes, R.Guimerà and M. Sales-Pardo, Bioinformatics, 35(20), 2019), 'CliqueMS' builds a weighted similarity network where nodes are features and edges are weighted according to the similarity of this features. Then it searches for the most plausible division of the similarity network into cliques (fully connected components). Finally it annotates metabolites within each clique, obtaining for each annotated metabolite the neutral mass and their features, corresponding to isotopes, ionization adducts and fragmentation adducts of that metabolite.
Metabolomics, MassSpectrometry, Network, NetworkInference183
30
clustifyr
Rui Fu [aut, cre], Kent Riemondy [aut], RNA Bioscience Initiative [fnd], Austin Gillen [ctb], Chengzhe Tian [ctb], Jay Hesselberth [ctb], Yue Hao [ctb], Michelle Daya [ctb]
Rui Fu <raysinensis@gmail.com>
Package designed to aid in classifying cells from single-cell RNA sequencing data using external reference data (e.g., bulk RNA-seq, scRNA-seq, microarray, gene lists). A variety of correlation based methods and gene list enrichment methods are provided to assist cell type assignment.
SingleCell, Annotation, Sequencing, Microarray32
31
cmapR
Ted Natoli [aut, cre] (<https://orcid.org/0000-0002-0953-0206>)
Ted Natoli <ted.e.natoli@gmail.com>
The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common (CMap) data objects, such as annotated matrices and collections of gene sets.
DataImport, DataRepresentation, GeneExpression35
32
CNVfilteR
Jose Marcos Moreno-Cabrera <jmoreno@igtp.cat> and Bernat Gel <bgel@igtp.cat>
Jose Marcos Moreno-Cabrera <jmoreno@igtp.cat>
CNVfilteR identifies those CNVs that can be discarded by using the single nucleotide variant (SNV) calls that are usually obtained in common NGS pipelines.
CopyNumberVariation, Sequencing, DNASeq, Visualization,
DataImport
125
33
combi
Stijn Hawinkel <stijn.hawinkel@ugent.be>
Joris Meys <joris.meys@ugent.be>
Combine quasi-likelihood estimation, compositional regression models and latent variable models for integrative visualization of several omics datasets. Both unconstrained and constrained integration is available, the results are shown as interpretable multiplots.
Metagenomics, DimensionReduction, Microbiome, Visualization,
Metabolomics
26
34
CoreGx
Petr Smirnov [aut], Ian Smith [aut], Christopher Eeles [aut], Benjamin Haibe-Kains [aut, cre]
Benjamin Haibe-Kains <benjamin.haibe.kains@utoronto.ca>
A collection of functions and classes which serve as the foundation for our lab's suite of R packages, such as 'PharmacoGx' and 'RadioGx'. This package was created to abstract shared functionality from other lab package releases to increase ease of maintainability and reduce code repetition in current and future 'Gx' suite programs. Major features include a 'CoreSet' class, from which 'RadioSet' and 'PharmaSet' are derived, along with get and set methods for each respective slot. Additional functions related to fitting and plotting dose response curves, quantifying statistical correlation and calculating area under the curve (AUC) or survival fraction (SF) are included. For more details please see the included documentation, as well as: Smirnov, P., Safikhani, Z., El-Hachem, N., Wang, D., She, A., Olsen, C., Freeman, M., Selby, H., Gendoo, D., Grossman, P., Beck, A., Aerts, H., Lupien, M., Goldenberg, A. (2015) <doi:10.1093/bioinformatics/btv723>. Manem, V., Labie, M., Smirnov, P., Kofia, V., Freeman, M., Koritzinksy, M., Abazeed, M., Haibe-Kains, B., Bratman, S. (2018) <doi:10.1101/449793>.
Software, Pharmacogenomics, Classification, Survival27
35
CrossICC
Yu Sun <suny226@mail2.sysu.edu.cn>, Qi Zhao <zhaoqi@sysucc.org.cn>
Yu Sun <suny226@mail2.sysu.edu.cn>
CrossICC utilizes an iterative strategy to derive the optimal gene set and cluster number from consensus similarity matrix generated by consensus clustering and it is able to deal with multiple cross platform datasets so that requires no between-dataset normalizations. This package also provides abundant functions for visualization and identifying subtypes of cancer. Specially, many cancer-related analysis methods are embedded to facilitate the clinical translation of the identified cancer subtypes.
Software, GeneExpression, DifferentialExpression, GUI,
GeneSetEnrichment, Classification, Clustering,
FeatureExtraction, Survival, Microarray, RNASeq, BatchEffect,
Normalization, Preprocessing, Visualization
121
36
CSSQ
Ashwath Kumar [aut, cre] (<https://orcid.org/0000-0001-9106-6715>)
Ashwath Kumar <akumar301@gatech.edu>
This package is desgined to perform statistical analysis to identify statistically significant differentially bound regions between multiple groups of ChIP-seq dataset.
ChIPSeq, DifferentialPeakCalling, Sequencing, Normalization49
37
ctgGEM
Mark Block and Carrie Minette
USD Biomedical Engineering <bicbioeng@gmail.com>
Cell Tree Generator for Gene Expression Matrices (ctgGEM) streamlines the building of cell-state hierarchies from single-cell gene expression data across multiple existing tools for improved comparability and reproducibility. It supports pseudotemporal ordering algorithms and visualization tools from monocle, cellTree, TSCAN, sincell, and destiny, and provides a unified output format for integration with downstream data analysis workflows and Cytoscape.
GeneExpression, Visualization, Sequencing, SingleCell,
Clustering, RNASeq, ImmunoOncology, DifferentialExpression,
MultipleComparison, QualityControl, DataImport
22
38
cytomapper
Nils Eling [aut, cre] (<https://orcid.org/0000-0002-4711-1176>), Nicolas Damond [aut] (<https://orcid.org/0000-0003-3027-8989>)
Nils Eling <nils.eling@dqbm.uzh.ch>
Highly multiplexed imaging cytometry acquires the single-cell expression of selected proteins in a spatially-resolved fashion. These measurements can be visualized across multiple length-scales. First, pixel-level intensities represent the spatial distributions of feature expression with highest resolution. Second, after segmentation, expression values or cell-level metadata (e.g. cell-type information) can be visualized on segmented cell areas. This package contains functions for the visualization of multiplexed read-outs and cell-level information obtained by multiplexed imaging cytometry. The main functions of this package allow 1. the visualization of pixel-level information across multiple channels and 2. the display of cell-level information (expression and/or metadata) on segmentation masks.
ImmunoOncology, Software, SingleCell, OneChannel,
TwoChannel, MultipleComparison, Normalization, DataImport
13
39
DAMEfinder
Stephany Orjuela [aut, cre] (<https://orcid.org/0000-0002-1508-461X>), Dania Machlab [aut], Mark Robinson [aut]
Stephany Orjuela <sorjuelal@gmail.com>
DAMEfinder' offers functionality for taking methtuple or bismark outputs to calculate ASM scores and compute DAMEs. It also offers nice visualization of methyl-circle plots.
DNAMethylation, DifferentialMethylation, Coverage41
40
dearseq
Denis Agniel [aut], Boris P. Hejblum [aut, cre], Marine Gauthier [aut]
Boris P. Hejblum <boris.hejblum@u-bordeaux.fr>
Differential Expression Analysis RNA-seq data with variance component score test accounting for data heteroscedasticity through precision weights. Perform both gene-wise and gene set analyses, and can deal with repeated or longitudinal data. Methods are detailed in: Agniel D & Hejblum BP (2017) Variance component score test for time-course gene set analysis of longitudinal RNA-seq data, Biostatistics, 18(4):589-604. and Gauthier M, Agniel D, Thiébaut R & Hejblum BP (2019). dearseq: a variance component score test for RNA-Seq differential analysis that effectively controls the false discovery rate, *bioRxiv* 635714.
BiomedicalInformatics, CellBiology, DifferentialExpression,
DNASeq, GeneExpression, Genetics, GeneSetEnrichment,
ImmunoOncology, KEGG, Regression, RNASeq, Sequencing,
SystemsBiology, TimeCourse, Transcription, Transcriptomics
54
41
debCAM
Lulu Chen <luluchen@vt.edu>
Lulu Chen <luluchen@vt.edu>
An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.
Software, CellBiology, GeneExpression126
42
deltaCaptureC
Michael Shapiro [aut, cre] (<https://orcid.org/0000-0002-2769-9320>)
Michael Shapiro <sifka@earthlink.net>
This package discovers meso-scale chromatin remodelling from 3C data. 3C data is local in nature. It givens interaction counts between restriction enzyme digestion fragments and a preferred 'viewpoint' region. By binning this data and using permutation testing, this package can test whether there are statistically significant changes in the interaction counts between the data from two cell types or two treatments.
BiologicalQuestion, StatisticalMethod104
43
DEWSeq
Sudeep Sahadevan <sahadeva@embl.de>, Thomas Schwarzl <schwarzl@embl.de>
Hentze bioinformatics team <biohentze@embl.de>
Differential expression analysis of windows for next-generation sequencing data like eCLIP or iCLIP data.
Sequencing, GeneRegulation, FunctionalGenomics,
DifferentialExpression
129
44
DIAlignR
Shubham Gupta <shubham.1637@gmail.com>, Hannes Rost <hannes.rost@utoronto.ca>
Shubham Gupta <shubham.1637@gmail.com>
To obtain unbiased proteome coverage from a biological sample, mass-spectrometer is operated in Data Independent Acquisition (DIA) mode. Alignment of these DIA runs establishes consistency and less missing values in complete data-matrix. This package implements dynamic programming with affine gap penalty based approach for pair-wise alignment of analytes. A hybrid approach of global alignment (through MS2 features) and local alignment (with MS2 chromatograms) is implemented in this tool.
MassSpectrometry, Metabolomics, Proteomics, Alignment,
Software
29
45
distinct
Simone Tiberi [aut, cre], Mark D. Robinson [aut].
Simone Tiberi <simone.tiberi@uzh.ch>
distinct is a statistical method to perform differential testing between two or more groups of distributions; differential testing is performed via hierarchical non-parametric permutation tests on the cumulative distribution functions (cdfs) of each sample. While most methods for differential expression target differences in the mean abundance between conditions, distinct, by comparing full cdfs, identifies, both, differential patterns involving changes in the mean, as well as more subtle variations that do not involve the mean (e.g., unimodal vs. bi-modal distributions with the same mean). distinct is a general and flexible tool: due to its fully non-parametric nature, which makes no assumptions on how the data was generated, it can be applied to a variety of datasets. It is particularly suitable to perform differential state analyses on single cell data (i.e., differential analyses within sub-populations of cells), such as single cell RNA sequencing (scRNA-seq) and high-dimensional flow or mass cytometry (HDCyto) data. To use distinct one needs data from two or more groups of samples (i.e., experimental conditions), with at least 2 samples (i.e., biological replicates) per group.
Genetics, RNASeq, Sequencing, DifferentialExpression,
GeneExpression, MultipleComparison, Software, Transcription,
StatisticalMethod, Visualization, SingleCell, FlowCytometry,
GeneTarget
4
46
dittoSeqDaniel Bunis
Daniel Bunis <daniel.bunis@ucsf.edu>
A universal, user friendly, single-cell and bulk RNA sequencing visualization toolkit that allows highly customizable creation of color blindness friendly, publication-quality figures. dittoSeq accepts both SingleCellExperiment (SCE) and Seurat objects, as well as the import and usage, via conversion to an SCE, of SummarizedExperiment or DGEList bulk data. Visualizations include dimensionality reduction plots, heatmaps, scatterplots, percent composition or expression across groups, and more. Customizations range from size and title adjustments to automatic generation of annotations for heatmaps, overlay of trajectory analysis onto any dimensionality reduciton plot, hidden data overlay upon cursor hovering via ggplotly conversion, and many more. All with simple, discrete inputs. Color blindness friendliness is powered by legend adjustments (enlarged keys), and by allowing the use of shapes or letter-overlay in addition to the carefully selected dittoColors().
Software, Visualization, RNASeq, SingleCell, GeneExpression,
Transcriptomics, DataImport
40
47
DMCFBFarhad Shokoohi
Farhad Shokoohi <shokoohi@icloud.com>
DMCFB is a pipeline for identifying differentially methylated cytosines using a Bayesian functional regression model in bisulfite sequencing data. By using a functional regression data model, it tries to capture position-specific, group-specific and other covariates-specific methylation patterns as well as spatial correlation patterns and unknown underlying models of methylation data. It is robust and flexible with respect to the true underlying models and inclusion of any covariates, and the missing values are imputed using spatial correlation between positions and samples. A Bayesian approach is adopted for estimation and inference in the proposed method.
DifferentialMethylation, Sequencing, Coverage, Bayesian,
Regression
115
48
dpeak
Dongjun Chung, Carter Allen
Dongjun Chung <dongjun.chung@gmail.com>
dPeak is a statistical framework for the high resolution identification of protein-DNA interaction sites using PET and SET ChIP-Seq and ChIP-exo data. It provides computationally efficient and user friendly interface to process ChIP-seq and ChIP-exo data, implement exploratory analysis, fit dPeak model, and export list of predicted binding sites for downstream analysis.
ChIPSeq, Genetics, Sequencing, Software, Transcription8
49
Dune
Hector Roux de Bezieux [aut, cre] (<https://orcid.org/0000-0002-1489-8339>), Kelly Street [aut]
Hector Roux de Bezieux <hector.rouxdebezieux@berkeley.edu>
Given a set of clustering labels, Dune merges pairs of clusters to increase mean ARI between labels, improving replicability.
Clustering, GeneExpression, RNASeq, Software, SingleCell,
Transcriptomics, Visualization
30
50
easyreporting
Dario Righelli [cre, aut]
Dario Righelli <dario.righelli@gmail.com>
An S4 class for facilitating the automated creation of rmarkdown files inside other packages/software, even without knowing rmarkdown language. Best if implemented in functions as recursive style programming.
ReportWriting57
51
eisaR
Michael Stadler [aut, cre], Dimos Gaidatzis [aut], Lukas Burger [aut], Charlotte Soneson [aut]
Michael Stadler <michael.stadler@fmi.ch>
Exon-intron split analysis (EISA) uses ordinary RNA-seq data to measure changes in mature RNA and pre-mRNA reads across different experimental conditions to quantify transcriptional and post-transcriptional regulation of gene expression. For details see Gaidatzis et al., Nat Biotechnol 2015. doi: 10.1038/nbt.3269. eisaR implements the major steps of EISA in R.
Transcription, GeneExpression, GeneRegulation,
FunctionalGenomics, Transcriptomics, Regression, RNASeq
47
52
EnMCBXin Yu
Xin Yu <whirlsyu@gmail.com>
Creation of the correlated blocks using DNA methylation profiles. A stacked ensemble of machine learning models, which combined the support vector machine and elastic-net regression model, can be constructed to predict disease progression.
Normalization, DNAMethylation, MethylationArray,
SupportVectorMachine
57
53
EpiTxDb
Felix G.M. Ernst [aut, cre] (<https://orcid.org/0000-0001-5064-0928>)
Felix G.M. Ernst <felix.gm.ernst@outlook.com>
EpiTxDb facilitates the storage of epitranscriptomic information. More specifically, it can keep track of modification identity, position, the enzyme for introducing it on the RNA, a specifier which determines the position on the RNA to be modified and the literature references each modification is associated with.
Software, Epitranscriptomics31
54
exomePeak2Zhen Wei
Zhen Wei <zhen.wei10@icloud.com>
exomePeak2 provides bias awared quantification and peak detection on Methylated RNA immunoprecipitation sequencing data (MeRIP-Seq). MeRIP-Seq is a commonly applied sequencing technology to measure the transcriptome-wide location and abundance of RNA modification sites under a given cellular condition. However, the quantification and peak calling in MeRIP-Seq are sensitive to PCR amplification bias which is prevalent in next generation sequencing (NGS) techniques. In addition, the RNA-Seq based count data exhibits biological variation in small reads count. exomePeak2 collectively address these challanges by introducing a rich set of robust data science models tailored for MeRIP-Seq. With exomePeak2, users can perform peak calling, modification site quantification, and differential analysis with a straightforward one-step function. Alternatively, users could define personalized methods for their own analysis through multi-step functions and diagnostic plots.
Sequencing, MethylSeq, RNASeq, ExomeSeq, Coverage,
Normalization, Preprocessing, ImmunoOncology,
DifferentialExpression
15
55
ExploreModelMatrix
Charlotte Soneson [aut, cre] (<https://orcid.org/0000-0003-3833-2169>), Federico Marini [aut] (<https://orcid.org/0000-0003-3252-7758>), Michael Love [aut] (<https://orcid.org/0000-0001-8401-0545>), Florian Geier [aut] (<https://orcid.org/0000-0002-9076-9264>), Michael Stadler [aut] (<https://orcid.org/0000-0002-2269-4934>)
Charlotte Soneson <charlottesoneson@gmail.com>
Given a sample data table and a design formula, generate an interactive application to explore the resulting design matrix.
ExperimentalDesign, Regression, DifferentialExpression8
56
fcoex
Tiago Lubiana [aut, cre], Helder Nakaya [aut, ths]
Tiago Lubiana <tiago.lubiana.alves@usp.br>
The fcoex package implements an easy-to use interface to co-expression analysis based on the FCBF (Fast Correlation-Based Filter) algorithm. it was implemented especifically to deal with single-cell data. The modules found can be used to redefine cell populations, unrevel novel gene associations and predict gene function by guilt-by-association. The package structure is adapted from the CEMiTool package, relying on visualizations and code designed and written by CEMiTool's authors.
GeneExpression, Transcriptomics, GraphAndNetwork,
mRNAMicroarray, RNASeq, Network, NetworkEnrichment, Pathways,
ImmunoOncology, SingleCell
127
57
fcScan
Abdallah El-Kurdi <ak161@aub.edu.lb> Ghiwa khalil <gk39@aub.edu.lb> Georges Khazen <gkhazen@lau.edu.lb> Pierre Khoueiry <pk17@aub.edu.lb>
Pierre Khoueiry <pk17@aub.edu.lb> Abdallah El-Kurdi <ak161@aub.edu.lb>
This package is used to detect combination of genomic coordinates falling within a user defined window size along with user defined overlap between identified neighboring clusters. It can be used for genomic data where the clusters are built on a specific chromosome or specific strand. Clustering can be performed with a greedy option allowing thus the presence of additional sites within the allowed window size.
GenomeAnnotation, Clustering115
58
flowSpecs
Jakob Theorell [aut, cre]
Jakob Theorell <jakob.theorell@ndcn.ox.ac.uk>
This package is intended to fill the role of conventional cytometry pre-processing software, for spectral decomposition, transformation, visualization and cleanup, and to aid further downstream analyses, such as with DepecheR, by enabling transformation of flowFrames and flowSets to dataframes. Functions for flowCore-compliant automatic 1D-gating/filtering are in the pipe line. The package name has been chosen both as it will deal with spectral cytometry and as it will hopefully give the user a nice pair of spectacles through which to view their data.
Software,CellBasedAssays,DataRepresentation,ImmunoOncology,
FlowCytometry,SingleCell,Visualization,Normalization,DataImport
141
59
flowSpy
Yuting Dai [aut, cre]
Yuting Dai <forlynna@sjtu.edu.cn>
A trajectory inference and visualization toolkit for flow and mass cytometry data. flowSpy offers complete analyzing workflow for flow and mass cytometry data. flowSpy can be a valuable tool for application ranging from clustering and dimensionality reduction to trajectory reconstruction and pseudotime estimation for flow and mass cytometry data.
CellBiology, Clustering, Visualization, Software,
CellBasedAssays, FlowCytometry, NetworkInference, Network
135
60
FRASER
Christian Mertes <mertes@in.tum.de>, Ines Scheller <scheller@in.tum.de>, Prof. Julien Gagneur <gagneur@in.tum.de>
Christian Mertes <mertes@in.tum.de>
Detection of rare aberrant splicing events in transcriptome profiles. The workflow aims to assist the diagnostics in the field of rare diseases where RNA-seq is performed to identify aberrant splicing defects.
RNASeq, AlternativeSplicing, Sequencing, Software, Genetics,
Coverage
7
61
frenchFISH
Adam Berman, Geoff Macintyre
Adam Berman <agb61@cam.ac.uk>
FrenchFISH comprises a nuclear volume correction method coupled with two types of Poisson models: either a Poisson model for improved manual spot counting without the need for control probes; or a homogenous Poisson Point Process model for automated spot counting.
Software, BiomedicalInformatics, CellBiology, Genetics,
HiddenMarkovModel, Preprocessing
20
62
GCSConnectionJiefei Wang
Jiefei Wang <szwjf08@gmail.com>
Create R 'connection' objects to google cloud storage buckets using the Google REST interface. Both read and write connections are supported. The package also provide functions to view and manage files on Google Cloud.
Infrastructure25
63
GCSscore
Guy M. Harris & Shahroze Abbas & Michael F. Miles
Guy M. Harris <harrisgm@vcu.edu>
For differential expression analysis of 3'IVT and WT-style microarrays from Affymetrix/Thermo-Fisher. Based on S-score algorithm originally described by Zhang et al 2002.
DifferentialExpression, Microarray, OneChannel,
ProprietaryPlatforms, DataImport
114
64
gemini
Mahdi Zamanighomi [aut], Sidharth Jain [aut, cre]
Sidharth Jain <sidharthsjain@gmail.com>
GEMINI uses log-fold changes to model sample-dependent and independent effects, and uses a variational Bayes approach to infer these effects. The inferred effects are used to score and identify genetic interactions, such as lethality and recovery. More details can be found in Zamanighomi et al. 2019 (in press).
Software, CRISPR, Bayesian, DataImport115
65
GeneTonic
Federico Marini [aut, cre] (<https://orcid.org/0000-0003-3252-7758>)
Federico Marini <marinif@uni-mainz.de>
This package provides a Shiny application that aims to combine at different levels the existing pieces of the transcriptome data and results, in a way that makes it easier to generate insightful observations and hypothesis - combining the benefits of interactivity and reproducibility, e.g. by capturing the features and gene sets of interest highlighted during the live session, and creating an HTML report as an artifact where text, code, and output coexist.
GUI, GeneExpression, Software, Transcription,
Transcriptomics, Visualization, DifferentialExpression,
Pathways, ReportWriting, GeneSetEnrichment, Annotation,
Pathways, GO
34
66
GenomicOZone
Hua Zhong, Mingzhou Song
Hua Zhong<zh9118@gmail.com>, Mingzhou Song <joemsong@cs.nmsu.edu>
The package clusters gene activity along chromosome into zones, detects differential zones as outstanding, and visualizes maps of outstanding zones across the genome. The method guarantees cluster optimality, linear runtime to sample size, and reproducibility. It enables new characterization of effects due to genome reorganization, structural variation, and epigenome alteration.
Software, GeneExpression, Transcription,
DifferentialExpression, FunctionalPrediction, GeneRegulation,
BiomedicalInformatics, CellBiology, FunctionalGenomics,
Genetics, SystemsBiology, Transcriptomics, Clustering,
Regression, RNASeq, Annotation, Visualization, Sequencing,
Coverage, DifferentialMethylation, GenomicVariation,
StructuralVariation
115
67
GGPA
Dongjun Chung, Hang J. Kim, Carter Allen
Dongjun Chung <dongjun.chung@gmail.com>
Genome-wide association studies (GWAS) is a widely used tool for identification of genetic variants associated with phenotypes and diseases, though complex diseases featuring many genetic variants with small effects present difficulties for traditional these studies. By leveraging pleiotropy, the statistical power of a single GWAS can be increased. This package provides functions for fitting graph-GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy. 'GGPA' package provides user-friendly interface to fit graph-GPA models, implement association mapping, and generate a phenotype graph.
Software, StatisticalMethod, Classification,
GenomeWideAssociation, SNP, Genetics, Clustering,
MultipleComparison, Preprocessing, GeneExpression,
DifferentialExpression
3
68
glmGamPoi
Constantin Ahlmann-Eltze [aut, cre] (<https://orcid.org/0000-0002-3762-068X>), Michael Love [ctb]
Constantin Ahlmann-Eltze <artjom31415@googlemail.com>
Fit linear models to overdispersed count data. The package can estimate the overdispersion and fit repeated models for matrix input. It is designed to handle large input datasets as they typically occur in single cell RNA-seq experiments.
Regression, RNASeq, Software, SingleCell8
69
GmicR
Richard Virgen-Slane
Richard Virgen-Slane <RVS.BioTools@gmail.com>
This package uses bayesian network learning to detect relationships between Gene Modules detected by WGCNA and immune cell signatures defined by xCell. It is a hypothesis generating tool.
Software, SystemsBiology, GraphAndNetwork, Network,
NetworkInference, GUI, ImmunoOncology, GeneExpression,
QualityControl, Bayesian, Clustering
144
70
gmoviz
Kathleen Zeglinski [cre, aut], Arthur Hsu [aut], Monther Alhamdoosh [aut] (<https://orcid.org/0000-0002-2411-1325>), Constantinos Koutsakis [aut]
Kathleen Zeglinski <kathleen.zeglinski@csl.com.au>
Genetically modified organisms (GMOs) and cell lines are widely used models in all kinds of biological research. As part of characterising these models, DNA sequencing technology and bioinformatics analyses are used systematically to study their genomes. Therefore, large volumes of data are generated and various algorithms are applied to analyse this data, which introduces a challenge on representing all findings in an informative and concise manner. `gmoviz` provides users with an easy way to visualise and facilitate the explanation of complex genomic editing events on a larger, biologically-relevant scale.
Visualization, Sequencing, GeneticVariability,
GenomicVariation, Coverage
31
71
GPA
Dongjun Chung, Emma Kortemeier, Carter Allen
Dongjun Chung <dongjun.chung@gmail.com>
This package provides functions for fitting GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy information and annotation data. In addition, it also includes ShinyGPA, an interactive visualization toolkit to investigate pleiotropic architecture.
Software, StatisticalMethod, Classification,
GenomeWideAssociation, SNP, Genetics, Clustering,
MultipleComparison, Preprocessing, GeneExpression,
DifferentialExpression
5
72
gramm4R
Mengci Li, Dandan Liang, Tianlu Chen and Wei Jia
Tianlu Chen <chentianlu@sjtu.edu.cn>
Generalized Correlation Analysis for Metabolome and Microbiome (GRaMM), for inter-correlation pairs discovery among metabolome and microbiome.
GraphAndNetwork,Microbiome132
73
gscreend
Katharina Imkeller [cre, aut], Wolfgang Huber [aut]
Katharina Imkeller <k.imkeller@dkfz.de>
Package for the analysis of pooled genetic screens (e.g. CRISPR-KO). The analysis of such screens is based on the comparison of gRNA abundances before and after a cell proliferation phase. The gscreend packages takes gRNA counts as input and allows detection of genes whose knockout decreases or increases cell proliferation.
Software, StatisticalMethod, PooledScreens, CRISPR119
74
HCAExplorer
Daniel Van Twisk [aut], Martin Morgan [aut], Bioconductor Package Maintainer [cre]
Bioconductor Package Maintainer <maintainer@bioconductor.org>
Search, browse, reference, and download resources from the Human Cell Atlas data portal. Development of this package is supported through funds from the Chan / Zuckerberg initiative.
DataImport, Sequencing121
75
HiLDA
Zhi Yang [aut, cre], Yuichi Shiraishi [ctb]
Zhi Yang <zyang895@gmail.com>
A package built under the Bayesian framework of applying hierarchical latent Dirichlet allocation to statistically test whether the mutational exposures of mutational signatures (Shiraishi-model signatures) are different between two groups.
Software, SomaticMutation, Sequencing, StatisticalMethod,
Bayesian
109
76
HIPPO
Tae Kim [aut, cre], Mengjie Chen [aut]
Tae Kim <tk382@uchicago.edu>
For scRNA-seq data, it selects features and clusters the cells simultaneously for single-cell UMI data. It has a novel feature selection method using the zero inflation instead of gene variance, and computationally faster than other existing methods since it only relies on PCA+Kmeans rather than graph-clustering or consensus clustering.
Sequencing, SingleCell, GeneExpression,
DifferentialExpression, Clustering
14
77
idr2d
Konstantin Krismer [aut, cre, cph] (<https://orcid.org/0000-0001-8994-3416>), David Gifford [ths, cph] (<https://orcid.org/0000-0003-1709-4034>)
Konstantin Krismer <krismer@mit.edu>
A tool to measure reproducibility between genomic experiments that produce two-dimensional peaks (interactions between peaks), such as ChIA-PET, HiChIP, and HiC. idr2d is an extension of the original idr package, which is intended for (one-dimensional) ChIP-seq peaks.
DNA3DStructure, GeneRegulation, PeakDetection, Epigenetics,
FunctionalGenomics, Classification, HiC
130
78
IgGeneUsage
Simo Kitanovski [aut, cre]
Simo Kitanovski <simo.kitanovski@uni-due.de>
Decoding the properties of immune repertoires is key in understanding the response of adaptive immunity to challenges such as viral infection. One important task in immune repertoire profiling is the detection of biases in Ig gene usage between biological conditions. IgGeneUsage is a computational tool for the analysis of differential gene usage in immune repertoires. It employs Bayesian hierarchical models to fit complex gene usage data from immune repertoire sequencing experiments and quantifies Ig gene usage biases as probabilities.
DifferentialExpression, Regression, Genetics, Bayesian120
79
iSEEu
Kevin Rue-Albrecht [aut, cre] (<https://orcid.org/0000-0003-3899-3872>), Charlotte Soneson [aut] (<https://orcid.org/0000-0003-3833-2169>), Federico Marini [aut] (<https://orcid.org/0000-0003-3252-7758>), Aaron Lun [aut] (<https://orcid.org/0000-0002-3564-4813>), Michael Stadler [ctb]
Kevin Rue-Albrecht <kevinrue67@gmail.com>
iSEEu (the iSEE universe) contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels, or modes allowing easy configuration of iSEE applications.
ImmunoOncology, Visualization, GUI, DimensionReduction,
FeatureExtraction, Clustering, Transcription, GeneExpression,
Transcriptomics, SingleCell, CellBasedAssays
19
80
KnowSeq
Daniel Castillo-Secilla, Juan Manuel Galvez, Francisco Carrillo-Perez, Marta Verona-Almeida, Francisco Manuel Ortuno, Luis Javier Herrera and Ignacio Rojas.
Daniel Castillo-Secilla <cased@ugr.es>
KnowSeq proposes a novel methodology that comprises the most relevant steps in the Transcriptomic gene expression analysis. KnowSeq expects to serve as an integrative tool that allows to process and extract relevant biomarkers, as well as to assess them through a Machine Learning approaches. Finally, the last objective of KnowSeq is the biological knowledge extraction from the biomarkers (Gene Ontology enrichment, Pathway listing and Visualization and Evidences related to the addressed disease). Although the package allows analyzing all the data manually, the main strenght of KnowSeq is the possibilty of carrying out an automatic and intelligent HTML report that collect all the involved steps in one document. It is important to highligh that the pipeline is totally modular and flexible, hence it can be started from whichever of the different steps. KnowSeq expects to serve as a novel tool to help to the experts in the field to acquire robust knowledge and conclusions for the data and diseases to study.
GeneExpression, DifferentialExpression, GeneSetEnrichment,
DataImport, Classification, FeatureExtraction, Sequencing,
RNASeq, BatchEffect, Normalization, Preprocessing,
QualityControl, Genetics, Transcriptomics, Microarray,
Alignment, Pathways, SystemsBiology, GO, ImmunoOncology
129
81
LACE
Daniele Ramazzotti [aut], Fabrizio Angaroni [aut], Davide Maspero [cre, aut], Alex Graudenzi [aut]
Davide Maspero <d.maspero@campus.unimib.it>
LACE is an algorithmic framework that processes single-cell somatic mutation profiles from cancer samples collected at different time points and in distinct experimental settings, to produce longitudinal models of cancer evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a weighed likelihood function computed on multiple time points.
BiomedicalInformatics, SingleCell, SomaticMutation3
82
LinkHD
Laura M. Zingaretti [aut, cre]
\Laura M Zingaretti\" <m.lau.zingaretti@gmail.com>"
Here we present Link-HD, an approach to integrate heterogeneous datasets, as a generalization of STATIS-ACT (“Structuration des Tableaux A Trois Indices de la Statistique–Analyse Conjointe de Tableaux”), a family of methods to join and compare information from multiple subspaces. However, STATIS-ACT has some drawbacks since it only allows continuous data and it is unable to establish relationships between samples and features. In order to tackle these constraints, we incorporate multiple distance options and a linear regression based Biplot model in order to stablish relationships between observations and variable and perform variable selection.
Classification,MultipleComparison,Regression,Software121
83
lionessR
Marieke Lydia Kuijjer [aut] (<https://orcid.org/0000-0001-6280-3130>), Ping-Han Hsieh [cre] (<https://orcid.org/0000-0003-3054-1409>)
Ping-Han Hsieh <dn070017@gmail.com>
LIONESS, or Linear Interpolation to Obtain Network Estimates for Single Samples, can be used to reconstruct single-sample networks (https://arxiv.org/abs/1505.06440). This code implements the LIONESS equation in the lioness function in R to reconstruct single-sample networks. The default network reconstruction method we use is based on Pearson correlation. However, lionessR can run on any network reconstruction algorithms that returns a complete, weighted adjacency matrix. lionessR works for both unipartite and bipartite networks.
Network, NetworkInference, GeneExpression123
84
Maaslin2
Himel Mallick [aut], Ali Rahnavard [aut], Lauren McIver [aut, cre]
Lauren McIver <lauren.j.mciver@gmail.com>
MaAsLin2 is comprehensive R package for efficiently determining multivariable association between clinical metadata and microbial meta'omic features. MaAsLin2 relies on general linear models to accommodate most modern epidemiological study designs, including cross-sectional and longitudinal, and offers a variety of data exploration, normalization, and transformation methods. MaAsLin2 is the next generation of MaAsLin.
Metagenomics, Software, Microbiome, Normalization323
85
MACSQuantifyR
Raphaël Bonnet [aut, cre], Marielle Nebout [dtc],Giulia Biondani [dtc], Jean-François Peyron[aut,ths], Inserm [fnd]
Raphaël Bonnet <raphael.bonnet@univ-cotedazur.fr>
Automatically process the metadata of MACSQuantify FACS sorter. It runs multiple modules: i) imports of raw file and graphical selection of duplicates in well plate, ii) computes statistics on data and iii) can compute combination index.
DataImport, Preprocessing, Normalization, FlowCytometry,
DataRepresentation, GUI
106
86
MatrixGenerics
Constantin Ahlmann-Eltze [aut] (<https://orcid.org/0000-0002-3762-068X>), Peter Hickey [aut, cre] (<https://orcid.org/0000-0002-8153-6258>)
Peter Hickey <peter.hickey@gmail.com>
S4 generic functions modeled after the 'matrixStats' API for alternative matrix implementations. Packages with alternative matrix implementation can depend on this package and implement the generic functions that are defined here for a useful set of row and column summary statistics. Other package developers can import this package and handle a different matrix implementations without worrying about incompatibilities.
Infrastructure, Software12
87
MBQN
Ariane Schad [aut, cre] (<https://orcid.org/0000-0002-1921-8902>), Clemens Kreutz [aut, ctb] (<https://orcid.org/0000-0002-8796-5766>), Eva Brombacher [aut, ctb] (<https://orcid.org/0000-0002-5488-0985>)
Ariane Schad <ariane.schad@fdm.uni-freiburg.de>
Modified quantile normalization for omics or other matrix-like data distorted in location and scale.
Normalization, Preprocessing, Proteomics, Software67
88
MEAT
Sarah Voisin [aut, cre] (<https://orcid.org/0000-0002-4074-7083>), Steve Horvath [ctb] (<https://orcid.org/0000-0002-4110-3589>)
Sarah Voisin <sarah.voisin.aeris@gmail.com>
This package estimates epigenetic age in skeletal muscle, using DNA methylation data generated with Illumina Infinium technology (HM27, HM450 and HMEPIC).
Epigenetics, DNAMethylation, Microarray, Normalization,
BiomedicalInformatics, MethylationArray, Preprocessing
26
89
MEB
Yan Zhou, Jiadi Zhu
Jiadi Zhu <2160090406@email.szu.edu.cn>, Yan Zhou <zhouy1016@szu.edu.cn>
Identifying differentially expressed genes between the same or different species is an urgent demand for biological and medical research. For RNA-seq data, systematic technical effects and different sequencing depths are usually encountered when conducting experiments. Normalization is regarded as an essential step in the discovery of biologically important changes in expression. The present methods usually involve normalization of the data with a scaling factor, followed by detection of significant genes. However, more than one scaling factor may exist because of the complexity of real data. Consequently, methods that normalize data by a single scaling factor may deliver suboptimal performance or may not even work. The development of modern machine learning techniques has provided a new perspective regarding discrimination between differentially expressed (DE) and non-DE genes. However, in reality, the non-DE genes comprise only a small set and may contain housekeeping genes (in same species) or conserved orthologous genes (in different species). Therefore, the process of detecting DE genes can be formulated as a one-class classification problem, where only non-DE genes are observed, while DE genes are completely absent from the training data. We transform the problem to an outlier detection problem by treating DE genes as outliers, and we propose a normalization-invariant minimum enclosing ball (NIMEB) method to construct a smallest possible ball to contain the known non-DE genes in a feature space. The genes outside the minimum enclosing ball can then be naturally considered to be DE genes. Compared with the existing methods, the proposed NIMEB method does not require data normalization, which is particularly attractive when the RNA-seq data include more than one scaling factor. Furthermore, the NIMEB method could be easily extended to different species without normalization.
DifferentialExpression, GeneExpression, Normalization,
Classification, Sequencing
99
90
metaseqR2
Panagiotis Moulos [aut, cre]
Panagiotis Moulos <moulos@fleming.gr>
Provides an interface to several normalization and statistical testing packages for RNA-Seq gene expression data. Additionally, it creates several diagnostic plots, performs meta-analysis by combinining the results of several statistical tests and reports the results in an interactive way.
Software, GeneExpression, DifferentialExpression,
WorkflowStep, Preprocessing, QualityControl, Normalization,
ReportWriting, RNASeq, Transcription, Sequencing,
Transcriptomics, Bayesian, Clustering, CellBiology,
BiomedicalInformatics, FunctionalGenomics, SystemsBiology,
ImmunoOncology, AlternativeSplicing, DifferentialSplicing,
MultipleComparison, TimeCourse, DataImport, ATACSeq,
Epigenetics, Regression, ProprietaryPlatforms,
GeneSetEnrichment, BatchEffect, ChIPSeq
3
91
MetaVolcanoR
Cesar Prada [aut, cre], Diogenes Lima [aut], Helder Nakaya [aut, ths]
Cesar Prada <cesar.prada@usp.br>
MetaVolcanoR combines differential gene expression results. It implements three strategies to summarize differential gene expression from different studies. i) Random Effects Model (REM) approach, ii) a p-value combining-approach, and iii) a vote-counting approach. In all cases, MetaVolcano exploits the Volcano plot reasoning to visualize the gene expression meta-analysis results.
GeneExpression, DifferentialExpression, Transcriptomics,
mRNAMicroarray, RNASeq
164
92
MethCP
Boying Gong [aut, cre]
Boying Gong <jorothy_gong@berkeley.edu>
MethCP is a differentially methylated region (DMR) detecting method for whole-genome bisulfite sequencing (WGBS) data, which is applicable for a wide range of experimental designs beyond the two-group comparisons, such as time-course data. MethCP identifies DMRs based on change point detection, which naturally segments the genome and provides region-level differential analysis.
DifferentialMethylation, Sequencing, WholeGenome, TimeCourse100
93
methrix
Anand Mayakonda [aut, cre], Reka Toth [aut], Maximilian Schönung [ctb], Pavlo Lutsik [ctb], Joschka Hey [ctb]
Anand Mayakonda <anand_mt@hotmail.com>
Bedgraph files generated by Bisulfite pipelines often come in various flavors. Critical downstream step requires summarization of these files into methylation/coverage matrices. This step of data aggregation is done by Methrix, including many other useful downstream functions.
DNAMethylation, Sequencing, Coverage127
94
methylCC
Stephanie C. Hicks [aut, cre] (<https://orcid.org/0000-0002-7858-0231>), Rafael Irizarry [aut] (<https://orcid.org/0000-0002-3944-4309>)
Stephanie C. Hicks <shicks19@jhu.edu>
A tool to estimate the cell composition of DNA methylation whole blood sample measured on any platform technology (microarray and sequencing).
Microarray, Sequencing, DNAMethylation, MethylationArray,
MethylSeq, WholeGenome
126
95
methylSig
Yongseok Park [aut], Raymond G. Cavalcante [aut, cre]
Raymond G. Cavalcante <rcavalca@umich.edu>
MethylSig is a package for testing for differentially methylated cytosines (DMCs) or regions (DMRs) in whole-genome bisulfite sequencing (WGBS) or reduced representation bisulfite sequencing (RRBS) experiments. MethylSig uses a beta binomial model to test for significant differences between groups of samples. Several options exist for either site-specific or sliding window tests, and variance estimation.
DNAMethylation, DifferentialMethylation, Epigenetics,
Regression, MethylSeq
7
96
microbiomeDASim
Justin Williams, Hector Corrada Bravo, Jennifer Tom, Joseph Nathaniel Paulson
Justin Williams <williazo@ucla.edu>
A toolkit for simulating differential microbiome data designed for longitudinal analyses. Several functional forms may be specified for the mean trend. Observations are drawn from a multivariate normal model. The objective of this package is to be able to simulate data in order to accurately compare different longitudinal methods for differential abundance.
Microbiome, Visualization, Software108
97
MicrobiotaProcess
Shuangbin Xu [aut, cre] (<https://orcid.org/0000-0003-3513-5362>), Guangchuang Yu [aut, ctb] (<https://orcid.org/0000-0002-6485-8781>)
Shuangbin Xu <xshuangbin@163.com>
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It supports calculating alpha index and provides functions to visualize rarefaction curves. Moreover, it also supports visualizing the abundance of taxonomy of samples. And It also provides functions to perform the PCA, PCoA and hierarchical cluster analysis. In addition, MicrobiotaProcess also provides a method for the biomarker discovery of metagenome or other datasets.
Visualization, Microbiome, Software, MultipleComparison,
FeatureExtraction
14
98
mitch
Mark Ziemann [aut, cre, cph], Antony Kaspi [aut, cph]
Mark Ziemann <mark.ziemann@gmail.com>
mitch is an R package for multi-contrast enrichment analysis. At it’s heart, it uses a rank-MANOVA based statistical approach to detect sets of genes that exhibit enrichment in the multidimensional space as compared to the background. The rank-MANOVA concept dates to work by Cox and Mann (https://doi.org/10.1186/1471-2105-13-S16-S12). mitch is useful for pathway analysis of profiling studies with one, two or more contrasts, or in studies with multiple omics profiling, for example proteomic, transcriptomic, epigenomic analysis of the same samples. mitch is perfectly suited for pathway level differential analysis of scRNA-seq data. The main strengths of mitch are that it can import datasets easily from many upstream tools and has advanced plotting features to visualise these enrichments.
GeneExpression, GeneSetEnrichment, SingleCell,
Transcriptomics, Epigenetics, Proteomics,
DifferentialExpression, Reactome
53
99
MMAPPR2
Kyle Johnsen [aut], Nathaniel Jenkins [aut], Jonathon Hill [cre]
Jonathon Hill <jhill@byu.edu>
MMAPPR2 maps mutations resulting from pooled RNA-seq data from the F2 cross of forward genetic screens. Its predecessor is described in a paper published in Genome Research (Hill et al. 2013). MMAPPR2 accepts aligned BAM files as well as a reference genome as input, identifies loci of high sequence disparity between the control and mutant RNA sequences, predicts variant effects using Ensembl's Variant Effect Predictor, and outputs a ranked list of candidate mutations.
RNASeq, PooledScreens, DNASeq, VariantDetection72
100
MMUPHinSiyuan Ma
Siyuan MA <siyuanma@g.harvard.edu>
MMUPHin is an R package for meta-analysis tasks of microbiome cohorts. It has function interfaces for: a) covariate-controlled batch- and cohort effect adjustment, b) meta-analysis differential abundance testing, c) meta-analysis unsupervised discrete structure (clustering) discovery, and d) meta-analysis unsupervised continuous structure discovery.
Metagenomics, Microbiome, BatchEffect126