PropertyValue
is nif:broaderContext of
nif:broaderContext
is schema:hasPart of
schema:isPartOf
nif:isString
  • Drawing upon open-source general-purpose machine learning algorithms and libraries, we have developed a software package IDEPI (IDentify EPItopes) for learning genotype-to-phenotype predictive models from sequences with known phenotypes. IDEPI can apply learned models to classify sequences of unknown phenotypes, and also identify specific sequence features which contribute to a particular phenotype. The cross-platform Python source code (released under the GPL 3.0 license), documentation, issue tracking, and a pre-configured virtual machine for IDEPI can be found at https://github.com/veg/idepi. To provide a unified solution for these and similar problems, we designed IDEPI – a domain-specific and extensible software library for supervised learning of models that relate genotype to phenotype for HIV-1 and other organisms. IDEPI makes use of open source libraries for machine learning (scikit-learn, scikit-learn.org/), sequence alignment (HMMER, hmmer.janelia.org/), sequence manipulation (BioPython, biopython.org), and parallelization (joblib, pythonhosted.org/joblib), and provides a programming interface which allows users to engineer sequence features and select machine learning algorithms appropriate for their application. A Virtual Machine for Oracle's VirtualBox has also been built to provide easy access to IDEPI for users unfamiliar with the intricacies of Python package management, and is available from the main package distribution page (http://github.com/veg/idepi/). The complete source code tree, example files, and documentation for IDEPI; for the most current version visit the package distribution page at https://github.com/veg/idepi.
rdf:type