khmer v2.1 release notes ======================== We are pleased to announce release version 2.1 of khmer! This release includes several new features, bug fixes, and internal changes. `CHANGELOG.md `__ includes a complete description of changes made since the previous release. A concise summary of the most relevant changes is provided below. The latest version of the khmer documentation is available at https://khmer.readthedocs.org/en/v2.1/. Items of note ------------- New features ~~~~~~~~~~~~ - New storage class using half a byte per entry. Exposed as SmallCounttable and SmallCountgraph. - New Counttable, SmallCounttable, and Nodetable classes to support non-reversible hashing functionality and k > 32. - Support for human-friendly memory requests (2G) in addition to the previous style of requests (2000000000 or 2e9). - Support for variable-coverage trimming in the ``filter-abund-single.py`` script. - khmer package version now included in ``.info`` files. - Several simple examples of the Python API and the C++ API in ``examples/python-api`` and ``examples/c++-api``, respectively. - Support for assembling directly from k-mer graphs, and a new JunctionCountAssembler class. - New sandbox script ``extract-compact-dbg.py`` for computing a compact de Bruijn graph from sequence data. Bug fixes ~~~~~~~~~ - Streaming gzip-compressed output from scripts now works correctly. - The ``load-graph.py`` script now calculates required graph space correctly. - The ``broken_paired_reader`` no longer drops short reads when ``require_paired`` is set. Other changes ~~~~~~~~~~~~~ - Command-line options ``-x`` and ``-N`` now suppressed by default in script help messages. - Renamed core data structures: CountingHash --> Countgraph, Hashbits --> Nodegraph. - Replaced the ``IParser`` and ``FastxParser`` classes with a single ``ReadParser`` class. Different input formats are supported by templating ``ReadParser`` with a reader class. - Renamed ``consume_fasta`` and related functions to ``consume_seqfile``, with support for reading sequences from additional formats pending. Known issues: ------------- - ``load-graph.py`` in multithreaded mode will find slightly different number of unique kmers. This is being investigated in https://github.com/dib-lab/khmer/issues/1248 - Any scripts that consume FASTA or FASTQ data are unresponsive to attempts to terminate the program with ``ctrl-c``. Eventually khmer should handle this properly, but for now it should be possible to halt a script using ``ctrl-\``. This is being tracked in https://github.com/dib-lab/khmer/issues/1618. Contributors ------------ ‡@aaliyari, @ctb, ‡@caitsydney, @camillescott, @standage, @kdmurray91, @ljcohen, @luizirber, @mr-c, ‡@NBKingsley, ‡@ryneches, ‡@rcs333, ‡@satta, ‡@shannonekj, ‡@betatim ‡ Indicates new contributors