Add a second transformer#

A couple of sections back (in Add a new Artifact Class), I noted that adding transformers was an obscure step for a lot of new plugin developers. Let’s circle back to that now with the goal of developing a better understanding of the role of transformers in QIIME 2, and also to simplify the code for generating usage examples that we just wrote.

Take a minute to review the helper function we defined in our _examples.py file, and try to describe in a sentence or two what that code is doing. Here it is again, for reference:

def _create_seq_artifact(seq: skbio.DNA):
    ff = SingleRecordDNAFASTAFormat()
    seq.write(str(ff.path))
    return qiime2.Artifact.import_data("SingleDNASequence", ff)
Here’s my description of what this is doing, but come up with your own before looking at this.

This code is transforming (or converting) an skbio.DNA object into a q2_dwq2.SingleRecordDNAFASTAFormat object, and then importing that format into a QIIME 2 artifact.

Transformers in QIIME 2 are designed to handle converstions between objects behind the scenes, so that users don’t ever have to think about this, and developers can think about it as infrequently as possible. In this section, we’ll do a small refactor of the code we wrote in the previous section.

tl;dr

The code that I wrote for this section can be found here: caporaso-lab/q2-dwq2.

Define a transformer from skbio.DNA to q2_dwq2.SingleRecordDNAFASTAFormat#

The first transformer that we wrote transforms our q2_dwq2.SingleRecordDNAFASTAFormat object to an skbio.DNA object, so that we can view artifacts of class SingleDNASequence as skbio.DNA objects when we work with them. As a developer, skbio.DNA objects are easier to create and use than q2_dwq2.SingleRecordDNAFASTAFormat objects, because they have convenient APIs. Once we have a helper function for creating q2_dwq2.SingleRecordDNAFASTAFormat objects from skbio.DNA objects, like the _create_seq_artifact function we wrote, q2_dwq2.SingleRecordDNAFASTAFormat objects are also trivial to create, but it still tends to be more convenient to create and use those via an skbio.DNA object since we then don’t have to directly deal with reading and writing files. QIIME 2 enables us to define and register functions that convert between object types as transformers, making them universally accessible in deployments where the plugin that defines and registers them is installed.

We can adapt the code from our _create_seq_artifact function into a new transformer in our _transformers.py file as follows:

@plugin.register_transformer
def _2(seq: DNA) -> SingleRecordDNAFASTAFormat:
    ff = SingleRecordDNAFASTAFormat()
    seq.write(str(ff.path))
    return ff

If you don’t recall exactly what this is doing, review the text that described this when we defined _create_seq_artifact. The only difference here is that we’re returning the SingleRecordDNAFASTAFormat, where in _create_seq_artifact we imported this into a qiime2.Aritfact as well.

This new transformer enables us to adapt our factory functions in _examples.py to look like the following:

def seq1_factory():
    seq = skbio.DNA("AACCGGTTGGCCAA", metadata={"id": "seq1"})
    return qiime2.Artifact.import_data(
        "SingleDNASequence", seq, view_type=skbio.DNA)


def seq2_factory():
    seq = skbio.DNA("AACCGCTGGCGAA", metadata={"id": "seq2"})
    return qiime2.Artifact.import_data(
        "SingleDNASequence", seq, view_type=skbio.DNA)

With this code, we’re still importing to a SingleDNASequence artifact class, but this time we’re doing it directly from an skbio.DNA view type. Under the hood, QIIME 2 checks to see if any transfomers are registered that transform a skbio.DNA to a skbio.SingleRecordDNAFASTADirectoryFormat (the format we associated with our artifact class). It finds a transformer from skbio.DNA to skbio.SingleRecordDNAFASTAFormat, and a transformer from skbio.SingleRecordDNAFASTAFormat to skbio.SingleRecordDNAFASTADirectoryFormat, so it applies that chain of transformers to import into the SingleDNASequence artifact class with the skbio.DNA object that we provided. Cool! 😎

At this point, we can delete the _create_seq_artifact function from _examples.py as we have centralized the functionality for performing the transformation that it did, and we moved the import step into the factories.

Add unit tests of the new transfomer#

As always, before this new code is ready for use, we need to write some unit tests. Here are the tests that I wrote in test_transformers.py:

    def test_DNA_to_single_record_fasta_simple1(self):
        in_ = DNA('ACCGGTGGAACCGGTAACACCCAC',
                  metadata={'id': 'example-sequence-1', 'description': ''})
        tx = self.get_transformer(DNA, SingleRecordDNAFASTAFormat)

        observed = tx(in_)
        # confirm "round-trip" of DNA -> SingleRecordDNAFASTAFormat -> DNA
        # results in an observed sequence that is the same as the starting
        # sequence
        self.assertEqual(observed.view(DNA), in_)

    def test_DNA_to_single_record_fasta_simple2(self):
        in_ = DNA('ACCGGTAACCGGTTAACACCCAC',
                  metadata={'id': 'example-sequence-2', 'description': ''})
        tx = self.get_transformer(DNA, SingleRecordDNAFASTAFormat)

        observed = tx(in_)
        self.assertEqual(observed.view(DNA), in_)

Review those to make sure that you understand them, and then copy/paste those into your plugin or write your own. Run make test to confirm that everything is working as expected.