Having worked on and off on it for a while I’ve finally uploaded my DNA compression software pufferfish to Github. It’s still a little way off being fully ready, but at the moment it offers bidirectional compression and decompression, with output formatting and line numbering where appropriate.
The focus was very much on making a super lightweight software (the compiled binary is only about 200Kbs) with robust error handling and file input. Compressed files are defined using the .pfsh extension, although this is not a prerequisite for decompression. I even made a logo (perhaps not the most pressing detail, but we by chance had a clay pufferfish sitting in our front room – bizarre!)
The next stage is to build Huffman encoding into the compression scheme I’ve already constructed to add a second layer of compression. Beyond that, I’d like to add a client/server mode for direct DNA transmission – by creating a direct link between you and your recipient you avoid the potentially massive overhead of a third party server. In this day and age, where computers are online 24/7, it seems totally reasonable that a system could have pufferfish running in listen mode, ready to accept direct DNA transmission. I may have the futz with the associated security/authentication a bit to make this secure, but given that it would be a super simple setup transmitting non-binary data I suspect the security risk would be minimal, especially if you used some basic public/private key cryptography to avoid man-in-the-middle attacks.
Beyond that, I have a few other things I’d like to implement, but that’s for the future….