Tuesday, July 8, 2014

Being safer in dynamic languages

Jakub Arnold writes about Phantom Types in Haskell and compares the safety provided to a solution using Ruby. This isn't an "us vs. them" blog post. I'm not going to argue that dynamic languages are as "safe" as modern typed languages. But neither will I capitulate that only a modern typed language will suffice. Your mileage may vary. If we meet for beers, I'm sure we could have an interesting discussion. At minimum, we'd have a good beer and talk about something else.


Experienced programmers in either approach will be more successful than inexperienced programmers in any approach. Here is one way to be "safer" in a dynamic language, using the same example as the original post.

The original post lists Ruby like so:

    def send_message(message, recipient)
      if message.encrypted
        # send logic
      else
        raise ArgumentError, "Can’t send a plain text message"
      end
    end

But a "safer" solution would look more object-oriented (I'll use Smalltalk):

    Tube>>send: aMessage to: aRecipient
        "Send aMessage to aRecipient where aMessage can represent itself as an encrypted string."
        | encryptedMessage |
        encryptedMessage := aMessage encrypted.
        "... send logic ..."

...where objects have the ability to encrypt themselves...

    Object>>encrypted
        "By default an object answers its string representation, encrypted."
        self asString encrypted

    String>>encrypted
        "Answer the string encrypted using some agreed-upon algorithm."
        self tinyEncrypt

    TinyEncryptedString>>encrypted
        "Encrypted strings just answer themselves."
        self


Oh the monkey patching! First of all, Smalltalk has so much better tools than Ruby. Smalltalkers don't have to denigrate the practice as "monkey patching". It's just "programming with objects". Not a utility class in sight and no confusion about where any of the code originates.

As an aside, you may not want to encrypt large objects as strings in memory. (Or self-referencing objects, without a seralization that can handle them.) This is making a point about programming in general, not serialization or encryption.

But mostly you don't want to sprinkle "if"'s and unnecessary error handling throughout your system. If you want an encrypted object, just ask the object to provide its own encrypted representation. Implement the default case, the special cases, and the case where the object is already encrypted.

You might decide certain objects are just not suitable to being encrypted. For example the tubes themselves, or windows, or processes, etc. Those objects might signal an exception. Standard Smalltalk defines a signalling mechanism similar to Common Lisp's condition system, where certain exceptions might be resumable, etc. This is another safety mechanism of good dynamic languages. Maybe the topic of a future blog post. Or you can read Practical Common Lisp's really good description now.

2 comments:

  1. I've just tried to post a super long comment but it seems to have disappeared when sending :( So this one will be shorter.

    I'm the original author of the article you're refering to, so here's my reaction.

    I don't think your approach applies in general. Someone already pointed out a similar thing in the comments and here's my response https://lobste.rs/s/5ekbap/using_phantom_types_in_haskell_for_extra_safety/comments/rhvlxy#c_rhvlxy, but let me give you another example.

    I guess my general problem with this is that you're more tightly coupling things together. In your example to be able to send a message, you need to also encrypt it (I guess we could say that it could've been encrypted before and it's just a NOOP bounds check, but let's say the encryption can happen only once).

    Also if the encryption can fail, you'd be delegating the failure handling to the code that "sends" the message out, instead of the code that "encrypts" it. Yes we could say that the error handling can happen inside the "encrypt", but in case it's fatal and needs to propagate it would go through the "sending" code. If the encryption would need to happen in another part of the system, you'd still have to do the conditional when sending the message.

    Imagine you're working with credit cards that need to be "verified" before you create a "transaction". The verification process is complicated and involves talking to a bank and making a void transaciton. It also depends on many other factors, so you can't just tell a "credit card" to verify itself right before sending the transaction. How would you go about solving this?

    Another simple example is something like an API, which has two endpoints that return a "thing" in different states, let's say one that returns "encrypted" and one that returns "plaintext" messages. You'd have no way of solving this other than using a conditional, since the process of getting the "thing" itself is outside of the scope of your program, all you need to do is verify that you're passing around the right one.

    ReplyDelete
  2. Hi Jakub -- thanks for taking the time to comment. I'm sorry your first attempt disappeared!

    You have a number of good follow-up aspects to the original post. I could probably write three or four full blog posts addressing them adequately.

    I'll respond to your lobsters comment directly on that site.

    Going through your comment above, here are some initial thoughts:

    1. "you're more tightly coupling things together"

    I read this as more tightly coupling the time of encryption of an object and the time of sending it as a message. As you hinted, using my code above if one wanted always to do the encryption ahead of time, then the application code could simply make sure to pass in an instance of (in this example) TinyEncryptedString and the the message to encrypt would indeed be a noop. The application would choose ahead of time when to perform the encryption.

    Certainly there are other ways of structuring the relationships of messages, encryption, and communication channels. The specifics requirements of a scenario might lead to different class and interactions among them.

    2. Failure handling -- I would encourage you to read the PCL reference in my post about the Common List condition system (or read about similar mechanisms in Smalltalk). These systems provide much more control than the typical throw/catch/finally mechanisms in most other languages.

    3. Verifying credit cards or other integrated verification scenarios -- yes, this is a significantly different scenario than the in-the-moment encryption scenario I was addressing. I'd rather not write that out in detail in this comment. That may be an interesting blog post in its own right. Integration can be a simple HTTP request or it can be quite elaborate when it requires multiple coordinated steps, potentially with rollback or non-local compensating actions in response to local failures (and vice versa).

    4. An API with two end points, one returning an encrypted object, the other not -- this is another interesting integration scenario that is beyond the original scope of these posts. Integration boundaries in general require, at some level of the receiving system, some kind of parsing and contract enforcement. Here again any solutions I would propose would be fairly different than what I addressed originally, and would be somewhat specific to the scenario(s) that had to use one or both of the given API endpoints. There are ways to move the "if"s to the lower level "middleware" code and to be more object-oriented at the application level. The approach would not be significantly different from a function-oriented solution.

    Cheers,
    Patrick


    ReplyDelete