The way I deal with this in Scow (my Raft implementation), is that Scow is parameterized over a transport. The transport has an API for how Scow will use it and every function is expected to possibly fail (and Scow has well defined ways in which it handles failure). A user can then wrap up the transport in with any policy around it that they want. So if you want to retry on failure you can wrap up a TCP transport in a retrying transport. And if you want to timeout after so long you can wrap that up in a timeouting transport, etc.
I’ve found it quite nice since it makes testing very easy. I have an in-memory transport and I can wrap that up in a transport with generates random partitions in interesting ways.