I am surprised that TRUNCATE ran slower than DELETE FROM. I would have
thought truncate ran faster because it would not have to keep count of
how many records were deleted etc. (I remember replacing DELETE with
TRUNCATE to speed up a query many years ago, lamenting that I lost the
count of rows deleted.) However, I just found this interesting
StackOverflow answer that could
help explain it. I’m guessing that since these are tests, the data set
is not very big and that’s why DELETE is faster.
In the past a colleague wrote some code that inferred the dependencies
from the db and deleted from the tables accordingly. That way there
should be little pain in maintaining as you add more constraints, since
it is decoupled from how the schema is stored on disk. Not sure whether
this would make a difference in this case, but it seems nicer than
toggling the constraints off and on again.
Another alternative could be using something like the H2 in-memory
db and
its Pg compatibility
mode to
approximate an in-memory Postgres DB. It may be faster to just create
the schema in one of these than deleting data. This can also simplify
running independent tests in parallel.
Lastly, another technique is to write the tests such that they don’t
require a “clean DB”. For example: don’t insert three objects and count
on their primary keys coming back as 1, 2 and 3. (This example
intentionally super simplistic for illustration purposes. I didn’t get
the impression the author were doing this.) It may take more effort to
write tests this way, but it can pay dividends if you don’t have to
bring your DB back to a known state for every test.
Lastly, another technique is to write the tests such that they don’t require a “clean DB”. For example: don’t insert three objects and count on their primary keys coming back as 1, 2 and 3.
It would be really nice to do this but I doubt we have the discipline :(
I’ll check out the in-memory DB, that sounds interesting! We also might try running tests against multiple databases in parallel.
More evidence that as a test suite grows, it begins to asymptotically test everything other than the intended program. :)
I am surprised that TRUNCATE ran slower than DELETE FROM. I would have thought truncate ran faster because it would not have to keep count of how many records were deleted etc. (I remember replacing DELETE with TRUNCATE to speed up a query many years ago, lamenting that I lost the count of rows deleted.) However, I just found this interesting StackOverflow answer that could help explain it. I’m guessing that since these are tests, the data set is not very big and that’s why DELETE is faster.
In the past a colleague wrote some code that inferred the dependencies from the db and deleted from the tables accordingly. That way there should be little pain in maintaining as you add more constraints, since it is decoupled from how the schema is stored on disk. Not sure whether this would make a difference in this case, but it seems nicer than toggling the constraints off and on again.
Another alternative could be using something like the H2 in-memory db and its Pg compatibility mode to approximate an in-memory Postgres DB. It may be faster to just create the schema in one of these than deleting data. This can also simplify running independent tests in parallel.
Lastly, another technique is to write the tests such that they don’t require a “clean DB”. For example: don’t insert three objects and count on their primary keys coming back as 1, 2 and 3. (This example intentionally super simplistic for illustration purposes. I didn’t get the impression the author were doing this.) It may take more effort to write tests this way, but it can pay dividends if you don’t have to bring your DB back to a known state for every test.
It would be really nice to do this but I doubt we have the discipline :(
I’ll check out the in-memory DB, that sounds interesting! We also might try running tests against multiple databases in parallel.