Very interesting and well written. Btw, there’s a typo:
I’m certainly no mathematician so won’t be tackling a proof […]
Perhaps a missing “I”?
This was an interesting exploration and I’m looking forward to seeing the code. I know you’re no mathematician, but the biggest thing that stood out to me was the lack of rigor surrounding the assumption that a minimal string is indeed a palindrome.
The actual algorithm used is very elegant and directly follows the assumptions, though. Very nifty. Also, I am curious, with a multi-month execution time, how do you avoid overflowing the function call stack when your program is fundamentally recursive?
Thanks for your feedback! I’ve fixed the typo.
the biggest thing that stood out to me was the lack of rigor surrounding the assumption that a minimal string is indeed a palindrome.
Indeed. I actually think the more likely outcome is the minimal superpermutation isn’t a palindrome. It’s probably 872 characters long and we’ve already found it!
But… I’d still attribute a small chance to this possibility and so I think it’s worth exploring, especially if that means we can use this assumption to speed up the search algorithm.
with a multi-month execution time, how do you avoid overflowing the function call stack when your program is fundamentally recursive?
Great question. The search tree is actually fairly shallow. Each recursive call appends a character to the string so the call stack depth shouldn’t exceed half the length of the minimal superpermutation - about 872/2 = 436.
The search tree has a branching factor of N-1 so a naive search that doesn’t prune branches would explore approximately 5^436 strings. It’s still a huge search space but a tiny fraction of that is held in memory.
I didn’t mention it but this is a huge advantage of depth-first search. It only needs to hold a single stack frame in memory for each level of the tree. A breadth-first search on the other hand, would quickly run out of memory.
Thanks again for your kind words.
No problem, thanks for the response. I can see this problem having utility in applied mathematics as well.
Not to discourage the more elaborate solutions, but a comment on the simple solution discussed early on, of just incrementally increasing the size of the integers being searched over:
Wasteful in a sense, but not that wasteful, especially if you increment in some reasonable step size, like say 8 bits at a time. For example if you first search in an int8, then increment to int16 and restart the search, you do re-visit the int8 part of the space, but that’s only 0.39% of the int16 search space. If you increase again to int24 and restart, you’ll now have wastefully triple-searched 0.0015% of the search space, and double-searched 0.39% of the search space. That’s… not that much waste. This is essentially the same reason that iterative-deepening search isn’t that wasteful.
This is very true! I do this a lot today manually and it works pretty well. Often the simple solutions are best.