A few years ago, I realized you can easily solve NYT puzzles just by taking a photo of the blank grid, identifying the grid shape, and looking it up in a database.
For whatever reason, my employer (the world’s largest Brazilian themed bookstore) didn’t think that was a fun party trick to include in their garbage telephone. Oh well.
You would think, but surprisingly, the linked Guardian crossword is American in cluing (but British in layout).
I wonder if a better ChatGPT API would allow for token filtering to match the givens. Like, if you have H????, the answer might be HATER, or it might be HOBBY, by it’s not BLART. So any token that doesn’t start with H can be rejected. ChatGPT has a temperature parameter which affects whether it picks the most likely token or one further down the list. We actually want the most likely that matches the givens. Or, actually, a list of the N most likely that match the givens.
This is a task that token based LLMs are fundamentally bad at.
A work around may be to ask if it HATER satifies the clue, and how certain it is about that. Then do another inference asking if HOBBY satisfies the cule, and how certain it is.
I think I understand your idea now. That is really good idea too, we could use reLLM to implement this. My free $18 have expired and I don’t want to give them money for ethical reasons so I wont be experimenting with this further.
My interest now is whether byte based LLMs can solve this task better, but I don’t know if there any around that I can try out.
I have never done a crossword before this project.
I initially started working on a cryptic crossword and I realized this would never work. It had a whole system of dropping letters and things I didn’t understand.
A few years ago, I realized you can easily solve NYT puzzles just by taking a photo of the blank grid, identifying the grid shape, and looking it up in a database.
For whatever reason, my employer (the world’s largest Brazilian themed bookstore) didn’t think that was a fun party trick to include in their garbage telephone. Oh well.
What database is that? I’d love to hear more about this.
https://www.xwordinfo.com/Grids
The linked gist is an attempt at solving cryptic crosswords, not American-style crosswords. They are very different.
You would think, but surprisingly, the linked Guardian crossword is American in cluing (but British in layout).
I wonder if a better ChatGPT API would allow for token filtering to match the givens. Like, if you have H????, the answer might be HATER, or it might be HOBBY, by it’s not BLART. So any token that doesn’t start with H can be rejected. ChatGPT has a temperature parameter which affects whether it picks the most likely token or one further down the list. We actually want the most likely that matches the givens. Or, actually, a list of the N most likely that match the givens.
This is a task that token based LLMs are fundamentally bad at.
A work around may be to ask if it HATER satifies the clue, and how certain it is about that. Then do another inference asking if HOBBY satisfies the cule, and how certain it is.
I think I understand your idea now. That is really good idea too, we could use reLLM to implement this. My free $18 have expired and I don’t want to give them money for ethical reasons so I wont be experimenting with this further.
My interest now is whether byte based LLMs can solve this task better, but I don’t know if there any around that I can try out.
I have never done a crossword before this project.
I initially started working on a cryptic crossword and I realized this would never work. It had a whole system of dropping letters and things I didn’t understand.
So I switched to a simpler crossword puzzle.