“Sadly” I have disabled my browser history, so I can’t check what I usually search for, but if I’d have to guess, it’s mostly error messages. I have been trying to not search online, and instead become better in finding the information I need in official documentation (man pages, info documents, PDFs and source code). I can’t say if that makes be better or worse, since it might take longer in the short term, but I tend to understand more in the long term.
To everyone their own. For me it makes no sense to give up something that made my work life so much easier and enjoyable. Scouring tons of documentation just to find X is not what I like to spend my time on.
can anyone think of a quick and browser-local way to capture this information for oneself? pretty sure my habits align almost exactly and want to try to reproduce.
also maybe identify some classes of queries that i can redirect to e.g. Dash and interact less with Google.
Your browser history is captured in the local sqlite database. Shut down the browser (to let it release locked .db file) and use sqlite3 to query its database.
Here’s (good enough) gist to start from – Playing around with Chrome’s history.
See if you can get your browser history as a list of URLs. You could filter them down to known search pages and extract the queries from the query strings.
This external link from example centric programming on wiki also talks about this.
This feels about right to me, but I wonder to what extent the numbers of searches of each type are skewed by the huge variance in difficulty of getting good results for different queries.
For example, when I search for API documentation, the official docs are very often what I’m looking for and they’re very often in the top 2-3 results if I search for “<language/library name> <function name>”. One search and I’m done.
But for troubleshooting? Oh boy. Paste the error message into the search bar. Click on all the links on the first page to find they are all unrelated problems that happened to cause a similar error message. Add some keywords to try to narrow it down. Repeat half a dozen times until I either hit the magic combination of words or conclude that nobody else has ever posted a question or a bug report about whatever is causing the problem for me.
Looking at the raw numbers, you’d conclude that I search for error message help far more than I search for API docs, but in reality it’s just that one of those is much more efficient to search for than the other.
Same with “how to do X” questions: when I’m asking that about some technology that I’m already somewhat proficient with, it often takes me several attempts to get past the beginner questions that happen to have a lot of keyword overlap with what I’m trying to do. (Of course, when I’m searching about technology I’m not already familiar with, the beginner questions are sometimes just what I want.)
That’s a very valid point! To properly take into account how many searches it takes you to find what you’re looking for, we should have classified whole search sessions instead of individual searches.
In this case, we did cluster the searches into sessions to look at query refinement strategies, but we couldn’t make the assumption that the category of the session was the same as the category of an individual query in that session.
That would have been interesting to look into! Our dataset was too small for any real statistical analysis, though… I did report the numbers, but the meatier discussion was around the qualitative stuff.