Randomly open a dictionary and then randomly pointing on a word, repeating this a few times, is one way for an artist to get an inspiration.
I wonder how safe it is to use such a method to generate a passphrase.
An old Chinese proverb says: do not invent your own crypto.
Diceware is much better crafted than you may imagine. It’s not just some random idea someone had while contemplating life in a loo. It solves some real problems and avoids pitfalls.
What are the problems with the proposed method? First of all: what is your RNG or CSPRNG? Is it your brain? Your hand? Then you have already lost. If you’re just grabbing a book and opening it at a “random page”, your generator is already biased. You have much greater chances of picking a page closer to the middle than on the ends of the book. It may be even worse when it comes to the selection of the word on a page. Are you, instead, using an actual RNG or CSPRNG? Is it not biased? How are you dealing with that issue? Are the values from it mutually independent?
Even if you have a good [pseudo]randomness source, how do you map its output to the page number and word number? It isn’t a trivial task and if you do it wrong, you skew your distribution.
A dictionary may contain long words. While you may imagine that is good, because “longer is better”, it is giving you only a tiny advantage, because the space a word takes is not really used. In English it’s less than 3 bits per letter and it tends to be worse for longer words. Still, no loss, yes? Wrong. Unfortunately many services limit the length of the password you may use. It is also harder to get muscle memory for typing long words.
I believe a cryptographer could point out a few other mistakes as well. The reason I explained this is not to inspire anyone to “fix” the proposed algorithm. My goal is opposite: to discourage people from undertaing such tasks. There is many gotachas, it is easy to introduce a vulnerability and you don’t even get any testing/review for your method. Better trust people, who spent half of their lives studying cryptography.
How does Diceware deal with the above problems? It eliminates the human factor. It uses a randomness source that for all practical purposes is an actual RNG. A RNG that is even better than what is typically used for private key genereation! The tiny bias it has is acceptable, considering the great advantage of using dice. The set of possible values is chosen in a way, which ensures no bias being introduced while mapping from the output of the RNG to those values (yes, it avoids the issue altogether). It is clear, transparent and obvious at each stage — nothing up my sleeve. It can be used by anyone. Finally, words are short, so the output is compact. After some time entering such a passphrase is just a series of 4–5 taps on the keyboard. APPRECIATE WHAT ARNOLD REINHOLD DID, because he did a truly good job. :)