I eat words

I eat words

Several years ago, a group of researchers from OpenAI, one of the leading artificial intelligence research labs in the world, noticed a surprising phenomenon when they were training a neural network. These models typically gradually learn general features and relationships from the data they are trained on, called generalization. After a recommended amount of training time, a neural network is thought to have reached its peak performance, after which it starts memorizing the training data and performing worse on previously unseen data. After accidentally leaving their model to train for much longer, however, the team found it suddenly had a lightbulb moment, showing a near-perfect ability to predict patterns in all types of test data.

The OpenAI researchers called the phenomenon ‘grokking,’ a term coined by science-fiction writer Robert A. Heinlein in his 1961 novel Stranger in a Strange Land to describe an understanding of something that is so profound, you essentially merge with it.

Now other teams are trying to better understand it.

Training Neural Networks to Grok – Communications of the ACM

Training Neural Networks to Grok – Communications of the ACM