Posted on 13 Jun 2022 by Edward Betts
Codespell is a spell checker specifically designed for finding misspellings in source code.
I've been using it to correct spelling mistakes in GitHub repos sine 2016.
Most spell checkers use a list of valid words and highlighting any word in a document that is not in the word list. This method doesn't work for source code because code contains abbreviations and words joined together without spaces, a spell checker will generate too many false positives.
Codespell uses a different approach, instead of a list of valid words it has a dictionary of common misspellings.
Currently the codespell dictionary includes 34,466 known misspellings. I've contributed 300 misspellings to the dictionary.
Whenever I find an interesting open source project I run codespell to check for spelling mistakes. Most projects have spelling mistakes and I can send a pull request to fix them.
In 2019 Microsoft made the Windows calculator open source and uploaded it to GitHub. I used codespell to find some spelling mistakes, sent them a pull request and they accepted it.
A great source for GitHub repos to spell check is Hacker News. Let's have a look.
Hacker News has a link to forum software called Flarum. I can use codespell to look for spelling mistakes. When I'm looking for errors in a GitHub repo I don't fork the project until I know there is a spelling mistake to fix.
edward@x1c9 ~/spelling> git clone git@github.com:flarum/flarum.git Cloning into 'flarum'... remote: Enumerating objects: 1338, done. remote: Counting objects: 100% (42/42), done. remote: Compressing objects: 100% (23/23), done. remote: Total 1338 (delta 21), reused 36 (delta 19), pack-reused 1296 Receiving objects: 100% (1338/1338), 725.02 KiB | 1.09 MiB/s, done. Resolving deltas: 100% (720/720), done. edward@x1c9 ~/spelling> cd flarum/ edward@x1c9 ~/spelling/flarum (master)> codespell -q3 ./public/web.config:13: sensitve ==> sensitive edward@x1c9 ~/spelling/flarum (master)> gh repo fork ✓ Created fork EdwardBetts/flarum ? Would you like to add a remote for the fork? Yes ✓ Added remote origin edward@x1c9 ~/spelling/flarum (master)> git checkout -b spelling Switched to a new branch 'spelling' edward@x1c9 ~/spelling/flarum (spelling)> codespell -q3 ./public/web.config:13: sensitve ==> sensitive edward@x1c9 ~/spelling/flarum (spelling)> codespell -q3 -w FIXED: ./public/web.config edward@x1c9 ~/spelling/flarum (spelling)> git commit -am "Correct spelling mistakes" [spelling bbb04c7] Correct spelling mistakes 1 file changed, 1 insertion(+), 1 deletion(-) edward@x1c9 ~/spelling/flarum (spelling)> git push -u origin Enumerating objects: 7, done. Counting objects: 100% (7/7), done. Delta compression using up to 8 threads Compressing objects: 100% (4/4), done. Writing objects: 100% (4/4), 360 bytes | 360.00 KiB/s, done. Total 4 (delta 3), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (3/3), completed with 3 local objects. remote: remote: Create a pull request for 'spelling' on GitHub by visiting: remote: https://github.com/EdwardBetts/flarum/pull/new/spelling remote: To github.com:EdwardBetts/flarum.git * [new branch] spelling -> spelling branch 'spelling' set up to track 'origin/spelling'. edward@x1c9 ~/spelling/flarum (spelling)> gh pr create Creating pull request for EdwardBetts:spelling into master in flarum/flarum ? Title Correct spelling mistakes ? Choose a template Open a blank pull request ? Body <Received> ? What's next? Submit https://github.com/flarum/flarum/pull/81 edward@x1c9 ~/spelling/flarum (spelling)>
That worked. I found one spelling mistake, the word "sensitive" was spelled wrong. I forked the repo, fixed the spelling mistake and submitted the fix as a pull request.
The maintainer of Flarum accepted my pull request.
Fixing spelling mistakes in Bootstrap helped me unlocked the Mars 2020 Contributor achievements on GitHub.
Why not try running codespell on your own codebase? You'll probably find some spelling mistakes to fix.