Codespell is a spell checker specifically designed for finding misspellings in source code.
I've been using it to correct spelling mistakes in GitHub repos sine 2016.
Most spell checkers use a list of valid words and highlighting any word in a document that is not in the word list. This method doesn't work for source code because code contains abbreviations and words joined together without spaces, a spell checker will generate too many false positives.
Codespell uses a different approach, instead of a list of valid words it has a dictionary of common misspellings.
Currently the codespell dictionary includes 34,466 known misspellings. I've contributed 300
misspellings to the dictionary.
Whenever I find an interesting open source project I run codespell to check for spelling mistakes. Most projects have spelling mistakes and I can send a pull request to fix them.
In 2019 Microsoft made the Windows calculator open
source and uploaded it to GitHub. I used codespell to find some spelling mistakes, sent them a pull request and they accepted it.
A great source for GitHub repos to spell check is Hacker News. Let's have a look.
[!flarum hacker news]
Hacker News has a link to forum software called Flarum. I can use codespell to look for spelling mistakes. When I'm looking for errors in a GitHub repo I don't fork
the project until I know there is a spelling mistake to fix.
edward@x1c9 ~/spelling> git clone git@github.com:flarum/flarum.git
Cloning into 'flarum'...
remote: Enumerating objects: 1338, done.
remote: Counting objects: 100% (42/42), done.
remote: Compressing objects: 100% (23/23), done.
remote: Total 1338 (delta 21), reused 36 (delta 19), pack-reused 1296
Receiving objects: 100% (1338/1338), 725.02 KiB | 1.09 MiB/s, done.
Resolving deltas: 100% (720/720), done.
edward@x1c9 ~/spelling> cd flarum/
edward@x1c9 ~/spelling/flarum (master)> codespell -q3
./public/web.config:13: sensitve ==> sensitive
edward@x1c9 ~/spelling/flarum (master)> gh repo fork
✓ Created fork EdwardBetts/flarum
? Would you like to add a remote for the fork? Yes
✓ Added remote origin
edward@x1c9 ~/spelling/flarum (master)> git checkout -b spelling
Switched to a new branch 'spelling'
edward@x1c9 ~/spelling/flarum (spelling)> codespell -q3
./public/web.config:13: sensitve ==> sensitive
edward@x1c9 ~/spelling/flarum (spelling)> codespell -q3 -w
FIXED: ./public/web.config
edward@x1c9 ~/spelling/flarum (spelling)> git commit -am "Correct spelling mistakes"
[spelling bbb04c7] Correct spelling mistakes
1 file changed, 1 insertion(+), 1 deletion(-)
edward@x1c9 ~/spelling/flarum (spelling)> git push -u origin
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 8 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 360 bytes | 360.00 KiB/s, done.
Total 4 (delta 3), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (3/3), completed with 3 local objects.
remote:
remote: Create a pull request for 'spelling' on GitHub by visiting:
remote: https://github.com/EdwardBetts/flarum/pull/new/spelling
remote:
To github.com:EdwardBetts/flarum.git
* [new branch] spelling -> spelling
branch 'spelling' set up to track 'origin/spelling'.
edward@x1c9 ~/spelling/flarum (spelling)> gh pr create
Creating pull request for EdwardBetts:spelling into master in flarum/flarum
? Title Correct spelling mistakes
? Choose a template Open a blank pull request
? Body <Received>
? What's next? Submit
https://github.com/flarum/flarum/pull/81
edward@x1c9 ~/spelling/flarum (spelling)>
That worked. I found one spelling mistake, the word "sensitive" was spelled wrong. I forked the repo, fixed the spelling mistake and submitted the fix as a pull request.
![flarum pull request](../../img/fixing-spelling-in-github-repos-using-codespell/flarum_pull_request.png)
The maintainer of Flarum accepted my pull request.
Fixing spelling mistakes in Bootstrap helped me unlocked the Mars 2020 Contributor achievements on GitHub.
![github mars badge](../../img/fixing-spelling-in-github-repos-using-codespell/github_mars_badge.png)
Why not try running codespell on your own codebase? You'll probably find some spelling mistakes to fix.