Installation

If you are upgrading from older versions of LinkChecker you should also read the upgrading documentation stored in Upgrading.

Installing a LinkChecker release uses pre-built distribution packages. Building the distribution packages requires hatchling and hatch-vcs, and for application translations to be compiled polib needs to be installed. After the sdist/wheel has been built polib can be removed. pip-run may be useful for this.

There are several steps to resolve problems with detecting the character encoding of checked HTML pages: first ensure the web server, if used, is not returning an incorrect charset in the Content-Type header; second, if possible add a meta element to the HTML page with the correct charset; finally, check chardet is not installed, Requests >= 2.26 will install charset-normalizer, Beautiful Soup has its own encoding detector but will use in order of preference cchardet, chardet or charset-normalizer (Beautiful Soup >= 4.11). You might find that one of the other three detectors works better for your pages. There may already be a system copy of e.g. chardet installed; installing LinkChecker in a Python venv (pipx does this for you) gives control over which packages are used.

Setup with pipx

pipx can be used to install LinkChecker on the local system.

To install the latest release from PyPI:

pipx install linkchecker

There is no need to wait for releases, every update to LinkChecker gets a unique version number and is subjected to the test suite. You can easily install the latest source from the LinkChecker GitHub repository:

pipx install https://github.com/linkchecker/linkchecker/archive/master.tar.gz

If pipx warns you to run pipx ensurepath, do that and follow the subsequent hint to open a new terminal.

Alternatively, to install the latest LinkChecker with application translations create a wheel as described below.

Setup for Windows

Installing within the Windows Subsystem for Linux (WSL) is the preferred option: https://docs.microsoft.com/en-us/windows/python/beginners

Setup for macOS

brew install pipx can be used to install pipx (untested): https://pipx.pypa.io/stable/#on-macos

Setup for GNU/Linux

On major Linux distributions (Debian, Gentoo, Fedora, Ubuntu), the linkchecker package is available for installation. pipx is available to install the latest LinkChecker.

You may wish to install your distribution’s copies of LinkChecker’s dependencies before using pipx to install LinkChecker. e.g. for Debian/Ubuntu:

apt install python3-bs4 python3-dnspython python3-requests

If those packages are too old pipx will install newer versions.

To install LinkChecker with pipx using the dependencies from your distribution:

pipx install --system-site-packages linkchecker

Building and Installing a Wheel

Clone the LinkChecker repository:

git clone https://github.com/linkchecker/linkchecker.git
cd linkchecker

Install hatchling, from your distribution e.g python3-hatch-vcs, or using pipx:

pipx install --include-deps hatch-vcs

To enable application translations:

pipx inject hatch-vcs polib

Build the distribution wheel:

hatchling build

Now install the application from the wheel:

pipx install dist/linkchecker-<version>-py3-none-any.whl

Dependencies

  1. Build time only, Python hatchling package from https://pypi.org/project/hatchling/

  2. Build time only, Python hatch-vcs package from https://pypi.org/project/hatch-vcs/

  3. Python Requests package from https://pypi.org/project/requests/

  4. Python Beautiful Soup package from https://pypi.org/project/beautifulsoup4/

  5. Python dnspython package from https://pypi.org/project/dnspython/

  6. Optional, build time only, for translations: Python polib package from https://pypi.org/project/polib/

  7. Optional, for bash-completion: Python argcomplete package from https://pypi.org/project/argcomplete/

  8. Optional, for displaying country codes: Python GeoIP package from https://pypi.org/project/GeoIP/

  9. Optional, for reading PDF files: Python pdfminer.six package from https://pypi.org/project/pdfminer.six/

  10. Optional, used for Virus checking: ClamAv from https://www.clamav.net/

  11. Optional, to run the WSGI web interface: Apache from https://httpd.apache.org/ mod_wsgi from https://pypi.org/project/mod-wsgi/

Note for developers: if you want to regenerate the po/linkchecker.pot template from the source files, you will need xgettext with Python support. This is available in gettext >= 0.12.

WSGI web interface

The included WSGI script can run LinkChecker with a nice graphical web interface. You can use and adjust the example HTML files in the lconline directory to run the script.

  1. Note that running LinkChecker requires CPU and memory resources. Allowing a WSGI script to execute such a program for possibly a large number of users might deplete those resources. Be sure to only allow access from trusted sites to this script.

  2. Copy the script lc.wsgi in the WSGI directory.

  3. Adjust the “action=…” parameter in lconline/lc_cgi.html to point to your WSGI script.

  4. If you use Apache, copy config/linkchecker.apache2.conf into your Apache configuration directory (eg. /etc/apache2/conf.d) and enable it.

  5. Load the lconline/index.html file, enter an URL and click on the check button.

  6. If something goes wrong, check the following:

    1. look in the error log of your web server

    2. be sure that you have enabled WSGI support in your web server, for example by installing mod_wsgi for Apache

    3. be sure that you have enabled the negotiation and versioning modules for Apache: a2enmod version a2enmod negotiation