LinkChecker Logo
10.2.1.post29+g054a583b7
  • Frequently Asked Questions
  • Installation
    • Setup with pip
    • Setup for Windows
    • Setup for macOS
    • Setup for GNU/Linux
    • Manual setup for Unix systems
    • After installation
    • WSGI web interface
  • Upgrading
    • Migrating from 10.2 to 10.x
    • Migrating from 10.1 to 10.2
    • Migrating from 10.0 to 10.1
    • Migrating from 9.x to 10.0
    • Migrating from 8.x to 9.0
    • Migrating from 8.4 to 8.5
    • Migrating from 8.0 to 8.1
    • Migrating from 7.9 to 8.0
    • Migrating from 7.6 to 7.7
    • Migrating from 7.3 to 7.4
    • Migrating from 7.0 to 7.1
    • Migrating from 6.x to 7.0
    • Migrating from 6.6 to 6.7
    • Migrating from 6.4 to 6.5
    • Migrating from 5.5 to 6.0
    • Migrating from 5.2 to 5.3
    • Migrating from 5.1 to 5.2
    • Migrating from 5.0 to 5.1
    • Migrating from 4.x to 5.0
    • Migrating from 4.4 to 4.5
    • Migrating from 4.2 to 4.3
    • Migrating from 4.1 to 4.2
    • Migrating from 3.x to 4.0
    • Migrating from 3.0 to 3.1
    • Migrating from 2.x to 3.0
    • Migrating from 2.2 to 2.3
    • Migrating from 1.x to 2.0
    • Migrating from 1.12.x to 1.13.0
  • linkchecker
    • SYNOPSIS
    • DESCRIPTION
    • EXAMPLES
    • OPTIONS
      • General options
      • Output options
        • URL checking results
        • Progress updates
        • Application
        • Quiet
      • Checking options
      • Input options
    • CONFIGURATION FILES
    • OUTPUT TYPES
    • REGULAR EXPRESSIONS
    • COOKIE FILES
    • PROXY SUPPORT
    • PERFORMED CHECKS
    • SITEMAPS
    • PLUGINS
    • RECURSION
    • NOTES
    • ENVIRONMENT
    • RETURN VALUE
    • LIMITATIONS
    • FILES
    • SEE ALSO
  • linkcheckerrc
    • DESCRIPTION
    • SETTINGS
      • checking
      • filtering
      • authentication
      • output
        • URL checking results
        • Progress updates
        • Application
        • Quiet
    • OUTPUT TYPES
      • text
      • gml
      • dot
      • csv
      • sql
      • html
      • failures
      • xml
      • gxml
      • sitemap
    • LOGGER PARTS
    • MULTILINE
    • EXAMPLE
    • PLUGINS
      • AnchorCheck
      • LocationInfo
      • RegexCheck
      • SslCertificateCheck
      • HtmlSyntaxCheck
      • HttpHeaderInfo
      • CssSyntaxCheck
      • VirusCheck
      • PdfParser
      • WordParser
      • MarkdownCheck
    • WARNINGS
    • SEE ALSO
  • Contribution Guide
    • Positive feedback
    • Issues and bug reports
      • Issue triage
      • Security issues
    • Patches
      • Patch triage
    • Membership
  • Contributor Covenant Code of Conduct
    • Our Pledge
    • Our Standards
    • Our Responsibilities
    • Scope
    • Enforcement
    • Contacts
    • Attribution
    • Changes
  • Code
    • linkcheck
      • LinkCheckerError
      • LinkCheckerInterrupt
      • get_link_pat()
      • init_i18n()
      • module_path()
      • linkcheck.ansicolor
        • ColoredStreamHandler
        • Colorizer
        • esc_ansicolor()
        • get_columns()
        • get_win_color()
        • has_colors()
        • write_color()
      • linkcheck.better_exchook2
        • better_exchook()
        • fallback_findfile()
        • grep_full_py_identifiers()
        • install()
        • output()
        • output_limit()
        • parse_py_statement()
        • pp_extra_info()
        • pretty_print()
      • linkcheck.bookmarks
        • linkcheck.bookmarks.chromium
        • linkcheck.bookmarks.firefox
        • linkcheck.bookmarks.opera
        • linkcheck.bookmarks.safari
      • linkcheck.cache
        • linkcheck.cache.results
        • linkcheck.cache.robots_txt
        • linkcheck.cache.urlqueue
      • linkcheck.checker
        • absolute_url()
        • get_index_html()
        • get_url_from()
        • get_urlclass_from()
        • guess_url()
        • linkcheck.checker.const
        • linkcheck.checker.dnsurl
        • linkcheck.checker.fileurl
        • linkcheck.checker.ftpurl
        • linkcheck.checker.httpurl
        • linkcheck.checker.ignoreurl
        • linkcheck.checker.internpaturl
        • linkcheck.checker.itmsservicesurl
        • linkcheck.checker.mailtourl
        • linkcheck.checker.nntpurl
        • linkcheck.checker.telneturl
        • linkcheck.checker.unknownurl
        • linkcheck.checker.urlbase
      • linkcheck.cmdline
        • LCArgumentParser
        • aggregate_url()
        • print_plugins()
        • print_usage()
        • print_version()
      • linkcheck.colorama
        • CONSOLE_SCREEN_BUFFER_INFO
        • COORD
        • SMALL_RECT
        • GetConsoleScreenBufferInfo()
        • SetConsoleTextAttribute()
        • get_attrs()
        • get_console_size()
        • init()
        • reset_console()
        • set_console()
      • linkcheck.command
        • linkcheck.command.arg_parser
        • linkcheck.command.linkchecker
        • linkcheck.command.setup_config
      • linkcheck.configuration
        • Configuration
        • get_certifi_file()
        • get_modules_info()
        • get_plugin_folders()
        • get_system_cert_file()
        • get_user_config()
        • get_user_data()
        • make_userdir()
        • normpath()
        • split_hosts()
        • linkcheck.configuration.confparse
      • linkcheck.containers
        • LFUCache
      • linkcheck.cookies
        • from_file()
        • from_headers()
      • linkcheck.data
      • linkcheck.decorators
        • curried
        • deprecated()
        • notimplemented()
        • signal_handler()
        • synchronize()
        • synchronized()
        • timed()
        • timeit()
        • update_func_meta()
      • linkcheck.director
        • abort()
        • abort_now()
        • check_url()
        • check_urls()
        • get_aggregate()
        • interrupt()
        • linkcheck.director.aggregator
        • linkcheck.director.checker
        • linkcheck.director.console
        • linkcheck.director.interrupter
        • linkcheck.director.logger
        • linkcheck.director.status
        • linkcheck.director.task
      • linkcheck.dummy
        • Dummy
        • dummy()
      • linkcheck.fileutil
        • get_mtime()
        • get_size()
        • get_temp_file()
        • has_module()
        • is_accessable_by_others()
        • is_readable()
        • is_tty()
        • is_writable_by_others()
        • path_safe()
      • linkcheck.ftpparse
        • ftpparse()
        • ismonth()
      • linkcheck.htmlutil
        • linkcheck.htmlutil.htmlsoup
        • linkcheck.htmlutil.linkparse
        • linkcheck.htmlutil.loginformsearch
        • linkcheck.htmlutil.srcsetparse
      • linkcheck.httputil
        • get_content_type()
        • x509_to_dict()
      • linkcheck.i18n
        • get_encoded_writer()
        • init()
      • linkcheck.lc_cgi
        • LCFormError
        • ThreadsafeIO
        • application()
        • checkform()
        • checklink()
        • dump()
        • encode()
        • format_error()
        • formvalue()
        • get_configuration()
        • get_host_name()
        • get_response_headers()
        • log()
        • start_check()
      • linkcheck.loader
        • check_writable_by_others()
        • get_folder_modules()
        • get_importable_files()
        • get_module_plugins()
        • get_package_modules()
        • get_plugins()
      • linkcheck.lock
        • DebugLock
        • get_lock()
        • get_semaphore()
      • linkcheck.log
        • critical()
        • debug()
        • error()
        • exception()
        • info()
        • is_debug()
        • shutdown()
        • warn()
      • linkcheck.logconf
        • add_loghandler()
        • init_log_config()
        • remove_loghandler()
        • reset_loglevel()
        • set_debug()
        • set_loglevel()
      • linkcheck.logger
        • LogStatistics
        • linkcheck.logger.csvlog
        • linkcheck.logger.customxml
        • linkcheck.logger.dot
        • linkcheck.logger.failures
        • linkcheck.logger.gml
        • linkcheck.logger.graph
        • linkcheck.logger.gxml
        • linkcheck.logger.html
        • linkcheck.logger.none
        • linkcheck.logger.sitemapxml
        • linkcheck.logger.sql
        • linkcheck.logger.text
        • linkcheck.logger.xmllog
      • linkcheck.memoryutil
        • write_memory_dump()
      • linkcheck.mimeutil
        • add_mimetype()
        • guess_mimetype()
        • guess_mimetype_read()
        • init_mimedb()
      • linkcheck.network
        • linkcheck.network.iputil
      • linkcheck.parser
        • parse_chromium()
        • parse_css()
        • parse_firefox()
        • parse_html()
        • parse_itms_services()
        • parse_opera()
        • parse_safari()
        • parse_swf()
        • parse_text()
        • parse_url()
        • parse_wml()
        • linkcheck.parser.sitemap
      • linkcheck.plugins
        • PluginManager
        • get_plugin_classes()
        • get_plugin_modules()
        • run_plugins()
        • linkcheck.plugins.anchorcheck
        • linkcheck.plugins.httpheaderinfo
        • linkcheck.plugins.locationinfo
        • linkcheck.plugins.markdowncheck
        • linkcheck.plugins.parsepdf
        • linkcheck.plugins.parseword
        • linkcheck.plugins.regexcheck
        • linkcheck.plugins.sslcertcheck
        • linkcheck.plugins.syntaxchecks
        • linkcheck.plugins.viruscheck
      • linkcheck.robotparser2
        • RobotFileParser
      • linkcheck.socketutil
        • create_socket()
      • linkcheck.strformat
        • ascii_safe()
        • format_feature_warning()
        • get_paragraphs()
        • indent()
        • limit()
        • paginate()
        • strduration_long()
        • strip_control_chars()
        • stripurl()
        • strline()
        • strsize()
        • strtime()
        • strtimezone()
        • unquote()
        • wrap()
      • linkcheck.threader
        • StoppableThread
      • linkcheck.trace
        • trace_filter()
        • trace_ignore()
        • trace_off()
        • trace_on()
      • linkcheck.url
        • collapse_segments()
        • document_quote()
        • idna_encode()
        • is_numeric_port()
        • is_safe_domain()
        • is_safe_url()
        • parse_qsl()
        • split_netloc()
        • splitparams()
        • splitport()
        • url_fix_host()
        • url_fix_mailto_urlsplit()
        • url_fix_wayback_query()
        • url_is_absolute()
        • url_needs_quoting()
        • url_norm()
        • url_parse_query()
        • url_quote()
        • urlunsplit()
  • Index

  • Change Log
  • Issue Tracker
LinkChecker
  • LinkChecker
  • Edit on GitHub

Check websites for broken links

Introduction

LinkChecker is a free, GPL licensed website validator. LinkChecker checks links in web documents or full websites. It runs on Python 3 systems, requiring Python 3.8 or later.

Visit the project on GitHub.

Installation

$ pip3 install linkchecker

The version in the pip repository may be old, to find out how to get the latest code, plus platform-specific information and other advice see the installation document.

Basic usage

To check a URL like http://www.example.org/myhomepage/ it is enough to execute:

$ linkchecker http://www.example.org/myhomepage/

This check will validate recursively all pages starting with http://www.example.org/myhomepage/. Additionally, all external links pointing outside of www.example.org will be checked but not recursed into.

Find out more from the manual pages linkchecker and linkcheckerrc.

Features

  • recursive and multithreaded checking and site crawling

  • output in colored or normal text, HTML, SQL, CSV, XML or a sitemap graph in different formats

  • HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Telnet and local file links support

  • restriction of link checking with regular expression filters for URLs

  • proxy support

  • username/password authorization for HTTP and FTP and Telnet

  • honors robots.txt exclusion protocol

  • Cookie support

  • HTML5 support

  • Plugin support allowing custom page checks

  • Different interfaces: command line and web interface

Screenshots

_images/shot1.png _images/shot3.png

Commandline interface

WSGI web interface

Test suite status

Linkchecker has extensive unit tests to ensure code quality. GitHub Actions is used for continuous build and test integration.

Build Status
Next

© Copyright 2000-2016 Bastian Kleineidam, 2010-2023 LinkChecker Authors.

Built with Sphinx using a theme provided by Read the Docs.