linkcheck.checker.fileurl

Handle local file: links.

Functions

get_files(dirname)

Get iterator of entries in directory.

get_nt_filename(path)

Return case sensitive filename for NT path.

get_os_filename(path)

Return filesystem path for given URL path.

is_absolute_path(path)

Check if given path is absolute.

prepare_urlpath_for_nt(path)

URLs like file://server/path/ result in a path named /server/path.

Classes

AnchorCheckFileUrl(base_url, ...[, ...])

File URL link for AnchorCheck plugin.

FileUrl(base_url, recursion_level, aggregate)

Url link with file scheme.

class linkcheck.checker.fileurl.AnchorCheckFileUrl(base_url, recursion_level, aggregate, parent_url=None, base_ref=None, line=-1, column=-1, page=-1, name='', url_encoding=None, extern=None)[source]

Bases: FileUrl

File URL link for AnchorCheck plugin.

Initialize check data, and store given variables.

Parameters:
  • base_url – unquoted and possibly unnormed url

  • recursion_level – on what check level lies the base url

  • aggregate – aggregate instance

  • parent_url – quoted and normed url of parent or None

  • base_ref – quoted and normed url of <base href=””> or None

  • line – line number of url in parent content

  • column – column number of url in parent content

  • page – page number of url in parent content

  • name – name of url or empty

  • url_encoding – encoding of URL or None

  • extern – None or (is_extern, is_strict)

build_url()[source]

Calls UrlBase.build_url() and adds a trailing slash to directories.

check_connection()[source]

Try to open the local file. Under NT systems the case sensitivity is checked.

reset()[source]

Reset all variables to default values.

set_content_type()[source]

Set URL content type, or an empty string if content type could not be found.

class linkcheck.checker.fileurl.FileUrl(base_url, recursion_level, aggregate, parent_url=None, base_ref=None, line=-1, column=-1, page=-1, name='', url_encoding=None, extern=None)[source]

Bases: UrlBase

Url link with file scheme.

Initialize check data, and store given variables.

Parameters:
  • base_url – unquoted and possibly unnormed url

  • recursion_level – on what check level lies the base url

  • aggregate – aggregate instance

  • parent_url – quoted and normed url of parent or None

  • base_ref – quoted and normed url of <base href=””> or None

  • line – line number of url in parent content

  • column – column number of url in parent content

  • page – page number of url in parent content

  • name – name of url or empty

  • url_encoding – encoding of URL or None

  • extern – None or (is_extern, is_strict)

add_size_info()[source]

Get size of file content and modification time from filename path.

add_url(url, line=0, column=0, page=0, name='', base=None)[source]

If a local webroot directory is configured, replace absolute URLs with it. After that queue the URL data for checking.

build_base_url()[source]

The URL is normed according to the platform: - the base URL is made an absolute file:// URL - under Windows platform the drive specifier is normed

build_url()[source]

Calls super.build_url() and adds a trailing slash to directories.

check_case_sensitivity()[source]

Check if url and windows path name match cases else there might be problems when copying such files on web servers that are case sensitive.

check_connection()[source]

Try to open the local file. Under NT systems the case sensitivity is checked.

get_intern_pattern(url=None)[source]

Get pattern for intern URL matching.

:return non-empty regex pattern or None :rtype String or None

get_os_filename()[source]

Construct os specific file path out of the file:// URL.

Returns:

file name

Return type:

string

get_temp_filename()[source]

Get filename for content to parse.

init(base_ref, base_url, parent_url, recursion_level, aggregate, line, column, page, name, url_encoding, extern)[source]

Initialize the scheme.

is_directory()[source]

Check if file is a directory.

Returns:

True iff file is a directory

Return type:

bool

is_parseable()[source]

Check if content is parseable for recursion.

Returns:

True if content is parseable

Return type:

bool

read_content()[source]

Return file content, or in case of directories a dummy HTML file with links to the files.

set_content_type()[source]

Set URL content type, or an empty string if content type could not be found.

linkcheck.checker.fileurl.get_files(dirname)[source]

Get iterator of entries in directory. Only allows regular files and directories, no symlinks.

linkcheck.checker.fileurl.get_nt_filename(path)[source]

Return case sensitive filename for NT path.

linkcheck.checker.fileurl.get_os_filename(path)[source]

Return filesystem path for given URL path.

linkcheck.checker.fileurl.is_absolute_path(path)[source]

Check if given path is absolute. On Windows absolute paths start with a drive letter. On all other systems absolute paths start with a slash.

linkcheck.checker.fileurl.prepare_urlpath_for_nt(path)[source]

URLs like file://server/path/ result in a path named /server/path. However urllib.url2pathname expects ////server/path.