Skip to content

URL

The URL class represents a parsed URL and provides methods to inspect and manipulate its components according to the WHATWG URL Standard.

Overview

from pywhatwgurl import URL

url = URL("https://example.com:8080/path?q=python#section")
print(url.hostname)  # "example.com"
print(url.search_params["q"])  # "python"

Constructor

Parse url relative to base.

Parameters:

Name Type Description Default
url str

The URL string to parse.

required
base Optional[str]

An optional base URL string.

None

Raises:

Type Description
ValueError

If parsing fails.

Source code in pywhatwgurl/url.py
def __init__(self, url: str, base: Optional[str] = None) -> None:
    parsed_base: Optional[URLRecord] = None
    if base is not None:
        parsed_base = _basic_url_parse(base)
        if parsed_base is None:
            raise ValueError(f"Invalid base URL: {base}")

    parsed_url = _basic_url_parse(url, base=parsed_base)
    if parsed_url is None:
        raise ValueError(f"Invalid URL: {url}")

    self._url = parsed_url
    self._query = URLSearchParamsImpl()
    self._query._list = _parse_urlencoded_string(parsed_url.query or "")
    self._query._url = self

Static Methods

These methods provide alternative ways to parse URLs without raising exceptions.

parse

Parse url relative to base.

Parameters:

Name Type Description Default
url str

The URL string to parse.

required
base Optional[str]

An optional base URL string.

None

Returns:

Type Description
Optional['URL']

A new URL object, or None if parsing fails.

Source code in pywhatwgurl/url.py
@classmethod
def parse(cls, url: str, base: Optional[str] = None) -> Optional["URLImpl"]:
    try:
        return cls(url, base)
    except ValueError:
        return None

can_parse

Return True if url (relative to base) can be parsed, False otherwise.

Parameters:

Name Type Description Default
url str

The URL string to check.

required
base Optional[str]

An optional base URL string.

None

Returns:

Type Description
bool

True if the URL is valid, False otherwise.

Source code in pywhatwgurl/url.py
@classmethod
def can_parse(cls, url: str, base: Optional[str] = None) -> bool:
    return cls.parse(url, base) is not None

URL Components

All component properties are readable and writable (except origin and search_params which are read-only).

WHATWG Spec Compliance

These properties correspond directly to the URL class interface in the WHATWG URL Standard.

href

Full URL string.

origin

The URL's origin.

protocol

URL protocol scheme, ending with ':'.

username

URL username.

password

URL password.

host

URL host (hostname plus port if non-default).

hostname

URL hostname.

port

URL port.

pathname

URL pathname.

URL query string, starting with '?'.

search_params

URLSearchParams object representing the query string.

hash

URL fragment identifier, starting with '#'.

Serialization Methods

Pythonic Usage

Prefer str(url) over url.to_string(). These methods exist for WHATWG spec compliance.

to_string

Return the href.

Returns:

Type Description
str

The full URL string.

Source code in pywhatwgurl/interfaces.py
def to_string(self) -> str:
    """Return the href.

    Returns:
        The full URL string.
    """
    return self.href

to_json

Return the href, for JSON serialization.

Note

This method is primarily for WHATWG Spec compliance.

Returns:

Type Description
str

The full URL string.

Source code in pywhatwgurl/interfaces.py
def to_json(self) -> str:
    """Return the href, for JSON serialization.

    Note:
        This method is primarily for WHATWG Spec compliance.

    Returns:
        The full URL string.
    """
    return self.href

Special Methods

Method Usage Description
__str__ str(url) Returns the full URL string
__repr__ repr(url) Returns URL('...') representation
__eq__ url1 == url2 Compares URLs by href
__hash__ None (URLs are mutable, not hashable)