scaffold_kit.utils package¶

Submodules¶

scaffold_kit.utils.ignore_parser module¶

Parses and applies .gitignore-style ignore rules to filesystem paths.

This module provides a robust, two-class system for handling file exclusion patterns. The IgnoreRule class interprets a single pattern line, converting it to a regular expression. The IgnoreParser class then reads and manages a collection of these rules, using them to filter lists of files or to check individual paths.

Demo:

To run the module’s demonstration code, use the following command:

$ uv run python -m scaffold_kit.utils.ignore_parser

class scaffold_kit.utils.ignore_parser.IgnoreParser(base_path: str | Path | None = None)[source]¶

Bases: object

Parses and applies ignore rules to filesystem paths.

This class provides methods to load rules from files or strings, and to apply those rules to filter lists of paths or check individual paths for ignored status.

add_lines(lines: Iterator[str]) → None[source]¶

Parses and adds ignore rules from an iterable of lines.

Parameters:: lines – An iterable of strings, such as from an open file or str.splitlines().

add_rule(pattern: str) → None[source]¶

Adds a single ignore rule.

Parameters:: pattern – The pattern string to add.

explain(path: str | Path, is_dir: bool = False) → list[str][source]¶

Returns a list of all rules that apply to a path.

This method is useful for debugging as it shows every rule that matches the given path and its resulting decision.

Parameters:

path – The path string or Path object to explain.
is_dir – Optional flag to indicate if the path is a directory.

Returns:

A list of strings, each explaining a matching rule and its outcome.

filter(paths: list[str | Path]) → list[str][source]¶

Returns only the paths that are not ignored by the rules.

Parameters:: paths – A list of path strings or Path objects.
Returns:: A new list containing only the paths that are not ignored.

classmethod from_file(file_path: str | Path, base_path: str | Path | None = None) → IgnoreParser[source]¶

Loads ignore rules from a file.

Parameters:

file_path – The path to the ignore file (e.g., ‘.gitignore’).
base_path – The base path for relative rules. Defaults to the directory of the file_path.

Returns:

A new IgnoreParser instance with the loaded rules.

classmethod from_string(rules: str, base_path: str | Path | None = None) → IgnoreParser[source]¶

Loads ignore rules from a string.

Parameters:

rules – A string containing newline-separated ignore patterns.
base_path – The base path for relative rules.

Returns:

A new IgnoreParser instance with the loaded rules.

matches(path: str | Path, is_dir: bool = False) → bool[source]¶

Checks if a path is ignored by the rules.

The method returns the result of the last matching rule, where a negated rule overrides a regular one.

Parameters:

path – The path string or Path object to check.
is_dir – Optional flag to indicate if the path is a directory.

Returns:

True if the path is ignored, False otherwise.

class scaffold_kit.utils.ignore_parser.IgnoreRule(pattern: str, regex: Pattern[str], negated: bool = False, dir_only: bool = False)[source]¶

Bases: object

Represents a single ignore pattern and its regex equivalent.

This class handles the conversion of a .gitignore-style pattern string into a compiled regular expression, managing pattern nuances like negation and directory-only rules.

classmethod from_pattern(pattern: str) → IgnoreRule[source]¶

Creates an IgnoreRule from a raw ignore pattern string.

This factory method parses the input string to determine its properties (negation, directory-only) before converting it to a regex.

Parameters:: pattern – The raw ignore pattern (e.g., ‘logs/’, ‘!.gitkeep’).
Returns:: A new IgnoreRule instance.

matches(path: str, is_dir: bool = False) → bool[source]¶

Checks if the given path matches this ignore rule.

This method uses the rule’s compiled regex to check for a match and applies additional logic for directory-only rules.

Parameters:

path – The path string to check.
is_dir – True if the path represents a directory.

Returns:

True if the path matches the rule, False otherwise.

scaffold_kit.utils.pattern_processor module¶

Converts glob-like patterns to regular expressions.

This module provides classes for processing .gitignore-style glob patterns and converting them into equivalent regular expressions. It uses a handler-based, “strategy” pattern to process different types of characters (e.g., wildcards, character classes, literals) and handles complex rules like recursive wildcards and root-anchored patterns.

Demo:

To run the module’s demonstration code, use the following command:

$ uv run python -m scaffold_kit.utils.pattern_processor

class scaffold_kit.utils.pattern_processor.CharacterClassHandler[source]¶

Bases: CharacterHandler

Handles ‘[…]’ character classes.

Captures the entire character class including its content and closing bracket.

can_handle(char: str) → bool[source]¶

Checks if the character is a ‘[‘.

Parameters:: char – The single character to check.
Returns:: True if the character is a character class, False otherwise.

handle(text: str, position: int) → Tuple[str, int][source]¶

Extracts the entire character class from the text.

Parameters:

text – The full text being processed.
position – Current position in the text.

Returns:

The regex string for the character class.
The new position in the text after processing.

Return type:

A tuple containing

class scaffold_kit.utils.pattern_processor.CharacterHandler[source]¶

Bases: ABC

Abstract base class for character handlers.

Character handlers define the logic for converting a specific type of pattern character into its regex equivalent.

abstractmethod can_handle(char: str) → bool[source]¶

Checks if this handler can process the given character.

Parameters:: char – The single character to check.
Returns:: True if the handler can process the character, False otherwise.

abstractmethod handle(text: str, position: int) → Tuple[str, int][source]¶

Handles the character at the given position.

Parameters:

text – The full text being processed.
position – Current position in the text.

Returns:

The replacement string for the character(s).
The new position in the text after processing.

Return type:

A tuple containing

class scaffold_kit.utils.pattern_processor.GlobProcessor[source]¶

Bases: object

Processes glob patterns using the strategy pattern.

This class iterates through a glob string, applying the appropriate CharacterHandler to each character to build a regex string part.

convert_glob_part(part: str) → str[source]¶

Converts a single glob part to regex using character handlers.

Parameters:: part – A single string part of a glob pattern (e.g., ‘path’, ‘*’, ‘**’).
Returns:: The regex equivalent of the glob part.

class scaffold_kit.utils.pattern_processor.LiteralCharHandler[source]¶

Bases: CharacterHandler

Handles literal characters (default handler).

Converts a literal character to a regex-escaped string.

can_handle(char: str) → bool[source]¶

Checks if this is the fallback handler.

This is the fallback handler, so it can handle any character.

Parameters:: char – The single character to check.
Returns:: True.

handle(text: str, position: int) → Tuple[str, int][source]¶

Escapes a single literal character for regex.

Parameters:

text – The full text being processed.
position – Current position in the text.

Returns:

A tuple of the escaped character and the new position.

class scaffold_kit.utils.pattern_processor.PatternProcessor[source]¶

Bases: object

Main class for converting glob patterns to regex.

This class orchestrates the entire conversion process, handling normalization, splitting, and joining of the regex parts.

pattern_to_regex(pattern: str) → str[source]¶

Converts a .gitignore-style glob pattern to a regex.

Parameters:: pattern – The glob pattern string to convert.
Returns:: The complete, anchored regular expression string.

class scaffold_kit.utils.pattern_processor.SingleCharHandler[source]¶

Bases: CharacterHandler

Handles ‘?’ single character wildcards.

Converts a single ‘?’ glob character into its regex equivalent.

can_handle(char: str) → bool[source]¶

Checks if the character is a ‘?’.

Parameters:: char – The single character to check.
Returns:: True if the character is a single-char wildcard, False otherwise.

handle(text: str, position: int) → Tuple[str, int][source]¶

Converts ‘?’ to ‘[^/]’.

Parameters:

text – The full text being processed.
position – Current position in the text.

Returns:

A tuple of the replacement regex and the new position.

class scaffold_kit.utils.pattern_processor.WildcardHandler[source]¶

Bases: CharacterHandler

Handles ‘*’ wildcard characters.

Converts a single ‘*’ glob character into its regex equivalent.

can_handle(char: str) → bool[source]¶

Checks if the character is a ‘*’.

Parameters:: char – The single character to check.
Returns:: True if the character is a wildcard, False otherwise.

handle(text: str, position: int) → Tuple[str, int][source]¶

Converts ‘*’ to ‘[^/]*’.

Parameters:

text – The full text being processed.
position – Current position in the text.

Returns:

A tuple of the replacement regex and the new position.

scaffold_kit.utils.string_utils module¶

A set of utilities for string manipulation.

This module provides functions for transliterating unicode characters and creating URL-friendly “slugs” from text.

Demo:

To run the module’s demonstration code, use the following command:

$ uv run python -m scaffold_kit.utils.string_utils

scaffold_kit.utils.string_utils.DIACRITICS_MAP: dict[str, str] = {'À': 'A', 'Á': 'A', 'Ã': 'A', 'Ä': 'Ae', 'Å': 'A', 'Ç': 'C', 'È': 'E', 'É': 'E', 'Ë': 'E', 'Ì': 'I', 'Í': 'I', 'Î': 'I', 'Ï': 'I', 'Ñ': 'N', 'Ò': 'O', 'Ó': 'O', 'Ô': 'O', 'Õ': 'O', 'Ö': 'Oe', 'Ù': 'U', 'Ú': 'U', 'Û': 'U', 'Ü': 'Ue', 'Ý': 'Y', 'à': 'a', 'á': 'a', 'ã': 'a', 'ä': 'ae', 'å': 'a', 'ç': 'c', 'è': 'e', 'é': 'e', 'ë': 'e', 'ì': 'i', 'í': 'i', 'î': 'i', 'ï': 'i', 'ò': 'o', 'ó': 'o', 'ô': 'o', 'õ': 'o', 'ö': 'oe', 'ù': 'u', 'ú': 'u', 'û': 'u', 'ü': 'ue', 'ý': 'y', 'ÿ': 'y', 'Ā': 'A', 'ā': 'a', 'Ă': 'A', 'ă': 'a', 'Ą': 'A', 'ą': 'a', 'Ć': 'C', 'ć': 'c', 'Ĉ': 'C', 'ĉ': 'c', 'Č': 'C', 'č': 'c', 'Ď': 'D', 'ď': 'd', 'Đ': 'D', 'đ': 'd', 'Ē': 'E', 'Ĕ': 'E', 'ĕ': 'e', 'ė': 'e', 'Ę': 'E', 'ę': 'e', 'Ě': 'E', 'ě': 'e', 'Ĝ': 'G', 'ĝ': 'g', 'Ğ': 'G', 'ğ': 'g', 'Ġ': 'G', 'ġ': 'g', 'Ģ': 'G', 'ģ': 'g', 'Ĥ': 'H', 'ĥ': 'h', 'Ħ': 'H', 'ħ': 'h', 'ĩ': 'i', 'Ī': 'I', 'ī': 'i', 'Į': 'I', 'İ': 'I', 'Ĵ': 'J', 'ĵ': 'j', 'Ķ': 'K', 'ķ': 'k', 'Ĺ': 'L', 'ĺ': 'l', 'Ļ': 'L', 'ļ': 'l', 'Ľ': 'L', 'ľ': 'l', 'Ŀ': 'L', 'Ņ': 'N', 'ņ': 'n', 'Ň': 'N', 'ň': 'n', 'Ō': 'O', 'ō': 'o', 'Ŏ': 'O', 'ŏ': 'o', 'Ő': 'O', 'ő': 'o', 'Ū': 'U', 'ū': 'u', 'Ů': 'U', 'ů': 'u', 'Ű': 'U', 'ű': 'u', 'Ų': 'U', 'Ŵ': 'W', 'ŵ': 'w', 'Ŷ': 'Y', 'ŷ': 'y', 'Ÿ': 'Y', 'ź': 'z', 'Ż': 'Z', 'ż': 'z', 'Ž': 'Z', 'ž': 'z', 'Ẽ': 'E', 'ẽ': 'e'}¶: Constant signifying diacritics map.

scaffold_kit.utils.string_utils.LIGATURES_MAP: dict[str, str] = {'Æ': 'Ae', 'ß': 'ss', 'æ': 'ae', 'Ĳ': 'Ij', 'ĳ': 'ij', 'Œ': 'Oe', 'œ': 'oe', 'Ʒ': 'Ez', 'ʒ': 'ezh', 'ﬀ': 'ff', 'ﬁ': 'fi', 'ﬂ': 'fl', 'ﬃ': 'ffi', 'ﬄ': 'ffl', 'ﬅ': 'ft', 'ﬆ': 'st'}¶: Constant signifying ligatures map.

scaffold_kit.utils.string_utils.TRANSLITERATE_MAP = {'À': 'A', 'Á': 'A', 'Ã': 'A', 'Ä': 'Ae', 'Å': 'A', 'Æ': 'Ae', 'Ç': 'C', 'È': 'E', 'É': 'E', 'Ë': 'E', 'Ì': 'I', 'Í': 'I', 'Î': 'I', 'Ï': 'I', 'Ñ': 'N', 'Ò': 'O', 'Ó': 'O', 'Ô': 'O', 'Õ': 'O', 'Ö': 'Oe', 'Ù': 'U', 'Ú': 'U', 'Û': 'U', 'Ü': 'Ue', 'Ý': 'Y', 'ß': 'ss', 'à': 'a', 'á': 'a', 'ã': 'a', 'ä': 'ae', 'å': 'a', 'æ': 'ae', 'ç': 'c', 'è': 'e', 'é': 'e', 'ë': 'e', 'ì': 'i', 'í': 'i', 'î': 'i', 'ï': 'i', 'ò': 'o', 'ó': 'o', 'ô': 'o', 'õ': 'o', 'ö': 'oe', 'ù': 'u', 'ú': 'u', 'û': 'u', 'ü': 'ue', 'ý': 'y', 'ÿ': 'y', 'Ā': 'A', 'ā': 'a', 'Ă': 'A', 'ă': 'a', 'Ą': 'A', 'ą': 'a', 'Ć': 'C', 'ć': 'c', 'Ĉ': 'C', 'ĉ': 'c', 'Č': 'C', 'č': 'c', 'Ď': 'D', 'ď': 'd', 'Đ': 'D', 'đ': 'd', 'Ē': 'E', 'Ĕ': 'E', 'ĕ': 'e', 'ė': 'e', 'Ę': 'E', 'ę': 'e', 'Ě': 'E', 'ě': 'e', 'Ĝ': 'G', 'ĝ': 'g', 'Ğ': 'G', 'ğ': 'g', 'Ġ': 'G', 'ġ': 'g', 'Ģ': 'G', 'ģ': 'g', 'Ĥ': 'H', 'ĥ': 'h', 'Ħ': 'H', 'ħ': 'h', 'ĩ': 'i', 'Ī': 'I', 'ī': 'i', 'Į': 'I', 'İ': 'I', 'Ĳ': 'Ij', 'ĳ': 'ij', 'Ĵ': 'J', 'ĵ': 'j', 'Ķ': 'K', 'ķ': 'k', 'Ĺ': 'L', 'ĺ': 'l', 'Ļ': 'L', 'ļ': 'l', 'Ľ': 'L', 'ľ': 'l', 'Ŀ': 'L', 'Ņ': 'N', 'ņ': 'n', 'Ň': 'N', 'ň': 'n', 'Ō': 'O', 'ō': 'o', 'Ŏ': 'O', 'ŏ': 'o', 'Ő': 'O', 'ő': 'o', 'Œ': 'Oe', 'œ': 'oe', 'Ū': 'U', 'ū': 'u', 'Ů': 'U', 'ů': 'u', 'Ű': 'U', 'ű': 'u', 'Ų': 'U', 'Ŵ': 'W', 'ŵ': 'w', 'Ŷ': 'Y', 'ŷ': 'y', 'Ÿ': 'Y', 'ź': 'z', 'Ż': 'Z', 'ż': 'z', 'Ž': 'Z', 'ž': 'z', 'Ʒ': 'Ez', 'ʒ': 'ezh', 'Ẽ': 'E', 'ẽ': 'e', 'ﬀ': 'ff', 'ﬁ': 'fi', 'ﬂ': 'fl', 'ﬃ': 'ffi', 'ﬄ': 'ffl', 'ﬅ': 'ft', 'ﬆ': 'st'}¶: Constant signifying transliterate map (diacritics and ligatures merged).

scaffold_kit.utils.string_utils.slugify(text: str) → str[source]¶

Converts a given string into an url-safe, ascii-only slug.

This function removes or transliterates diacritics, ligatures, and other non-ascii characters while normalising whitespace and punctuation into hyphens. The result contains only lowercase letters ([a-z]), digits ([0-9]) and hyphens, making it suitable for use in urls, file names or keys.

Parameters:: text – The original, possibly unicode string that needs to be slugified.
Returns:: A hyphen-separated ascii slug derived from text. If text is empty or the transformation leads to an empty string the returned slug will also be empty (“”).
Raises:: None – all standard exceptions are caught internally. –

Examples

Basic usage:

>>> slugify("Café crème à la française")
'cafe-creme-a-la-francaise'

Complex input with punctuation and mixed spaces:

>>> slugify("  ¡Hola! ¿Qué tal?  ")
'hola-que-tal'

Already ascii and clean strings remain the same, except for case:

>>> slugify("Valid-slug-already-given")
'valid-slug-already-given'

Empty or symbol-only input results in an empty string:

>>> slugify("!!!!!  ???")
''

scaffold_kit.utils.string_utils.transliterate(text: str) → str[source]¶

Transliterates unicode characters to their closest ascii replacements.

This function replaces diacritics, ligatures, and stylistic variants with base ASCII letters, e.g., ‘ñ’ → ‘n’, ‘æ’ → ‘ae’, ‘ß’ → ‘ss’. All remaining non-ASCII characters are removed by a second decomposing and encoding pass.

Parameters:: text – Any string containing unicode characters.
Returns:: A plain ASCII string where every non-ASCII glyph has been converted or dropped, resulting in lossy but url-safe output.
Raises:: None – all standard exceptions are caught internally. –

Examples

Handling diacritics:

>>> transliterate("François Café")
'Francois Cafe'

Mixed scripts and special characters:

>>> transliterate("Straße – café naïf")
'Strasse  cafe naif '

Ligatures and stylists variants:

>>> transliterate("Encyclopædia & ﬂuffy œuf")
'Encyclopaedia & fluffy oeu'

Emojis and math get stripped:

>>> transliterate("α ≤ ½ 😊")
'  '  # empty string, every char is non-ASCII

scaffold_kit.utils package¶

Submodules¶

scaffold_kit.utils.ignore_parser module¶

scaffold_kit.utils.pattern_processor module¶

scaffold_kit.utils.string_utils module¶

Module contents¶

Scaffold Kit

Navigation

Related Topics