scaffold_kit.utils package¶
Submodules¶
scaffold_kit.utils.ignore_parser module¶
Parses and applies .gitignore-style ignore rules to filesystem paths.
This module provides a robust, two-class system for handling file exclusion patterns. The IgnoreRule class interprets a single pattern line, converting it to a regular expression. The IgnoreParser class then reads and manages a collection of these rules, using them to filter lists of files or to check individual paths.
- Demo:
To run the module’s demonstration code, use the following command:
$ uv run python -m scaffold_kit.utils.ignore_parser
- class scaffold_kit.utils.ignore_parser.IgnoreParser(base_path: str | Path | None = None)[source]¶
Bases:
objectParses and applies ignore rules to filesystem paths.
This class provides methods to load rules from files or strings, and to apply those rules to filter lists of paths or check individual paths for ignored status.
- add_lines(lines: Iterator[str]) None[source]¶
Parses and adds ignore rules from an iterable of lines.
- Parameters:
lines – An iterable of strings, such as from an open file or str.splitlines().
- add_rule(pattern: str) None[source]¶
Adds a single ignore rule.
- Parameters:
pattern – The pattern string to add.
- explain(path: str | Path, is_dir: bool = False) list[str][source]¶
Returns a list of all rules that apply to a path.
This method is useful for debugging as it shows every rule that matches the given path and its resulting decision.
- Parameters:
path – The path string or Path object to explain.
is_dir – Optional flag to indicate if the path is a directory.
- Returns:
A list of strings, each explaining a matching rule and its outcome.
- filter(paths: list[str | Path]) list[str][source]¶
Returns only the paths that are not ignored by the rules.
- Parameters:
paths – A list of path strings or Path objects.
- Returns:
A new list containing only the paths that are not ignored.
- classmethod from_file(file_path: str | Path, base_path: str | Path | None = None) IgnoreParser[source]¶
Loads ignore rules from a file.
- Parameters:
file_path – The path to the ignore file (e.g., ‘.gitignore’).
base_path – The base path for relative rules. Defaults to the directory of the file_path.
- Returns:
A new IgnoreParser instance with the loaded rules.
- classmethod from_string(rules: str, base_path: str | Path | None = None) IgnoreParser[source]¶
Loads ignore rules from a string.
- Parameters:
rules – A string containing newline-separated ignore patterns.
base_path – The base path for relative rules.
- Returns:
A new IgnoreParser instance with the loaded rules.
- matches(path: str | Path, is_dir: bool = False) bool[source]¶
Checks if a path is ignored by the rules.
The method returns the result of the last matching rule, where a negated rule overrides a regular one.
- Parameters:
path – The path string or Path object to check.
is_dir – Optional flag to indicate if the path is a directory.
- Returns:
True if the path is ignored, False otherwise.
- class scaffold_kit.utils.ignore_parser.IgnoreRule(pattern: str, regex: Pattern[str], negated: bool = False, dir_only: bool = False)[source]¶
Bases:
objectRepresents a single ignore pattern and its regex equivalent.
This class handles the conversion of a .gitignore-style pattern string into a compiled regular expression, managing pattern nuances like negation and directory-only rules.
- classmethod from_pattern(pattern: str) IgnoreRule[source]¶
Creates an IgnoreRule from a raw ignore pattern string.
This factory method parses the input string to determine its properties (negation, directory-only) before converting it to a regex.
- Parameters:
pattern – The raw ignore pattern (e.g., ‘logs/’, ‘!.gitkeep’).
- Returns:
A new IgnoreRule instance.
- matches(path: str, is_dir: bool = False) bool[source]¶
Checks if the given path matches this ignore rule.
This method uses the rule’s compiled regex to check for a match and applies additional logic for directory-only rules.
- Parameters:
path – The path string to check.
is_dir – True if the path represents a directory.
- Returns:
True if the path matches the rule, False otherwise.
scaffold_kit.utils.pattern_processor module¶
Converts glob-like patterns to regular expressions.
This module provides classes for processing .gitignore-style glob patterns and converting them into equivalent regular expressions. It uses a handler-based, “strategy” pattern to process different types of characters (e.g., wildcards, character classes, literals) and handles complex rules like recursive wildcards and root-anchored patterns.
- Demo:
To run the module’s demonstration code, use the following command:
$ uv run python -m scaffold_kit.utils.pattern_processor
- class scaffold_kit.utils.pattern_processor.CharacterClassHandler[source]¶
Bases:
CharacterHandlerHandles ‘[…]’ character classes.
Captures the entire character class including its content and closing bracket.
- can_handle(char: str) bool[source]¶
Checks if the character is a ‘[‘.
- Parameters:
char – The single character to check.
- Returns:
True if the character is a character class, False otherwise.
- handle(text: str, position: int) Tuple[str, int][source]¶
Extracts the entire character class from the text.
- Parameters:
text – The full text being processed.
position – Current position in the text.
- Returns:
The regex string for the character class.
The new position in the text after processing.
- Return type:
A tuple containing
- class scaffold_kit.utils.pattern_processor.CharacterHandler[source]¶
Bases:
ABCAbstract base class for character handlers.
Character handlers define the logic for converting a specific type of pattern character into its regex equivalent.
- abstractmethod can_handle(char: str) bool[source]¶
Checks if this handler can process the given character.
- Parameters:
char – The single character to check.
- Returns:
True if the handler can process the character, False otherwise.
- abstractmethod handle(text: str, position: int) Tuple[str, int][source]¶
Handles the character at the given position.
- Parameters:
text – The full text being processed.
position – Current position in the text.
- Returns:
The replacement string for the character(s).
The new position in the text after processing.
- Return type:
A tuple containing
- class scaffold_kit.utils.pattern_processor.GlobProcessor[source]¶
Bases:
objectProcesses glob patterns using the strategy pattern.
This class iterates through a glob string, applying the appropriate CharacterHandler to each character to build a regex string part.
- class scaffold_kit.utils.pattern_processor.LiteralCharHandler[source]¶
Bases:
CharacterHandlerHandles literal characters (default handler).
Converts a literal character to a regex-escaped string.
- class scaffold_kit.utils.pattern_processor.PatternProcessor[source]¶
Bases:
objectMain class for converting glob patterns to regex.
This class orchestrates the entire conversion process, handling normalization, splitting, and joining of the regex parts.
- class scaffold_kit.utils.pattern_processor.SingleCharHandler[source]¶
Bases:
CharacterHandlerHandles ‘?’ single character wildcards.
Converts a single ‘?’ glob character into its regex equivalent.
- class scaffold_kit.utils.pattern_processor.WildcardHandler[source]¶
Bases:
CharacterHandlerHandles ‘*’ wildcard characters.
Converts a single ‘*’ glob character into its regex equivalent.
scaffold_kit.utils.string_utils module¶
A set of utilities for string manipulation.
This module provides functions for transliterating unicode characters and creating URL-friendly “slugs” from text.
- Demo:
To run the module’s demonstration code, use the following command:
$ uv run python -m scaffold_kit.utils.string_utils
- scaffold_kit.utils.string_utils.DIACRITICS_MAP: dict[str, str] = {'À': 'A', 'Á': 'A', 'Ã': 'A', 'Ä': 'Ae', 'Å': 'A', 'Ç': 'C', 'È': 'E', 'É': 'E', 'Ë': 'E', 'Ì': 'I', 'Í': 'I', 'Î': 'I', 'Ï': 'I', 'Ñ': 'N', 'Ò': 'O', 'Ó': 'O', 'Ô': 'O', 'Õ': 'O', 'Ö': 'Oe', 'Ù': 'U', 'Ú': 'U', 'Û': 'U', 'Ü': 'Ue', 'Ý': 'Y', 'à': 'a', 'á': 'a', 'ã': 'a', 'ä': 'ae', 'å': 'a', 'ç': 'c', 'è': 'e', 'é': 'e', 'ë': 'e', 'ì': 'i', 'í': 'i', 'î': 'i', 'ï': 'i', 'ò': 'o', 'ó': 'o', 'ô': 'o', 'õ': 'o', 'ö': 'oe', 'ù': 'u', 'ú': 'u', 'û': 'u', 'ü': 'ue', 'ý': 'y', 'ÿ': 'y', 'Ā': 'A', 'ā': 'a', 'Ă': 'A', 'ă': 'a', 'Ą': 'A', 'ą': 'a', 'Ć': 'C', 'ć': 'c', 'Ĉ': 'C', 'ĉ': 'c', 'Č': 'C', 'č': 'c', 'Ď': 'D', 'ď': 'd', 'Đ': 'D', 'đ': 'd', 'Ē': 'E', 'Ĕ': 'E', 'ĕ': 'e', 'ė': 'e', 'Ę': 'E', 'ę': 'e', 'Ě': 'E', 'ě': 'e', 'Ĝ': 'G', 'ĝ': 'g', 'Ğ': 'G', 'ğ': 'g', 'Ġ': 'G', 'ġ': 'g', 'Ģ': 'G', 'ģ': 'g', 'Ĥ': 'H', 'ĥ': 'h', 'Ħ': 'H', 'ħ': 'h', 'ĩ': 'i', 'Ī': 'I', 'ī': 'i', 'Į': 'I', 'İ': 'I', 'Ĵ': 'J', 'ĵ': 'j', 'Ķ': 'K', 'ķ': 'k', 'Ĺ': 'L', 'ĺ': 'l', 'Ļ': 'L', 'ļ': 'l', 'Ľ': 'L', 'ľ': 'l', 'Ŀ': 'L', 'Ņ': 'N', 'ņ': 'n', 'Ň': 'N', 'ň': 'n', 'Ō': 'O', 'ō': 'o', 'Ŏ': 'O', 'ŏ': 'o', 'Ő': 'O', 'ő': 'o', 'Ū': 'U', 'ū': 'u', 'Ů': 'U', 'ů': 'u', 'Ű': 'U', 'ű': 'u', 'Ų': 'U', 'Ŵ': 'W', 'ŵ': 'w', 'Ŷ': 'Y', 'ŷ': 'y', 'Ÿ': 'Y', 'ź': 'z', 'Ż': 'Z', 'ż': 'z', 'Ž': 'Z', 'ž': 'z', 'Ẽ': 'E', 'ẽ': 'e'}¶
Constant signifying diacritics map.
- scaffold_kit.utils.string_utils.LIGATURES_MAP: dict[str, str] = {'Æ': 'Ae', 'ß': 'ss', 'æ': 'ae', 'IJ': 'Ij', 'ij': 'ij', 'Œ': 'Oe', 'œ': 'oe', 'Ʒ': 'Ez', 'ʒ': 'ezh', 'ff': 'ff', 'fi': 'fi', 'fl': 'fl', 'ffi': 'ffi', 'ffl': 'ffl', 'ſt': 'ft', 'st': 'st'}¶
Constant signifying ligatures map.
- scaffold_kit.utils.string_utils.TRANSLITERATE_MAP = {'À': 'A', 'Á': 'A', 'Ã': 'A', 'Ä': 'Ae', 'Å': 'A', 'Æ': 'Ae', 'Ç': 'C', 'È': 'E', 'É': 'E', 'Ë': 'E', 'Ì': 'I', 'Í': 'I', 'Î': 'I', 'Ï': 'I', 'Ñ': 'N', 'Ò': 'O', 'Ó': 'O', 'Ô': 'O', 'Õ': 'O', 'Ö': 'Oe', 'Ù': 'U', 'Ú': 'U', 'Û': 'U', 'Ü': 'Ue', 'Ý': 'Y', 'ß': 'ss', 'à': 'a', 'á': 'a', 'ã': 'a', 'ä': 'ae', 'å': 'a', 'æ': 'ae', 'ç': 'c', 'è': 'e', 'é': 'e', 'ë': 'e', 'ì': 'i', 'í': 'i', 'î': 'i', 'ï': 'i', 'ò': 'o', 'ó': 'o', 'ô': 'o', 'õ': 'o', 'ö': 'oe', 'ù': 'u', 'ú': 'u', 'û': 'u', 'ü': 'ue', 'ý': 'y', 'ÿ': 'y', 'Ā': 'A', 'ā': 'a', 'Ă': 'A', 'ă': 'a', 'Ą': 'A', 'ą': 'a', 'Ć': 'C', 'ć': 'c', 'Ĉ': 'C', 'ĉ': 'c', 'Č': 'C', 'č': 'c', 'Ď': 'D', 'ď': 'd', 'Đ': 'D', 'đ': 'd', 'Ē': 'E', 'Ĕ': 'E', 'ĕ': 'e', 'ė': 'e', 'Ę': 'E', 'ę': 'e', 'Ě': 'E', 'ě': 'e', 'Ĝ': 'G', 'ĝ': 'g', 'Ğ': 'G', 'ğ': 'g', 'Ġ': 'G', 'ġ': 'g', 'Ģ': 'G', 'ģ': 'g', 'Ĥ': 'H', 'ĥ': 'h', 'Ħ': 'H', 'ħ': 'h', 'ĩ': 'i', 'Ī': 'I', 'ī': 'i', 'Į': 'I', 'İ': 'I', 'IJ': 'Ij', 'ij': 'ij', 'Ĵ': 'J', 'ĵ': 'j', 'Ķ': 'K', 'ķ': 'k', 'Ĺ': 'L', 'ĺ': 'l', 'Ļ': 'L', 'ļ': 'l', 'Ľ': 'L', 'ľ': 'l', 'Ŀ': 'L', 'Ņ': 'N', 'ņ': 'n', 'Ň': 'N', 'ň': 'n', 'Ō': 'O', 'ō': 'o', 'Ŏ': 'O', 'ŏ': 'o', 'Ő': 'O', 'ő': 'o', 'Œ': 'Oe', 'œ': 'oe', 'Ū': 'U', 'ū': 'u', 'Ů': 'U', 'ů': 'u', 'Ű': 'U', 'ű': 'u', 'Ų': 'U', 'Ŵ': 'W', 'ŵ': 'w', 'Ŷ': 'Y', 'ŷ': 'y', 'Ÿ': 'Y', 'ź': 'z', 'Ż': 'Z', 'ż': 'z', 'Ž': 'Z', 'ž': 'z', 'Ʒ': 'Ez', 'ʒ': 'ezh', 'Ẽ': 'E', 'ẽ': 'e', 'ff': 'ff', 'fi': 'fi', 'fl': 'fl', 'ffi': 'ffi', 'ffl': 'ffl', 'ſt': 'ft', 'st': 'st'}¶
Constant signifying transliterate map (diacritics and ligatures merged).
- scaffold_kit.utils.string_utils.slugify(text: str) str[source]¶
Converts a given string into an url-safe, ascii-only slug.
This function removes or transliterates diacritics, ligatures, and other non-ascii characters while normalising whitespace and punctuation into hyphens. The result contains only lowercase letters ([a-z]), digits ([0-9]) and hyphens, making it suitable for use in urls, file names or keys.
- Parameters:
text – The original, possibly unicode string that needs to be slugified.
- Returns:
A hyphen-separated ascii slug derived from text. If text is empty or the transformation leads to an empty string the returned slug will also be empty (“”).
- Raises:
None – all standard exceptions are caught internally. –
Examples
Basic usage:
>>> slugify("Café crème à la française") 'cafe-creme-a-la-francaise'
Complex input with punctuation and mixed spaces:
>>> slugify(" ¡Hola! ¿Qué tal? ") 'hola-que-tal'
Already ascii and clean strings remain the same, except for case:
>>> slugify("Valid-slug-already-given") 'valid-slug-already-given'
Empty or symbol-only input results in an empty string:
>>> slugify("!!!!! ???") ''
- scaffold_kit.utils.string_utils.transliterate(text: str) str[source]¶
Transliterates unicode characters to their closest ascii replacements.
This function replaces diacritics, ligatures, and stylistic variants with base ASCII letters, e.g., ‘ñ’ → ‘n’, ‘æ’ → ‘ae’, ‘ß’ → ‘ss’. All remaining non-ASCII characters are removed by a second decomposing and encoding pass.
- Parameters:
text – Any string containing unicode characters.
- Returns:
A plain ASCII string where every non-ASCII glyph has been converted or dropped, resulting in lossy but url-safe output.
- Raises:
None – all standard exceptions are caught internally. –
Examples
Handling diacritics:
>>> transliterate("François Café") 'Francois Cafe'
Mixed scripts and special characters:
>>> transliterate("Straße – café naïf") 'Strasse cafe naif '
Ligatures and stylists variants:
>>> transliterate("Encyclopædia & fluffy œuf") 'Encyclopaedia & fluffy oeu'
Emojis and math get stripped:
>>> transliterate("α ≤ ½ 😊") ' ' # empty string, every char is non-ASCII