Contents¶
Overview¶
docs | |
---|---|
tests | |
package |
Code generators for immutable structured data, including algebraic data types, and functions to destructure them.
Structured Data provides three public modules: structured_data.adt
, structured_data.match
, and structured_data.data
.
The adt
module provides base classes and an annotation type for converting a class into algebraic data types.
The match
module provides a Pattern
class that can be used to build match structures, and a Matchable
class that wraps a value, and attempts to apply match structures to it.
If the match succeeds, the bindings can be extracted and used.
It includes some special support for adt
subclasses.
The match architecture allows you tell pull values out of a nested structure:
structure = (match.pat.a, match.pat.b[match.pat.c, match.pat.d], 5)
my_value = (('abc', 'xyz'), ('def', 'ghi'), 5)
matchable = match.Matchable(my_value)
if matchable(structure):
# The format of the matches is not final.
print(matchable['a']) # ('abc', 'xyz')
print(matchable['b']) # ('def', 'ghi')
print(matchable['c']) # 'def'
print(matchable['d']) # 'ghi'
The subscript operator allows binding both the outside and the inside of a structure.
Indexing a Matchable
is forwarded to a matches
attribute, which is None
if the last match was not successful, and otherwise contains an instance of a custom mapping type, which allows building the matched values back up into simple structures.
The Sum
base class exists to create classes that do not necessarily have a single fixed format, but do have a fixed set of possible formats.
This lowers the maintenance burden of writing functions that operate on values of a Sum
class, because the full list of cases to handle is directly in the class definition.
Here are implementations of common algebraic data types in other languages:
class Maybe(adt.Sum, typing.Generic[T]):
Just: adt.Ctor[T]
Nothing: adt.Ctor
class Either(adt.Sum, typing.Generic[E, R]):
Left: adt.Ctor[E]
Right: adt.Ctor[R]
The data
module provides classes based on these examples.
- Free software: MIT license
How Can I Help?¶
Currently, this project has somewhat high quality metrics, though some of them have been higher. I am highly skeptical of this, because I’ve repeatedly given in to the temptation to code to the metrics. I can’t trust the metrics, and I know the code well enough that I can’t trust my own judgment to figure out which bits need to be improved and how. I need someone to review the code and identify problem spots based on what doesn’t make sense to them. The issues are open.
Should I Use This?¶
Until there’s a major version out, probably not.
There are several alternatives in the standard library that may be better suited to particular use-cases:
- The
namedtuple
factory creates tuple classes with a single structure; thetyping.NamedTuple
class offers the ability to include type information. The interface is slightly awkward, and the values expose their tuple-nature easily. (NOTE: In Python 3.8, the fast access to namedtuple members means that they bypass user-defined__getitem__
methods, thereby allowing factory consumers to customize indexing without breaking attribute access. It looks like it does still rely on iteration behavior for various convenience methods.) - The
enum
module provides base classes to create finite enumerations. Unlike NamedTuple, the ability to convert values into an underlying type must be opted into in the class definition. - The
dataclasses
module provides a class decorator that converts a class into one with a single structure, similar to a namedtuple, but with more customization: instances are mutable by default, and it’s possible to generate implementations of common protocols. - The Structured Data
adt
decorator is inspired by the design ofdataclasses
. (A previous attempt used metaclasses inspired by theenum
module, and was a nightmare.) Unlikeenum
, it doesn’t require all instances to be defined up front; instead each class defines constructors using a sequence of types, which ultimately determines the number of arguments the constructor takes. Unlikenamedtuple
anddataclasses
, it allows instances to have multiple shapes with their own type signatures. Unlike using regular classes, the set of shapes is specified up front. - If you want multiple shapes, and don’t want to specify them ahead of time, your best bet is probably a normal tree of classes, where the leaf classes are
dataclasses
.
Installation¶
pip install structured-data
Documentation¶
Usage¶
To use Structured Data in a project:
import structured_data
Structured Data provides several related facilities.
- To define algebraic data types, see the
structured_data.adt
module. - To perform destructuring matches of data, see the
structured_data.match
module.
Reference¶
structured_data.adt¶
Base classes for defining abstract data types.
This module provides three public members, which are used together.
Given a structure, possibly a choice of different structures, that you’d like to associate with a type:
- First, create a class, that subclasses the Sum class.
- Then, for each possible structure, add an attribute annotation to the class
with the desired name of the constructor, and a type of
Ctor
, with the types within the constructor as arguments.
To look inside an ADT instance, use the functions from the
structured_data.match
module.
Putting it together:
>>> from structured_data import match
>>> class Example(Sum):
... FirstConstructor: Ctor[int, str]
... SecondConstructor: Ctor[bytes]
... ThirdConstructor: Ctor
... def __iter__(self):
... matchable = match.Matchable(self)
... if matchable(Example.FirstConstructor(match.pat.count, match.pat.string)):
... count, string = matchable[match.pat.count, match.pat.string]
... for _ in range(count):
... yield string
... elif matchable(Example.SecondConstructor(match.pat.bytes)):
... bytes_ = matchable[match.pat.bytes]
... for byte in bytes_:
... yield chr(byte)
... elif matchable(Example.ThirdConstructor()):
... yield "Third"
... yield "Constructor"
>>> list(Example.FirstConstructor(5, "abc"))
['abc', 'abc', 'abc', 'abc', 'abc']
>>> list(Example.SecondConstructor(b"abc"))
['a', 'b', 'c']
>>> list(Example.ThirdConstructor())
['Third', 'Constructor']
-
class
structured_data.adt.
Ctor
[source]¶ Marker class for adt constructors.
To use, index with a sequence of types, and annotate a variable in an adt-decorated class with it.
-
class
structured_data.adt.
Product
[source]¶ Base class of classes with typed fields.
Examines PEP 526 __annotations__ to determine fields.
If repr is true, a __repr__() method is added to the class. If order is true, rich comparison dunder methods are added.
The Product class examines the class to find annotations. Annotations with a value of “None” are discarded. Fields may have default values, and can be set to inspect.empty to indicate “no default”.
The subclass is subclassable. The implementation was designed with a focus on flexibility over ideals of purity, and therefore provides various optional facilities that conflict with, for example, Liskov substitutability. For the purposes of matching, each class is considered distinct.
-
class
structured_data.adt.
Sum
[source]¶ Base class of classes with disjoint constructors.
Examines PEP 526 __annotations__ to determine subclasses.
If repr is true, a __repr__() method is added to the class. If order is true, rich comparison dunder methods are added.
The Sum class examines the class to find Ctor annotations. A Ctor annotation is the adt.Ctor class itself, or the result of indexing the class, either with a single type hint, or a tuple of type hints. All other annotations are ignored.
The subclass is not subclassable, but has subclasses at each of the names that had Ctor annotations. Each subclass takes a fixed number of arguments, corresponding to the type hints given to its annotation, if any.
structured_data.data¶
Example types showing simple usage of adt.Sum.
structured_data.match¶
Utilities for destructuring values using matchables and match targets.
Given a value to destructure, called value
:
- Construct a matchable:
matchable = Matchable(value)
- The matchable is initially falsy, but it will become truthy if it is passed a
match target that matches
value
:assert matchable(some_pattern_that_matches)
(Matchable returns itself from the call, so you can put the calls in an if-elif block, and only make a given call at most once.) - When the matchable is truthy, it can be indexed to access bindings created by the target.
-
class
structured_data.match.
AttrPattern
[source]¶ A matcher that destructures an object using attribute access.
The
AttrPattern
constructor takes keyword arguments. Each name-value pair is the name of an attribute, and a matcher to apply to that attribute.Attributes are checked in the order they were passed.
-
destructure
(value) → Union[Tuple[()], Tuple[Any, Any]][source]¶ Return a tuple of sub-values to check.
If self is empty, return no values from self or the target.
Special-case matching against another AttrPattern as follows: Confirm that the target isn’t smaller than self, then Extract the first match from the target’s match_dict, and Return the smaller value, and the first match’s value. (This works as desired when value is self, but all other cases where
isinstance(value, AttrPattern)
are unspecified.)By default, it takes the first match from the match_dict, and returns the original value, and the result of calling
getattr
with the target and the match’s key.
-
match_dict
¶ Return the dict of matches to check.
-
-
class
structured_data.match.
Bind
[source]¶ A wrapper that adds additional bindings to a successful match.
The
Bind
constructor takes a single required argument, and any number of keyword arguments. The required argument is a matcher. When matching, if the match succeeds, theBind
instance adds bindings corresponding to its keyword arguments.First, the matcher is checked, then the bindings are added in the order they were passed.
-
bindings
¶ Return the bindings to add to the match.
-
destructure
(value)[source]¶ Return a list of sub-values to check.
If
value is self
, return all of the bindings, and the structure.Otherwise, return the corresponding bound values, followed by the original value.
-
structure
¶ Return the structure to match against.
-
-
class
structured_data.match.
DictPattern
[source]¶ A matcher that destructures a dictionary by key.
The
DictPattern
constructor takes a required argument, a dictionary where the keys are keys to check, and the values are matchers to apply. It also takes an optional keyword argument, “exhaustive”, which defaults to False. If “exhaustive” is True, then the match requires that the matched dictionary has no keys not in theDictPattern
. Otherwise, “extra” keys are ignored.Keys are checked in iteration order.
-
destructure
(value) → Union[Tuple[()], Tuple[Any, Any]][source]¶ Return a tuple of sub-values to check.
If self is exhaustive and the lengths don’t match, fail.
If self is empty, return no values from self or the target.
Special-case matching against another DictPattern as follows: Confirm that the target isn’t smaller than self, then Extract the first match from the target’s match_dict, and Return the smaller value, and the first match’s value. Note that the returned DictPattern is never exhaustive; the exhaustiveness check is accomplished by asserting that the lengths start out the same, and that every key in self is present in value. (This works as desired when value is self, but all other cases where
isinstance(value, DictPattern)
are unspecified.)By default, it takes the first match from the match_dict, and returns the original value, and the result of indexing the target with the match’s key.
-
exhaustive
¶ Return whether the target must of the exact keys as self.
-
exhaustive_length_must_match
(value: Sized)[source]¶ If the match is exhaustive and the lengths differ, fail.
-
match_dict
¶ Return the dict of matches to check.
-
-
class
structured_data.match.
MatchDict
[source]¶ A MutableMapping that allows for retrieval into structures.
The actual keys in the mapping must be string values. Most of the mapping methods will only operate on or yield string keys. The exception is subscription: the “key” in subscription can be a structure made of tuples and dicts. For example,
md["a", "b"] == (md["a"], md["b"])
, andmd[{1: "a"}] == {1: md["a"]}
. The typical use of this will be to extract many match values at once, as ina, b, c == md["a", "b", "c"]
.The behavior of most of the pre-defined MutableMapping methods is currently neither tested nor guaranteed.
-
class
structured_data.match.
Matchable
(value: Any)[source]¶ Given a value, attempt to match against a target.
The truthiness of
Matchable
values varies on whether they have bindings associated with them. They are truthy exactly when they have bindings.Matchable
values provide two basic forms of syntactic sugar.m_able(target)
is equivalent tom_able.match(target)
, andm_able[k]
will returnm_able.matches[k]
if theMatchable
is truthy, and raise aValueError
otherwise.
-
class
structured_data.match.
Pattern
[source]¶ A matcher that binds a value to a name.
A
Pattern
can be indexed with another matcher to produce anAsPattern
. When matched with a value, anAsPattern
both binds the value to the name, and uses the matcher to match the value, thereby constraining it to have a particular shape, and possibly introducing further bindings.-
name
¶ Return the name of the matcher.
-
-
class
structured_data.match.
Property
(func=None, fset=None, fdel=None, doc=None, *args, **kwargs)[source]¶ Decorator with value-based dispatch. Acts as a property.
-
structured_data.match.
decorate_in_order
(*args)[source]¶ Apply decorators in the order they’re passed to the function.
Contributing¶
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
Bug reports¶
When reporting a bug please include:
- Your operating system name and version.
- Any details about your local setup that might be helpful in troubleshooting.
- Detailed steps to reproduce the bug.
Documentation improvements¶
Structured Data could always use more documentation, whether as part of the official Structured Data docs, in docstrings, or even on the web in blog posts, articles, and such.
Feature requests and feedback¶
The best way to send feedback is to file an issue at https://github.com/mwchase/python-structured-data/issues.
If you are proposing a feature:
- Explain in detail how it would work.
- Keep the scope as narrow as possible, to make it easier to implement.
- Remember that this is a volunteer-driven project, and that code contributions are welcome :)
Development¶
To set up python-structured-data for local development:
Fork python-structured-data (look for the “Fork” button).
Clone your fork locally:
git clone git@github.com:your_name_here/python-structured-data.git
Create a branch for local development:
git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
When you’re done making changes, run all the checks, doc builder and spell checker with tox one command:
tox
Commit your changes and push your branch to GitHub:
git add . git commit -m "Your detailed description of your changes." git push origin name-of-your-bugfix-or-feature
Submit a pull request through the GitHub website.
Pull Request Guidelines¶
If you need some code review or feedback while you’re developing the code just make the pull request.
For merging, you should:
- Include passing tests (run
tox
) [1]. - Update documentation when there’s new API, functionality etc.
- Add a note to
CHANGELOG.rst
about the changes. - Add yourself to
AUTHORS.rst
.
[1] | If you don’t have all the necessary python versions available locally you can rely on Travis - it will run the tests for each change you add in the pull request. It will be slower though … |
Tips¶
To run a subset of tests:
tox -e envname -- pytest -k test_myfeature
To run all the test environments in parallel (you need to pip install detox
):
detox
Authors¶
- Max Woerner Chase - https://mwchase.neocities.org
Changelog¶
Unreleased¶
0.13.0 (2019-09-29)¶
Added¶
match.function
andmatch.Property
decorators for Haskell-style function definitions.
Fixed¶
- Accessing data descriptors on
Sum
andProduct
instances.
0.12.0 (2019-09-03)¶
Added¶
- Product base class
Changed¶
- Improved documentation of some match constructors.
- Exposed
MatchDict
type, so it gets documented. - Converted the
adt
decorator to aSum
base class.
Removed¶
Guard
type removed in favor of user-defined validation functions.
0.11.0 (2019-03-23)¶
Changed¶
- Consider all overrides of checked dunder methods, not just those in the decorated class.
0.10.1 (2019-03-22)¶
Added¶
- A non-ergonomic but simple wrapper class for use by the typing plugin. It’s not available to runtime code.
0.10.0 (2019-03-21)¶
Changed¶
- Actually, the facade was working, I was just confused. Restored the facade.
0.6.1 (2019-03-18)¶
Added¶
Bind
class for attaching extra data to a match structure.- PEP 561 support.
Changed¶
- As-patterns are now formed with indexing instead of the
@
operator. AttrPattern
andDictPattern
now take keyword arguments instead of adict
argument, and form new versions of themselves with analter
method.- Actually. Change
DictPattern
back, stop trying to keep these things in synch.
0.6.0 (2018-07-27)¶
Added¶
AttrPattern
andDictPattern
classes that take adict
argument and perform destructuring match against arbitrary objects, and mappings, respectively.
Changed¶
- Added special handling for matching AsPatterns against different AsPatterns. This is subject to change, as it’s definitely an edge case.
0.5.0 (2018-07-22)¶
Added¶
Matchable
class is now callable and indexable. Calling is forwarded to thematch
method, and indexing forwards to thematches
attribute, if it exists, and raises an error otherwise.Matchable
class now has custom coercion to bool:False
if the last match attempt failed,True
otherwise.
Changed¶
- Renamed
enum
toadt
to avoid confusion. - Renamed
ValueMatcher
toMatchable
. Matchable.match
now returns theMatchable
instance, which can then be coerced tobool
, or indexed directly.
0.4.0 (2018-07-21)¶
Added¶
- Mapping class especially for match values. It’s capable of quickly and concisely pulling out groups of variables, but it also properly supports extracting just a single value.
- Mapping class can now index from a
dict
to adict
, in order to support**kwargs
unpacking.
Fixed¶
- A bug (not present in any released version) that caused the empty tuple target to accept any tuple value. This is included partly because this was just such a weird bug.
Removed¶
- Unpublished the
MatchFailure
exception type, and thedesugar
function.
0.3.0 (2018-07-15)¶
Added¶
- Simpler way to create match bindings.
- Dependency on the
astor
library. - First attempt at populating the annotations and signature of the generated constructors.
data
module containing some generic algebraic data types.- Attempts at monad implementations for
data
classes.
Changed¶
- Broke the package into many smaller modules.
- Switched many attributes to use a
WeakKeyDictionary
instead. - Moved prewritten methods into a class to avoid defining reserved methods at the module level.
- When assigning equality methods is disabled for a decorated class, the default behavior is now
object
semantics, rather than failing comparison and hashing with aTypeError
. - The prewritten comparison methods no longer return
NotImplemented
.
Removed¶
- Ctor metaclass.
0.2.0 (2018-07-13)¶
Added¶
- Explicit
__bool__
implementation, to consider all constructor instances as truthy, unless defined otherwise. - Python 3.7 support.
Changed¶
- Marked the enum constructor base class as private. (
EnumConstructor
->_EnumConstructor
) - Switched scope of test coverage to supported versions. (Python 3.7)
Removed¶
- Support for Python 3.6 and earlier.
- Incidental functionality required by supported Python 3.6 versions. (Hooks to enable restricted subclassing.)
0.1.0 (2018-06-10)¶
- First release on PyPI.