Define New Zones Where to Find Examples
There are three different ways in which byexample can be extended:
- define zones where to find examples
- support new languages: how to find them and how to run them
- perform arbitrary actions during the execution
byexample uses the concept of modules: a python file with some extension
classes defined there. Modules can be loaded using --modules <dir>
from the command line.
What extension classes will depend of what you want to extend or customize.
In this how-to we will see how to define new zones where to
find examples.
Check how to support new finders and languages
and how to hook to events with concerns
for a how-to about the first two items.
How to define new zones
byexample searches for examples in particular zones of a file.
What zones will depend of the kind of file.
For Python files it searches them in the docstrings, in Ruby files it searches them in the comments.
By default, if no particular zone is defined byexample searches the
example in the whole file.
But searching in the whole file without any structure could lead to false positives.
You can define new zones for a type of files just creating a ZoneDelimiter
subclass.
Imagine that you want to find examples in a HTML file and you want to ignore
everything except the code between <pre> and </pre> tags.
This is what you need to write:
>>> from byexample import regex as re
>>> from byexample.finder import ZoneDelimiter
>>> class HTMLPreBlockDelimiter(ZoneDelimiter):
... target = '.html'
...
... def zone_regex(self):
... return re.compile(r'<pre>(?P<zone>.*?)</pre>', re.DOTALL | re.UNICODE)
...
... def get_zone(self, match, where):
... return ZoneDelimiter.get_zone(self, match, where)
That’s it.
The target indicates the file extension of the files
that will be delimited by this code. It can be a string with a single extension
file or a list or set of several extensions.
The zone_regex method should return a regular expression to find and capture
the zones.
And optionally, the get_zone can be overridden to post-process the captured
string: use it to remove any spurious string that may had been captured.
Changed in
byexample 10.0.0. Before10.0.0you could return a Python regular expression but from10.0.0and on, you need to return the regular expressions created bybyexample.regex. The module is almost identical to Python’sreso the required changes are minimal.
Concurrency model
Each ZoneDelimiter instance will be created once during the setup of
byexample and then it will be created once per job thread.
By default there is only one job thread but more threads can be added
with the --jobs option.
The instances are independent and therefore thread-safe.
If you want to share data among them you will have to use a
thread-safe structures created by a sharer and store them
in a namespace.
In the concurrency model documentation it is explained and in byexample/modules/progress.py you can see a concrete example.
Changed in
byexample 10.0.0. Before10.0.0you were forced to usemultiprocessingby hand but in10.0.0the concurrency model is hidden so you cannot relay onmultiprocessingbecausebyexamplemay not use processes at all!sharerandnamespaceare objects that hide the details while allowing you to have the same power.
ZoneDelimiter initialization
If you extend ZoneDelimiter and decide to implement your own __init__,
you must ensure that you call ZoneDelimiter’s __init__ method
passing to it all the keyword-only arguments that you received.
Once done that, you can use the self.cfg property to access any
configuration set in byexample including the flags/options set
(self.cfg.options).
In the __init__ you can also change the value of target to something
different. This allows you to change what type of files you are going to
processes based on the configuration or you can disable the zone finder
entirely setting self.target = None.
See Extension initialization for more about this and some troubleshooting.
New in
byexample 11.0.0:self.cfgwas introduced.