Suppose one wishes to expose to Python multi-threaded C++ code that generates output in the form of flat integer or floating point value arrays, and these arrays should appear as Numpy arrays in Python. The two common approaches are (replacing "element" with
bool,
std::uint16_t,
int,
float,
double, etc):
- Copying the contents of whatever C++ produces, whether std::vector<element> or element* with length parameter, into Numpy arrays.
- Modifying the C++ code so that it accepts pre-allocated data storage in the form of element* with length parameter, and using the Numpy C API or array.ctypes API to feed the pointer to C++. This avoids a copy operation.
With
pybind11, there is a better way: 1) keep any
std::vector<> that may be exposed to Python in an
std::shared_ptr<std::vector>, 2) expose the concrete
std::vector<element> types used with
std::shared_ptr<std::vector<>> as the associated "holder" type for each, with an appropriate
.def_buffer call, 3) and in response to requests from Python, lazily retrieve (causing instantiation of) the Python object wrapping the vector requested, feed this to Numpy, and cache and return the resulting Numpy array.
This arrangement may sound complicated, but it is, by far, the most natural and flexible of all approaches: without resort to Python reference counting or requirement to acquire the GIL, a vector exposed in this manner is not garbage collected until both the last outstanding Python reference and the last outstanding C++ reference are gone. This is awesome.
Let's break down the rather dense instructions presented above.
1
Keep any std::vector<> that may be exposed to Python in an std::shared_ptr<std::vector<>>.
There not much to this. struct Foo { std::vector<int> v; }; changes to Foo { std::shared_ptr<std::vector<int>> v; };, and any v. changes to v->
2
Expose the concrete std::vector<element> types used with std::shared_ptr<std::vector<>> as the associated "holder" type for each, with an appropriate .def_buffer call.
py::class_<std::vector<std::uint64_t>, std::shared_ptr<std::vector<std::uint64_t>>>(m, "_HistogramBuffer")
.def_buffer([](std::vector<std::uint64_t>& v) {
return py::buffer_info(
v.data(),
sizeof(std::uint64_t),
py::format_descriptor<std::uint64_t>::format(),
1,
{ v.size() },
{ sizeof(std::uint64_t) });
});
3
In response to requests from Python, lazily retrieve (causing instantiation of) the Python object wrapping the vector requested, feed this to Numpy, and cache and return the resulting Numpy array. The lines of code where this is done are in bold; the rest is provided as minimal context, so that you have some chance of figuring out what I'm talking about :)
template<typename T>
struct StatsBase
{
static void expose_via_pybind11(py::module& m);
StatsBase();
StatsBase(const StatsBase&) = delete;
StatsBase& operator = (const StatsBase&) = delete;
virtual ~StatsBase() = default;
std::tuple<T, T> extrema;
std::size_t max_bin;
std::shared_ptr<std::vector<std::uint64_t>> histogram;
// A numpy array that is a read-only view of histogram. Lazily created in response to get_histogram_py calls.
std::shared_ptr<py::object> histogram_py;
py::object& get_histogram_py();
};
template<typename T>
void StatsBase<T>::expose_via_pybind11(py::module& m)
{
std::string s = std::string("_StatsBase_") + component_type_names[std::type_index(typeid(T))];
py::class_<StatsBase<T>, std::shared_ptr<StatsBase<T>>>(m, s.c_str())
.def_readonly("extrema", &StatsBase<T>::extrema)
.def_readonly("max_bin", &StatsBase<T>::max_bin)
.def_readonly("histogram_buff", &StatsBase<T>::histogram)
.def_property_readonly("histogram", [](StatsBase<T>& v){return v.get_histogram_py();});
}
template<typename T>
StatsBase<T>::StatsBase()
: extrema(0, 0),
max_bin(0),
histogram(new std::vector<std::uint64_t>(bin_count<T>(), 0)),
histogram_py(nullptr)
{
}
template<typename T>
py::object& StatsBase<T>::get_histogram_py()
{
if(!histogram_py)
{
py::object buffer_obj = py::cast(histogram);
histogram_py.reset(new py::object(PyArray_FromAny(buffer_obj.ptr(), nullptr, 1, 1, 0, nullptr), false), &safe_py_deleter);
}
return *histogram_py;
}
StatsBase<T>::get_histogram_py() is a bit complex; let's break it down:
If the StatsBase<T> instance in question does not already have a non-null histogram_py pointer...
py::object buffer_obj = py::cast(histogram);
Get a Python object wrapping our std::shared_ptr<std::vector<std::uint64_t>> instance. This wrapper will be as we specified to pybind11 and will therefore have a buffer protocol interface understood by Numpy.
histogram_py.reset(new py::object(PyArray_FromAny(buffer_obj.ptr(), nullptr, 1, 1, 0, nullptr), false), &safe_py_deleter);
Use the PyArray_FromAny call to make a Numpy array that is a view of our vector and keep the resulting PyObject* in a pybind11 PyObject* wrapper that will decrement its refcount appropriately when destroyed. Store this in an std::shared_ptr with a GIL-safe deleter in order to avoid crashing in the case where a C++ background thread is the last thing with a reference to a StatsBase instance that has been accessed from a no-longer-extant Python reference.
return *histogram_py;
Return a C++ reference to the py::object representing the Numpy array.
This example is from
real world code (it may be necessary to look in the new_ndimage_statistics branch, but I expect to merge this into master within the next few days). Apologies for not making a minimal example. If you'd like one or have any questions, please ask!
I tried storing an 8GiB virtual box disk image on BTRFS. Well, it copied over successfully, but minutes into 'pacman -Syu', the Linux instance in the VM began reporting copious IDE errors. Suspecting BTRFS copy-on-write being an issue, I moved the disk image back to a ZFS volume. This took inordinately long - it definitely was a COW issue. BTRFS went absolutely crazy as the VM wrote here, there and everywhere to its virtual disk, requiring a
competent copy-on-write implementation - which BTRFS
does not have.
That VM, again on ZFS, is again working flawlessly. ZFS is a copy-on-write filesystem and
works. BTRFS is a copy-on-write filesystem and
does not work.
We're how many years into BTRFS being officially "stable"? And it blows chunks the instant you attempt to, say, modify a file a lot? That doesn't seem right. Perhaps I'm the only one, and I'm doing something wrong? Nope.
BTRFS just plain sucks.
The thing I did wrong with BTRFS was using BTRFS. Apparently, I could
disable BTRFS's copy-on-write support for my VM disk image files. But, then my VM disk image files would have no FS-level data checksums or snapshot capability. If that's what I wanted, I'd keep my VM images on XFS or EXT4. It's not, and BTRFS is apparently little better than EXT4 with some additional features that
don't work, so ZFS it is.