Binary Extension

Note

This content was last updated May 19, 2020. Much of the content is tested automatically to keep from getting stale, but some of the console code blocks are not. As a result, this material may be out of date. If anything does not seem correct — or even if the explanation is insufficient — please file an issue.

The bezier Python package has optional speedups that wrap the libbezier library. These are incorporated into the Python interface via Cython as a binary extension. See C ABI (libbezier) for more information on building and installing libbezier.

Extra (Binary) Dependencies

When the bezier Python package is installed via pip, it will likely be installed from a Python wheel. The wheels uploaded to PyPI are pre-built, with the Fortran code compiled by GNU Fortran (gfortran). As a result, libbezier will depend on libgfortran. This can be problematic due to version conflicts, ABI incompatibility, a desire to use a different Fortran compiler (e.g. Intel’s ifort) and a host of other reasons.

Some of the standard tooling for distributing wheels tries to address this. On Linux and macOS, the tools address it by placing a copy of libgfortran (and potentially its dependencies) in the built wheel. (On Windows, there is no standard tooling beyond that provided by distutils and setuptools.) This means that libraries that depend on libbezier may also need to link against these local copies of dependencies.

Linux

The command line tool auditwheel adds a bezier.libs directory to site-packages (i.e. it is next to bezier) with a modified libbezier and all of its dependencies (e.g. libgfortran)

>>> libs_directory
'.../site-packages/bezier.libs'
>>> print_tree(libs_directory)
bezier.libs/
  libbezier-28a97ca3.so.2020.5.19
  libgfortran-2e0d59d6.so.5.0.0
  libquadmath-2d0c479f.so.0.0.0
  libz-eb09ad1d.so.1.2.3

The bezier._speedup module depends on this local copy of libbezier:

$ readelf -d _speedup.cpython-38-x86_64-linux-gnu.so

Dynamic section at offset 0x444000 contains 27 entries:
  Tag        Type                         Name/Value
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/../bezier.libs]
 0x0000000000000001 (NEEDED)             Shared library: [libbezier-28a97ca3.so.2020.5.19]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x9d08
...

and the local copy of libbezier depends on the other dependencies in bezier.libs/ (both directly and indirectly):

$ readelf -d ../bezier.libs/libbezier-28a97ca3.so.2020.5.19

Dynamic section at offset 0x44dd8 contains 28 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libgfortran-2e0d59d6.so.5.0.0]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000e (SONAME)             Library soname: [libbezier-28a97ca3.so.2020.5.19]
 0x000000000000000c (INIT)               0x2be8
...
$ readelf -d ../bezier.libs/libgfortran-2e0d59d6.so.5.0.0

Dynamic section at offset 0x207db8 contains 31 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libquadmath-2d0c479f.so.0.0.0]
 0x0000000000000001 (NEEDED)             Shared library: [libz-eb09ad1d.so.1.2.3]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000e (SONAME)             Library soname: [libgfortran-2e0d59d6.so.5.0.0]
 0x000000000000000c (INIT)               0x19a78
...

Note

The runtime path (RPATH) uses $ORIGIN to specify a path relative to the directory where the extension module (.so file) is.

macOS

The command line tool delocate adds a bezier/.dylibs directory with copies of libbezier, libgfortran, libquadmath and libgcc_s:

>>> dylibs_directory
'.../site-packages/bezier/.dylibs'
>>> print_tree(dylibs_directory)
.dylibs/
  libbezier.2020.5.19.dylib
  libgcc_s.1.dylib
  libgfortran.5.dylib
  libquadmath.0.dylib

The bezier._speedup module depends on the local copy of libbezier:

$ otool -L _speedup.cpython-38-darwin.so
_speedup.cpython-38-darwin.so:
        @loader_path/.dylibs/libbezier.2020.5.19.dylib (...)
        /usr/lib/libSystem.B.dylib (...)

Though the Python extension module (.so file) only depends on libbezier it indirectly depends on libgfortran, libquadmath and libgcc_s:

$ otool -L .dylibs/libbezier.2020.5.19.dylib
.dylibs/libbezier.2020.5.19.dylib:
    /DLC/bezier/libbezier.2020.5.19.dylib (...)
    @loader_path/libgfortran.5.dylib (...)
    /usr/lib/libSystem.B.dylib (...)
    @loader_path/libgcc_s.1.dylib (...)
    @loader_path/libquadmath.0.dylib (...)

Note

To allow the package to be relocatable, the libbezier dependency is relative to the @loader_path (i.e. the path where the Python extension module is loaded) instead of being an absolute path within the file system.

Notice also that delocate uses the nonexistent root /DLC for the install_name of libbezier to avoid accidentally pointing to an existing file on the target system.

Windows

A single Windows shared library (DLL) is provided: bezier.dll. The Python extension module (.pyd file) depends directly on this library:

> dumpbin /dependents _speedup.cp38-win_amd64.pyd
Microsoft (R) COFF/PE Dumper Version ...
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file _speedup.cp38-win_amd64.pyd

File Type: DLL

  Image has the following dependencies:

    bezier-e5dbb97a.dll
    python38.dll
    KERNEL32.dll
    VCRUNTIME140.dll
    api-ms-win-crt-stdio-l1-1-0.dll
    api-ms-win-crt-heap-l1-1-0.dll
    api-ms-win-crt-runtime-l1-1-0.dll
...

For built wheels, the dependency will be renamed from bezier.dll to a unique name containing the first 8 characters of the SHA256 hash of the DLL file (to avoid a name collision) and placed in a directory within the bezier package: for example extra-dll/bezier-e5dbb97a.dll.

In order to ensure this DLL can be found, the bezier.__config__ module adds the extra-dll directory to the DLL search path on import. (%PATH% is used on Windows as part of the DLL search path. For Python versions starting with 3.8, modifying os.environ["PATH"] after Python startup no longer works; instead the os.add_dll_directory() function achieves the same goal in a more official capacity.)

The libbezier DLL has no external dependencies, but does have a corresponding import libraryusr/lib/bezier.lib — which is provided to specify the symbols in the DLL.

On Windows, building Python extensions is a bit more constrained. Each official Python is built with a particular version of MSVC and Python extension modules must be built with the same compiler. This is primarily because the C runtime (provided by Microsoft) changes from Python version to Python version. To see why the same C runtime must be used, consider the following example. If an extension uses malloc from MSVCRT.dll to allocate memory for an object and the Python interpreter tries to free that memory with free from MSVCR90.dll, bad things can happen:

Python’s linked CRT, which is msvcr90.dll for Python 2.7, msvcr100.dll for Python 3.4, and several api-ms-win-crt DLLs (forwarded to ucrtbase.dll) for Python 3.5 … Additionally each CRT uses its own heap for malloc and free (wrapping Windows HeapAlloc and HeapFree), so allocating memory with one and freeing with another is an error.

This problem has been largely fixed in newer versions of Python but is still worth knowing.

Unfortunately, there is no Fortran compiler provided by MSVC. The MinGW-w64 suite of tools is a port of the GNU Compiler Collection (gcc) for Windows. In particular, MinGW includes gfortran. However, mixing the two compiler families (MSVC and MinGW) can be problematic because MinGW uses a fixed version of the C runtime (MSVCRT.dll) and this dependency cannot be easily dropped or changed.

A Windows shared library (DLL) can be created after compiling each of the Fortran submodules:

$ gfortran \
>   -shared \
>   -o bezier.dll \
>   ${OBJ_FILES} \
>   -Wl,--output-def,bezier.def

Note

Invoking gfortran can be done from the Windows command prompt (e.g. it works just fine on AppVeyor), but it is easier to do from a shell that explicitly supports MinGW, such as MSYS2.

By default, the created shared library will depend on gcc libraries provided by MinGW:

> dumpbin /dependents ...\bezier.dll
...
  Image has the following dependencies:

    KERNEL32.dll
    msvcrt.dll
    libgcc_s_seh-1.dll
    libgfortran-3.dll

Unlike Linux and macOS, on Windows relocating and copying any dependencies on MinGW (at either compile, link or run time) is explicitly avoided. By adding the -static flag

$ gfortran \
>   -static \
>   -shared \
>   -o bezier.dll \
>   ${OBJ_FILES} \
>   -Wl,--output-def,bezier.def

all the symbols used from libgfortran or libgcc_s are statically included and the resulting shared library bezier.dll has no dependency on MinGW:

> dumpbin /dependents extra-dll\bezier-e5dbb97a.dll
Microsoft (R) COFF/PE Dumper Version ...
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file extra-dll\bezier-e5dbb97a.dll

File Type: DLL

  Image has the following dependencies:

    KERNEL32.dll
    msvcrt.dll
    USER32.dll
...

Note

Although msvcrt.dll is a dependency of bezier.dll, it is not a problem. Any values returned from Fortran (as intent(out)) will have already been allocated by the caller (e.g. the Python process). This won’t necessarily be true for generic Fortran subroutines, but subroutines marked with bind(c) (i.e. marked as part of the C ABI of libbezier) will not be allowed to use allocatable or deferred-shape output variables. Any memory allocated in Fortran will be isolated within the Fortran code.

However, the dependency on msvcrt.dll can still be avoided if desired. The MinGW gfortran default “specs file” can be captured:

$ gfortran -dumpspecs > ${SPECS_FILENAME}

and modified to replace instances of -lmsvcrt with a substitute, e.g. -lmsvcr90. Then gfortran can be invoked with the flag -specs=${SPECS_FILENAME} to use the custom spec. (Some other dependencies may also indirectly depend on msvcrt.dll, such as -lmoldname. Removing dependencies is not an easy process.)

From there, an import library must be created

> lib /def:.\bezier.def /out:.\lib\bezier.lib /machine:${ARCH}

Note

lib.exe is used from the same version of MSVC that compiled the target Python. Luckily distutils enables this without difficulty.

Source

For code that depends on libgfortran, it may be problematic to also depend on the local copy distributed with the bezier wheels.

The bezier Python package can be built from source if it is not feasible to link with these libraries, if a different Fortran compiler is required or “just because”.

The Python extension module can be built from source via:

$ # One of
$ BEZIER_INSTALL_PREFIX=.../usr/ python -m pip wheel .
$ BEZIER_INSTALL_PREFIX=.../usr/ python -m pip install .
$ BEZIER_INSTALL_PREFIX=.../usr/ python setup.py build_ext
$ BEZIER_INSTALL_PREFIX=.../usr/ python setup.py build_ext --inplace