OpenCV Configuration and Optimization Notes


The default package for OpenCV on Fedora 20 (f20) is


The performance of such algorithms as Classifier::detectMultiScale and opencv_traincascade can be optimized via the installation of additional packages, and then enabling them when rebuilding OpenCV with various build flags.

Looking through the opencv.spec SRPM file, various enable flags are provided for configuration tweaking and tuning purposes when rebuilding with rpmbuild.

The most relevant for optimization:

--with eigen3
--with sse3

The most relevant for extending capabilities:

--with ffmpeg
--with openni

The default package can be rebuilt with these optimizations using syntax like:

rpmbuild -ba opencv.spec --with ffmpeg --with openni --with eigen3 --with sse3

However, even when using these flags on f20, the output provided by cmake at configuration time as per doesn’t enthuse. So, rebuild upstream sources without RPM to master the package configuration, and then bring this knowledge back into the RPM package. Old school, yo.

Looking at the upstream source repository, and then rebasing the f20 sources to the latest release of OpenCV (2.4.9) starts off the SRPM hacking. To get a cmake build going, build the opencv sources as specified in the link, to get dependency tracking working.

The file CMakeLists.txt has the build-time configure options.

A list of the most interesting:





Setup, Install Prerequisites.

A couple of these are easy to enable, with dependencies already pre-packaged.

For development, you’ll need the following dependencies:

yum install -y gtk2-devel libtheora-devel libvorbis-devel libraw1394-devel libdc1394-devel jasper-devel libpng-devel libjpeg-devel libtiff-devel libv4l-devel libGL-devel gtkglext-devel OpenEXR-devel zlib-devel python2-devel swig python-sphinx gstreamer-devel gstreamer-plugins-base-devel opencl-headers gstreamer-plugins-bad-free-devel gstreamer-python-devel gstreamer-devel gstreamer-plugins-bad-free-devel-docs gstreamer-plugins-base-devel-docs gstreamer-plugins-ugly-devel-docs libpng12-devel mesa-libGLES-devel

To execute binaries that have been compiled with this optimized version of opencv, one will need to install the OpenCL runtime.


yum install -y openni openni-devel openni-doc


yum install -y ffmpeg ffmpeg-devel


yum install -y tbb tbb-devel tbb-doc


yum install -y eigen3-devel eigen3-doc


To enable WITH_IPP, more elaborate configuration is required. First, install Intel Performance Primitives (aka IPP). From the User’s Guide: Note that opencv_traincascade application can use TBB for multi-threading. To use it in multicore mode OpenCV must be built with TBB.

After IPP is installed, the system must be configured to use it easily. To fixup PATHs, pick one of two options.

One: add the following to LD_LIBRARY_PATH and LD_RUN_PATH:


Two: edit /etc/ldso.conf.d and add



Furthermore, for OpenCV configuration to find the installed IPP at SRPM build time, the environment variable IPPROOT must be set, as follows:

setenv IPPROOT /opt/intel/ipp

Build SRPM

Build the modified opencv package with the following custom SPEC file. No configuration options are necessary: WITH_IPP, WITH_TBB, WITH_EIGEN are all enabled.

Then, force install it over the default libs as follows:

rpm -Uvh --nodeps opencv-2.4.9-3 etc etc.

Recompile the opencv app in question, and volia. Optimized. Speedups may vary, seeing ~ 2.3x speedups in processing times.

libabigail aka C++ Instrumentation and Analysis


Libabigail is shorthand for the alternative, which just so happens to be a bit of a mouthful: “GNU Application Binary Interface Generic Analysis and Instrumentation Library.”

This is a current compiler/language research topic to provide a serialized XML form of C++11 sources as compiled by GNU g++, and a way of looking at the data produced. This data can be parsed to more accurately determine ABI compatibility, to better understand code additions and changes and how these change the exported interface, to examine and prototype how C++11 language usage determines linkage, etc.

Discussions about this functionality started at the “C++ ABI BOF” at the GNU Tools Cauldron 2012 Prague. This work was created at Red Hat, by Benjamin Kosnik, Jason Merrill, and Dodji Seketeli. Some updates at 2013 Cauldron. See “Cauldron 2013 GCC ABI BOF.”

Development sources are written in mixed C++2003/C++11, hosted in git, based on GCC trunk, and tracking what will to be gcc-4.9.0. The branch is administered by Dodji Seketeli.

Please feel free to try it out, but know that the state is experimental and quite raw.

Feedback and assistance is welcome.

Starting from a git working tree as described in GitMirror, add the libabigail repository as follows:

git checkout -b libabigail origin/libabigail

To stay up to date, use:

git pull


How is this expected to be used? First, a libabigail top-level directory is either added to the GCC sources or compiled as a first step and put into some PREFIX directory. The GNU C++ compiler, g++, is configured to use this new library with:

configure .. --with-abigail=$PREFIX

Thus configured, the C++ front end is built, installed, and used as the primary compiler. All sources are compiled with an additional flag, -fdump-abi.

So, this command:

g++ -c -fdump-abi

Creates two files:

  • somefile.o

    The object file


    The XML instrumentation file



Toplevel namespace is abigail.

The interface header files in libabigail:


Doxgen is used to document the sources: try make html to generate, and look in libabigail-build-dir/doc/api/html/index.html to read it.

And then the binary interface is in


Each object file is compiled to a translation_unit. The sum of all translation_units is a corpus.

Compiler-generated files are read as serialized input to a translation_unit and de-serialized. And any modified form is written to an output file in serialized form.

The interface to the C++ intermediate representation is best viewed in the class documentation.

Opinions and Wild Guesses

1. Some formatting tips.

– classes “read” as types, data, members functions. In that order.

– doxygen gives feedback on the state of the doxygen parse in the form of a log, as you run “make html.” Read this log: doxygen is a fuzzy parse. There are formatting things you can do to make it better. Do them. It’s easier to fix up these errors then figure out why the generated HTML is poor.

2. Use of shared_ptr is intriguing.

There are not really a lot of existing usage patterns for std::shared_ptr in libstdc++ (in C++11 , , ). If you look at the page of boost idioms for shared_ptr usage:

One notices that there’s not a lot of use of shared_ptrs in interfaces. Yet in libabigail, that is very common. I’m curious about this style question.

And most usage is up for debate, see this stack overthow discussion about using shared_ptrs as function arguments. Should the parameters be const reference or just shared_ptr? And another.

Some interesting thinking from microsoft on shared_ptr usage.

3. Use of virtual binary operators is odd.

The old adage is that operators cause havoc in overload resolution. These are binary operators, but the stigma lingers. A vague feeling is not the same as something definite that’s a hard no. It’s more like the pirate code than a strict coding convention or hard rule. I would say that if you ever start to see strange bugs due to overloading, consider making these (non-operator) functions.

Otherwise, do it.

Compiler Feature Testing

SG10 is a working group for C++ users to try and figure out how to port between C++11 and C++14. It’s part of the ISO C++ standardization plan for post-C++11 work.

SG 10 first met in Bristol, United Kingdom during the spring of 2013. There have been two teleconfences, and an archived mailing list has been set up for discussion.

The goal is to have some consensus for an approach that vendors can use as C++14 is implemented. In particular, draft recommendations are due prior to the start of the Chicago meeting starting on August 23, 2013.


Below is a summary of the macro interface for relevant languages (C and C++), operating systems (), and notable implementations (GNU, EDG, Clang, Boost).


6.10.8 Predefined macro names Mandatory macros

__STDC_VERSION__ (201ymmL)
__TIME__ Environment macros
Conditionally defined:

__STDC_UTF_16__ (char16_t types are UTF-16 encoded)
__STDC_UTF_32__ (char32_t types are UTF-32 encoded) Conditional feature macros

__STDC_ANALYZABLE__ (1 iff conforms to Annex L)
__STDC_IEC_559__ (1 iff IEC 60559 floating point)
__STDC_IEC_559_COMPLEX__ (1 iff IEC 60559 complex)
__STDC_LIB_EXT1__ (201ymmL Annex K, bounds checking interfaces)
__STDC_NO_ATOMICS__ (1 iff no atomics)
__STDC_NO_COMPLEX__ (1 iff no complex.h types)
__STDC_NO_THREADS__ (1 iff no threads)
__STDC_NO_VLA__ (1 if no VLA)


See the unistd.h include file.

2.1.3 POSIX Conformance

_POSIX_VERSION (200809L iff all mandatory functions and headers)

Defined with value > 1:

_POSIX_NO_TRUNC (!-1, pathnames smaller than NAME_MAX ok)

Defined with value = 200809L


Defined with a value > 0


Defined optionally (-1 means no, 0 maybe, >0 means yes)


User defined to specify version



As of the post-C++11 draft standard, N3485, the current lay of the land is divided into:

16.8 Predefined macro names [cpp.predefined]


__cplusplus (201103L)
__DATE__ ("Mmm dd yyy")
__STDC_HOSTED__ (1 iff hosted, 0 iff freestanding)
__TIME__ ("hh:mm:ss")

Conditionally defined:

__STDCPP_STRICT_POINTER_SAFETY__ (1 iff string pointer safety as per
__STDCPP_THREADS__ (1 iff more than single thread supported)

Some other notable implementation interfaces follow.


From “The C Preprocessor” manual, section 3.7 on Predefined Macros.


__OPTIMIZE__ (iff -On, where n > 0)
__NO_INLINE__ (if -finline)
__SIZE_TYPE__ + others (correct underlying type)
__SIZEOF_INT__ + others (size of type)
__EXCEPTIONS (if -fexceptions)
__GXX_WEAK (1 if comdat, weak supported)
__GXX_RTTI (if -frtti)
__STRICT_ANSI__ (no GNU extensions)










See Feature Checking Macros. Uses a generalized mechanism via “builtin function-like” macros.








Proposed Language Predefines

The plan is to start with new language features, and to offer several modular macros that sub-divide the C++2014 feature set, while retaining a relationship with the main versioning macro (__cplusplus).

In addition, there is much interest in a solution that could be used to resolve some of the lingering portability issues with C++2011 (constexpr, variadic templates), and even C++2003 (exceptions, RTTI).

Starting with proposed C++11/14 language features, add predefined macros of the form:

__cpp + language feature

So, for constexpr, the macro becomes:


The value is determined to be:

1) if C++11 constexpr is not supported, __cpp_constexpr < 201103L
2) if C++11 constexpr is supported, __cpp_constexpr >= 201103L
3) if C++14 constexpr is supported, __cpp_constexpr > 201103L

In the last case, there is a bit of ambiguity. How do you distinguish between a C++11 conformant compiler, an experimental C++14 compiler of a particular vintage, and a C++14 conformant compiler?

One way would be to use the same form used by __cplusplus. This macro value is computed from the year + month of the standard’s adoption by ISO. In a similar manner, pre-standard features could be defined as the year + month that the feature was voted into the working C++ draft.

Take the evolution of constexpr, as a useful for-instance. Using


Set it to the following values based on different language dialect flags, and compare to the primary C++ macro, __cplusplus.

c++ dialect flag __cplusplus __cpp_constexpr
c++98 -std=g++98 199711L 199711L
c++11 -std=g++11 201103L 201103L
pre-c++14 with N3302/N3470/3469/3471 lib changes -std=g++1y 201300L 201210L
pre-c++14 with above + N3652 (relaxed) language changes -std=g++1y 201300L 201304L

Proposed Library Defines

Starting with proposed C++11/14 library features, add macros of the form:

__cpp_lib_ + header name

So, for C++11 , the macro becomes:


The value is determined to be:

1) if C++11 futures is not supported, __cpp_lib_futures < 201103L
2) if C++11 futures is supported, __cpp_lib_futures >= 201103L
3) if C++14 futures is supported, __cpp_lib_futures > 201103L

This would require library implementors to create a header file with this macro definition. (As opposed to not having the header, or pre-defining this macro, or having the library feature testing macros live in one particular header.)

Example Usage

Guarding for C++11 constexpr:

#if __cplusplus_constexpr >= 201103L
  constexpr int i = 66;

Guarding for C++14 relaxed constexpr, given C++11 assumed.

#if __cplusplus_constexpr >= 201304L
constexpr int h(int k) 
  int x = incr(k);  
  return x;

Open Questions

1. Macro conventions.

The macro naming convention, the numbers of macros, type, form, etc. are all up for debate.

Some consensus on:

a. Against function-style macros in the committee, but no explicit rationale for this.

b. The prefix with the most support is: __cpp_.

c. Language feature macros should be pre-defined and not tied to a particular header.

2. What about feature testing in older versions of C++?

In the C++11 standard, two new macros were added, proto feature-testing macros. These macros may establish a naming precedent.


If the committee feels like this is not precedent, and that new functionality means new name, than hopefully these will be incorporated this into whatever naming scheme is now proposed, and the C++11 forms deprecated.

3. Longest-standing feature-testing portability wart is from 1997, starting with the language features exceptions and run-time type identification.

Solving generalized feature testing in a manner simpatico with the both older and newer language features is highly desirable. Some background on GNU issues with this is PR 25191.

4. How do individual feature tests fit in with the global version macro for C++ (__cplusplus)?

Right now, there’s only one real macro, so everything depends on it. But when there are more, how do the multiple feature test macros interact with __cplusplus? Is there a general way to indicate that there is a compiler setting or command line flag that has explicitly disabled parts of the specified language dialect?

No, there is not. Should there be just one, or should a bunch of smaller macros also be checked?

Surveying a couple compilers for standard operating procedures, it seems as if the usual behavior is to treat the command line dialect flag as the base language target, rather than indicating full or strict conformance.

Then, strict language conformance is available via specific command-line flags (-ansi, or -std=c++98), and defines __STRICT_ANSI__ or another equivalent macro.

So, distinguishing between some vendor extensions and strict standard conformance is possible at compile time.

But disabling whole chunks of the regularly-supported language, like specific builtin types or language features, doesn’t distinguish itself in the same manner at compile time. .

For instance, in the C++11 dialect, gnu/clang/edg front ends set __cplusplus to 201103L. Even when language features required for full conformance, like exceptions or long long integers, are explicitly disabled.

The C language has the idea of a pre-defined macro that indicates conformance (__STDC__), and separates out dialect (__STDC_VERSION__). In practice, __STDC__ may indicate conformance + extensions, or explicitly non-conformant behavior. So, not especially useful.

In C++ these macros are explicitly implementation-defined, so even less useful.

Posix has a runtime test that can be used to determine functionality, ie sysconf.

5. What about multi-vendor setups hosted on a single operating system?

Doxygen vs. generated graphviz class hierarchy visualizations

Here is a breakdown of the generation steps doxygen uses to visualize class hierarchy. Requisite software includes doxygen-1.8.3 on C++/C++11 files, compilation and development environment is Fedora 18/x86_64 using GNU C++ version 4.8.

Doxygen Overview

Some Doxygen basics, and internals. The Fedora package is doxygen-1.8.3-3.fc18.x86_64, the command line invocation is: doxygen, which is a C++ binary.

To make the doxygen binary debuggable, check out doxygen in subversion and configure the build with --debug. (On Fedora, some other dependencies are required, like qt-devel. An alias between what the Makefiles expect, ie code and the installed qmake-qt4 needs to be defined).

For this investigation, the subject of most interest is the language parser for C++, breaking in parseInput(). The doxygen parse phase lowdown:

The task of the parser is to convert the input buffer into a tree of entries (basically an abstract syntax tree). An entry is defined in src/entry.h and is a blob of loosely structured information. The most important field is section which specifies the kind of information contained in the entry.

The other area of interest is the output generator for graphviz sources and then generated diagrams. So, breaking in function generateOutput() (see src/doxygen.cpp), step until

  if (Config_getBool("HAVE_DOT"))

This is the part that generates the graphviz source files and then uses dot to create output from the previously-parsed C++ source data. Breaking in function DotManager::run() (see src/dot.cpp) allows stepping through individual graph creation.

To be determined: file name, class name mapping to Doxygen internals.

Graphviz Overview

Some graphviz basics, DOT language reference and users guides, wiki. Of particular interest are the “Node, Edge, and Graph Attributes.”

Usual command-line invocation looks like:

dot -v -Tsvg:cairo -o myfile.svg myfile.gv

And then fonts are in ~/.fonts or /usr/share/fonts, and can be controlled via the following attributes:


These should map to installed fonts, ie

%fc-match "DejaVuSansMono"
DejaVuSansMono.ttf: "DejaVu Sans Mono" "Book"

Doxygen Settings

Parts of the doxygen configuration file that matter, the config settings used, and any commentary.




COLLABORATION_GRAPH=NO (Interesting on a per-class basis only. For larger projects the noise becomes overwhelming.)





TEMPLATE_RELATIONS=NO (Relations between primary templates and template instances is very cluttered, noise value high. Template relationships and class hierarchy relations in non-UML mode are displayed on the same diagram, but use a different visual grammar. Classes inherit base to derived. Throw templates in and they read "as if" from base to primary template to specific instance. This should instead be base to specific instance.)



DOT_IMAGE_FORMAT=SVG (Resolution-independant text, editable, lossless)

INTERACTIVE_SVG=YES (Focus control for big diagrams)

With these settings, a PDF file of the GNU C++11 API runs over three thousand five hundred pages.

Doxygen XML attribute for Graphviz

Legend for doxygen-generated graphviz diagrams.

2) what attributes are needed in XML to represent this?
3) what are the added attributes/markup needed to get longstanding bugs fixed? Or are these solely parse errors?

Generated Diagram Quality

Sample set is GCC-4.8.0 C++ docset, based on a generated output on 2013-03-10.

Representative diagrams for more traditionally-styled C++ code, in the form of OO-style class hierarchies, are found based off of the std::exception and std::ios_base root elements.

  1. std::exception, just the class hierarchy diagram, and the Exceptions Module
  2. std::ios_base, just the class hierarchy diagram, and the IO Module

Starting with the exception diagram, because the lack of templates in this hierarchy is a useful simplifying factor. This diagram is accurate. Layout issues include: names overflowing the bounding boxes (__gnu_pbds::intsert_error) and line break issues (same, but others like __gnu_cxx::recursive_init_error). Many of the line connectors and paths to endpoints are infuriatingly erratic. These issues are largely look to be the kind of thing that could be tweaked via various dot settings, or related graphviz tool settings (like neato).

Next diagram: io. This hierarchy diagram is largely accurate, but with distracting elements,  and extraneous information. This is a multi-level class hierarchy with both base classes and base class templates. Starting from the left, reading to the right. The least-derived base is ios_base. A class template for basic_ios derives from it, taking two parameters. In this diagram, two templates derive from ios_base: the primary class template for basic_ios, and a fully-specialized class template for basic_ios instantiated with the integer type char. A couple of things to note, one level in to the diagram:

  • restricting to just primary templates would be useful, ie this diagram without any specializations. Indeed, this is what the diagrams evolve into once level two and above diagrams are cleaved off, ie starting from basic_istream (instead of ios_base) and going to basic_streamstream.
  • there are actually two specializations for this hierarchy, both char and wchar_t. Where’s wchar_t?
  • there are actually typedefs like ios and wios for the char and wchar_t instantiations of the basic_ios class template.
  • basic_ostream char specialization is duplicated, once for basic_ostream<char>, and one for basic_ostream<char, char_traits, char>. Neither of these instances actually exists. There’s a similar phantom template instance for basic_iostream.

Let’s stop here with this diagram. The rest of the hierarchy has similar issues.

Next, let’s examine some template-heavy components and idioms, like policy based design.
Components that use this idiom are found in the class hierarchies for std::allocator, std::unordered_map (and the other unordered_containers), __gnu_cxx::vstring, and the policy-based data structures extensions for which __gnu_pbds::trie.

  1. std::allocator
  2. __gnu_cxx::__versa_string
  3. __gnu_pbds::trie

For the first class, std::allocator, the generated diagram is accurate. Ideally, there would be a visual marker for the allocator void specialization, and note about the grouping of the superset of extension allocators as the base class for std::allocator.

The two extension classes share common issues, and none of have accurately-generated class hierarchies. Both make use of multiple base classes and policy-based design.

Finally, a pass at some C++11 features. Of note are things like variadic templates (invented for std::tuple and then used elsewhere) and template aliases (used in may parts of the library with policy-based designs, like std::allocator and std::unordered_map.)

  1. std::tuple
  2. std::unordered_map

Just making a quick pass here. From the tuple diagram, ponder the implied template relations. This is hard, since making sense of this with a visual grammar would require better grouping between primary template, partial specializations, and full specializations.

And for unordered_map, apparently the complexity of the derivation, plus the templates, plus the use of C++11 features like alias templates aborts the graph. No hierarchy is not an accurate hierarchy.

Doxygen use is not considered harmful, even with these flaws it is an invaluably useful tool. Reasonable people may differ, of course.

Sources (C++11, graphviz)

For a given set of sources:

struct base
  enum mode : short { in, out, top, bottom };
  typedef long value_type;

struct A : public base
  int _M_i;
  int _M_n;

struct B : public base
  value_type _M_v;

  constexpr B(value_type __v = 6) : _M_v(__v) { };

struct C : public B
  constexpr value_type
  square() { return _M_v * _M_v; }

struct D : public A
  D(const D& __d) : A() { };

  ~D() { }

Next, use doxygen to generate HTML, with HAVE_DOT set to YES and DOT_CLEANUP set to NO in the doxygen configuration file. With this configuration tweak, when doxygen is used to generate HTML, the doxygen-generated graphviz sources used to create the class diagrams are not destroyed. On examination, they produce the following graphic:


And then, look at the generated graphviz for the base class, the root of the diagram:

digraph "base"
  // edge and node defaults
  edge [fontname="FreeSans",fontsize="9",labelfontname="FreeSans",labelfontsize="9"];
  node [fontname="FreeSans",fontsize="9",shape=record];

  // actual graph
  Node1 [label="base",height=0.2,width=0.4,color="black", fillcolor="grey75", style="filled" fontcolor="black"];
  Node1 -> Node2 [dir="back",color="midnightblue",fontsize="9",style="solid",fontname="FreeSans"];
  Node2 [label="A",height=0.2,width=0.4,color="black", fillcolor="white", style="filled",URL="$struct_a.html"];
  Node2 -> Node3 [dir="back",color="midnightblue",fontsize="9",style="solid",fontname="FreeSans"];
  Node3 [label="D",height=0.2,width=0.4,color="black", fillcolor="white", style="filled",URL="$struct_d.html"];
  Node1 -> Node4 [dir="back",color="midnightblue",fontsize="9",style="solid",fontname="FreeSans"];
  Node4 [label="B",height=0.2,width=0.4,color="black", fillcolor="white", style="filled",URL="$struct_b.html"];
  Node4 -> Node5 [dir="back",color="midnightblue",fontsize="9",style="solid",fontname="FreeSans"];
  Node5 [label="C",height=0.2,width=0.4,color="black", fillcolor="white", style="filled",URL="$struct_c.html"];


Template-fu, Zen Linking

Welcome, weary traveller.

Please, enter the dojo. Have some tea. Sit, and listen to me expound on the state of linking today.

There are a number of new techniques for linking in C+11. Some are not widely known. Some require long nights, on cold drafty mountain tops to fully master.

The new forms:

1. Extern Template

When you want white. Nothing. A truly private implementation, with only the API exported. Use extern template on class specializations to tell the compiler to not implicitly instantiate any symbols when the class is used by user code. For template functions as well.

Smartly done on forward-declarations, after the main class has been defined, making them post-declarations.  Pretty much anything goes: the syntax is the same as the syntax for explicit instantiations. Precisely because the two are a matched pair: with the power to prohibit instantiations comes the responsibility to explicitly instantiate them in some form. Wax on, wax off.

With C++11, extern template is portable. GNU C++ users have used it widely since 2002.

2. -fvisibility=hidden

And why it’s different from extern template. There seems to be a lot of confusion out there, about this. And let’s face it, the syntax is atrocious! Absolutely abominable.

GNU extensions, apply as attribute on namespaces.

3. constexpr


4. Namespace association. Tarsier-style.

But I will not bore you, weary traveller. Sit and enjoy your beverage. There will be plenty of time to talk about new techniques and methods later, after you have rested from your voyage.