PARP Research Group University of Murcia, Spain


What is QVision?

The QVision is an image processing and computer vision toolkit for application development, fast prototyping and algorithm research.

It is intended to rapidly implement and test new ideas, without investing too much time and effort in the gory details of video and image input/output programming, graphical user interface, debugging, testing, performance measuring, and the like. It also includes implementations of well known computer vision and image processing algorithms.

Special interest is set on efficiency, which makes the toolkit very adequate for real-time applications. It also offers versatile design tools for multithreaded application programming, so QVision applications can benefit from arising multi-core architectures.

The QVision toolkit is currently is developed and maintained to work over Linux platforms, but in a future it is planned to work under other operating systems such as Windows and Mac-OS, taking advantage of the portability of the underlying Qt library.

The toolkit can be used in two ways. First, as a general purpose computer vision library. The QVision integrates efficient functionality for image processing, numerical computation, versatile video input/output, and almost any task relevant to computer vision application development. These are merged under an object oriented and homogeneously documented API.

Second, it can be used as an application development and prototyping tool. The toolkit offers a reusable block oriented application development philosophy, which can be used to easily incorporate synchronized and fully data sharing processes (threads) in the application, to create multi-threaded applications.

QVision as a computer vision library

The toolkit is built on C++ over the widely used Qt graphical widget toolkit. The developer can benefit from its extensive functionality (which goes from generic container data types, to file input/output, networking, and graphical widgets), application development tools (qmake, Qt designer, etc...) and portability amongst different Operative Systems and platforms.

Usability and performance are the two main goals of the QVision. To achieve the first at image processing, the QVision includes the well documented class QVImage, which models images as software objects. Objects derived from this class can load data from a wide range of image files, and provide functionality for pixel access, and basic image operation. Section Basic image processing can be read to get a general idea about the image processing functionality provided by the QVision.

To achieve a good performance at image processing, the toolkit provides a comprehensive set of wrapper functions for the IPP library functionality, which receive as input parameters objects from the class QVImage, instead of the raw image data buffer pointers and steps. To see the full list of these wrapper functions see the IPP wrapper functions group documentation. Section Advanced image processing can be read to learn specific issues about the usage of these functions.

For image input/output, the toolkit offers functionality to load the content of a PNG or JPG image file in a QVImage object, and viceversa. This is based on Qt's own image load and store functionality.

To perform video input, the QVision uses the versatile and widely known MPlayer multimedia player as a back-end application. This media player can open and read from a widely set of video sources: lots of video file formats and encodings, digital/analogical TV, video-cam and webcams, streaming, sets (or sequences) of individual images, etc... In case the MPlayer is not available, or to store image frames to a video file, the toolkit also has functions to store and read the content of a video file using the YUV4MPEG2 encoding.

Section Image and video input/output describes the comprehensive function set provided in the QVision to perform image an video input/output. You can read the documentation of classes QVMPlayerReader for further specifically info about reading video with MPlayer.

Group Math extensions contains the algebraic functionality provided by the toolkit, to operate with matrices, vector, quaternions and tensors. These mathematical entities are modeled with different object classes to stick with the object oriented spirit of the QVision. These functions are mostly based on the GNU Scientific Library, Blas and Lapack libraries.

Along with the IPP and the GSL, QVision can interoperate with other libraries and toolkits, commonly used in computer vision, such as the OpenCV, the CGAL library and CUDA. The toolkit includes functionality to convert from native data type objects (such as images, matrices, etc..) of those libraries to QVision data type objects. Thus a QVision application can cleanly use functionality from those libraries.

Section Interoperability can be read to learn more about using functionality of other libraries in the QVision.

QVision as an application design and prototyping tool

The QVision developer can create applications by coding and composing logically independent data processing blocks.

This approach encourages re-usability. New computer vision or image processing algorithms can be coded inside a logical block, and be shared or reused with zero programming cost. Also applications can be easily created with good and modular designs.

Another advantage of the block oriented design is that it can be used to create parallel applications. The QVision toolkit can map the execution of each logical block to a different thread automatically, obtaining performance gains when the applications are executed on multicore architectures. The developer can create efficient parallel applications by smartly designing block application architectures, connecting the logical blocks with easy to use synchronization and data sharing primitives.

The toolkit includes a set of classes which can be used to create and include ready-to-use logical blocks in new applications. These blocks can perform video input/output (for example, the QVMPlayerReaderBlock can create image an video input blocks based on the MPlayer application), graphical blocks (which will create widgets connected as input or outputs to other blocks, like image displayers, or control widgets), or basic image processing (module Blocks for the IPP functions includes a comprehensive set of processing blocks that correspond to the image processing functions from the IPP library).

So, a new QVision block-oriented application can be created just by coding the more complex image or data processing blocks (if necessary), and building the application block structure by creating and connecting these user defined processing blocks and those provided by the toolkit.

Section Introduction to block programming introduces the reader to block programming in QVision.

The graphical user interface blocks provided by the toolkit offer direct inspection and control to the final user over the algorithms and processes created in the application. This is a snapshot of a typical application developed under the QVision toolkit using some of these GUI blocks:

qvision-screenshot.png

The user can stop, resume and execute step by step each process of the application (through the main window named QVision default GUI for features in the former screen-shot), while the application reads input frames from a given (live or recorded) video source. The user can also modify the values for the parameters of the image processing and computer vision algorithms implemented in the application, in execution time. Both are interesting features for debugging and testing the algorithms, specially for research purposes.

Besides the control (or input) graphical blocks, the toolkit also includes a set of output graphical blocks. The user can inspect with them the performance and results of the algorithms. These widgets can display the images (windows named Image canvas for MSER regions and Image canvas for Harris in the former screen-shot) or plot output data values resulting from the processing algorithms, as well as show CPU time consumption statistics (window named Plot for CPU performance plot of: MSER Block in the screenshot).

All of these graphical widgets are specifically designed to be used at a minimal programming cost: with just a few lines of code, the toolkit automatically creates and manages the whole graphical interface. The programmer or researcher can focus on creating, testing and debugging image processing and computer vision algorithms, without messing with graphical widget programming.

Section Graphical user interface blocks reviews the different input/output graphical blocks provided by the QVision.

To ease the development of complex applications, with a moderate or high number of blocks and connections between them, the QVision offers a visual development tool named The Designer. By deriving an object from the class QVDesignerGUI, the application opens a slate window where every block created by the application and the connections between them can be visualized. This might be the designer window corresponding to the application appearing in the former snapshot:

qvdesignergui.png

With this design interface the application can be executed normally, visualizing the results of the processing blocks in the different graphical blocks created. If desired, the user can stop the execution and modify the block structure, adding any of the blocks provided by the QVision, deleting blocks, or adding/deleting data links between them. For further info about this tool, check the section The Designer GUI.

Finally, the block oriented development also lets the user of a QVision application to set in the command line initial values for the parameters of the different algorithms, contained in the processing blocks. This allows the easy recreation of test conditions for the algorithms implemented by the applications. Section Command line parameters reviews this topic in detail.




QVision framework. PARP research group, copyright 2007, 2008.