Identification and Recognition
- Does your surveillance project need to identify or recognize persons?
- Do you need to identify objects such as license plates?
If so, you will not only need to capture the scene at a sufficient resolution, but also take into account factors such as illumination, camera positioning and motion.
The focus of this tutorial is the cameras in a surveillance project that provide the close-up footage required for identification and/or recognition. We’ll also show you how you can meet these requirements in your project.
The traditional way of defining the resolution requirements for an analog CCTV system was to specify the height of the observed object as a percentage of the vertical image. Different surveillance objectives required different percentages.
For example, detecting the presence of a person in a scene might require that the height of the person occupies 20% of the view. Recognizing a person, however, might require that the person occupies 40%, and identification person might require 140% or more (i.e. the person is taller than the image).
Figure 1: The height of the same person occupying 20%, 40% and 140% of the image.
Network video cameras today, however, offer a much wider range of resolutions, and using the percentage requirements from the analog world is no longer practical. Instead, we now use pixels when specifying resolution requirements. The table below shows how Axis defines these requirements.
|Operational requirement||Horizontal pixels/face||Px/cm||Px/inch|
|Identification (Challenging conditions)||80 px/face||5 px/cm||12,5 px/in|
|Identification (Good conditions)||40 px/face||2,5 px/cm||6,3 px/in|
|Recognition||20 px/face||1,25 px/cm||3,2 px/in|
|Detection||4 px/face||0,25 px/cm||0,6 px/in|
Table 1: The Axis definition of the requirements for detection, recognition and identification.
For a detailed discussion on the resolution requirements for identification, recognition and detection, see the Perfect pixel count tutorial.
Other criteria will be valid for other objects, such as license plates, where typical recommendations are that the height of the letters and digits should be 15 pixels (corresponding to approx. 200 pixels/m) to ensure legibility.
When determining the resolution needed in order to use camera footage as evidence in court, it is also important to take into account legal and regulatory requirements.
The resolution of a captured scene is determined by the camera resolution and the size of the scene. Select a camera and lens that will allow the field of view to match the scene width at the desired distance from the camera.
|Camera’s horizontal resolution||Focal length||Maximum|
|2592 pixels||2.8 – 8 mm||9 m||5.2 m|
|1280 pixels||3.3 – 12 mm||6 m||2.6 m|
|1920 pixels||5.1 – 51 mm||41 m||3.8 m|
|736 pixels||3.3 – 119 mm||50 m||1.5 m|
|1280 pixels||4.4 – 132 mm||67 m||2.6 m|
Table 2: Examples of maximum distances for identification (500 px/m or 80 pixels/face).
Axis Lens Calculators and the Axis Product Selector are useful tools that help you find a suitable camera and focal length. For advanced users, a pixel and distance calculator spreadsheet is also available.
As the maximum size of a scene at a given resolution only depends on the resolution, cameras with higher resolutions can cover larger areas. For example, if a scene 7 m wide requires five cameras at 4CIF resolution, these could be replaced by two cameras at 1080p HDTV resolution (1920 x 1080 pixels). Also, a camera with higher resolution can be used to give a better overview, by covering a larger scene while maintaining the required horizontal resolution.
Figure 2: A comparison of various resolutions.
1: 4 CIF (704×576)
2: SVGA (800×600)
3: HDTV 720p (1280×720)
4: HDTV 1080p (1920×1080)
5: 3 MP (2048×1536)
6: 5 MP (2592×1944)
7: 4K (3840×2160)
The greater the depth of field, the larger the area in which persons or objects are in focus. The chances of identification increase with a larger depth of field, which is determined by the iris opening, the focal length and the distance to the camera.
The depth of field increases as the iris opening gets smaller, which means that good lighting can help increase the depth of field. The P-Iris feature in some Axis cameras will adjust the iris to optimize the depth of field for various lighting conditions.
You can learn more about the P-Iris in this white paper:
Using a shorter focal length will also increase the depth of field. Cameras with higher resolutions can capture scenes using shorter focal lengths, whilst still maintaining resolution requirements.
Most lenses exhibit some degree of distortion, often in the form of barrel distortion. This is due to the lens magnification being smaller at the edges of the field of view than at the center of the image. The effect is that objects near the edge appear closer to the center as compared to an undistorted image. Objects of the same size will cover fewer pixels when close to the edge, compared to what they would cover if they were closer to the center. This means that objects close to the edge of the field of view need to be closer to the camera in order to fulfill the requirements for minimum resolution.
The effect of barrel distortion is often much more pronounced at short focal lengths, making wide angle lenses less suitable for identification purposes.
Illumination greatly affects the ability to identify persons or objects. Shadows, high contrast and backlit scenes all make identification and recognition more difficult.
At distances of 15-20 m you will need a 50 mm lens to ensure that a face covers approximately 80 pixels on the horizontal plane. However, positive identification cannot always be guaranteed at the 100-150 lux illumination typically found in an office corridor or subway station. Camera features such as WDR Mode and sensors that perform well in low-light situations can help, but the best results will be obtained if these features are combined with additional lightning and the relocation of cameras to avoid backlit situations.
In outdoor surveillance it is important to remember that sunlight varies in intensity and direction over the course of a day. Weather conditions will also affect lighting and reflection. Snow, for example, will intensify the reflected light, while rain and wet tarmac will absorb light. For identification of a human face, balanced illumination in the region 300-500 lux is recommended. For license plate identification, 150 lux may be sufficient.
Figure 3: Examples of how light conditions affect identification. Image A) has a lux level of 1600 with a favorable light direction. B) has 350 lux, with backlight. C) has 7 lux, with a favorable light direction. D) has a lux level of 1.5.
In low light conditions, camera sensors produce significant amounts of noise that can affect the image, making identification more difficult. There is always a trade-off between noise, shutter speed, and depth of field at any given level of illumination, where better lighting conditions allow you to improve all of these.
Color is often an important factor for identification. To ensure color fidelity, the camera’s white balance should be adjusted to suit the color temperature of the light source(s) used. In outdoor surveillance, the color temperature will change over the course of the day, requiring automatic white balancing to maintain color fidelity.
Camera positioning is critical for successful identification. This is not only to avoid difficult lighting situations, but also to ensure that persons or objects are captured at a favorable angle. A birds-eye perspective from a camera placed high above the ground will cause some degree of distortion, making it difficult to identify persons or objects.
Figure 4: An image with good lighting, both in intensity and direction. The camera is placed at the same level as the people, and the lens provides both focus and depth of field.
The camera should be firmly fixed in order to minimize blur caused by camera movement. This is of particular importance for PTZ cameras, where maneuvering the camera may cause vibrations that affect image quality. Stability can also be a challenge when the camera is mounted on a tall pole, and uses a zoom lens with a long focal length. In this situation, even small vibrations will translate to large movements in the resulting image.
Your system design must also consider the effects of motion in the scene. For identification purposes, a minimum frame rate of 5-8 frames per second is often recommended. Your surveillance objectives may require higher frame rates, for example to get a clearer picture of a series of events. If the scene being monitored includes persons or objects moving at high speeds, or close to the camera, you will probably want to increase the frame rate to ensure that the camera does not miss any of the action.
Furthermore, in order to capture sharp footage of fast-moving persons or objects, you will need to use fast shutter speeds. Using cameras that support progressive scan eliminates the blur that affects moving objects when using interlaced video.
Compression can greatly affect the usability of recorded materials for identification and recognition. High compression ratios will introduce blur or pixelation, which makes identification difficult. If the compression algorithm uses a bit rate limit, the compression might increase when there is motion, making otherwise clear footage unusable. When using variable bit rates on the other hand, the compression remains unchanged, but bandwidth usage will increase when there is motion.
To ensure that identification and recognition goals are met, it is essential to test the installed cameras in realistic conditions. Make sure you use varying levels of lighting, and review the recorded footage in order to verify that you get the required image quality.
Typical problems to be aware of:
- Camera placement or lens selection that distorts facial features
- Difficult lighting conditions that create shaded areas or whiteout effects
- Compression settings that cause image blur or pixelation
- Motion blur caused by slow shutter speeds or low frame rates
- Excessive noise in low-light situations
- Overlay text appearing in a crucial part of the scene
The pixel counter feature available in some Axis cameras lets you draw a rectangle in the image around an area of interest. The camera reports the pixel dimensions of the rectangle, making it easy to verify that the camera installation fulfills requirements.
Figure 5: The pixel counter feature in action.
There are also numerous test targets that can be placed in front of cameras to help determine if resolution requirements are met. One such target is available from the link below:
Figure 6: An example of a test target.
For more advanced calibration, a Rotakin device (rotating man) may be used, which simulates object motion and resulting image blur.
The ability to identify or recognize persons or objects depends on a number of factors. Some of the more important factors are:
- Camera resolution and scene size
- Lighting conditions
- Camera position
Surveillance objectives determine the number of pixels a person or object needs to occupy in the captured footage. Axis recommends 80 pixels or more for identification in challenging conditions. For license plates, text should be 15 pixels vertically. Check legal requirements for footage intended as evidence.
The camera resolution determines the maximum size of the captured scene. The more pixels, the larger the scene covered. The camera’s depth of field is important in order to allow identification within a wider range.
Axis Lens Calculator is useful when selecting cameras to fulfill requirements for identification and recognition.
Identification might not be possible in challenging lighting conditions even if resolution is high enough. Highly sensitive sensors and features such as wide dynamic range can help, but also consider better lighting and positioning of the camera to avoid backlit situations.
Camera positioning is important in order to get distortion-free images.
Select an appropriate frame rate and shutter speed depending on the movements of your surveillance subjects.
Test your system under operational conditions to ensure that the installation meets your surveillance objectives. Review recorded footage to ensure that image quality has not been compromised by compression and that the quality is sufficient for your requirements.