Intelligent applications are all the rage and I for one was really surprised to see how easy and quickly some basic recognition can be developed using the Windows Runtime. It took me just a few minutes to get a webcam to recognize if one or more faces are in frame in a Universal Windows App (UWA).
In the following XAML we have a simple button and a Capture element. The capture element is the means by which we stream from a webcam onto a page or form of the UWA. We are then going to use the Face Analysis class to check how many faces are in the image. So here goes with XAML:
<CaptureElement x:Name="CameraCaptureElement" HorizontalAlignment="Left" Height="600" Margin="0,0,-240,0" VerticalAlignment="Top" Width="600" /> <Button x:Name="button" Content="Button" HorizontalAlignment="Left" Margin="145,605,0,0" VerticalAlignment="Top" Click="button_Click" />
In the UWA form I set the OnLoad method to complete the following steps:-
- Find all the video capture apparatus associated with the Windows device.
- Select the front facing video device.
- Use Windows.Media.Capture.MediaCapture to initialize the CaptureElement
// Find all the videos, and select the one that is Front facing var videoDevices = await DeviceInformation.FindAllAsync(DeviceClass.VideoCapture); var frontCamera = videoDevices.FirstOrDefault(item => item.EnclosureLocation != null && item.EnclosureLocation.Panel == Windows.Devices.Enumeration.Panel.Front); // Initialize the selected camera MediaCapture mediaCaptureMgr = new MediaCapture(); await mediaCaptureMgr.InitializeAsync(new MediaCaptureInitializationSettings { VideoDeviceId = frontCamera.Id }); // Assign the camera to the CaptureElement on the Form CameraCaptureElement.Source = mediaCaptureMgr; await mediaCaptureMgr.StartPreviewAsync();
Then we can use a simple OnClick event to capture an image from the CatpureElement note the steps:-
- Use the MediaCapture class to capture audio, videos and image streams from the webcam.
- Convert the stream to SoftwareBitmap and ensure that the SoftwareBitmap is in the correct pixel format
- Use the FaceDetector class to do its thing…
// Grab the image from the CaptureElement into a stream InMemoryRandomAccessStream stream = new InMemoryRandomAccessStream(); MediaCapture mediaCaptureMgr = (MediaCapture)CameraCaptureElement.Source; await mediaCaptureMgr.CapturePhotoToStreamAsync(ImageEncodingProperties.CreateJpeg(), stream); // Get the SoftwareBitmap from the stream and convert it to a supported format BitmapDecoder decoder = await BitmapDecoder.CreateAsync(stream); SoftwareBitmap softwareBitmap = await decoder.GetSoftwareBitmapAsync(BitmapPixelFormat.Bgra8, BitmapAlphaMode.Straight); IReadOnlyList<BitmapPixelFormat> supportedBitmapPixelFormats = FaceDetector.GetSupportedBitmapPixelFormats(); SoftwareBitmap convertedBitmap = SoftwareBitmap.Convert(softwareBitmap, supportedBitmapPixelFormats.First()); //Detect number of faces FaceDetector faceDetect = await FaceDetector.CreateAsync(); IList<DetectedFace> faces = await faceDetect.DetectFacesAsync(convertedBitmap); await new MessageDialog(string.Format("{0} faces detected.",faces.Count)).ShowAsync();
The native Face Analysis in WinRT is rather elementary (no expression analysis), but there are a couple other interesting (albeit limited) detection namespaces. There is support for developing synthesized speech (voice) by converting text strings to an audio stream (text-to-speech), and speech recognition for command and control within Windows Runtime apps. Next I want to take a look at Microsoft Cognitive Services (cloud).
Comments are closed.