Build a Hardware-based Face Recognition System for $150 with the Nvidia Jetson Nano and Python

It includes Ubuntu Linux 18.

04 with Python 3.

6 and OpenCV pre-installed which saves a lot of time.

Here’s how to get the Jetson Nano software onto your SD card:Download the Jetson Nano Developer Kit SD Card Image from Nvidia.

Download Etcher, the program that writes the Jetson software image to your SD card.

Run Etcher and use it to write the Jetson Nano Developer Kit SD Card Image that you downloaded to your SD card.

This takes about 20 minutes or so.

At this point, you have an SD card loaded with the default Jetson Nano software.

Time to unbox the rest of the hardware!Plugging Everything InFirst, take your Jetson Nano out of the box:All that is inside is a Jetson Nano board and a little paper tray that you can use to prop up the board.

There’s no manual or cords or anything else inside.

The first step is inserting the microSD card.

However, the SD card slot is incredibly well hidden.

You can find it on the rear side under the bottom of the heatsink:Next, you need to plug in your Raspberry Pi v2.

x camera module.

It connects with a ribbon cable.

Find the ribbon cable slot on the Jetson, pop up the connector, insert the cable, and pop it back closed.

Make sure the metal contacts on the ribbon cable are facing inwards toward the heatsink:Now, plug in everything else:Plug in a mouse and keyboard to the USB ports.

Plug in a monitor using an HDMI cable.

Plug in an ethernet cable to the network port and make sure the other end is plugged into your router.

Finally, plug in the MicroUSB power cord.

You’ll end up with something that looks like this:The Jetson Nano will automatically boot up when you plug in the power cable.

You should see a Linux setup screen appear on your monitor.

First Boot and User Account ConfigurationThe first time the Jetson Nano boots, you have to go through the standard Ubuntu Linux new user process.

You select the type of keyboard you are using, create a user account and pick a password.

When you are done, you’ll see a blank Ubuntu Linux desktop.

At this point, Python 3.

6 and OpenCV are already installed.

You can open up a terminal window and start running Python programs right now just like on any other computer.

But there are a few more libraries that we need to install before we can run our doorbell camera app.

Installing Required Python LibrariesTo build our face recognition system, we need to install several Python libraries.

While the Jetson Nano has a lot of great stuff pre-installed, there are some odd omissions.

For example, OpenCV is installed with Python bindings, but pip and numpy aren’t installed and those are required to do anything with OpenCV.

Let’s fix that.

From the Jetson Nano desktop, open up a Terminal window and run the following commands.

Any time it asks for your password, type in the same password that you entered when you created your user account:sudo apt-get updatesudo apt-get install python3-pip cmake libopenblas-dev liblapack-dev libjpeg-devFirst, we are updating apt, which is the standard Linux software installation tool that we’ll use to install everything else.

Next, we are installing some basic libraries with apt that we will need later to compile numpy and dlib.

Before we go any further, we need to create a swapfile.

The Jetson Nano only has 4GB of RAM which won’t be enough to compile dlib.

To work around this, we’ll set up a swapfile which lets us use disk space as extra RAM.

Luckily, there is an easy way to set up a swapfile on the Jetson Nano.

Just run these two commands:git clone https://github.

com/JetsonHacksNano/installSwapfile.

/installSwapfile/installSwapfile.

shNote: This shortcut is thanks to the JetsonHacks website.

They are great!At this point, you need to reboot the system to make sure the swapfile is running.

If you skip this, the next step will fail.

You can reboot from the menu at the top right of the desktop.

When you are logged back in, open up a fresh Terminal window and we can continue.

First, let’s install numpy, a Python library that is used for matrix math calculations:pip3 install numpyThis command will take 15 minutes since it has to compile numpy from scratch.

Just wait until it finishes and don’t get worried it seems to freeze for a while.

Now we are ready to install dlib, a deep learning library created by Davis King that does the heavy lifting for the face_recognition library.

However, there is currently a bug in Nvidia’s own CUDA libraries for the Jetson Nano that keeps it from working correctly.

To work around the bug, we’ll have to download dlib, edit a line of code, and re-compile it.

But don’t worry, it’s no big deal.

In Terminal, run these commands:wget http://dlib.

net/files/dlib-19.

17.

tar.

bz2 tar jxvf dlib-19.

16.

tar.

bz2That will download and uncompress the source code for dlib.

Before we compile it, we need to comment out a line.

Run this command:gedit dlib/cuda/cudnn_dlibapi.

cppThis will open up the file that we need to edit in a text editor.

Search the file for the following line of code (which should be around line 850):forward_algo = forward_best_algo;And comment it out by adding two slashes in front of it, so it looks like this://forward_algo = forward_best_algo;Now save the file, close the editor, and go back to the Terminal window.

Next, run these commands to compile and install dlib:cd dlibsudo python3 setup.

py installThis will take around 30–60 minutes to finish and your Jetson Nano might get hot, but just let it run.

Finally, we need to install the face_recognition Python library.

Do that with this command:sudo pip3 install face_recognitionNow your Jetson Nano is ready to do face recognition with full CUDA GPU acceleration.

On to the fun part!Running the Face Recognition Doorbell Camera Demo AppThe face_recognition library is a Python library I wrote that makes it super simple to do face recognition.

It lets you detect faces, turn each detected face into a unique face encoding that represents the face, and then compare face encodings to see if they are likely the same person — all with just a couple of lines of code.

Using that library, I put together a doorbell camera application that can recognize people who walk up to your front door and track each time the person comes back.

Here’s it looks like when you run it:To get started, let’s download the code.

I’ve posted the full code here with comments, but here’s an easier way to download it onto your Jetson Nano from the command line:wget -O doorcam.

py tiny.

cc/doorcamThen you can run the code and try it out:python3 doorcam.

pyYou’ll see a video window pop up on your desktop.

Whenever a new person steps in front of the camera, it will register their face and start tracking how long they have been near your door.

If the same person leaves and comes back more than 5 minutes later, it will register a new visit and track them again.

You can hit ‘q’ on your keyboard at any time to exit.

The app will automatically save information about everyone it sees to a file called known_faces.

dat.

When you run the program again, it will use that data to remember previous visitors.

If you want to clear out the list of known faces, just quit the program and delete that file.

Doorbell Camera Python Code WalkthroughWant to know how the code works?.Let’s step through it.

The code starts off by importing the libraries we are going to be using.

The most important ones are OpenCV (called cv2 in Python), which we’ll use to read images from the camera, and face_recognition, which we’ll use to detect and compare faces.

import face_recognitionimport cv2from datetime import datetime, timedeltaimport numpy as npimport platformimport pickleNext, we are going to create some variables to store data about the people who walk in front of our camera.

These variables will act as a simple database of known visitors.

known_face_encodings = []known_face_metadata = []This application is just a demo, so we are storing our known faces in a normal Python list.

In a real-world application that deals with more faces, you might want to use a real database instead, but I wanted to keep this demo simple.

Next, we have a function to save and load the known face data.

Here’s the save function:def save_known_faces(): with open("known_faces.

dat", "wb") as face_data_file: face_data = [known_face_encodings, known_face_metadata] pickle.

dump(face_data, face_data_file) print("Known faces backed up to disk.

")This writes the known faces to disk using Python’s built-in pickle functionality.

The data is loaded back the same way, but I didn’t show that here.

I wanted this program to run on a desktop computer or on a Jetson Nano without any changes, so I added a simple function to detect which platform it is currently running on:def running_on_jetson_nano(): return platform.

machine() == "aarch64"This is needed because the way we access the camera is different on each platform.

On a laptop, we can just pass in a camera number to OpenCV and it will pull images from the camera.

But on the Jetson Nano, we have to use gstreamer to stream images from the camera which requires some custom code.

By being able to detect the current platform, we’ll be able to use the correct method of accessing the camera on each platform.

That’s the only customization needed to make this program run on the Jetson Nano instead of a normal computer!Whenever our program detects a new face, we’ll call a function to add it to our known face database:def register_new_face(face_encoding, face_image): known_face_encodings.

append(face_encoding) known_face_metadata.

append({ "first_seen": datetime.

now(), "first_seen_this_interaction": datetime.

now(), "last_seen": datetime.

now(), "seen_count": 1, "seen_frames": 1, "face_image": face_image, })First, we are storing the face encoding that represents the face in a list.

Then, we are storing a matching dictionary of data about the face in a second list.

We’ll use this to track the time we first saw the person, how long they’ve been hanging around the camera recently, how many times they have visited our house, and a small image of their face.

We also need a helper function to check if an unknown face is already in our face database or not:def lookup_known_face(face_encoding): metadata = None if len(known_face_encodings) == 0: return metadata face_distances = face_recognition.

face_distance( known_face_encodings, face_encoding ) best_match_index = np.

argmin(face_distances) if face_distances[best_match_index] < 0.

65: metadata = known_face_metadata[best_match_index] metadata["last_seen"] = datetime.

now() metadata["seen_frames"] += 1 if datetime.

now() – metadata["first_seen_this_interaction"] > timedelta(minutes=5): metadata["first_seen_this_interaction"] = datetime.

now() metadata["seen_count"] += 1 return metadataWe are doing a few important things here:Using the face_recogntion library, we check how similar the unknown face is to all previous visitors.

The face_distance() function gives us a numerical measurement of similarity between the unknown face and all known faces— the smaller the number, the more similar the faces.

If the face is very similar to one of our known visitors, we assume they are a repeat visitor.

In that case, we update their “last seen” time and increment the number of times we have seen them in a frame of video.

Finally, if this person has been seen in front of the camera in the last five minutes, we assume they are still here as part of the same visit.

Otherwise, we assume that this is a new visit to our house, so we’ll reset the time stamp tracking their most recent visit.

The rest of the program is the main loop — an endless loop where we fetch a frame of video, look for faces in the image, and process each face we see.

It is the main heart of the program.

Let’s check it out:def main_loop(): if running_on_jetson_nano(): video_capture = cv2.

VideoCapture( get_jetson_gstreamer_source(), cv2.

CAP_GSTREAMER ) else: video_capture = cv2.

VideoCapture(0)The first step is to get access to the camera using whichever method is appropriate for our computer hardware.

But whether we are running on a normal computer or a Jetson Nano, the video_capture object will let us grab frames of video from our computer’s camera.

So let’s start grabbing frames of video:while True: # Grab a single frame of video ret, frame = video_capture.

read() # Resize frame of video to 1/4 size small_frame = cv2.

resize(frame, (0, 0), fx=0.

25, fy=0.

25) # Convert the image from BGR color rgb_small_frame = small_frame[:, :, ::-1]Each time we grab a frame of video, we’ll also shrink it to 1/4 size.

This will make the face recognition process run faster at the expense of only detecting larger faces in the image.

But since we are building a doorbell camera that only recognizes people near the camera, that shouldn’t be a problem.

We also have to deal with the fact that OpenCV pulls images from the camera with each pixel stored as a Blue-Green-Red value instead of the standard order of Red-Green-Blue.

Before we can run face recognition on the image, we need to convert the image format.

Now we can detect all the faces in the image and convert each face into a face encoding.

That only takes two lines of code:face_locations = face_recognition.

face_locations(rgb_small_frame)face_encodings = face_recognition.

face_encodings( rgb_small_frame, face_locations )Next, we’ll loop through each detected face and decide if it is someone we have seen in the past or a brand new visitor:for face_location, face_encoding in zip( face_locations, face_encodings): metadata = lookup_known_face(face_encoding) if metadata is not None: time_at_door = datetime.

now() – metadata['first_seen_this_interaction'] face_label = f"At door {int(time_at_door.

total_seconds())}s" else: face_label = "New visitor!" # Grab the image of the the face top, right, bottom, left = face_location face_image = small_frame[top:bottom, left:right] face_image = cv2.

resize(face_image, (150, 150)) # Add the new face to our known face data register_new_face(face_encoding, face_image)If we have seen the person before, we’ll retrieve the metadata we’ve stored about their previous visits.

If not, we’ll add them to our face database and grab the picture of their face from the video image to add to our database.

Now that we have found all the people and figured out their identities, we can loop over the detected faces again just to draw boxes around each face and add a label to each face:for (top, right, bottom, left), face_label in zip(face_locations, face_labels): # Scale back up face location # since the frame we detected in was 1/4 size top *= 4 right *= 4 bottom *= 4 left *= 4 # Draw a box around the face cv2.

rectangle( frame, (left, top), (right, bottom), (0, 0, 255), 2 ) # Draw a label with a description below the face cv2.

rectangle( frame, (left, bottom – 35), (right, bottom), (0, 0, 255), cv2.

FILLED ) cv2.

putText( frame, face_label, (left + 6, bottom – 6), cv2.

FONT_HERSHEY_DUPLEX, 0.

8, (255, 255, 255), 1 )I also wanted a running list of recent visitors drawn across the top of the screen with the number of times they have visited your house:A graphical list of icons representing each person currently at your door.

To draw that, we need to loop over all known faces and see which ones have been in front of the camera recently.

For each recent visitor, we’ll draw their face image on the screen and draw a visit count:number_of_recent_visitors = 0for metadata in known_face_metadata: # If we have seen this person in the last minute if datetime.

now() – metadata["last_seen"] < timedelta(seconds=10): # Draw the known face image x_position = number_of_recent_visitors * 150 frame[30:180, x_position:x_position + 150] = metadata["face_image"] number_of_recent_visitors += 1 # Label the image with how many times they have visited visits = metadata['seen_count'] visit_label = f"{visits} visits" if visits == 1: visit_label = "First visit" cv2.

putText( frame, visit_label, (x_position + 10, 170), cv2.

FONT_HERSHEY_DUPLEX, 0.

6, (255, 255, 255), 1 )Finally, we can display the current frame of video on the screen with all of our annotations drawn on top of it:cv2.

imshow('Video', frame)And to make sure we don’t lose data if the program crashes, we’ll save our list of known faces to disk every 100 frames:if len(face_locations) > 0 and number_of_frames_since_save > 100: save_known_faces() number_of_faces_since_save = 0else: number_of_faces_since_save += 1And that’s it aside from a line or two of clean up code to turn off the camera when the program exits.

The start-up code for the program is at the very bottom of the program:if __name__ == "__main__": load_known_faces() main_loop()All we are doing is loading the known faces (if any) and then starting the main loop that reads from the camera forever and displays the results on the screen.

The whole program is only about 200 lines, but it does something pretty interesting — it detects visitors, identifies them and tracks every single time they have come back to your door.

It’s a fun demo, but it could also be really creepy if you abuse it.

Fun fact: This kind of face tracking code is running inside many street and bus station advertisements to track who is looking at ads and for how long.

That might have sounded far fetched to you before, but you just build the same thing for $150!Extending the ProgramThis program is an example of how you can use a small amount of Python 3 code running on a $100 Jetson Nano board to build a powerful system.

If you wanted to turn this into a real doorbell camera system, you could add the ability for the system to send you a text message using Twilio whenever it detects a new person at the door instead of just showing it on your monitor.

Or you might try replacing the simple in-memory face database with a real database.

You can also try to warp this program into something entirely different.

The pattern of reading a frame of video, looking for something in the image, and then taking an action is the basis of all kinds of computer vision systems.

Try changing the code and see what you can come up with!.How about making it play yourself custom theme music whenever you get home and walk up to your own door?.You can check out some of the other face_recognition Python examples to see how you might do something like this.

Learn More about the Nvidia Jetson PlatformIf you want to learn more about building stuff with the Nvidia Jetson hardware platform, there’s a website called JetsonHacks that publishes tips and tutorials.

I recommend checking them out.

I’ve found a few tips there myself.

If you want to learn more about building ML and AI systems with Python in general, check out my other articles and my book on my website.

If you liked this article, sign up for my Machine Learning is Fun!.Newsletter to find out when I post something new:You can also follow me on Twitter at @ageitgey, email me directly or find me on linkedin.

.

. More details

Leave a Reply