A simple idea: How can I identify the breed of a cat? I mean, any cat.
It is trivial to differentiate dogs from cats. There are many work have been done and can be found online, with dataset and training models. However, I haven’t found anything that is available to tell the breed of a cat.
Why do I care?
Honestly, identifying cats’ breed is not as useful as differentiating a cat from a dog. One fundamental thing people have been trying to do is: how to better understand our surroundings. We have seen program recognizing flowers, natural species, and animals. It will be tremendously useful when setting up a remote camera in a rainforest and identify what types of animals passing by.
Unfortunately, no one will be interested in knowing what breed of cat just walks by my neighborhood, unless cats are approaching extinction, or someone is obsessed with studying cats.
However, there are still some use cases might benefit from it. A robot that plays with cats. Different breed of cats have different personality: see this. But cats usually don’t play with every human, not to mention robots. A camera analyzes what type of breed always go outdoor. I suppose it’s interesting but anyone who has a cat would know cats do whatever they want to do, so they probably will all go out. Finally, a smart feeder! I know you have been waiting for me to say this.
Well, the truth is, there is no such a thing called breed-specific diet. Although some claim they do, it is found that is purely BS. It may be true that for different cats, they have different diet needs, like age or specific illness a cat carries, but there is definitely no a certain food must be used for a specific breed.
Therefore, a smart feeder does not need to care the breed of a cat. Like many in the market today, they can be programmed to feed a certain amount at a certain time. The only extra feature I need is something like this.
Hence, at least I don’t see a use case for an AI to identify cat breed.
Interestingly, I stumble upon a scam/failed indiegogo project: Bistro. In short, it is designed for people who have multiple cats, and the device can recognize and record how much a cat eats or drinks. I like the idea, but I don’t like the feeder design (I have some ideas saved for later). The project had been abandoned long time ago, you can see how angry those backers are from the comment section. I like to think the owner fails to implement the software instead of being a real scam.
Now it is the real motivation of my idea: Why did the idea Bistro fail?
Is it possible to classify cats?
First, let’s look at these two cats:
Can a human being identify the breed?
The left is a Ragdoll; the right is a Siberian. I guess you can argue they are pretty different with distinguish features.
What about this one compared to the above two:
This is a Birman. It has similar coat to a Ragdoll, but similar color pattern like a Siberian.
We are still talking about purebred here. Inbred will be very hard to say.
Problem 1: Cat breeds could be difficult to differentiate.
The next problem is the total number of breeds for cats. According to wikipedia, there are 71, or 44, or 43 breeds in the world. It is fascinating to know there are 3 different cat associations in the world and they don’t agree with each other. Let’s use the largest candidate: 71.
Problem 2: We have a large number of classes to classify.
In summary, since we have a large number of classes, and each class has similarities to each other (meaning, the data cannot be just 0 and 1), a simple multiclass classification will not suffice. Here is a discussion on large class classification problem, which has 5k of classes. Not exactly what I have to follow, but interesting to read.
Hypothesis: because of the two problems listed above, we cannot use a simple multiclass classification to identify the breed of cats.
My naive attempt
First, I randomly picked 14 breeds as follows:
Each breed has 6-10 images. I used Google Image to download the cat pictures (please let me know if anyone finds an image violates any copyright).
The project uses ML.NET with the most basic configuration. I did try different setting values. The idea is basically: each label represents a cat breed and multiple images are used to represent it. We build a model with all those labels and images. Next, we feed a new image to the model and expect it to tell me which label has the highest probability.
The input for training is in a TSV file, such as this:
As mentioned earlier, there is a label and a bunch images mapped to the same label.
Source code: https://github.com/sowenzhang/CatTracker
If you want to run it, you have to train the model in your local machine and then run the test with test images.
The result is not very accurate though. Here is one failed prediction, for this Ragdoll test image, the model predicts it as a Siberian:
The way to read the numbers inside the bracket is: each number represents the probability of what the model thinks this input could be. The order is based on the sequence we define in the cats.tsv, which is the training input file. For this particular image, the model thinks it’s more likely to be a Siberian or Himalyan, and then possibly a Ragdoll.
I guess it is fair. Just by looking at it (as a human), I would say it is a Siberian as well.
There is another problem, which is probably very common in any multiclass classification problem. We have N classes for the object we want to identify. What about the class that is NOT any of N classes we care?
In my testing, I feed a Pomeranian dog picture and the model thinks it’s a Himalayan.
Fair enough. I can’t blame the model.
So I need a way to tell the model that there is also a None class.
I spent way longer time on writing this blog than writing the code.
The intention of the code is prove my hypothesis: identify the breed of a cat is not a trivia problem and we cannot use the most basic multiclass image classification to achieve.
That is probably why the project Bistro failed.
By doing the quick prototype, I realize the following potential improvements:
- I need more images per breed, including kitten and adults, as well as different posture of the breed, such as lay down, whole body, sit straight, etc.
- I need a multi-layer neural networks. Each breed needs to have another layer or more of details, including face, eyes, color of eyes, ear, mouth, tail, and so on. That should prevent a bulldog being predicted as a Scottish fold.
- I need to figure out how to tell a model that we are certain this is not what we interested, such as a dog picture.
In this blog, I present an idea on how to identify the breed of a cat, even though there is no real world use case I can think of, I decide to do a proof of concept with minimum amount of work. The prediction result is obviously poor and incomplete; however, it is a start. It gives me some more ideas on what is more important, which is to identify a specific cat. To achieve it, I will need a much most sophisticated model.
The source code of my POC can be found here: https://github.com/sowenzhang/CatTracker
…To be Continued…