Machine Learning Tobi Olabode Machine Learning Tobi Olabode

To Get Better, Do the Real thing

I'm not going to lie; I haven’t been spending much time working on a machine learning project. Due to university work. While that may be a valid reason. That still means my machine learning skills won't get better. So soon I should start making it a priority to start training some models.

Recently I watched a podcast with Lex Friedman and George Hotz. Hotz is a very eccentric figure, to say the least. But a very fascinating person. Which made the podcast very enjoyable. On the area of self-help advice. He said that he can’t give good advice. Especially for generic questions. In his own words “how do I become good at things?” Where he said, “just do [the thing] a lot”.

When he was asked how to be a better programmer. He just replied be programming for 20 years.  He says many times if you want to learn a skill you have to do it. When talking about self-help he said those books tend to be useless. As the things that people want to hear. Not real stuff like work harder.

Please do the real thing

This reminds of an article by Scott Young one of my favourite bloggers. Titled “Do The Real Thing”. Which echoes the sentiment above. That to get good at something. You want to do the real thing. Time and time again. Substitutes don’t count. He gave the example of his language learning journey. That if he wanted to get better at speaking in the foreign language he was learning. Then he had to speak the foreign language to native speakers. Learning vocab or reading can help. But he still needed to do the activity.

Same things apply to improving my machine learning skills. Make as many models as you can. You cannot help but get better. As you are googling things left right and centre.

Picking up the general process of making a deep learning project along the way. Getting data, cleaning the data. Choosing a model. Training the model. Testing the model. Debugging the model. Then publishing the model. Will be learnt by doing the thing.

Machine learning skills I want to learn

This is why I want to start a new project. But I don’t know what to build for my deep learning project.

I liked the green tea vs oolong tea project. I thought that was very original. I enjoyed making it. Even after the many frustrations of getting the model to work. And I learnt how to use Pytorch. Which is something I will likely be using in a future project.

I may spend more time expanding the green tea vs oolong tea model. Like converting it into an object detection model. Or publishing it so the public can use it. With services like stream lit. Or a custom frontend made with flask. Or convert it into a mobile app. While those options look nice.

I want to try something new. So I want to try a new project. Recently I have been thinking of trying something simple. Like a cat vs dogs image classifier. The reason why I thinking to do this. Is because I want an excuse to try the new FastAI library. As they rebuilt it from the ground up with Pytorch. So it will be nice to see what changed. And getting used to trying fastai again.

Still on the horizon is GANs. I always found GANs. Very interesting. But each time I tried to implement them I have always failed. So I think my prerequisites are not there. So soon I will probably try making a GAN.

Also like I mentioned in many previous blog posts. We need to learn how to implement models to the wider public, not just keeping it in our notebooks. I haven't been following my own advice. So I want to spend time using things like stream lit. Or having an API frontend for one of my ML projects. The production phase of the ml pipeline I think is not taught enough in the ML community. So I want to stay true to my word. And start learning about the production itself. Like ML-Ops and a basic fusion of software engineering and machine learning.

Now that I'm thinking about it one of the best ways to learn those skills is working for a tech company. As you need to publish to the wider public. The model needs to be effective enough where users can get results. But I don’t have that luxury yet. My projects will have to count.

Reading helps but Intuition comes from action

Going back to the topic at hand. All these areas I want to learn. Will need to be learnt by doing them. Getting the first-hand experience, you develop an intuition on the topic. And can produce tangible things with that knowledge. Further cementing your skills. Just reading about it will give you a high-level view of the topic. Which is fine. Not every single topic you need to learn the ins and outs of. But the ones that deep understanding can help you push towards your goals. Then doing the boring work of doing the real thing is a must.

  

Read More
Personal Project, Machine Learning Tobi Olabode Personal Project, Machine Learning Tobi Olabode

Image classifier for Oolong tea and Green tea

Developing the Dataset

In this project, I will be making an image classifier. My previous attempts a while ago I remember did not work. To change it up a bit, I will be using the Pytorch framework. Rather than TensorFlow. As this will be my first time using Pytorch. I will be taking a tutorial before I begin my project. The project is a classifier that spots the difference between bottled oolong tea and bottled green tea.

The tutorial I used was PyTorch's 60 min blitz. (It did take me more than 60 mins to complete though). After typing out the tutorial I got used to using Pytorch. So I started moving on the project. As this will be an image classifier. I needed to get a whole lot of images into my dataset. First stubbed upon a medium article. Which used a good scraper. But even after a few edits, it did not work.

image001.png

So I moved to using Bing for image search. Bing has an image API you can use. Which makes it easier to collect images compared to google. I used this article from pyimagesearch. I had a few issues with the API in the beginning. As the endpoints that Microsoft gave me did not work for the tutorial. After looking around and a few edits I was able to get it working.

image003.png

But looking at the image folder gave me this:

image005.png

After looking through the code I noticed that the program did not produce new images. But changed images to “000000”. This was from not copying the final section of code from the blog post. Which updated a counter variable.

image007.png

Now I got the tutorial code to work we can try my search terms. To create my dataset. First I started with green tea. So I used the term "bottle green tea". Which the program gave me these images:

image009.png

Afterwards, I got oolong tea, by using the term “bottle oolong tea”.

image011.png

Now I had personally go through the dataset myself. And delete any images that were not relevant to the class. The images I deleted looked like this:

image013.png

This is because we want the classifier to work on bottled drinks. So leaves are not relevant. Regardless of how tasty they are.

They were a few blank images. Needless to say, there are not useful for the image classifier.

image015.png
image017.png

Even though this image has a few green tea bottles. It also has an oolong tea bottle so this will confuse the model. So it's better to simplify it to having only a few green tea bottles. Rather than a whole variety which is not part of a class.

After I did that with both datasets. I was ready to move on to creating the model. So went to Google Collab and imported Pytorch.

As the dataset has less than 200 images. I thought it will be a good idea to apply data augmentation. I first found this tutorial which used Pytorch transformations.

When applying the transformation, it fell into a few issues. One it did not plot correctly, nor did it recognize my images. But I was able to fix it

image019.png

The issues stemmed from not slicing the dataset correctly. As ImageFolder(Pytorch helper function) returns a tuple not just a list of images.

Developing the model

After that, I started working on developing the model. I used the CNN used in the 60-minute blitz tutorial. One of the first errors I dealt with was data not going through the network properly.

shape '[-1, 400]' is invalid for input of size 179776

 

I was able to fix this issue by changing the kernel sizes to 2 x 2. And changed the feature maps to 64.

self.fc1 = nn.Linear(64 * 2 * 2, 120) 
x = x.view(-1, 64 * 2 * 2)

Straight afterwards I fell into another error:

ValueError: Expected input batch_size (3025) to match target batch_size (4).

 

This was fixed by reshaping the x variable again.

x = x.view(-1, 64 * 55 * 55) 

By using this forum post.

Then another error 😩.

RuntimeError: size mismatch, m1: [4 x 193600], m2: [256 x 120] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41

 

This was fixed by changing the linear layer again.

self.fc1 = nn.Linear(64 * 55 * 55, 120)
 

Damn, I did not know one dense layer can give me so many headaches.

 

After training. I needed to test the model. I did not make the test folder before making the model. (rookie mistake). I made it quickly afterwards by using the first 5 images of each class. This is a bad thing to do. This can contaminate the data. And lead to overfitting. But I needed to see if the model was working at the time.

I wanted to plot one of the images in a test folder. So I borrowed the code from the tutorial. This led to an error. But fixed it by changing the range to one. Instead of 5. This was because my model only has 2 labels. (tensor[0] and tensor[1]) Not 4.

When loaded the model. It threw me an error. But this was fixed by resizing the images in the test folder. After a few runs of the model, I noticed that it did not print the loss. So edited the code to do so.

if i % 10 == 0:  
            print('[%d, %d] loss: %.5f' %
                  (epoch + 1, i + 1, running_loss / 10))
            running_loss = 0.0
image021.png

As we can see the loss is very high.

When I tested the model on the test folder it gave me this:

image023.png

Which means it’s at best guessing. I later found it was because it picked every image as green tea. With 5 images with a green tea label. This lead it to be right 50% of the time.

So this leads me to the world of model debugging. Trying to reduce the loss rate and improve accuracy.  

Debugging the model

I started to get some progress of debugging my model when I found this medium article

The first point the writer said was to start with a simple problem that is known to work with your type of data. Even though I thought I was using a simple model designed to work with image data. As I was borrowing the model from the Pytorch tutorial. But it did not work. So opted for a simpler model shape. Which I found from a TensorFlow tutorial. Which only had 3 convolutional layers. And two dense layers. I had to change the final layer parameters as they were giving me errors. As it was designed for 10 targets in mind. Instead of 2. Afterwards, I fiddled around with the hyperparameters. With that, I was able to get the accuracy of the test images to 80% 😀.

Accuracy of the network on the 10 test images: 80 %
10
8 
image025.png

Testing the new model

As the test data set was contaminated because I used the images from the training dataset. I wanted to restructure the test data sets with new images. To make sure the accuracy was correct.

To restructure it I did it in the following style:

https://stackoverflow.com/a/60333941

https://stackoverflow.com/a/60333941

While calling the test and train dataset separately.

train_dataset = ImageFolder(root='data/train')
test_dataset  = ImageFolder(root='data/test')

With the test images, I decided to use Google instead of Bing. As it gives different results. After that,  I tested the model on the new test dataset.

Accuracy of the network on the 10 test images: 70 %
10
7

As it was not a significant decrease in the model learnt something about green tea and oolong tea.

Using the code from the Pytorch tutorial I wanted to analyse it even further:

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net_debug(images)
_, predicted = torch.max(outputs, 1)
c = (predicted == labels).squeeze()
for i in range(1):
print(labels)
label = labels[i]
class_correct[label] += c.item()
class_total[label] += 1
for i in range(2):
print('Accuracy of %5s : %2d %%' % (
classes[i], 100 * class_correct[i] / class_total[i]))
Accuracy of Green_tea_test : 80 %
Accuracy of oolong_tea_test : 60 %

Plotting the predictions

While I like this. I want the program to tell me which images it got wrong. So, I went to work trying to do so. To do this, I stitched up the image data with the labels, in an independent list.

for i, t, p, in zip(img_list, truth_label, predicted_label):
  one_merge_dict = {'image': i, 'truth_label': t, 'predicted_label': p}
  merge_list.append(one_merge_dict)

print(merge_list)

On my first try I got this:

image029.png


As we can see its very cluttered and shows all the images. To clear it out I removed unneeded text.

image031.png

Now I can start separating the images from right to wrong.

if correct_label != guess_label:
img = img.permute(1, 2, 0)
# print(sample)
# correct_label = merge_list[j][1]
# img, label = merge_list[j]
ax = plt.subplot(1, 11, j + 1)
ax.set_title('Sample #{}'.format(j))
ax.text(15, 250, 'GroundTruth: {}'.format(correct_label))
ax.text(15, 300, 'Predicted label: {}'.format(guess_label))
plt.tight_layout()
ax.imshow(img)
ax.axis('off')
if j == 10:
plt.show()
break

I was able to do this by using a small if statement

Now the program correctly plots the images with the incorrect label. But the placement of the images is wrong. This is because it still uses the placement of the other correct images. But the If statement does not plot them.


I corrected it by changing the loop:

i = 0
# mean_data = np.array(mean_data)
for j in range(len(merge_list)):
sample = merge_list[j]
img = merge_list[j]['image']
# print(img.shape)
img = img[0]
# print(img.shape)
# print('sample: ', sample)
correct_label = sample['truth_label']
# print(correct_label)
guess_label = sample['predicted_label']
# print(guess_label)
if correct_label != guess_label:
i += 1
img = img.permute(1, 2, 0)
# print(sample)
# correct_label = merge_list[j][1]
# img, label = merge_list[j]
ax = plt.subplot(1, 11, i + 1)
ax.set_title('Sample #{}'.format(i))
ax.text(15, 250, 'GroundTruth: {}'.format(correct_label))
ax.text(15, 300, 'Predicted label: {}'.format(guess_label))
plt.tight_layout()
ax.imshow(img)
ax.axis('off')
if j == 10:
plt.show()
break
image033.png

I wanted to get rid of the whitespace, so I decided to change the plotting of images.

  

ax = plt.subplot(1, 4, i + 1)

fig = plt.figure(figsize=(15,15))

image035.png

Now I have an idea, what the model got wrong. The first sample the green tea does not have the traditional green design. So it’s understandable that is got it wrong. The second sample. Was oolong tea but misclassified it as green tea. My guess is the bottle as has a very light colour tone. Compared to the golden or orange tone oolong bottles in the training data. Then the third example, where the bottle has the traditional oolong design with an orange colour palette. But the model misclassified it with green tea. I guess that the leaf on the bottle affected the judgement of the model. Leading it to classify it as green tea.

Now I have finished the project. This is not to say that I may not come back to this project. As an addition to the implementation side could be made. Like having a mobile app that can detect oolong or green tea. With your phone's camera. Or a simple web app, that users can upload their bottled tea images. And the model can classify your image on the website.

Read More
Technology Tobi Olabode Technology Tobi Olabode

Remote work will get better because of human behaviour, not software

My experience

Right now, I have been doing university lectures for a few months now. With that, I have been in a few video calls. Like many people during the pandemic. When taking video calls I’m starting to notice that adapting to remote work. Is mostly related to dealing with human behaviour, not software.

For example, in my video calls. Almost all the students don’t use our cameras. And many people use the chat. To communicate with each other and the teacher. This is very different from many places. I know companies that all their employees. Have their camera on. And predominately use the microphone to speak.

The Need for Structured Communication

So even with the same software. People are using the software very differently. This has to do with remote work in general. Not just video calls. I had to do some group work. While it was good. The work was done over an ad-hoc fashion over WhatsApp. This means I was always on tab. Ready to read a message in the group chat. One time I was eating my food and I had to interrupt it. So I can complete some work.

Now I understand the stuff I hear about in Cal Newport’s podcast. Talking about the danger of using instant messengers like slack and email. For work communication. And the need to have more structured communication. There is popular software for project management, Like Notion and Asana. But its just matter of getting people to use them. As the status quo is easier, in the moment. Moving to a new system has a high activation cost. The team will need to ask important questions like, how to do transfer your tasks? How do you get everyone up and running with the software? It's lots of questions to answer.

New Systems of Working

As we are all new to this. People will eventually work out new systems. To get the most out of remote work. Some companies can use remote work to their advantage. Gumroad and other small tech companies. Can use asynchronous communication. Which means people don’t need to be at the same time or place to receive or send messages.

Sahil Lavingia, the founder of gumroad mentions that asynchronous communication:

All communication is thoughtful. Because nothing is urgent (unless the site is down), comments are made after mindful processing and never in real-time. There's no drama

Because everyone is always effectively "blocked," everyone plans ahead. It also means anyone can disappear for an hour, a day, or a week and not feel like they are holding the company back. Even me!

I think this is a thing of times to come. Because it is creating new working styles when remote work is possible. With this text-based style. No meetings are needed. And plans are well thought through.

Sahil also mentions that:

People build their work around their life, not the other way around. This is especially great for new parents, but everyone benefits from being able to structure their days to maximize their happiness and productivity.

This allows us to spend more time on things we love like our family or other hobbies. Remote work can make living without following the traditional 9-5 structure. Or the Hustle minded 80-hour workweek. Also, remote work people be more location independent. In many cities around the world. House prices are skyrocketing. Due to a lack of housing. People buy shoeboxes going for half a million dollars. Now with that same money, they can buy a whole acreage in Nebraska. Allowing them to have more space for their family and themselves. And depending on their lifestyle to have a more enjoyable time.

I remember many times in college. I wondered why I’m I need to the classroom. A lot of this work can be done at home. I guess a lot of people in the workplace think about the same thing.

With Asynchronous communication and remote work. We can allow employees to become time and location independent. The time independence within reason. Employees still need to get work done, that goes without saying.

Software is still very helpful

Even software itself can help people transition into remote work easier. NVIDIA has showcased awesome technology using GANs. That creates an artificial version of a person's face. Using key points on a person’s face. Allowing a person to use a video feed. With little bandwidth. So people in rural areas. And other places with a bad internet connection. Can join video calls. With that, they can fully be location Independent. As location independence implies that you have a good internet connection. So choppy video will become a thing of the past.

Like mentioned earlier they are a lot of project management tools out there. People just need to use them. People will need to get used to structured communication. People will need to get used using video calls. People are quickly designing etiquette on the fly. Like muting your microphone when you enter the call. Messaging before wanting to speak. Some companies have a culture of using video calls to get closer to the team. So employees will talk briefly about their lives once a week. In a hands-on meeting. And used it to get updates with the rest of the team.

The Potential of remote work

With remote work, people can have time to take walks around the neighbourhood or maybe cook lunch for themselves. Not just staying in an office all day. Granted people still need to get used to this as lots people including myself stay at home all day. Like I mentioned before it's not a problem of software but human behaviour.

So, employees on a personal level will start to get the most out of remote work by improving their overall being and productivity. During the summertime, I was able to take walks almost every single day enjoying the local scenery and parks (now it's winter is looking a bit more difficult.) On NHK world I watching a program talking about people moving to the suburbs. In which the town, Atami. Had a nice beachfront. And people wanted to move away from the hustle and bustle life of Tokyo. I can now imagine with remote work during your lunch break you can take walks along the beach and have seafood for lunch.

I think remote work is something that going to stay. There are lots of improvements still need to be made.

Read More
Technology Tobi Olabode Technology Tobi Olabode

Deep learning is already mainstream

While I was scrolling on Twitter. I saw a tweet showing the word “deep learning” is plateauing on google trends. Then Yann LeCun replied that it's simply deep learning because become more normal.

This reminds me of a previous blog post that I wrote. Talking about how good tech is like good design. Meaning when technology is good. It embeds itself in our society and becomes invisible. Gmail’s smart compose feature is 100% deep learning. But we don’t think about it as ML. Amazon’s recommendations are ML. But we don’t think about them that way. In normal discourse, we just simply call them algorithms. Which is an accurate term. While abstracting away most of the advanced details.

This makes sense, as only nerds care about the type of algorithm. To recommend films on Netflix. Its recommendation systems. Everybody else will simply just say technology. Or treat the company as a person. Like “Netflix sent me this message.” “Facebook showed me this message.” Etc. When aeroplanes started getting popular we just said we took a flight to Washington DC. Rather than a mechanical flying device helped travel a couple of miles.

As I write this blog post. I use Microsoft Word’s read-aloud feature. To proofread the blog post. Where a robot voice reads why to work for me. The voice has improved tremendously. While it still has some robotic feels to it. It does a good job. It’s like an editor is personally reading my work. Also, I use the program Grammarly. While they do not say it I’m pretty certain they use machine learning. To spot mistakes in your work. These very useful tools that help me improve my writing are drive-by machine learning. Even though people will simply just call it technology.

This is the cycle of all technologies. You have hype. Depending on how good the technology is. It fails to even go into the mainstream. And start again in the hype cycle. If it's good. It will fall below expectations not because it’s bad. But failed to meet the sky-high expectations. Afterwards, people start to work out more practical uses of the technology. After a while, the technology gets popular. But lots of the hype starts to fade away. As people get used to the technology. So I guess deep learning or machine learning. The hype is starting to disappear, but people are finding uses for the technology.

You will have some standouts like GPT-3 and GANs. But most machine learning in the wild right now is a little bit boring. Recommendation systems. Think of Netflix and Amazon. Forecasting. Using past data. To predict future behaviours. It tends to be boring as its simply showing other data points based on past data. Or in amazon cases using the AI to help sell more products. Which is no surprise if your for-profit company. You need to make money.

While ML has its limits. I still think is very popular because it can do so many things. Like generating image via GANs. Classifying images with CNNs. To predicting past behaviour using forecasting. I think this is why AI is very popular. Because if you have some type of data. Which in the internet age, the answer is always a yes. Like early computers where it efficiently changed every industry either via automation or communication. With machine learning. It can help with those areas even more.

In the good tech is like good design blog post. I talked about technology tends to be popular when people stop noticing it. Which is happening now.

In the article I said:

no one calls their company “Excel-based” or “Windows-based”. As it’s [just] a tool.

 

When people started using office services on their computers. It was revolutionary at the time. But people now, don’t call themselves an “Excel-first company” Or an “email first company”. As people got used to them, people assume that using these services is a given. Soon having some type of data science role will be a given. Just like having a web developer for your company is a given. This will still mainly be focused on tech companies. But non-tech companies are not far behind. Non-tech companies hire web developers and server managers.

Read More
Machine Learning Tobi Olabode Machine Learning Tobi Olabode

Why learning how to use small datasets is useful for ML

Many times, you hear about a large tech company doing something awesome with ML. You think that’s great. You think how can I do the same. So you copy their open-source code. And try it for yourself. Then you notice the result is good but not amazing. Than you first thought. Then you spotted that you are training the model with less than 200 samples. But the tech company is using 1 million examples. So you conclude that’s why the tech company’s model performs well.

So this brings me the topic of the blog post. Most people who are doing ML need to get good at getting good results with small datasets. Unless you are working for a FANG company. You won't have millions of data points at your disposal. As we apply technology to different industries, we must deal with applying models when not much data is available. The lack of data can be because of any reason. For example, this could be that the industry does not have experience using deep learning. So, collecting data is a new process. Or maybe collecting data can be very expensive, due to extra tools or expertise involved. But I think we need to get used to the fact that we are not Google.

We do not have datacentres full of personal data. Most businesses have less than 50 employees. With the company dealing with a few thousand customers. To a few dozen depending on the business. Non-profits may not even have the resources to collect a lot of data. So just getting insights from the data we have. Is super useful. Compared to trying to working with a new cutting edge model. With bells and whistles. Remember your user only cares about the results. So you can use the simplest model. Heck, if a simple feed-forward neural network works then go ahead.

But we should not worry about having the resources of tech companies. But worry about what we can do with the resources we have now. A lot of gains in ML are simply done by throwing ungodly amounts of data at the model. And seeing what happens next. We need to do that with only less than 200 samples. Luckily, people have developed techniques that may help with this. Like image augmentation that edits photos. Which helps the model learn about the image in different orientations. So has a general idea of the object of the image. Regardless of any minor edits like size or direction. Soon we may have GANs, that help produce more data from a small dataset. Which means we can train a model with larger datasets. Thanks to generated data of GANs.

While reading about GPT-3 is fun. And very likely to lead to some very useful products. (Just check out product hunt). We are not going to have the opportunity to train a 10 billion parameter behemoth. Don’t get me wrong, we don’t need to train a large model from scratch to be useful. This is what fine-tuning is for. The people using GPT-3 are not training the model from scratch but using prompts to help prime the model to solve the problems it will be dealing with.

But I think we need to deal with the bread and butter issues that people want an AI to solve. Like a simple image classifier. Which may be useful for a small business which needs it to sort out different weights in its store. Or a simple time series analysis to forecast sales into the next season for the shop keeper. Models from Facebook and google that have 100 layers will not be helpful. And likely give you grey hairs by setting it up. Again, the whole goal is the solution to your customer’s problem. Not to split the atom.

Like a podcast, I heard a while ago. Deep learning is already powerful enough. We just need to get it into the hands of the right people. Hopefully, I can help with that process. To do that we need to be a pragmatist and deal with the restraints that most people have when applying ML to their projects. While I'm happy the researchers in OpenAI and the FANG companies take deep learning to its limits. I want to have on the ground experience of taking that knowledge and improving the world. (Yes, its sounds very hippy). But most people will not have the resources to spent millions of dollars on a cloud provider to train a model. People may have a budget of a few hundred or a few maybe a thousand. But not a few million. With rates of cloud computing, a budget like that should be more than enough. Especially dealing with small models with small datasets.

Read More
Machine Learning Tobi Olabode Machine Learning Tobi Olabode

The boring work of implementing ML models

As I write blog posts of the potential for AI to help industries. Or some hype article in the media talking about how AI can help revolutionise the world or your favourite industry. We forget the day to day work needed to make that future into reality.

What should you do with the data?

Yes, having AI look at medical images is great and gives more accurate predictions than doctors. But how do you get that data? Medical data is hard to obtain for good reason. Patients medical history should not be passed around willy-nilly. As its very sensitive information. This is not just data about your music preferences. It's about their lives. After you get the data. How should you train the data? A simple 2D medical image may require a Convolutional neural network. The default for computer vision. How do you train the data? You will need labels. So, the computer knows what it's looking at.  A person with expertise in the field (a doctor). Will need to help label the images. Depending on the goal of the model, the doctor will need to point out items in an image. (for object detection). Or just give the general category of the image. (image classifier). Now you have trained the model. And if a model is good. If the model is correct more often than doctors. Then you can think about how to move the model to production.

How do you make sure the model is safe for use?

Hospitals are known for some terrible bureaucracy and paperwork depending on your country. So how would you get this AI into the hands that need it the most? For example, should the doctor access the model via a web app? Then upload an image to the website. Or should it be a mobile app? Where the doctor can point his camera at his hospital computer and the app gives a result. Only talking to your prospective users will give you the answer. If the diagnosis is wrong, who comes under fire? The model or doctor. So, there are even ethical questions when it comes to some areas of using machine learning.

When Google added machine learning to its data centres. To help with energy usage. They added fail safes just in case the AI does something funky. Many times, humans had to do a final send-off of approval. When the AI changed something, the humans were always in the loop. So, depending on the scale and activity. Safety features may have to get built-in. Rather than just focusing on getting accuracy on your model. But tons of areas where machine learning will be used. Such safety features won’t be needed. It will likely speed up time extracting useful information from the company’s data. Like spreadsheets or text documents. The main issue is privacy. Because if they contain personal data of customers. Then ethics and regulations like GDPR get involved. But these are problems you will face even before touching the model.

What should you do with missing or inadequate data?

If the user decides to opt-out of giving data in certain areas. How should the company give model recommendations to the user? Maybe it will need to guess by using other data from the user. Or maybe, it bases guesses on other users similar to the original user. Or maybe just say to the user you can’t use the service unless you opt-in into giving certain data. I don’t know. I guess it will depend on the company and the service they are providing.

Let's say you want to use satellite data and/or remote sensing for your project. One major question you need to ask before starting your project. Is the spatial resolution enough for your image? If not, then you start to notice halfway through collecting your data. That it's not good enough. As you can’t zoom in enough to get features you want from the image. This affected me in one of my projects. So I was later forced to use screenshots from google earth. If the project has commercial value. Then it may make sense to buy higher resolution images. From places like Planet Labs that release high-quality satellites into space. Allowing for high-resolution images with daily or close to daily updates. These things that don’t get mentioned in media articles talking about “HoW RemTOe SeNing can HElp Your Business.” To get cool things to work, you will need to do boring things.

Sending the model to production

I didn't even get to the step by step problems of releasing your model to the public. Because if you are going to do so. You need to quickly learn how to use ML-ops and learn basic software engineering. An area I'm hoping to learn soon. Like I said in the medical example do you want to create a web-app or a mobile app. Maybe you want to create an API. As your users are going to be developers. But this side of machine learning is not talked about that much. After you release it, how will you update the model and even the app? Should users give user feedback to the model? Or are you going to do it personally by looking at user and error logs? Which cloud provider would you use to release your model? Would you go for the new serverless services or traditional server space? To improve the model would you collect user data separately. Then occasionally, train the model on new data? Or train as you go along. (Note: I don’t know if you can do this).

As I get more experience, I should be writing in more detail on how to solve these issues. Because I think this is an area that resources are lacking. Also, for my selfish reasons I want to share my work. Some apps allow you to interact with the model. This is something I want to do. Also, I could learn from the users of the app. So, something like that should be on the horizon. And if we want to have machine learning be useful. Then it's obvious they should be released in some type of way. An internal app. Or a public web app. Models are not useful when stuck in a notebook. They are useful when released for the wider world. And the model is being tested with reality.

Read More