Tobi Olabode 01/11/2020 Tobi Olabode 01/11/2020

Will building leverage get easier thanks to better technology?

While reading some of the work of Naval Ravikant. On the topic of building wealth, he stresses the need for someone to build leverage. Meaning your time is not correlated to your output. Meaning you can work on a product for 10 hours and the returns can be 10x or 100x the value. This is why Naval urges people to stop swapping time for money. Even doctors and lawyers get paid a lot by the hour. Won't get as rich. If they did. They started something separately like a private practice or selling a medical product.

Naval talked about the most recent form of leverage. Which are products with little marginal cost of replication. Which means media, books and code. With the internet, the cost of replication is close to zero. So if you write an eBook and sell it on amazon. You don’t need to pay the printing cost. It simply goes straight to the user. Naval explains this new form of leverage is great because it’s permission-less. You can simply start producing without another person’s approval. Things like social media, blogs and podcast also count in this bucket. All you need is a microphone or a camera and you can start. He mentioned that code can come with extra leverage because it can work 24/7. In this context he is talking about you can rent servers from the tech companies where you can place your code in. And can run the service for you 24/7.

Which brings me to the topic at hand. Code is a high leverage skill because of the possibilities you can make with it. With little replication cost. I'm thinking that while code is already high leverage skill. Machine learning may increase that leverage even more. As you are teaching a robot. Learn a process. Once that process is learnt it can be replicated to many places. Compared to normal programming where you are making the end product from scratch. Before you may have to rely on human leverage aka human labour[i]. But now you can use AI to complete a task. Which may be done faster and more accurately. Like what Naval said you have datacentres packed with robots. So you have other robots helping the robots you made. As the machine learning model improves from your use of the product. Due to users adding new data to the model. As time increases the leverage also increases as well.

I think the best examples of this are the major tech companies. With Google, each search is making the service better. As they collecting data on how the service is being used. As the service gets better more people are more likely to use the search engine as it gives them what they want. Expanding there reach even more. Same with Facebook. For the data, they collect. This is to make the service better. (which means more money). As you click on certain posts on your timeline you are training the algorithm. Which means it will show more posts that you’re more likely to click on. Helping you stay on the platform. As it knows more about you it can sell that information to advertisers. Were targeted ads can be displayed on your feed.

So compared to other products. Machine learning products. Can have a strong flywheel effect. Where the cost of replication is not just low but its better after each replication. This is where some of these algorithms get powerful. This is why regulation is likely going to step in. As the flywheels of the companies are just too strong. And Competitors can't compete with them. The competitor will never have enough data coming to go head to head with them. Don’t get me wrong they are some exceptions like Tick Tock. Where they got a great data flywheel going. Helping to grow the product even more.

Let’s go back to the ebook example. As soon as the author publishes the book. The author is not getting paid by the hour. Is getting paid by items sold. Disconnecting him from the input and the output. As the author can make tons of money by selling lots of items. While hours put into the book is the same. Which is where the leverage comes from. But imagine each time a person gets a book. After they have finished reading it. The book improves ever so slightly. So when the next person comes then he is seeing an improved book. This is what is happening for the tech companies that I mentioned above. So it will be hard for a beginner to catch up.

While these systems are democratic meaning everyone can use them. After a while, they become almost an oligopoly. Due to entrenched powers. This can be applied to the highest levels like the tech companies. To the lower levels like the content creators. A content creator nowadays will find it harder to build an audience than a couple of years ago. Due to massive content creators on the platform taking attention from most users. As they pull lots of the clicks and views. The algorithms tend to have a bias to favour them. Due to the history of generating attention for their platform.

Therefore making an entrenched class of content creators on a platform. Those content creators can use the money they earned from making content. To make better content that people are more likely to see. (good on them though). And leveraging their audience to help get further reach. Some of these issues are only a result of content creators leveraging their audience for greater heights in their career. Which is great for them don’t get me wrong. But because of the increased leverage. The compounding interest makes it harder for everyone to catch up.

This is not a sob story. With the internet now all of us have the chance. To make our flywheel. And what naval said you can “Escape Competition through authenticity”. Meaning you can make your monopoly just by being yourself.

[i] A lot of human labour is still used for labelling data. So human labour is still important for making AI.

Tobi Olabode 25/10/2020 Tobi Olabode 25/10/2020

To Get Better, Do the Real thing

I'm not going to lie; I haven’t been spending much time working on a machine learning project. Due to university work. While that may be a valid reason. That still means my machine learning skills won't get better. So soon I should start making it a priority to start training some models.

Recently I watched a podcast with Lex Friedman and George Hotz. Hotz is a very eccentric figure, to say the least. But a very fascinating person. Which made the podcast very enjoyable. On the area of self-help advice. He said that he can’t give good advice. Especially for generic questions. In his own words “how do I become good at things?” Where he said, “just do [the thing] a lot”.

When he was asked how to be a better programmer. He just replied be programming for 20 years. He says many times if you want to learn a skill you have to do it. When talking about self-help he said those books tend to be useless. As the things that people want to hear. Not real stuff like work harder.

Please do the real thing

This reminds of an article by Scott Young one of my favourite bloggers. Titled “Do The Real Thing”. Which echoes the sentiment above. That to get good at something. You want to do the real thing. Time and time again. Substitutes don’t count. He gave the example of his language learning journey. That if he wanted to get better at speaking in the foreign language he was learning. Then he had to speak the foreign language to native speakers. Learning vocab or reading can help. But he still needed to do the activity.

Same things apply to improving my machine learning skills. Make as many models as you can. You cannot help but get better. As you are googling things left right and centre.

Picking up the general process of making a deep learning project along the way. Getting data, cleaning the data. Choosing a model. Training the model. Testing the model. Debugging the model. Then publishing the model. Will be learnt by doing the thing.

Machine learning skills I want to learn

This is why I want to start a new project. But I don’t know what to build for my deep learning project.

I liked the green tea vs oolong tea project. I thought that was very original. I enjoyed making it. Even after the many frustrations of getting the model to work. And I learnt how to use Pytorch. Which is something I will likely be using in a future project.

I may spend more time expanding the green tea vs oolong tea model. Like converting it into an object detection model. Or publishing it so the public can use it. With services like stream lit. Or a custom frontend made with flask. Or convert it into a mobile app. While those options look nice.

I want to try something new. So I want to try a new project. Recently I have been thinking of trying something simple. Like a cat vs dogs image classifier. The reason why I thinking to do this. Is because I want an excuse to try the new FastAI library. As they rebuilt it from the ground up with Pytorch. So it will be nice to see what changed. And getting used to trying fastai again.

Still on the horizon is GANs. I always found GANs. Very interesting. But each time I tried to implement them I have always failed. So I think my prerequisites are not there. So soon I will probably try making a GAN.

Also like I mentioned in many previous blog posts. We need to learn how to implement models to the wider public, not just keeping it in our notebooks. I haven't been following my own advice. So I want to spend time using things like stream lit. Or having an API frontend for one of my ML projects. The production phase of the ml pipeline I think is not taught enough in the ML community. So I want to stay true to my word. And start learning about the production itself. Like ML-Ops and a basic fusion of software engineering and machine learning.

Now that I'm thinking about it one of the best ways to learn those skills is working for a tech company. As you need to publish to the wider public. The model needs to be effective enough where users can get results. But I don’t have that luxury yet. My projects will have to count.

Reading helps but Intuition comes from action

Going back to the topic at hand. All these areas I want to learn. Will need to be learnt by doing them. Getting the first-hand experience, you develop an intuition on the topic. And can produce tangible things with that knowledge. Further cementing your skills. Just reading about it will give you a high-level view of the topic. Which is fine. Not every single topic you need to learn the ins and outs of. But the ones that deep understanding can help you push towards your goals. Then doing the boring work of doing the real thing is a must.

Tobi Olabode 21/10/2020 Tobi Olabode 21/10/2020

Image classifier for Oolong tea and Green tea

Developing the Dataset

In this project, I will be making an image classifier. My previous attempts a while ago I remember did not work. To change it up a bit, I will be using the Pytorch framework. Rather than TensorFlow. As this will be my first time using Pytorch. I will be taking a tutorial before I begin my project. The project is a classifier that spots the difference between bottled oolong tea and bottled green tea.

The tutorial I used was PyTorch's 60 min blitz. (It did take me more than 60 mins to complete though). After typing out the tutorial I got used to using Pytorch. So I started moving on the project. As this will be an image classifier. I needed to get a whole lot of images into my dataset. First stubbed upon a medium article. Which used a good scraper. But even after a few edits, it did not work.

So I moved to using Bing for image search. Bing has an image API you can use. Which makes it easier to collect images compared to google. I used this article from pyimagesearch. I had a few issues with the API in the beginning. As the endpoints that Microsoft gave me did not work for the tutorial. After looking around and a few edits I was able to get it working.

But looking at the image folder gave me this:

After looking through the code I noticed that the program did not produce new images. But changed images to “000000”. This was from not copying the final section of code from the blog post. Which updated a counter variable.

Now I got the tutorial code to work we can try my search terms. To create my dataset. First I started with green tea. So I used the term "bottle green tea". Which the program gave me these images:

Afterwards, I got oolong tea, by using the term “bottle oolong tea”.

Now I had personally go through the dataset myself. And delete any images that were not relevant to the class. The images I deleted looked like this:

This is because we want the classifier to work on bottled drinks. So leaves are not relevant. Regardless of how tasty they are.

They were a few blank images. Needless to say, there are not useful for the image classifier.

Even though this image has a few green tea bottles. It also has an oolong tea bottle so this will confuse the model. So it's better to simplify it to having only a few green tea bottles. Rather than a whole variety which is not part of a class.

After I did that with both datasets. I was ready to move on to creating the model. So went to Google Collab and imported Pytorch.

As the dataset has less than 200 images. I thought it will be a good idea to apply data augmentation. I first found this tutorial which used Pytorch transformations.

When applying the transformation, it fell into a few issues. One it did not plot correctly, nor did it recognize my images. But I was able to fix it

The issues stemmed from not slicing the dataset correctly. As ImageFolder(Pytorch helper function) returns a tuple not just a list of images.

Developing the model

After that, I started working on developing the model. I used the CNN used in the 60-minute blitz tutorial. One of the first errors I dealt with was data not going through the network properly.

shape '[-1, 400]' is invalid for input of size 179776

I was able to fix this issue by changing the kernel sizes to 2 x 2. And changed the feature maps to 64.

self.fc1 = nn.Linear(64 * 2 * 2, 120)

x = x.view(-1, 64 * 2 * 2)

Straight afterwards I fell into another error:

ValueError: Expected input batch_size (3025) to match target batch_size (4).

This was fixed by reshaping the x variable again.

x = x.view(-1, 64 * 55 * 55)

By using this forum post.

Then another error 😩.

RuntimeError: size mismatch, m1: [4 x 193600], m2: [256 x 120] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:41

This was fixed by changing the linear layer again.

self.fc1 = nn.Linear(64 * 55 * 55, 120)

Damn, I did not know one dense layer can give me so many headaches.

After training. I needed to test the model. I did not make the test folder before making the model. (rookie mistake). I made it quickly afterwards by using the first 5 images of each class. This is a bad thing to do. This can contaminate the data. And lead to overfitting. But I needed to see if the model was working at the time.

I wanted to plot one of the images in a test folder. So I borrowed the code from the tutorial. This led to an error. But fixed it by changing the range to one. Instead of 5. This was because my model only has 2 labels. (tensor[0] and tensor[1]) Not 4.

When loaded the model. It threw me an error. But this was fixed by resizing the images in the test folder. After a few runs of the model, I noticed that it did not print the loss. So edited the code to do so.

if i % 10 == 0:  
            print('[%d, %d] loss: %.5f' %
                  (epoch + 1, i + 1, running_loss / 10))
            running_loss = 0.0

As we can see the loss is very high.

When I tested the model on the test folder it gave me this:

Which means it’s at best guessing. I later found it was because it picked every image as green tea. With 5 images with a green tea label. This lead it to be right 50% of the time.

So this leads me to the world of model debugging. Trying to reduce the loss rate and improve accuracy.

Debugging the model

I started to get some progress of debugging my model when I found this medium article

The first point the writer said was to start with a simple problem that is known to work with your type of data. Even though I thought I was using a simple model designed to work with image data. As I was borrowing the model from the Pytorch tutorial. But it did not work. So opted for a simpler model shape. Which I found from a TensorFlow tutorial. Which only had 3 convolutional layers. And two dense layers. I had to change the final layer parameters as they were giving me errors. As it was designed for 10 targets in mind. Instead of 2. Afterwards, I fiddled around with the hyperparameters. With that, I was able to get the accuracy of the test images to 80% 😀.

Accuracy of the network on the 10 test images: 80 %

Testing the new model

As the test data set was contaminated because I used the images from the training dataset. I wanted to restructure the test data sets with new images. To make sure the accuracy was correct.

To restructure it I did it in the following style:

While calling the test and train dataset separately.

train_dataset = ImageFolder(root='data/train')

test_dataset  = ImageFolder(root='data/test')

With the test images, I decided to use Google instead of Bing. As it gives different results. After that, I tested the model on the new test dataset.

Accuracy of the network on the 10 test images: 70 %

As it was not a significant decrease in the model learnt something about green tea and oolong tea.

Using the code from the Pytorch tutorial I wanted to analyse it even further:

Accuracy of Green_tea_test : 80 %

Accuracy of oolong_tea_test : 60 %

Plotting the predictions

While I like this. I want the program to tell me which images it got wrong. So, I went to work trying to do so. To do this, I stitched up the image data with the labels, in an independent list.

for i, t, p, in zip(img_list, truth_label, predicted_label):
  one_merge_dict = {'image': i, 'truth_label': t, 'predicted_label': p}
  merge_list.append(one_merge_dict)

print(merge_list)

On my first try I got this:

As we can see its very cluttered and shows all the images. To clear it out I removed unneeded text.

Now I can start separating the images from right to wrong.

I was able to do this by using a small if statement

Now the program correctly plots the images with the incorrect label. But the placement of the images is wrong. This is because it still uses the placement of the other correct images. But the If statement does not plot them.

I corrected it by changing the loop:

I wanted to get rid of the whitespace, so I decided to change the plotting of images.

ax = plt.subplot(1, 4, i + 1)

fig = plt.figure(figsize=(15,15))

Now I have an idea, what the model got wrong. The first sample the green tea does not have the traditional green design. So it’s understandable that is got it wrong. The second sample. Was oolong tea but misclassified it as green tea. My guess is the bottle as has a very light colour tone. Compared to the golden or orange tone oolong bottles in the training data. Then the third example, where the bottle has the traditional oolong design with an orange colour palette. But the model misclassified it with green tea. I guess that the leaf on the bottle affected the judgement of the model. Leading it to classify it as green tea.

Now I have finished the project. This is not to say that I may not come back to this project. As an addition to the implementation side could be made. Like having a mobile app that can detect oolong or green tea. With your phone's camera. Or a simple web app, that users can upload their bottled tea images. And the model can classify your image on the website.

Tobi Olabode 19/10/2020 Tobi Olabode 19/10/2020

Remote work will get better because of human behaviour, not software

My experience

Right now, I have been doing university lectures for a few months now. With that, I have been in a few video calls. Like many people during the pandemic. When taking video calls I’m starting to notice that adapting to remote work. Is mostly related to dealing with human behaviour, not software.

For example, in my video calls. Almost all the students don’t use our cameras. And many people use the chat. To communicate with each other and the teacher. This is very different from many places. I know companies that all their employees. Have their camera on. And predominately use the microphone to speak.

The Need for Structured Communication

So even with the same software. People are using the software very differently. This has to do with remote work in general. Not just video calls. I had to do some group work. While it was good. The work was done over an ad-hoc fashion over WhatsApp. This means I was always on tab. Ready to read a message in the group chat. One time I was eating my food and I had to interrupt it. So I can complete some work.

Now I understand the stuff I hear about in Cal Newport’s podcast. Talking about the danger of using instant messengers like slack and email. For work communication. And the need to have more structured communication. There is popular software for project management, Like Notion and Asana. But its just matter of getting people to use them. As the status quo is easier, in the moment. Moving to a new system has a high activation cost. The team will need to ask important questions like, how to do transfer your tasks? How do you get everyone up and running with the software? It's lots of questions to answer.

New Systems of Working

As we are all new to this. People will eventually work out new systems. To get the most out of remote work. Some companies can use remote work to their advantage. Gumroad and other small tech companies. Can use asynchronous communication. Which means people don’t need to be at the same time or place to receive or send messages.

Sahil Lavingia, the founder of gumroad mentions that asynchronous communication:

All communication is thoughtful. Because nothing is urgent (unless the site is down), comments are made after mindful processing and never in real-time. There's no drama

Because everyone is always effectively "blocked," everyone plans ahead. It also means anyone can disappear for an hour, a day, or a week and not feel like they are holding the company back. Even me!

I think this is a thing of times to come. Because it is creating new working styles when remote work is possible. With this text-based style. No meetings are needed. And plans are well thought through.

Sahil also mentions that:

People build their work around their life, not the other way around. This is especially great for new parents, but everyone benefits from being able to structure their days to maximize their happiness and productivity.

This allows us to spend more time on things we love like our family or other hobbies. Remote work can make living without following the traditional 9-5 structure. Or the Hustle minded 80-hour workweek. Also, remote work people be more location independent. In many cities around the world. House prices are skyrocketing. Due to a lack of housing. People buy shoeboxes going for half a million dollars. Now with that same money, they can buy a whole acreage in Nebraska. Allowing them to have more space for their family and themselves. And depending on their lifestyle to have a more enjoyable time.

I remember many times in college. I wondered why I’m I need to the classroom. A lot of this work can be done at home. I guess a lot of people in the workplace think about the same thing.

With Asynchronous communication and remote work. We can allow employees to become time and location independent. The time independence within reason. Employees still need to get work done, that goes without saying.

Software is still very helpful

Even software itself can help people transition into remote work easier. NVIDIA has showcased awesome technology using GANs. That creates an artificial version of a person's face. Using key points on a person’s face. Allowing a person to use a video feed. With little bandwidth. So people in rural areas. And other places with a bad internet connection. Can join video calls. With that, they can fully be location Independent. As location independence implies that you have a good internet connection. So choppy video will become a thing of the past.

Like mentioned earlier they are a lot of project management tools out there. People just need to use them. People will need to get used to structured communication. People will need to get used using video calls. People are quickly designing etiquette on the fly. Like muting your microphone when you enter the call. Messaging before wanting to speak. Some companies have a culture of using video calls to get closer to the team. So employees will talk briefly about their lives once a week. In a hands-on meeting. And used it to get updates with the rest of the team.

The Potential of remote work

With remote work, people can have time to take walks around the neighbourhood or maybe cook lunch for themselves. Not just staying in an office all day. Granted people still need to get used to this as lots people including myself stay at home all day. Like I mentioned before it's not a problem of software but human behaviour.

So, employees on a personal level will start to get the most out of remote work by improving their overall being and productivity. During the summertime, I was able to take walks almost every single day enjoying the local scenery and parks (now it's winter is looking a bit more difficult.) On NHK world I watching a program talking about people moving to the suburbs. In which the town, Atami. Had a nice beachfront. And people wanted to move away from the hustle and bustle life of Tokyo. I can now imagine with remote work during your lunch break you can take walks along the beach and have seafood for lunch.

I think remote work is something that going to stay. There are lots of improvements still need to be made.

Tobi Olabode 14/10/2020 Tobi Olabode 14/10/2020

Deep learning is already mainstream

While I was scrolling on Twitter. I saw a tweet showing the word “deep learning” is plateauing on google trends. Then Yann LeCun replied that it's simply deep learning because become more normal.

It's become so common and standard that you don't need to mention the name any more.
As in, we don't say "electronic digital computer" any more. We just say "computer". https://t.co/98hvL5QEqW
— Yann LeCun (@ylecun) October 8, 2020

This reminds me of a previous blog post that I wrote. Talking about how good tech is like good design. Meaning when technology is good. It embeds itself in our society and becomes invisible. Gmail’s smart compose feature is 100% deep learning. But we don’t think about it as ML. Amazon’s recommendations are ML. But we don’t think about them that way. In normal discourse, we just simply call them algorithms. Which is an accurate term. While abstracting away most of the advanced details.

This makes sense, as only nerds care about the type of algorithm. To recommend films on Netflix. Its recommendation systems. Everybody else will simply just say technology. Or treat the company as a person. Like “Netflix sent me this message.” “Facebook showed me this message.” Etc. When aeroplanes started getting popular we just said we took a flight to Washington DC. Rather than a mechanical flying device helped travel a couple of miles.

As I write this blog post. I use Microsoft Word’s read-aloud feature. To proofread the blog post. Where a robot voice reads why to work for me. The voice has improved tremendously. While it still has some robotic feels to it. It does a good job. It’s like an editor is personally reading my work. Also, I use the program Grammarly. While they do not say it I’m pretty certain they use machine learning. To spot mistakes in your work. These very useful tools that help me improve my writing are drive-by machine learning. Even though people will simply just call it technology.

This is the cycle of all technologies. You have hype. Depending on how good the technology is. It fails to even go into the mainstream. And start again in the hype cycle. If it's good. It will fall below expectations not because it’s bad. But failed to meet the sky-high expectations. Afterwards, people start to work out more practical uses of the technology. After a while, the technology gets popular. But lots of the hype starts to fade away. As people get used to the technology. So I guess deep learning or machine learning. The hype is starting to disappear, but people are finding uses for the technology.

You will have some standouts like GPT-3 and GANs. But most machine learning in the wild right now is a little bit boring. Recommendation systems. Think of Netflix and Amazon. Forecasting. Using past data. To predict future behaviours. It tends to be boring as its simply showing other data points based on past data. Or in amazon cases using the AI to help sell more products. Which is no surprise if your for-profit company. You need to make money.

While ML has its limits. I still think is very popular because it can do so many things. Like generating image via GANs. Classifying images with CNNs. To predicting past behaviour using forecasting. I think this is why AI is very popular. Because if you have some type of data. Which in the internet age, the answer is always a yes. Like early computers where it efficiently changed every industry either via automation or communication. With machine learning. It can help with those areas even more.

In the good tech is like good design blog post. I talked about technology tends to be popular when people stop noticing it. Which is happening now.

In the article I said:

no one calls their company “Excel-based” or “Windows-based”. As it’s [just] a tool.

When people started using office services on their computers. It was revolutionary at the time. But people now, don’t call themselves an “Excel-first company” Or an “email first company”. As people got used to them, people assume that using these services is a given. Soon having some type of data science role will be a given. Just like having a web developer for your company is a given. This will still mainly be focused on tech companies. But non-tech companies are not far behind. Non-tech companies hire web developers and server managers.

Tobi Olabode 11/10/2020 Tobi Olabode 11/10/2020

Why learning how to use small datasets is useful for ML

Many times, you hear about a large tech company doing something awesome with ML. You think that’s great. You think how can I do the same. So you copy their open-source code. And try it for yourself. Then you notice the result is good but not amazing. Than you first thought. Then you spotted that you are training the model with less than 200 samples. But the tech company is using 1 million examples. So you conclude that’s why the tech company’s model performs well.

So this brings me the topic of the blog post. Most people who are doing ML need to get good at getting good results with small datasets. Unless you are working for a FANG company. You won't have millions of data points at your disposal. As we apply technology to different industries, we must deal with applying models when not much data is available. The lack of data can be because of any reason. For example, this could be that the industry does not have experience using deep learning. So, collecting data is a new process. Or maybe collecting data can be very expensive, due to extra tools or expertise involved. But I think we need to get used to the fact that we are not Google.

We do not have datacentres full of personal data. Most businesses have less than 50 employees. With the company dealing with a few thousand customers. To a few dozen depending on the business. Non-profits may not even have the resources to collect a lot of data. So just getting insights from the data we have. Is super useful. Compared to trying to working with a new cutting edge model. With bells and whistles. Remember your user only cares about the results. So you can use the simplest model. Heck, if a simple feed-forward neural network works then go ahead.

But we should not worry about having the resources of tech companies. But worry about what we can do with the resources we have now. A lot of gains in ML are simply done by throwing ungodly amounts of data at the model. And seeing what happens next. We need to do that with only less than 200 samples. Luckily, people have developed techniques that may help with this. Like image augmentation that edits photos. Which helps the model learn about the image in different orientations. So has a general idea of the object of the image. Regardless of any minor edits like size or direction. Soon we may have GANs, that help produce more data from a small dataset. Which means we can train a model with larger datasets. Thanks to generated data of GANs.

While reading about GPT-3 is fun. And very likely to lead to some very useful products. (Just check out product hunt). We are not going to have the opportunity to train a 10 billion parameter behemoth. Don’t get me wrong, we don’t need to train a large model from scratch to be useful. This is what fine-tuning is for. The people using GPT-3 are not training the model from scratch but using prompts to help prime the model to solve the problems it will be dealing with.

But I think we need to deal with the bread and butter issues that people want an AI to solve. Like a simple image classifier. Which may be useful for a small business which needs it to sort out different weights in its store. Or a simple time series analysis to forecast sales into the next season for the shop keeper. Models from Facebook and google that have 100 layers will not be helpful. And likely give you grey hairs by setting it up. Again, the whole goal is the solution to your customer’s problem. Not to split the atom.

Like a podcast, I heard a while ago. Deep learning is already powerful enough. We just need to get it into the hands of the right people. Hopefully, I can help with that process. To do that we need to be a pragmatist and deal with the restraints that most people have when applying ML to their projects. While I'm happy the researchers in OpenAI and the FANG companies take deep learning to its limits. I want to have on the ground experience of taking that knowledge and improving the world. (Yes, its sounds very hippy). But most people will not have the resources to spent millions of dollars on a cloud provider to train a model. People may have a budget of a few hundred or a few maybe a thousand. But not a few million. With rates of cloud computing, a budget like that should be more than enough. Especially dealing with small models with small datasets.

Tobi Olabode 04/10/2020 Tobi Olabode 04/10/2020

The boring work of implementing ML models

As I write blog posts of the potential for AI to help industries. Or some hype article in the media talking about how AI can help revolutionise the world or your favourite industry. We forget the day to day work needed to make that future into reality.

What should you do with the data?

Yes, having AI look at medical images is great and gives more accurate predictions than doctors. But how do you get that data? Medical data is hard to obtain for good reason. Patients medical history should not be passed around willy-nilly. As its very sensitive information. This is not just data about your music preferences. It's about their lives. After you get the data. How should you train the data? A simple 2D medical image may require a Convolutional neural network. The default for computer vision. How do you train the data? You will need labels. So, the computer knows what it's looking at. A person with expertise in the field (a doctor). Will need to help label the images. Depending on the goal of the model, the doctor will need to point out items in an image. (for object detection). Or just give the general category of the image. (image classifier). Now you have trained the model. And if a model is good. If the model is correct more often than doctors. Then you can think about how to move the model to production.

How do you make sure the model is safe for use?

Hospitals are known for some terrible bureaucracy and paperwork depending on your country. So how would you get this AI into the hands that need it the most? For example, should the doctor access the model via a web app? Then upload an image to the website. Or should it be a mobile app? Where the doctor can point his camera at his hospital computer and the app gives a result. Only talking to your prospective users will give you the answer. If the diagnosis is wrong, who comes under fire? The model or doctor. So, there are even ethical questions when it comes to some areas of using machine learning.

When Google added machine learning to its data centres. To help with energy usage. They added fail safes just in case the AI does something funky. Many times, humans had to do a final send-off of approval. When the AI changed something, the humans were always in the loop. So, depending on the scale and activity. Safety features may have to get built-in. Rather than just focusing on getting accuracy on your model. But tons of areas where machine learning will be used. Such safety features won’t be needed. It will likely speed up time extracting useful information from the company’s data. Like spreadsheets or text documents. The main issue is privacy. Because if they contain personal data of customers. Then ethics and regulations like GDPR get involved. But these are problems you will face even before touching the model.

What should you do with missing or inadequate data?

If the user decides to opt-out of giving data in certain areas. How should the company give model recommendations to the user? Maybe it will need to guess by using other data from the user. Or maybe, it bases guesses on other users similar to the original user. Or maybe just say to the user you can’t use the service unless you opt-in into giving certain data. I don’t know. I guess it will depend on the company and the service they are providing.

Let's say you want to use satellite data and/or remote sensing for your project. One major question you need to ask before starting your project. Is the spatial resolution enough for your image? If not, then you start to notice halfway through collecting your data. That it's not good enough. As you can’t zoom in enough to get features you want from the image. This affected me in one of my projects. So I was later forced to use screenshots from google earth. If the project has commercial value. Then it may make sense to buy higher resolution images. From places like Planet Labs that release high-quality satellites into space. Allowing for high-resolution images with daily or close to daily updates. These things that don’t get mentioned in media articles talking about “HoW RemTOe SeNing can HElp Your Business.” To get cool things to work, you will need to do boring things.

Sending the model to production

I didn't even get to the step by step problems of releasing your model to the public. Because if you are going to do so. You need to quickly learn how to use ML-ops and learn basic software engineering. An area I'm hoping to learn soon. Like I said in the medical example do you want to create a web-app or a mobile app. Maybe you want to create an API. As your users are going to be developers. But this side of machine learning is not talked about that much. After you release it, how will you update the model and even the app? Should users give user feedback to the model? Or are you going to do it personally by looking at user and error logs? Which cloud provider would you use to release your model? Would you go for the new serverless services or traditional server space? To improve the model would you collect user data separately. Then occasionally, train the model on new data? Or train as you go along. (Note: I don’t know if you can do this).

As I get more experience, I should be writing in more detail on how to solve these issues. Because I think this is an area that resources are lacking. Also, for my selfish reasons I want to share my work. Some apps allow you to interact with the model. This is something I want to do. Also, I could learn from the users of the app. So, something like that should be on the horizon. And if we want to have machine learning be useful. Then it's obvious they should be released in some type of way. An internal app. Or a public web app. Models are not useful when stuck in a notebook. They are useful when released for the wider world. And the model is being tested with reality.

Tobi Olabode 30/09/2020 Tobi Olabode 30/09/2020

Battery Innovation

Recently I watched a few videos about Tesla battery day. Their annual shareholder meeting. Where Elon and his team talked about the improvements, they found in making batteries. This is a big deal. As making batteries cheaper and more efficient is one of the biggest bottlenecks facing Tesla. A lot of their improvement come from redoing their manufacturing process. And finding better minerals to use for battery manufacture. They touted a 69% reduction in cost per kilowatt per hour.

In my opinion the most important thing, Tesla announced was a cheaper electric car. With the price range around $25,000. As this should allow most people in the western world to buy an electric car. And move away from gasoline cars. Making the transition to cleaner transport faster. Because there is a still untapped market of lower-end electric cars. While they are some like the Nissan Leaf. And Toyota Prius. I think tesla can make a big play. And help most people purchase an electric car. Further removing our need for fossil fuels.

Making a good electric car is no minor feat. One the technological innovation side. So, trying to manufacture batteries that are cheap and efficient. So good enough to last for a long time in a car. The second on the supply chain side. By sourcing materials. And the chemical processes to get the materials ready for manufacturing and production. Which is why Elon mentioned that he wants the company to have an advantage with manufacturing batteries. Any car company that wants to create efficient batteries will need to do a lot of catching up to do with tesla.

Also, if western companies don’t want to get hammered on human rights issues. Then they need to find ways to source materials in the western countries. Tesla says they will be getting more materials from Nevada. So, I guess other companies will have to do the same. As I don’t know the mining locations cobalt or any other rare earth materials. I can’t give precise locations, for mining these materials. I guess European companies will do something similar by finding materials within European borders. A lot of innovation when it comes to batteries, I can see are not sexy. They are boring like finding better mining locations. Improving the chemical processes to make the materials more efficient. But this is the innovation of what is needed to transition to a carbon-neutral world. They will be some issues if companies want to mine in the west. First, companies need to make sure the mining does not cause a raucous with local homeowners. NIMBYism can derail the whole project. As mining practices need to be environmentally friendly to hit the same problems of open mining coal.

Mining will need to be environmentally friendly. To avoid problems that you have from mining something like coal. Like bad local air quality and destroying the local environment.

Tobi Olabode 23/09/2020 Tobi Olabode 23/09/2020

Predicting Flooding with Python

Getting Rainfall Data and Cleaning

For this project, I will make a model that will show long term flooding risk in an area. Related to climate change and machine learning, which I have been writing a lot about recently. The idea was to predict if an area has a higher risk of flooding in 10 years. The general idea to work this out was to get rainfall data. Then work out if the rainfall exceeded land elevation. After that, the area can be counted as flooded or not.

To get started I had to find rainfall data. Luckily, it was not too hard. But the question was, what rainfall data I wanted to use. First, I found the national rainfall data (UK). Which looked very helpful. But as the analysis will be done by a geographic basis. I decided that I will use London rainfall data. When I got the rainfall data it looked like this:

Some of the columns gave information about soil moisture, which was not relevant to the project. So I had to get rid of them. Also, as it was a geographic analysis. I decided to pick the column that would be closest to the general location I wanted to map. So, I picked Lower Lee rainfall. As I will analyse East London.

To complete the data wrangling I used pandas. No surprise there. To start, I had to get rid of the first row in the dataframe. As they work as the second header in the dataframe. This makes sense as the data was meant for an excel spreadsheet.

I used this to get rid of the first row:

df = df[1:]

After that, I had to get rid of the locations I was not going to use. So, I used pandas iloc function to slice through a significant number of columns in the dataframe.

df = df.drop(df.iloc[:, 1:6], axis=1)

After that, I used the dataframe drop function to get rid of the columns by name.

df = df.drop(['Roding', 'Lower Lee.1', 'North Downs South London.1', 'Roding.1'], axis=1)

Now, before I show you the other stuff I did. I fell into some errors when trying to analyse or manipulate the contents of the dataframe. To fix these issues that I fell into. I changed the date column into Pandas DateTime, with the option of parsing the date first. Due to pandas using the American date system. Then changed the Lower Lee column into a float type. This had to be done as the first row which I sliced earlier. Changes the data type of the columns into non-numeric data types. After I did all of this I can go back into further analysis.

To make the analysis more manageable, I decided to sum up the rainfall to a monthly basis. Rather than a daily basis. As I will have to deal with a lot of extra rows. And having monthly rainfall makes it easier to see changes in rainfall from a glance. To do this I had to group the dataframe into monthly data. This is something that I was stuck for a while, but I was able to find the solution.

Initially, I had to create a new dataframe, that grouped the DateTime column by month. This is why I had to change the datatype from earlier. Then I used the dataframe aggregate function. To sum the values. Then after that, I used the unstack function which pivots the index labels. Thirdly I used reset_index(level=[0,1]) to revert the multi-index into a single index dataframe. Then dropped the level_0 column. Then renamed the rest of columns date and rain.

Analysing the Data

One of the major issues that popped up was the data type of the date column. After tonnes of digging around in stack overflow, I found the solution was to convert it to a timestamp then converted back into a DateTime format. I think this has to do with the changed dataframe into a monthly dataframe so it must have messed up the data type which is why I had to change it again.

A minor thing I had to adjust was the index because when I first plotted the graphs the forecast did not provide the date only providing an increasing numerical number. So, I went to the tutorial’s notebook and her dataframe had the date as the index. So, I changed my dataset, so the index contains the dates so when the forecast is plotted the dates are shown on the x-axis.

Now for the analysis. This is a time-series analysis as we are doing forecasting. I found this article here which I followed. I used the statsmodels package. Which helps provide models for statistical analysis. First, we did a decomposition which separated the dataframe into a trend, seasonal and residual components.

Next, the tutorial asks us to check if the time series is stationary. In the article, it's defined as “A time series is stationary when its statistical properties such as mean, variance, and autocorrelation are constant over time. In other words, the time series is stationary when it is not dependent on time and not have a trend or seasonal effects.”

To check if the data is stationary, we used autocorrelation function and partial autocorrelation function plots.

There is a quick cut off the data is stationary. The Autocorrelation and Partial autocorrelation functions give information about the reliance of time series values.

Now we used another python package called pmdarima. Which will help me decide my model.

import pmdarima as pm

model = pm.auto_arima(new_index_df_new_index['Rain'], d=1, D=1,

                      m=12, trend='c', seasonal=True,

                      start_p=0, start_q=0, max_order=6, test='adf',

                      stepwise=True, trace=True)

All of the settings were taken from the tutorial. I will let the tutorial explain the numbers:

“Inside auto_arima function, we will specify d=1 and D=1 as we differentiate once for the trend and once for seasonality, m=12 because we have monthly data, and trend='C' to include constant and seasonal=True to fit a seasonal-ARIMA. Besides, we specify trace=True to print status on the fits. This helps us to determine the best parameters by comparing the AIC scores.”

After than I spilt the data into train and test batches.

train_x = new_index_df_new_index[:int(0.85*(len(new_index_df_new_index)))]

test_x = new_index_df_new_index[int(0.85*(len(new_index_df_new_index))):]

When Splitting the data for the first time I used SciKit Learn’s train_test_split function to split the data. But this led to some major errors later on when plotting the data so I'm using the tutorial method.

Then we trained a SARIMAX based on the parameters produced from earlier.

from statsmodels.tsa.statespace.sarimax import SARIMAX

model = SARIMAX(train_x['Rain'],
                order=(2,1,0),seasonal_order=(2,1,0,12))
results = model.fit()
results.summary()

Plotting the forecast

Now we can start work on forecasting as we now have a trained model.

forecast_object = results.get_forecast(steps=len(test_x))

mean = forecast_object.predicted_mean

conf_int = forecast_object.conf_int()

dates = mean.index

These variables used to help us plot the forecast. The forecast is as long as the test dataset. The mean is the average prediction. The confidence interval gives us a range where the numbers lie. And dates provide an index so we can plot the date.

plt.figure(figsize=(16,8))

df = new_index_df_new_index plt.plot(df.index, df, label='real')

plt.plot(dates, mean, label='predicted')

plt.fill_between(dates, conf_int.iloc[:,0], conf_int.iloc[:,1],alpha=0.2)

plt.legend() plt.show()

This is example of an in-sample forecast. Now lets see how we make a out-sample forecast.

pred_f = results.get_forecast(steps=60)
pred_ci = pred_f.conf_int()
ax = df.plot(label='Rain', figsize=(14, 7))
pred_f.predicted_mean.plot(ax=ax, label='Forecast')
ax.fill_between(pred_ci.index,
                pred_ci.iloc[:, 0],
                pred_ci.iloc[:, 1], color='k', alpha=.25)
ax.set_xlabel('Date')
ax.set_ylabel('Monthly Rain in lower lee')
plt.legend()
plt.show()

This is forecasting 60 months into the future.

Now we have forecasting data. I needed to work on which area can get flooded.

Getting Elevation Data

To work out areas that are at risk of flooding I had to find elevation data. After googling around. I found that the UK government provide elevation data of the country. Using LIDAR. While I was able to download the data. I worked out that I did not have a way to view the data in python. And I may have to pay and learn a new program called ArcGIS. Which is something I did not want to do.

So I found a simpler alternative using Google Maps API elevation data. Where you can get elevation data of an area. Using coordinates. I was able to access the elevation data using the Python package requests.

import requests
r = requests.get('https://maps.googleapis.com/maps/api/elevation/json?locations=39.7391536,-104.9847034&key={}'.format(key))
r.json()

{'results': [{'elevation': 1608.637939453125,
   'location': {'lat': 39.7391536, 'lng': -104.9847034},
   'resolution': 4.771975994110107}],
 'status': 'OK'}

Now we need to work out when the point will get flooded. So using the rainfall data we compare the difference between elevation and rainfall. And if the rain passes elevation then the place is underwater.

import json
r = requests.get('https://maps.googleapis.com/maps/api/elevation/json?locations=51.528771,0.155324&key={}'.format(key))
r.json()
json_data = r.json()
print(json_data['results'])
elevation = json_data['results'][0]['elevation']
print('elevation: ', elevation )

rainfall_dates = []
for index, values in mean.iteritems():
    print(index)
    rainfall_dates.append(index)

print(rainfall_dates)
for i in mean:
  # print('Date: ', dates_rain)
  print('Predicted Rainfall:', i)
  print('Rainfall vs elevation:', elevation - i)
  print('\n')

Predicted Rainfall: 8.427437412467206
Rainfall vs elevation: -5.012201654639448


Predicted Rainfall: 40.91480530998025
Rainfall vs elevation: -37.499569552152494


Predicted Rainfall: 26.277342698245548
Rainfall vs elevation: -22.86210694041779


Predicted Rainfall: 16.720892909866357
Rainfall vs elevation: -13.305657152038599

As we can see if the monthly rainfall drops all in one day. Then the area will get flooded.

diff_rain_ls = []
for f, b in zip(rainfall_dates, mean):
    print('Date:', f)
    print('Predicted Rainfall:', b)
    diff_rain = elevation - b
    diff_rain_ls.append(diff_rain)
    print('Rainfall vs elevation:', elevation - b)
    print('\n')
    # print(f, b)

This allows me to compare the dates with rainfall vs elevation difference.

df = pd.DataFrame(list(zip(rainfall_dates, diff_rain_ls)), 
               columns =['Date', 'diff']) 
df.plot(kind='line',x='Date',y='diff')
plt.show()

I did the same thing with the 60-month forecast

rainfall_dates_60 = []
for index, values in mean_60.iteritems():
    print(index)
    rainfall_dates_60.append(index)

diff_rain_ls_60 = []
for f, b in zip(rainfall_dates_60, mean_60):
    print('Date:', f)
    print('Predicted Rainfall:', b)
    diff_rain_60 = elevation - b
    diff_rain_ls_60.append(diff_rain_60)
    print('Rainfall vs elevation:', elevation - b)
    print('\n')

In the long term, the forecast says they will be less flooding. This is likely due to how the data is collected is not perfect and short timespan.

How the Project Fell Short

While I was able to work out the amount of rainfall to flood an area. I did not meet the goal of showing it on to a map. I could not work out the LIDAR data from earlier. And other google map packages for Jupiter notebooks did not work. So I only the coordinates and the rainfall amount.

Wanted to make something like this:

For the reasons I mentioned earlier, I could not do it. The idea was to have the map zoomed in to the local area. While showing underwater properties and land.

I think that’s the main bottleneck. Getting a map of elevation data which can be manipulated in python. As from either, I could create a script that could colour areas with a low elevation.

Why you should NOT use this model

While I learnt some stuff with the project. I do think they some major issues on how I decided which areas are at risk. Just calculating monthly rainfall and finding the difference from the elevation is arbitrary. What correlation does monthly rainfall effect if rainfall pores 10X more in a real flood? This is something I started to notice once I started going through the project. Floods happen (in the UK) from flash flooding. So a month’s worth of rain pours in one day. They will be some correlation with normal rainfall. The other data points that that real flood mappers use, like simulating the physics of the water. To see how the water will flow and affect the area. (Hydrology). Other data points can include temperature and snow. Even the data I did have could have been better. The longest national rainfall data when back to the 70s. I think I did a good job by picking the local rain gauge from the dataset. (Lower Lee). I wonder if it would have been better to take the average or sum of all the gauges to have a general idea of rainfall of the city.

So other than I did not map the flooding. This risk assessment is woefully inaccurate.

If you liked reading this article, please check out my other blog posts:

Failing to implement my first paper

How I created an API that can work out your shipping emissions

Tobi Olabode 16/09/2020 Tobi Olabode 16/09/2020

Why The Best Model Can Be The Simplest

AI does not need to be in everything

Right now I’m making a project that should produce areas that have high risk of flooding in the future. To do that I had to create a model that would complete a time series analysis that would forecast into the future. Instead of using a deep learning model like LSTMs. I went with a tried and tested model for time series analysis called ARIMA. (this is because I was following a tutorial.) This made me think about how all the bells and whistles are not needed.

As long as you're solving the problem the tool your using does not matter. I think this is the problem that people with engineering backgrounds have. They fall in love with the tool. Not solving the problem. Don’t get me wrong I do this all the time. But I think it’s important to solve the problem first. Rather than having a tool and finding a problem to solve it with. Granted I do this a lot.

I have been doing a lot of this recently, as a bid to improve my machine learning skills. I have been finding problems that will require me to make models. So, there is a time and place for all this. One of the benefits of having a simple model. Is stuff are less likely to break down. As you have fewer dependencies and connections between different bits of your code. It makes it easier to maintain. As your not getting a headache trying to comb through your model. The more stuff you add, the harder it is to troubleshoot the model. For example, if your model is overfitting. And you're trying to fix it with 10 layers. It will be harder as you need to tweak the parameters of all the 10 layers. Compared to a simpler model with only 3 layers. Also, this will take fewer resources so if you're running the model on the local computer. Then your computer is less likely to kneel over. Which saves you from buying a new computer, which is great.

Making Useful Projects

From Seth Goldin, This is Marketing:

It doesn’t make any sense to make a key and then run around looking for a lock to open. The only productive solution is to find a lock and then fashion a key. It easier to make products and services for the customer you seek to serve than it is to find customers for your products and services.

While he was talking about making products. It relates to this blog post. As solving problems is making products and services even if we don’t call it that.

In my applied technology blog post. I talked about how people can make an effective solution using simple tools. The example I gave in the article was a radiologist that made an ML model to find fractures in X-rays. Using Google’s no-code machine learning tool. This shows we can create useful products with simple tools. I think we should not be too nerdy about tools. Granted being nerdy is great, but we should always be thinking about the end-user first and foremost.

I think lots of developers get stuck into the “build it they will come” mindset. I’m only recently shedding this mentality right now. Putting my focus on people’s needs and wants. Granted I still want to build cool things, but if we want people to use our products then they need to be useful. The reason why machine learning is popular now, is that it helps solve problems that we could not do before. And we have the elements to make it more effective. Like powerful computing power and lots and I mean lots of data. This helped deep learning go mainstream.

I think I mentioned this in another blog post, but this is why I choose projects that I could see a person using. Not working out handwritten numbers from the MNIST dataset. The whole point of having tools is to allow us to solve a problem that we couldn’t without the tool. We have a hammer to punch in nails which we couldn’t do before. We have paddles to help us travel by boat quicker. And we have a computer to help us compute things ridiculously fast. So we should not get distracted by the next shiny toy (tool). This does not mean you should not change and stick the tools you already know. If that was the case, we still will be using sticks and rocks.

The Tool does not matter, it’s the solution that matters

From Patrick McKenzie (Patio11)

Software solves business problems. Software often solves business problems despite being soul-crushingly boring and of minimal technical complexity.
…
It does not matter to the company that the reporting form is the world’s simplest CRUD app, it only matters that it either saves the company costs or generates additional revenue.

Even when companies are using advanced technology and techniques. Sometimes the goal is still very simple. Netflix has a great recommendation system, so you watch more movies. Facebook has PhDs in Psychology and Artificial intelligence so you can spend more time on Facebook. And look at more ads. Google has crawled the entire internet. So when you use search you get the information you want.

Again, it all comes back to the end-user. I'm a Nerd, so all this stuff interests me, I can spend all day tinkering with new technology. But we must remember that when building something it should help somebody else. It does not matter how small the audience is. Many people say the audience can just be yourself. This is helpful because if its useful for one person then is probably useful to other people.

While I'm writing this, a lot of content here can be applied to myself. Which is partly why I’m writing this blog post. To say, don’t make things complicated if they don’t need to be. Don’t create a rocket ship when you can just get a boat. Don’t get me wrong. Building things is fun. Which is why I have a hobby like this. But it's important not to get too distracted.