Using assert statements to prevent errors in ML code
In my previous post, I showed what I learnt about unit testing. Testing tends not to be thought about when coding ML models. (The exception being production). So, I thought it will be an interesting topic to learn about.
I found one unit test to try out because it solves an issue. I face a lot when coding up my model.
The unit test checks if I’m passing the right shape of data into the model. Because I make this simple mistake from time to time. This mistake can add hours to your project. If you don’t know the source of the problem.
After I shared the blog post on Reddit. A Redditor commented. “Why not just use assert?”
That was something that did not cross my mind. So, I rejigged my memory, by checking out what assert did.
Then started working out how to use it for testing ML code.
One of the most popular blog posts on the internet about ML testing. Uses assertion statements to test the code.
When writing an assertion statement making a function is needed most of the time. This is how unit tests can be made.
Assertion Statement for the Wrong Shape
I was able to hack up this simple assertion statement.
def test_shape():
assert torch.Size((BATCH_SIZE, 3, 32, 32)) == images.shape
This is even shorter than the unit test I created in the last blog post.
I tried out the new unit test. By dropping the batch size column. The same thing I did in the last post.
images = images[0,:,:,:]
images.shape
Now we get an assertion error:
To make the assertion statement clearer. I added info about the shapes of the tensors.
def test_shape():
assert torch.Size((BATCH_SIZE, 3, 32, 32)) == images.shape, f'Wrong Shape: {images.shape} != {torch.Size((BATCH_SIZE, 3, 32, 32))}'
This is super short. Now, you have something to try out straight away for your current project.
As I spend more time on this. I should be writing about testing ML code.
An area I want to explore with ML testing is production. Because I can imagine testing will be very important to make sure the data is all set up and ready. Before the model goes into production. (I don’t have the experience, so I'm only guessing.)
When I start work on my side projects. I can implement more testing. On the model side. And the production side. Which would be a very useful skill to have.
-
If you liked this blog post. Consider signing up to my mailing list. Where I write more stuff like this