Building a custom Image Classification Model

KamilTomaszewski

Hi @suburban-daredevil

Please try this:

Copy the whole sdk/apps/examples/tflmrt_lenet folder to a directory with a new name.
Replace all tflmrt_lenet strings in Makefile, Make.defs, Kconfig, tflmrt_lenet_main.c with the new name chosen in step 1. E.g.
tflmrt_lenet -> new_app
TFLMRT_LENET -> NEW_APP
Change the file name of tflmrt_lenet_main.c to <new name>_main.c.

Best Regards,
Kamil Tomaszewski

suburban-daredevil

Hey @kamiltomaszewski

Thanks for taking your time out and responding!

I am in the directory that you have mentioned.

I am attaching the screenshot of the path I am currently in

7975e949-19cc-4a0a-9123-8753085b4b92-Screenshot from 2022-01-17 21-42-03.png

But in the specified path, there is no such directory as tflmrt_lenet

81b1d752-caa6-48fa-9d78-5b47c6ee82c4-Screenshot from 2022-01-17 21-43-02.png

But the tflmrt_lenet directory is present in the another location. This contains all the necessary files from the outlook of it

f24a5c12-ee24-486c-8b92-7bb527686b56-Screenshot from 2022-01-17 21-47-52.png

dc86c29a-3beb-4d2a-ba0e-51a0dc782e2f-Screenshot from 2022-01-17 21-48-09.png

8f722348-a4ae-4e86-b096-316ed1759221-Screenshot from 2022-01-17 21-50-04.png

Are they both the same? Or is this the one you mentioned?

Thanks

KamilTomaszewski

Hi @suburban-daredevil

I am sorry, my mistake. I meant: examples/tflmrt_lenet.

Best Regards,
Kamil Tomaszewski

suburban-daredevil

Hi @kamiltomaszewski

I was just going through the example code for tflmrt_lenet given. There was a function call like the one given below

int tflm_runtime_forward(tflm_runtime_t *rt, const void *inputs[],
                         unsigned char input_num);

Here what does the variable "input_num" represent?

Thanks

suburban-daredevil

Hi @KamilTomaszewski

In the code given below

int tflm_runtime_output_shape(tflm_runtime_t *rt, unsigned char output_index, unsigned char dim_index)

What does the the variables "output_index" and "dim_index" represent?

Thanks

KamilTomaszewski

Hi @suburban-daredevil

input_num is the number of inputs you defined in the array inputs for your neural network. input_num equals to tflm_runtime_input_num()
output_index is index to specify an output. You can check the number of outputs of your neural network using tflm_runtime_output_num()
dim_indexis index to specify a dimension. You can check the number of dimensions of your specific output using tflm_runtime_output_ndim()

Below is a short code that uses these functions to print information about the output of your neural network:

  int output_num = tflm_runtime_output_num(&rt);
  printf("output num: %d\n", output_num);
  for (int i = 0; i < output_num; i++)
  {
    printf("output: %d, size: %d\n", i, tflm_runtime_output_size(&rt, i));
    printf("output: %d, dim num: %d\n", i, tflm_runtime_output_ndim(&rt, i));
    for (int j = 0; j< tflm_runtime_output_ndim(&rt, i); j++)
    {
      printf("output: %d, dim: %d, shape: %d\n", i, j, tflm_runtime_output_shape(&rt, i, j));
    }
  }

Best Regards,
Kamil Tomaszewski

suburban-daredevil

Hi @kamiltomaszewski

Is there any resource / documentation explaining the usage of all the function calls (especially the ones in the runtime.h file) used in the tflmrt_lenet program?

Thanks

KamilTomaszewski

Hi @suburban-daredevil
You can find the TFLMRT documentation here: https://developer.sony.com/develop/spresense/docs/sdk_developer_guide_en.html#_tflm_runtime
tflmrt_lenet example is described here:
https://developer.sony.com/develop/spresense/docs/sdk_tutorials_en.html#_tflmrt_sample_application
You can find the description of the functions in the comments in this file:
https://github.com/sonydevworld/spresense/blob/master/sdk/modules/include/tflmrt/runtime.h

Best Regards,
Kamil Tomaszewski

suburban-daredevil

Hi @kamiltomaszewski

We have the tflmrt_lenet example which was trained on 28x28 grayscale images. What are the changes that has to be made and in what files (like app_main.c, pnm_util.c etc.) to accept and process RGB images of different size (say 96x96) ?

Thanks

suburban-daredevil

Hi @KamilTomaszewski

93dd823a-5624-44e2-a595-23bb171d0f6a-Screenshot from 2022-02-24 14-17-03.png

I just wanted to know, to embed our model's C-byte array code in our application, is it sufficient to replace the existing model0.c file's array contents with that of our new model?

I tried emedding the model0.c file's array contents with that of my new model and also changed the model length. But every time I run inference, for any given input I'm getting the same output. No change in output. Am I missing something?

And also model_tflite is the buffer that holds the model.

acc5ea7b-e51e-40d7-9d16-0ccc8c3360c0-Screenshot from 2022-02-24 14-28-13.png

And network is the variable that holds our NN. It is assigned in the else part of the code block below

6652de38-50cc-4c49-ac3e-c9f87943677d-Screenshot from 2022-02-24 14-28-55.png

But I think the network variable doesn't read the builtin model and hence it is not able to perform the right inference ?

Any help is appreciated

Thanks

KamilTomaszewski

Hi @suburban-daredevil,

You need to change #define MNIST_SIZE_PX (28 * 28) to #define MNIST_SIZE_PX (96 * 96 * 3) in the tflmrt_lenet_main.c file and #define MY_BUFSIZ (28 * 28) to #define MY_BUFSIZ (96 * 96 * 3) in the pnm_util.c file. I think that should be enough.

Does your model array have __attribute__((aligned))?

Best Regards,
Kamil Tomaszewski

suburban-daredevil

Hi @kamiltomaszewski

I think you are referring to this right?

4e49f22f-9895-427e-a0ee-d1801b404207-Screenshot from 2022-02-25 08-46-35.png

I'm not sure how to check if my new model has __attribute__((aligned))

Thanks

KamilTomaszewski

Hi @suburban-daredevil

Yes, that is right.

Are you using the model as a C array or as a binary file that you load from the SD card?

suburban-daredevil

Hi @kamiltomaszewski

I have 2 queries

Currently I'm loading my model from the SD card. But I want to embed my model code onto my application folder itself. Is it enough to replace the contents of the model_tflite[] array with that of my new model or is there any other changes to be made?
When running the app from the nuttx prompt, I try to give the path of the image that should be used for inference. But currently only 28*28 grayscale images are being accepted. When I try to give images of any other dimensions, it says pgm image load failed . I have made the change that you have suggested above. Are there any other changes to be made?

Thanks

KamilTomaszewski

Hi @suburban-daredevil

It should be enough to replace the contents of the model_tflitearray with that of your new model.

Could you check where exactly the pnm_load function returns an error?

Best Regards,
Kamil Tomaszewski

suburban-daredevil

Hi @KamilTomaszewski

When I try to run build and flash from VS code, I get the following error. But I do have a file called project_name.nuttx.spx inside my out directory after building.

4eff2a6f-f913-4ace-84d3-5f6684bb10d0-Screenshot from 2022-03-10 17-19-28.png

Can you help me out with this?

Thanks

KamilTomaszewski

Hi @suburban-daredevil,

I think there is a bug in the latest VS code release. Could you please try an older release? For example: https://update.code.visualstudio.com/1.63.2/linux-deb-x64/stable

Best Regards,
Kamil Tomaszewski

suburban-daredevil

Hi @KamilTomaszewski

It worked and solved the issue. Thanks for your help

Thanks

Upcoming maintenance

Building a custom Image Classification Model