Intro¶
If you read chapter 4 from Practical Deep Learning for Coders
and you have questions about all these numbers from images
like what is $28*28$ or $784$ or why we divide these tensors with image data on 255
then this is a post for you
Usual start of working with images¶
from fastai.vision.all import *
Download MNIST SAMPLE images dataset with FastAI function untar_data
os.getcwd() + '/images'
returns path of notebook directory and add
subdir images
to it
images will be download in this [notebook dir]\images
directory
path_to_img_dir = untar_data(URLs.MNIST_SAMPLE, data=Path(os.getcwd() + '/images'))
path_to_img_dir
Path('/home/harley/mnt/pci_ssd/jupyter_notebooks/fastai/images/mnist_sample')
After we have directory with images we use some commands from FastAI
and received these images in nice batches which we can feed to our model
db_imgs = DataBlock(blocks = (ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(seed=42),
get_y=parent_label)
dls_imgs = db_imgs.dataloaders(path_to_img_dir)
Here is our first batch of images
dls_imgs.show_batch(nrows=3, ncols=3, figsize=(4, 4))
What is going on under the hood in the previous code block?¶
Let’s make a simulation on the easy example of two numbers from MNIST image dataset
Images unpacking and loading¶
After downloading and unpacking dataset from FastAI or from other source,
we have a directory which contains all images.
In MNIST dataset case this directory contains
subdirectorieswith the name of the number in image
Make two lists of paths to images of 3 and 7
paths_to_threes = (path_to_img_dir/'train'/'3').ls()
paths_to_sevens = (path_to_img_dir/'train'/'7').ls()
#printing first five paths to images of 3
for path in paths_to_threes[:5]: print(str(path)[42:])
/fastai/images/mnist_sample/train/3/43330.png /fastai/images/mnist_sample/train/3/34239.png /fastai/images/mnist_sample/train/3/5102.png /fastai/images/mnist_sample/train/3/40805.png /fastai/images/mnist_sample/train/3/3171.png
Let’s make two lists of tensors with 3 and 7 images
For this we will use Image.open(full_path_to_image)
tensor
for converting image to tensor
[]
for packing these tensors in list
list_tensors_three = [tensor(Image.open(path)) for path in paths_to_threes]
list_tensors_seven = [tensor(Image.open(path)) for path in paths_to_sevens]
Length of list is a quantity of images in this list
print(f"Quantity of images of 3 in the three_tensors: {len(list_tensors_three)}")
print(f"Quantity of images of 7 in the seven_tensors: {len(list_tensors_seven)}")
Quantity of images of 3 in the three_tensors: 6131 Quantity of images of 7 in the seven_tensors: 6265
Looking inside the image¶
Each item of these lists is a tensor/array which contains image data
three_tensors[5]
means 5th image/tensor in our list
list_tensors_three[5].shape
torch.Size([28, 28])
torch.Size([28, 28])
means that there are $28*28$ pixels/numbers in the tensor
and each of these numbers is from 0 to 255
this 0 to 255 number is a color:
- 0 – black color / no color
- 128 – middle shade of grey
- 255 – white color
if we look at first 25 rows and middle 14 columns from 10th to 24th
we will see ASCII picture of 3
(if we output more than 14 columns, the notebook splits the row into two rows
and we won’t see the picture)
list_tensors_three[5][0:25, 10:24]
tensor([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 78, 207, 254, 206, 254, 230, 144, 42, 0, 0, 0, 0], [ 0, 55, 244, 254, 253, 253, 253, 253, 253, 250, 69, 0, 0, 0], [ 0, 14, 183, 254, 184, 111, 102, 175, 253, 253, 190, 0, 0, 0], [ 0, 0, 5, 11, 4, 0, 0, 56, 253, 253, 199, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 80, 253, 253, 99, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 57, 235, 253, 206, 22, 0, 0, 0], [ 0, 0, 0, 0, 3, 104, 239, 253, 250, 30, 0, 0, 0, 0], [ 0, 33, 45, 60, 181, 253, 253, 200, 65, 0, 0, 0, 0, 0], [188, 237, 253, 254, 253, 253, 253, 122, 0, 0, 0, 0, 0, 0], [253, 253, 253, 254, 253, 253, 253, 246, 96, 0, 0, 0, 0, 0], [111, 111, 111, 112, 139, 234, 255, 254, 216, 12, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 31, 217, 253, 253, 22, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 133, 253, 253, 22, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 133, 253, 253, 22, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 133, 253, 253, 22, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 133, 253, 222, 14, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 53, 239, 253, 112, 0, 0, 0, 0, 0], [ 45, 45, 45, 60, 155, 237, 253, 200, 22, 0, 0, 0, 0, 0], [253, 253, 253, 254, 253, 253, 203, 23, 0, 0, 0, 0, 0, 0], [253, 253, 253, 240, 143, 52, 16, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=torch.uint8)
There is an interesting way of using pandas.DataFrame
to receive
more clear ASCII image from this data
df = pd.DataFrame(list_tensors_three[5])
df.style.set_properties(**{'font-size':'6pt', 'padding': '1px'}).background_gradient('Greys_r')
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 78 | 207 | 254 | 206 | 254 | 230 | 144 | 42 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 55 | 244 | 254 | 253 | 253 | 253 | 253 | 253 | 250 | 69 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 183 | 254 | 184 | 111 | 102 | 175 | 253 | 253 | 190 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 11 | 4 | 0 | 0 | 56 | 253 | 253 | 199 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 80 | 253 | 253 | 99 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 57 | 235 | 253 | 206 | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 104 | 239 | 253 | 250 | 30 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
11 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 33 | 45 | 60 | 181 | 253 | 253 | 200 | 65 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 125 | 188 | 237 | 253 | 254 | 253 | 253 | 253 | 122 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
13 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 78 | 251 | 253 | 253 | 253 | 254 | 253 | 253 | 253 | 246 | 96 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 39 | 111 | 111 | 111 | 111 | 112 | 139 | 234 | 255 | 254 | 216 | 12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
15 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 31 | 217 | 253 | 253 | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 133 | 253 | 253 | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
17 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 133 | 253 | 253 | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
18 | 0 | 0 | 0 | 0 | 15 | 56 | 27 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 133 | 253 | 253 | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
19 | 0 | 0 | 0 | 0 | 67 | 253 | 225 | 127 | 19 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 133 | 253 | 222 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
20 | 0 | 0 | 0 | 0 | 67 | 253 | 253 | 253 | 112 | 1 | 0 | 0 | 0 | 0 | 0 | 53 | 239 | 253 | 112 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
21 | 0 | 0 | 0 | 0 | 26 | 208 | 253 | 253 | 253 | 158 | 45 | 45 | 45 | 60 | 155 | 237 | 253 | 200 | 22 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
22 | 0 | 0 | 0 | 0 | 0 | 26 | 246 | 253 | 253 | 253 | 253 | 253 | 253 | 254 | 253 | 253 | 203 | 23 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
23 | 0 | 0 | 0 | 0 | 0 | 0 | 41 | 191 | 230 | 253 | 253 | 253 | 253 | 240 | 143 | 52 | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
24 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
26 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
27 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
and the same image data but in usual image regime
show_image(list_tensors_three[5], cmap='grey')
<Axes: >
Preparation image data for ML¶
so we have lists of tensors which contain integers from 0 to 255
for machine learning we need to convert these lists to tensors
torch.stack
converts list of tensors into sequence of tensors (tensor of tensors)
tensor_threes = torch.stack(list_tensors_three)
tensor_sevens = torch.stack(list_tensors_seven)
print(f"type of list_tensors_three: {type(list_tensors_three)}")
print(f"Type of tensor_threes: {type(tensor_threes)}")
print(f"Shape of tensor_threes: {tensor_threes.shape}")
type of list_tensors_three: <class 'list'> Type of tensor_threes: <class 'torch.Tensor'> Shape of tensor_threes: torch.Size([6131, 28, 28])
so for now we have tensor_threes
with 6131 elements
each of them contains an 28*28 image
Changing image data’s range¶
machine learning usually better work with numbers from 0 to 1 or from -1 to 1
so we need to convert our data into the small range
to do this we will convert our data in tensor to float and divide each pixel by 255
tensors use broadcasting
so this dividing operation divide not the object tensor
but each element/pixel of this tensor
converted_tensor_threes = tensor_threes.float()/255
converted_tensor_sevens = tensor_sevens.float()/255
What’s interesting: after dividing we still can see this image in DataFrame visualization
but with different numbers inside
df = pd.DataFrame(converted_tensor_threes[5])
df.style.set_properties(**{'font-size':'4pt', 'padding': '1px'}).background_gradient('Greys_r')
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
1 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
2 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
3 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
4 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.305882 | 0.811765 | 0.996078 | 0.807843 | 0.996078 | 0.901961 | 0.564706 | 0.164706 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
5 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.215686 | 0.956863 | 0.996078 | 0.992157 | 0.992157 | 0.992157 | 0.992157 | 0.992157 | 0.980392 | 0.270588 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
6 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.054902 | 0.717647 | 0.996078 | 0.721569 | 0.435294 | 0.400000 | 0.686275 | 0.992157 | 0.992157 | 0.745098 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
7 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.019608 | 0.043137 | 0.015686 | 0.000000 | 0.000000 | 0.219608 | 0.992157 | 0.992157 | 0.780392 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
8 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.313726 | 0.992157 | 0.992157 | 0.388235 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
9 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.223529 | 0.921569 | 0.992157 | 0.807843 | 0.086275 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
10 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.011765 | 0.407843 | 0.937255 | 0.992157 | 0.980392 | 0.117647 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
11 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.129412 | 0.176471 | 0.235294 | 0.709804 | 0.992157 | 0.992157 | 0.784314 | 0.254902 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
12 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.490196 | 0.737255 | 0.929412 | 0.992157 | 0.996078 | 0.992157 | 0.992157 | 0.992157 | 0.478431 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
13 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.305882 | 0.984314 | 0.992157 | 0.992157 | 0.992157 | 0.996078 | 0.992157 | 0.992157 | 0.992157 | 0.964706 | 0.376471 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
14 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.152941 | 0.435294 | 0.435294 | 0.435294 | 0.435294 | 0.439216 | 0.545098 | 0.917647 | 1.000000 | 0.996078 | 0.847059 | 0.047059 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
15 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.121569 | 0.850980 | 0.992157 | 0.992157 | 0.086275 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
16 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.521569 | 0.992157 | 0.992157 | 0.086275 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
17 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.521569 | 0.992157 | 0.992157 | 0.086275 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
18 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.058824 | 0.219608 | 0.105882 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.521569 | 0.992157 | 0.992157 | 0.086275 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
19 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.262745 | 0.992157 | 0.882353 | 0.498039 | 0.074510 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.521569 | 0.992157 | 0.870588 | 0.054902 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
20 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.262745 | 0.992157 | 0.992157 | 0.992157 | 0.439216 | 0.003922 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.207843 | 0.937255 | 0.992157 | 0.439216 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
21 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.101961 | 0.815686 | 0.992157 | 0.992157 | 0.992157 | 0.619608 | 0.176471 | 0.176471 | 0.176471 | 0.235294 | 0.607843 | 0.929412 | 0.992157 | 0.784314 | 0.086275 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
22 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.101961 | 0.964706 | 0.992157 | 0.992157 | 0.992157 | 0.992157 | 0.992157 | 0.992157 | 0.996078 | 0.992157 | 0.992157 | 0.796078 | 0.090196 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
23 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.160784 | 0.749020 | 0.901961 | 0.992157 | 0.992157 | 0.992157 | 0.992157 | 0.941176 | 0.560784 | 0.203922 | 0.062745 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
24 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
25 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
26 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
27 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
also we still can see this data in usual image regime
show_image(converted_tensor_threes[5], cmap='grey')
<Axes: >
Observation about image data¶
It’s always was very strange for me when I saw these transformations of range
in data before ML training. I always thought that this is some mutiliation of the data.
This visualization shows to us that this changing doesn’t change the data
we changed order of digits but not the relation/proportion among pieces of data
Image editors like range 0..255
Machine learning likes smaller range 0..1
but data is the same in both cases
Does ML models work with info about sides of image?¶
Some models work with information about side sizes, for example CNNs
in which case we have ready tensors with data in good format
print(f"Shape of Converted tensor images ready for CNN: {converted_tensor_threes.shape}")
Shape of Converted tensor images ready for CNN: torch.Size([6131, 28, 28])
Some models want to throw off this information about sides sizes, for example FCN
in which case we need to flatten our image data
If you read the chapter 4 from Practical Deep Learning for Coders
that’s exactly what is going on there: we use linear model there
and need to flatten out this $28*28$ images into the row of 784 pixels
we can do this flattening by using command view
number 784 is a
$side * height = 28 * 28 = 784$
this command slice image by rows and concatenate these rows from left to right
flattened_tensor_threes = converted_tensor_threes.view(-1, 28*28)
print(f"Shape of Flattened tensor images ready for CNN: {flattened_tensor_threes.shape}")
Shape of Flattened tensor images ready for CNN: torch.Size([6131, 784])
If previous explanation about view
function doesn’t make sense
let’s see output of this command on the simplier example
we create “image” with sides 3 * 3 and we need to flat it out
D2_tensor = torch.tensor([[1,2,3], [4,5,6], [7,8,9]])
print(D2_tensor)
tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
flat_D2_tensor = D2_tensor.view(-1, 3*3)
print(flat_D2_tensor)
tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9]])
What about color images?¶
All this were about greyscale images. And what about full color images?
The same principles but with a little trick from visual domain.
With mix of red green and blue we can mix any color.
So for storing full color image we can use three images and just combine them
to achieve any color.
These three mixed images have a name channels
Manual crafting of a color image¶
Let’s create tensor with the same sides but with additional dimension
earlier we used this dimension as an index of file
so when we saw this: Shape of tensor_trees: torch.Size([6131, 28, 28])
we knew that this is a tensor with 6131 images
for now we will use this dimension as a channel
so when we see this Shape of clr_tensor: torch.Size([3, 28, 28])
we knew that this is an image tensor with 3 channels and 28 * 28 sides
- R – 0 channel is red
- G – 1 channel is green
- B – 2 channel is blue
It’s still the same dimension from the programming point of view,
we just put another sense of use into this dimension.
clr_tensor = torch.zeros(3,28,28)
print(f"Shape of clr_tensor: {clr_tensor.shape}")
Shape of clr_tensor: torch.Size([3, 28, 28])
Let’s look at our image
show_image(clr_tensor)
<Axes: >
It’s just black and it’s ok because we create our tensor and fill it with zeroes
which is black for an image
Now we will fill with ones first 5 rows in 0/Red channel in our tensor
so we should receive a horizontal red line
clr_tensor[0][:5]=1
show_image(clr_tensor)
<Axes: >
Next channel 1 is green, we will put horizontal line of ones at the bottom
clr_tensor[1][23:28]=1
show_image(clr_tensor)
<Axes: >
final blue channel 2
we fill it with vertical line of ones
and here we can see interesting effect of color mixing:
left up corner – mix of red and blue gives magenta
left bottom corner – mix of blue and green gives cyan
clr_tensor[2][:,:5]=1
show_image(clr_tensor)
<Axes: >
Leave a Reply