wiki:file_system

Version 2 (modified by alain, 9 years ago) (diff)

--

GIET_VM / FAT32 File System

This section describes GIET_VM system calls and user libraries. The multi-threaded applications have been designed to analyse the TSAR manycore architecture scalability.

General Principles

This implementation supports only block devices with block_size = 512 bytes.

In the context of the FAT32, a cluster is the smallest storage allocation unit on the block device : any file (or directory) occupies at least one cluster, and one cluster cannot be shared by 2 different files.

This implementation supports only one cluster_size = 4 Kbytes (i.e. 8 blocks).

The FAT region on the block device is an array of 32 bits words defining the linked list of clusters allocated to given file in the DATA region of the block device. Each slot in this array contains a cluster_ptr, that is the index of the cluster on the block device. The cluster_ptr value cannot be larger than 0x0FFFFFFF (i.e. 256 M). The max addressable storage capacity in the DATA region on the block device is therefore (256 M * 4 Kbytes) = 1 Tbytes.

This implementation uses memory caches implemented in the distributed kernel heap. There is actually one cache per file (called file_cache), and one cache for the FAT itself (called fat_cache). The cache size is not fixed: it is dynamically increased from the kernel heap as required by the read / write access to the files or to the FAT itself. The memory allocated to a given cache_file is only released when the file is closed.

In the present implementation, there is no rescue mechanism in case of heap overflow: The system crash with a nice error message on the kernel terminal...

Cache Structure

The fat_cache and the file_cache have the same organisation. Each cache contains an integer number of clusters, as the cluster is the smallest unit of data that can be loaded from the block device to the cache. To reduce the access time, this set of clusters is organized as a 64-Tree: each node has one single parent and (up to) 64 children. The leaf nodes are the cluster descriptors. To access a given cluster in a given file, we use the cluster_id (index of cluster inside the file), that is different from the cluster_ptr index of cluster on the block device. This cluster_id must be split in pieces of 6 bits, that are used to access the proper children at a given level in the 64-Tree. The depth (number of levels) of the 64-Tree depends on the file size :

File Size levels
up to 256 Kbytes 1
from 256 Kbytes to 16 Mbytes 2
from 16 Mbytes to 1 Gbytes 3
larger than 1 Gbytes 4

Cache Write Policy

For a file_cache, the GIET_VM implements a WRITE-BACK policy. The data are always modified in the cache. In case of miss, new clusters are allocated to the target file, the cache is updated from the block device, and the data are modified in the cache, but not in the block device. The modified clusters are written on the block device only when the file is closed, using the dirty flag implemented in each cluster descriptor.

For the fat_cache, the GIET_VM implements a WRITE-THROUGH policy. When the FAT content is modified (i.e. when new clusters are allocated to an existing file, or when a new file (or directory) is created, the modifications are written in the fat_cache (that must be updated in case of miss), and are immediately reported to the block device, for each modified block.

Block Device Drivers