Class SoftIndexFileStore

  • All Implemented Interfaces:
    Lifecycle, AdvancedCacheLoader, AdvancedCacheWriter, AdvancedLoadWriteStore, CacheLoader, CacheWriter, ExternalStore

    public class SoftIndexFileStore
    extends java.lang.Object
    implements AdvancedLoadWriteStore
    Local file-based cache store, optimized for write-through use with strong consistency guarantees (ability to flush disk operations before returning from the store call). * DESIGN: There are three threads operating in the cache-store: - LogAppender: Requests to store entries are passed to the LogAppender thread via queue, then the requestor threads wait until LogAppender notifies them about successful store. LogAppender serializes the writes into append-only file, writes the offset into TemporaryTable and enqueues request to update index into UpdateQueue. The append-only files have limited size, when the file is full, new file is started. - IndexUpdater: Reads the UpdateQueue, applies the operation into B-tree-like structure Index (exact description below) and then removes the entry from TemporaryTable. When the Index is overwriten, the current entry offset is retrieved and IndexUpdater increases the unused space statistics in FileStats. - Compactor: When a limit of unused space in some file is reached (according to FileStats), the Compactor starts reading this file sequentially, querying TemporaryTable or Index for the current entry position and copying the unchanged entries into another file. For the entries that are still valid in the original file, a compare-and-set (file-offset based) request is enqueued into UpdateQueue - therefore this operation cannot interfere with concurrent writes overwriting the entry. Multiple files can be merged into single file during compaction. Structures: - TemporaryTable: keeps the records about current entry location until this is applied to the Index. Each read request goes to the TemporaryTable, if the key is not found here, Index is queried. - UpdateQueue: bounded queue (to prevent grow the TemporaryTable too much) of either forced writes (used for regular stores) or compare-and-set writes (used by Compactor). - FileStats: simple (Concurrent)HashTable with actual file size and amount of unused space for each file. - Index: B+-tree of IndexNodes. The tree is dropped and built a new if the process crashes, it does not need to flush disk operations. On disk it is kept as single random-accessed file, with free blocks list stored in memory. As IndexUpdater may easily become a bottleneck under heavy load, the IndexUpdater thread, UpdateQueue and tree of IndexNodes may be multiplied several times - the Index is divided into Segments. Each segment owns keys according to the hashCode() of the key. Amount of entries in IndexNode is limited by the size it occupies on disk. This size is limited by configurable nodeSize (4096 bytes by default?), only in case that the node contains single pivot (too long) it can be longer. A key_prefix common for all keys in the IndexNode is stored in order to reduce space requirements. For implementation reasons the keys are limited to 32kB - this requirement may be circumvented later. The pivots are not whole keys - it is the shortest part of key that is greater than all left children (but lesser or equal to all right children) - let us call this key_part. The key_parts are sorted in the IndexNode, naturally. On disk it has this format: key_prefix_length(2 bytes), key_prefix, num_parts(2 bytes), ( key_part_length (2 bytes), key_part, left_child_index_node_offset (8 bytes))+, right_child_index_node_offset (8 bytes) In memory, for every child a SoftReference is held. When this reference is empty (but the offset in file is set), any reader may load the reference using double-locking pattern (synchronized over the reference itself). The entry is never loaded by multiple threads in parallel and even may block other threads trying to read this node. For each node in memory a RW-lock is held. When the IndexUpdater thread updates the Index (modifying some IndexNodes), it prepares a copy of these nodes (already stored into index file). Then, in locks only the uppermost node for writing, overwrites the references to new data and unlocks the this node. After that the changed nodes are traversed from top down, write locked and their record in index file is released. Reader threads crawl the tree from top down, locking the parent node (for reading), locking child node and unlocking parent node.
    Radim Vansa <>
    • Constructor Detail

      • SoftIndexFileStore

        public SoftIndexFileStore()
    • Method Detail

      • start

        public void start()
        Description copied from interface: Lifecycle
        Invoked on component start
        Specified by:
        start in interface Lifecycle
      • getIndexLocation

        protected java.nio.file.Path getIndexLocation()
      • isSeqIdOld

        protected boolean isSeqIdOld​(long seqId,
                                     java.lang.Object key,
                                     byte[] serializedKey)
      • startIndex

        protected void startIndex()
      • isIndexLoaded

        protected boolean isIndexLoaded()
      • stop

        public void stop()
        Description copied from interface: Lifecycle
        Invoked on component stop
        Specified by:
        stop in interface Lifecycle
      • destroy

        public void destroy()
        Description copied from interface: ExternalStore
        Method to be used to destroy and clean up any resources associated with this store. This is normally only useful for non shared stores.

        This method will ensure the store is stopped and properly cleans up all resources for it.

        Specified by:
        destroy in interface ExternalStore
      • purge

        public void purge​(java.util.concurrent.Executor threadPool,
                          AdvancedCacheWriter.PurgeListener listener)
        Description copied from interface: AdvancedCacheWriter
        Using the thread in the pool, removed all the expired data from the persistence storage. For each removed entry, the supplied listener is invoked.

        When this method returns all entries will be purged and no tasks will be running due to this loader in the provided executor. If however an exception is thrown there could be tasks still pending or running in the executor.

        Specified by:
        purge in interface AdvancedCacheWriter
      • delete

        public boolean delete​(java.lang.Object key)
        Specified by:
        delete in interface CacheWriter
        true if the entry existed in the persistent store and it was deleted.
      • contains

        public boolean contains​(java.lang.Object key)
        Description copied from interface: CacheLoader
        Returns true if the storage contains an entry associated with the given key.
        Specified by:
        contains in interface CacheLoader
      • debugInfo

        public java.lang.String debugInfo​(java.lang.Object key)
        This method should be called by reflection to get more info about the missing/invalid key (from test tools)
        key -
      • publishKeys

        public org.reactivestreams.Publisher publishKeys​(java.util.function.Predicate filter)
        Description copied from interface: AdvancedCacheLoader
        Publishes all the keys from this store. The given publisher can be used by as many Subscribers as desired. Keys are not retrieved until a given Subscriber requests them from the Subscription.

        Stores will return only non expired keys

        Specified by:
        publishKeys in interface AdvancedCacheLoader
        filter - a filter - null is treated as allowing all entries
        a publisher that will provide the keys from the store
      • entryPublisher

        public org.reactivestreams.Publisher<MarshallableEntry> entryPublisher​(java.util.function.Predicate filter,
                                                                               boolean fetchValue,
                                                                               boolean fetchMetadata)
        Description copied from interface: AdvancedCacheLoader
        Publishes all entries from this store. The given publisher can be used by as many Subscribers as desired. Entries are not retrieved until a given Subscriber requests them from the Subscription.

        If fetchMetadata is true this store must guarantee to not return any expired entries.

        Specified by:
        entryPublisher in interface AdvancedCacheLoader
        filter - a filter - null is treated as allowing all entries
        fetchValue - whether or not to fetch the value from the persistent store. E.g. if the iteration is intended only over the key set, no point fetching the values from the persistent store as well
        fetchMetadata - whether or not to fetch the metadata from the persistent store. E.g. if the iteration is intended only ove the key set, then no point fetching the metadata from the persistent store as well
        a publisher that will provide the entries from the store