Skip to content

monsefot/mafilestore

Repository files navigation

Distributed FileStore

A simple, peer-to-peer file store application written in Go. This repository demonstrates how to:

  • Store files locally in a structured manner.
  • Broadcast file availability to other peers.
  • Fetch missing files from peers on the network.
  • Transfer and stream large files chunk-by-chunk to minimize memory usage and improve reliability.

Table of Contents


Overview

Distributed FileStore (often referred to as filestore in the code) allows you to share files across multiple nodes on a peer-to-peer network. When you store a file locally, the application automatically broadcasts the file’s key and content to other connected peers. When you request a file (Get), the application first checks your local storage; if it’s not there, it queries the rest of the network and streams the file to you in chunks, rather than loading it into memory at once.

Features

  1. Peer-to-Peer Transport: Uses a p2p.Transport interface, which can be backed by custom TCP or other protocols.
  2. Chunked Storage and Retrieval:
    • Store(key, reader, size) writes a file locally chunk by chunk and propagates it to peers.
    • Get(key) retrieves a file using streaming (io.Pipe) so large files never fully load in memory.
  3. Bootstrapping: Configure an initial list of bootstrap nodes to discover and connect with peers.
  4. Extendable Path Transformation: Pluggable logic (PathTransformation) for structuring how files are stored on the local disk.
  5. Graceful Shutdown: Easily stop the server and release resources via Stop().
  6. Improved Timeouts: Supports per-chunk or sliding/inactivity timeouts to avoid hanging when the network stalls.

Project Structure

.
├── server.go              # Core FileServer logic (broadcasting, chunk-based store, retrieval)
├── store.go               # Local Store logic (reading/writing files in chunks)
├── p2p/                   # Package handling network transport & peer connections
│   ├── transport.go       # Defines the Transport interface
│   ├── tcp_transport.go   # TCP-based Transport and Peer implementation
│   └── ...
├── main.go                # Example usage or demonstration code
├── README.md              # Project documentation
└── go.mod                 # Go module file

Key Components:

  • FileServer: Manages peers, coordinates the chunk-based store, listens for incoming requests, handles broadcasting.
  • Store (in store.go): Responsible for reading and writing files on the local disk, including chunk logic for large files.
  • p2p package: Provides an interface and a reference TCP implementation for peer-to-peer communication (connect, accept, dial, decode messages).

Installation

  1. Clone the repository:

    git clone https://github.com/monsefot/filestore.git
    cd filestore
  2. Build:

    make build
  3. Run (basic usage):

    make run

    Or run via go run main.go directly during development.

Usage

Below is an example flow to run two instances of FileStore and store/fetch files between them.

  1. Start the first node (listening on localhost:3000, for example):

    serverConfig := FileServerConfig{
        StorageRoot:        "store_3000",
        PathTransformation: SHA1PathTransformation,
        BootstrapNodes:     []string{},
    }
    
    server := NewFileServer(serverConfig)
    tcpTransportConfig := p2p.TCPTransportConfig{
        ListenAddress: ":3000",
        Handshake:     p2p.NoHandShake,
        Decoder:       p2p.GOBDecoder{},
    }
    tcpTransport := p2p.NewTCPTransport(tcpTransportConfig)
    tcpTransport.Config.OnPeer = server.OnPeer
    
    server.Config.Transport = tcpTransport
    
    if err := server.Start(); err != nil {
        log.Fatal("Failed to start server:", err)
    }
  2. Store a file on the first node:

    content := strings.NewReader("Hello from Node1!")
    err := server.Store("greeting.txt", content, 20) // last param is file size in bytes
    if err != nil {
        log.Fatal("Failed to store file:", err)
    }

    The server splits the file into chunks (based on ChunkSize), saves it locally, and broadcasts those chunks to peers.

  3. Get the file on another node:

    r, err := server.Get("greeting.txt")
    if err != nil {
        log.Fatal("Failed to get file:", err)
    }
    if r == nil {
        log.Fatal("File not found or still searching on the network.")
    }
    
    // Stream the data chunk by chunk
    buf := make([]byte, 4096)
    for {
        n, err := r.Read(buf)
        if err == io.EOF {
            break
        }
        if err != nil {
            log.Fatal("Error reading chunk:", err)
        }
        // Process or write chunk (buf[:n]) somewhere
        fmt.Print(string(buf[:n]))
    }

    If the file isn’t found locally, the node sends a request for greeting.txt. The peer that has the file responds chunk-by-chunk. The receiving node writes each chunk to a PipeWriter, and your r.Read(buf) call blocks until the next chunk arrives.

How It Works

  1. Starting the Server

    • Start() opens a listener and runs a main loop that consumes messages (chunks, requests) from the transport layer.
  2. Storing Files

    • Store(key, reader, size) breaks the file into chunks (e.g., 4 KiB each).
    • Each chunk is written locally and broadcast to connected peers.
  3. Getting Files

    • Get(key) checks local storage. If missing, it sends a FileGetMessage.
    • A node with the file replies chunk by chunk (FileStoreChunkMessage).
    • The requester streams the chunks via an io.Pipe, returning an io.Reader that never loads the entire file into memory.
  4. Timeouts

    • Instead of a single global timeout, the system can use a per-chunk or sliding timeout to avoid hanging forever if the network stalls.
  5. Stopping the Server

    • Stop() closes internal channels and gracefully stops the server’s main loop and any active transport connections.

Chunk-Based RPC & Streaming

  1. Why Chunks?

    • Splitting large files into smaller segments prevents excessive memory usage.
    • The file can be streamed as it arrives, improving responsiveness on slow networks.
  2. Pipelining via io.Pipe

    • A PipeReader / PipeWriter is used to seamlessly stream each arriving chunk to the caller.
    • As soon as the first chunk arrives, the reading side can begin consuming it.
  3. Chunk-Level Timeouts

    • A short “inactivity” timer is reset whenever a new chunk is received.
    • If data flow stops, the request aborts after the timer. This prevents indefinite blocking if a peer disappears.
  4. Final Chunk vs. Total Chunks

    • Each chunk can carry metadata like IsLast or the total number of chunks.
    • Once all chunks arrive, the writer closes the PipeWriter, signaling an EOF to the reading side.
  5. No Full-Memory Allocation

    • Each chunk is processed or written to disk immediately, avoiding large RAM usage even for multi-GB files.

Extending

  • Flexible Path Transformation: Customize how keys map to directory/file paths for more granular or distributed storage structures.
  • Security: Integrate TLS or encryption within the p2p.Transport to secure data over untrusted networks.
  • Retry & Acknowledgments: Implement chunk-level acknowledgments for more robust fault tolerance, retrying dropped chunks if necessary.
  • Metadata/Indexing: Keep an index of which peers hold which files; avoid broadcasting requests if not needed.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

fully distributed filestorage written in go

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published