Skip to content

blackms/ExcelMigrationTool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

35 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“Š Excel Migration Framework

A powerful framework for migrating Excel data using configurable rules, multimodal analysis, and LLM integration. This framework allows you to define complex migration rules, learn from examples, and leverage visual analysis of Excel sheets.

โœจ Features

  • ๐ŸŽฏ Task-centric approach for Excel migrations
  • ๐Ÿค– Support for multiple LLM providers through LangChain
  • ๐Ÿ‘๏ธ Multimodal analysis capabilities:
    • ๐Ÿ“‘ Direct Excel file processing
    • ๐Ÿ“ธ Screenshot analysis and data extraction
    • ๐Ÿ” Visual structure recognition
    • ๐Ÿ“ OCR for text extraction
  • ๐Ÿง  Rule generation from example files
  • ๐Ÿ› ๏ธ Flexible rule types:
    • ๐Ÿ“‹ Direct copy
    • ๐Ÿ”„ Value transformation
    • ๐Ÿงฎ Computed fields
    • ๐Ÿ“Š Aggregations
    • โœ… Validation rules
  • ๐Ÿ”Œ Plugin-based rule execution:
    • ๐Ÿงฉ Extensible formula executors
    • ๐Ÿ”„ Custom transformations
    • ๐ŸŽจ Modular design
    • ๐Ÿ›ก๏ธ SOLID principles
  • ๐Ÿค– LLM-powered transformations
  • โš™๏ธ Configurable via JSON rules
  • ๐Ÿ“ Comprehensive logging with loguru
  • ๐Ÿ—๏ธ SOLID principles and clean architecture

๐Ÿš€ Installation

# Using pip
pip install excel-migration-framework

# Using poetry
poetry add excel-migration-framework

๐Ÿƒโ€โ™‚๏ธ Quick Start

๐Ÿ“Œ Basic Usage

# Simple migration with rules
excel-migrate source.xlsx target.xlsx --rules rules.json

# Process specific sheets
excel-migrate source.xlsx target.xlsx \
    --source-sheets "Sheet1" "Sheet2" \
    --target-sheets "Output1" "Output2"

# Generate rules from example files with sheet selection
excel-migrate source.xlsx target.xlsx \
    --example-source example_source.xlsx \
    --example-target example_target.xlsx \
    --example-source-sheets "Template1" \
    --example-target-sheets "Result1"

# Include screenshots with sheet mapping
excel-migrate source.xlsx target.xlsx \
    --screenshots sheet1.png sheet2.png \
    --screenshot-sheet-mapping "sheet1.png:Sheet1" "sheet2.png:Sheet2"

๐Ÿ’ป Python API

from excel_migration.tasks.base import MigrationTask
from excel_migration.core.processor import TaskBasedProcessor
from pathlib import Path

# Create a migration task with sheet selection
task = MigrationTask(
    source_file=Path("source.xlsx"),
    target_file=Path("target.xlsx"),
    task_type="migrate",
    description="Migrate customer data",
    context={
        "sheet_mapping": {
            "CustomerData": "Processed_Customers",
            "Transactions": "Processed_Transactions"
        }
    },
    screenshots=[Path("sheet1.png")]
)

# Process the task
processor = TaskBasedProcessor(...)
success = await processor.process(task)

๐Ÿ”Œ Plugin System

The framework uses a flexible plugin system for formula execution and value transformations, following SOLID principles:

๐Ÿงฉ Formula Executors

from excel_migration.plugins.interfaces import FormulaExecutor
from typing import Any, Dict

class CustomFormulaExecutor(FormulaExecutor):
    """Custom formula executor plugin."""
    
    formula_type = "CUSTOM"
    
    def can_execute(self, formula: str) -> bool:
        """Check if this executor can handle the formula."""
        return formula.startswith("CUSTOM(")
    
    def execute(self, formula: str, values: Dict[str, Any]) -> Any:
        """Execute the custom formula."""
        # Implement custom formula logic
        pass

# Register the plugin
registry = PluginRegistry()
registry.register_formula_executor(CustomFormulaExecutor())

๐Ÿ”„ Transformation Handlers

from excel_migration.plugins.interfaces import TransformationHandler
from typing import Any, Dict

class CustomTransformer(TransformationHandler):
    """Custom transformation plugin."""
    
    transformation_type = "custom_format"
    
    def can_transform(self, transformation: Dict[str, Any]) -> bool:
        """Check if this handler can process the transformation."""
        return transformation.get("type") == self.transformation_type
    
    def transform(self, value: Any, params: Dict[str, Any]) -> Any:
        """Transform the value according to parameters."""
        # Implement custom transformation logic
        pass

# Register the plugin
registry.register_transformation_handler(CustomTransformer())

๐Ÿ“ฆ Built-in Plugins

The framework includes several built-in plugins:

Formula Executors:

  • ๐Ÿ“… DateDiffExecutor: Calculate date differences
  • ๐Ÿ”ข CountExecutor: Count values or records
  • ๐ŸŽฏ CountIfExecutor: Conditional counting
  • โž• SumExecutor: Sum numeric values
  • ๐Ÿ“Š AverageExecutor: Calculate averages

Transformation Handlers:

  • ๐Ÿ“… DateTimeTransformer: Format dates and times
  • ๐Ÿ”ข NumericTransformer: Format numbers
  • โœ… BooleanTransformer: Convert to boolean values
  • ๐Ÿ”ค ConcatenateTransformer: Join multiple values

๐ŸŽฏ Task Types

๐Ÿ”„ Migration Task

Migrates data from source to target Excel files.

excel-migrate source.xlsx target.xlsx \
    --task-type migrate \
    --source-sheets "Data" \
    --target-sheets "Processed"

๐Ÿ” Analysis Task

Analyzes Excel files and provides insights.

excel-migrate source.xlsx target.xlsx \
    --task-type analyze \
    --source-sheets "Financial" "Metrics"

โœ… Validation Task

Validates data against rules.

excel-migrate source.xlsx target.xlsx \
    --task-type validate \
    --source-sheets "Input" \
    --rules validation_rules.json

๐Ÿ”ฎ Multimodal Analysis

The framework can analyze Excel sheets through multiple approaches:

  1. ๐Ÿ“Š Direct File Analysis

    • ๐Ÿ” Structure analysis
    • ๐Ÿ“ Formula parsing
    • ๐Ÿท๏ธ Data type detection
  2. ๐Ÿ‘๏ธ Visual Analysis (from screenshots)

    • ๐Ÿ“ Table structure detection
    • ๐Ÿ”ฒ Cell boundary recognition
    • ๐Ÿ“ Text extraction (OCR)
    • ๐ŸŽจ Layout analysis
  3. ๐Ÿง  LLM Integration

    • ๐Ÿ’ญ Natural language understanding
    • ๐Ÿ”„ Complex pattern recognition
    • ๐Ÿ“š Context-aware transformations

โšก Rule Generation

Rules can be generated automatically by analyzing example files:

# Generate rules from specific sheets in examples
excel-migrate source.xlsx target.xlsx \
    --example-source example_source.xlsx \
    --example-target example_target.xlsx \
    --example-source-sheets "Template" \
    --example-target-sheets "Final" \
    --output-rules rules.json

The framework will:

  1. ๐Ÿ” Analyze source and target examples
  2. ๐Ÿงฎ Identify patterns and transformations
  3. โœจ Generate appropriate rules
  4. ๐Ÿ’พ Save rules for future use

โš™๏ธ Configuration

๐Ÿค– LLM Providers

# Use OpenAI
excel-migrate source.xlsx target.xlsx \
    --llm-provider openai \
    --model gpt-4

# Use Anthropic
excel-migrate source.xlsx target.xlsx \
    --llm-provider anthropic \
    --model claude-2

๐Ÿ“ Logging

# Set log level
excel-migrate source.xlsx target.xlsx --log-level DEBUG

# Log to file
excel-migrate source.xlsx target.xlsx --log-file migration.log

๐Ÿ”ง Advanced Features

๐Ÿ› ๏ธ Custom Rule Types

Create custom rule types by implementing the Rule interface:

from excel_migration.core.interfaces import Rule

class CustomRule(Rule):
    async def apply(self, data: Any, context: Dict[str, Any]) -> Any:
        # Implement custom logic
        pass

๐Ÿ“ก Event Handling

Subscribe to migration events:

from excel_migration.core.interfaces import EventEmitter

def on_cell_processed(data: Dict[str, Any]):
    print(f"Processed cell: {data}")

emitter = EventEmitter()
emitter.on("cell_processed", on_cell_processed)

๐Ÿ’พ Caching

Enable caching for better performance:

excel-migrate source.xlsx target.xlsx --cache-dir ./cache

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

๐Ÿ› ๏ธ Development Setup

# Clone repository
git clone https://github.com/yourusername/excel-migration-framework.git

# Install dependencies
poetry install

# Run tests
poetry run pytest

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages