Skip to content

Optimus Spark is a collection of pre-built, optimized transformation templates for Apache Spark

License

Notifications You must be signed in to change notification settings

dropit-dev/optimus-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Optimus Spark

Transform your data pipelines with the power of the AllSpark.

Optimus Spark is here to roll out pre-built, battle-tested transformation templates for Apache Spark. No need to wage war against messy datasets and inefficient pipelines—Optimus Spark has your back. It's faster than Bumblebee and more powerful than Megatron (at least when it comes to data transformations).


🚀 Why Choose Optimus Spark?

  • Plug-and-Play Templates: Skip the heavy lifting and let Optimus Spark do the transforming.
  • Highly Customizable: Because every dataset deserves a personal touch.
  • Supercharged Performance: Built to handle big data, whether it's a skirmish or a full-scale war.
  • Detailed Documentation: So easy, even a Decepticon could use it.
  • Community-Driven: Got a better idea? Join the Autobots and contribute.

🔧 Installation

Step 1: Summon Optimus Spark by cloning the repository:

git clone https://github.com/dropit-dev/optimus-spark.git

Step 2: Install dependencies (because even Autobots need fuel):

pip install -r requirements.txt

📚 How to Use

Example: Roll Out a Transformation

from optimus_spark.templates import example_transformation

# Load your Spark session
from pyspark.sql import SparkSession
spark = SparkSession.builder \
    .appName("Optimus Spark Example") \
    .getOrCreate()

# Load a sample dataset
data = [("Sam", 23), ("Mikaela", 28), ("Bumblebee", 7)]
columns = ["Name", "Age"]
df = spark.createDataFrame(data, columns)

# Apply the transformation
transformed_df = example_transformation(df)

# Show the results
transformed_df.show()

It’s like calling Optimus Prime—except instead of saving the universe, you’re saving your data pipeline. For more examples, check the examples directory.


📦 What's Inside?

  • Data Cleansing Templates: Wipe out duplicates, nulls, and other Decepticon data elements.
  • Schema Transformers: Morph your data structures faster than Optimus morphs into a truck.
  • Aggregations: Group, sum, and dominate.
  • Data Enrichment: Like adding jetpacks to Bumblebee.

Peek inside the templates directory for the full arsenal.


🌐 Contributing

Want to join the Autobots? Here's how you can contribute:

  1. Fork the repository (just like grabbing the Matrix of Leadership).
  2. Create a feature branch (git checkout -b awesome-feature).
  3. Commit your changes (git commit -m "Added more awesomeness").
  4. Push to your branch (git push origin awesome-feature).
  5. Open a pull request (bonus points if it’s witty).

See CONTRIBUTING.md for more.


📝 License

Optimus Spark is licensed under the Apache License 2.0. Use it freely, but remember to give credit to Dropit Shopping & Friends (because we’re the good guys). See the LICENSE file for details.


🤝 Shoutouts

Developed by Dropit Shopping & Friends, with inspiration from the greatest Autobot of all time (and Spark, obviously). Ready to transform and roll out? Let’s do this!


With Optimus Spark, your pipelines won’t just work—they’ll transform.

About

Optimus Spark is a collection of pre-built, optimized transformation templates for Apache Spark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published