Quick Start

Get up and running with Veloxx in just 5 minutes! This guide will walk you through creating your first DataFrame and performing basic operations.

Prerequisites

Make sure you have Rust installed. If not, install it from rustup.rs.

Create a New Project

cargo new velox-quickstart
cd velox-quickstart

Add Veloxx to Your Project

Add Veloxx to your Cargo.toml:

Cargo.toml
[dependencies]
veloxx = "0.2.4"

For additional features:

Cargo.toml
[dependencies]
veloxx = { version = "0.2.4", features = ["advanced_io", "data_quality", "window_functions"] }

Your First DataFrame

Replace the contents of src/main.rs with:

src/main.rs
use veloxx::dataframe::DataFrame;
use veloxx::series::Series;
use std::collections::BTreeMap;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a DataFrame from scratch
    let mut columns = BTreeMap::new();
    
    columns.insert(
        "name".to_string(),
        Series::new_string("name", vec![
            Some("Alice".to_string()),
            Some("Bob".to_string()),
            Some("Charlie".to_string()),
            Some("Diana".to_string()),
        ]),
    );
    
    columns.insert(
        "age".to_string(),
        Series::new_i32("age", vec![Some(30), Some(25), Some(35), Some(28)]),
    );
    
    columns.insert(
        "salary".to_string(),
        Series::new_f64("salary", vec![
            Some(75000.0), 
            Some(65000.0), 
            Some(85000.0), 
            Some(72000.0)
        ]),
    );

    columns.insert(
        "department".to_string(),
        Series::new_string("department", vec![
            Some("Engineering".to_string()),
            Some("Marketing".to_string()),
            Some("Engineering".to_string()),
            Some("Sales".to_string()),
        ]),
    );

    let df = DataFrame::new(columns)?;
    println!("📊 Our Employee DataFrame:");
    println!("{}", df);

    Ok(())
}

Run your program:

cargo run

You should see output like:

📊 Our Employee DataFrame:
age            department     name           salary         
--------------- --------------- --------------- --------------- 
30             Engineering    Alice          75000.00       
25             Marketing      Bob            65000.00       
35             Engineering    Charlie        85000.00       
28             Sales          Diana          72000.00       

Basic Operations

Now let's explore some basic operations. Update your main.rs:

src/main.rs
use veloxx::dataframe::DataFrame;
use veloxx::series::Series;
use veloxx::conditions::Condition;
use veloxx::types::Value;
use std::collections::BTreeMap;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create the DataFrame (same as before)
    let mut columns = BTreeMap::new();
    columns.insert(
        "name".to_string(),
        Series::new_string("name", vec![
            Some("Alice".to_string()),
            Some("Bob".to_string()),
            Some("Charlie".to_string()),
            Some("Diana".to_string()),
        ]),
    );
    columns.insert(
        "age".to_string(),
        Series::new_i32("age", vec![Some(30), Some(25), Some(35), Some(28)]),
    );
    columns.insert(
        "salary".to_string(),
        Series::new_f64("salary", vec![
            Some(75000.0), 
            Some(65000.0), 
            Some(85000.0), 
            Some(72000.0)
        ]),
    );
    columns.insert(
        "department".to_string(),
        Series::new_string("department", vec![
            Some("Engineering".to_string()),
            Some("Marketing".to_string()),
            Some("Engineering".to_string()),
            Some("Sales".to_string()),
        ]),
    );

    let df = DataFrame::new(columns)?;

    // 1. Basic DataFrame info
    println!("📊 DataFrame Info:");
    println!("Rows: {}, Columns: {}", df.row_count(), df.column_count());
    println!("Columns: {:?}\n", df.column_names());

    // 2. Filter employees with salary > 70000
    println!("💰 High Earners (Salary > $70,000):");
    let high_salary_condition = Condition::Gt("salary".to_string(), Value::F64(70000.0));
    let high_earners = df.filter(&high_salary_condition)?;
    println!("{}\n", high_earners);

    // 3. Select specific columns
    println!("👥 Names and Ages Only:");
    let names_ages = df.select_columns(vec!["name".to_string(), "age".to_string()])?;
    println!("{}\n", names_ages);

    // 4. Filter Engineering department
    println!("🔧 Engineering Team:");
    let eng_condition = Condition::Eq(
        "department".to_string(), 
        Value::String("Engineering".to_string())
    );
    let engineering_team = df.filter(&eng_condition)?;
    println!("{}\n", engineering_team);

    // 5. Sort by age (descending)
    println!("📈 Sorted by Age (Oldest First):");
    let sorted_by_age = df.sort(vec!["age".to_string()], false)?;
    println!("{}\n", sorted_by_age);

    // 6. Basic statistics
    println!("📊 Salary Statistics:");
    if let Some(salary_series) = df.get_column("salary") {
        println!("Mean Salary: ${:.2}", salary_series.mean()?);
        println!("Max Salary: ${:.2}", salary_series.max()?);
        println!("Min Salary: ${:.2}", salary_series.min()?);
    }

    Ok(())
}

Run this enhanced example:

cargo run

Working with CSV Files

Veloxx can easily load data from CSV files. Create a sample CSV file:

employees.csv
name,age,salary,department
Alice,30,75000,Engineering
Bob,25,65000,Marketing
Charlie,35,85000,Engineering
Diana,28,72000,Sales
Eve,32,78000,Engineering
Frank,29,68000,Marketing

Then load and process it:

src/main.rs
use veloxx::dataframe::DataFrame;
use veloxx::conditions::Condition;
use veloxx::types::Value;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load DataFrame from CSV
    let df = DataFrame::from_csv("employees.csv")?;
    
    println!("📂 Loaded from CSV:");
    println!("{}\n", df);

    // Group by department and calculate average salary
    println!("📊 Average Salary by Department:");
    let grouped = df.group_by(vec!["department".to_string()])?;
    let avg_salaries = grouped.agg(vec![("salary", "mean")])?;
    println!("{}\n", avg_salaries);

    // Find employees aged 30 or older
    println!("👴 Employees 30 or Older:");
    let condition = Condition::Gte("age".to_string(), Value::I32(30));
    let mature_employees = df.filter(&condition)?;
    println!("{}", mature_employees);

    Ok(())
}

Next Steps

Congratulations! You've learned the basics of Veloxx. Here's what to explore next:

🚀 Advanced Features

Advanced I/O: Work with JSON, Parquet, and other formats
Data Quality: Validate and clean your data
Window Functions: Perform advanced analytics
Joins: Combine multiple DataFrames

📚 Learning Resources

Complete API Reference: Explore all available methods
Examples Repository: Real-world usage patterns
Performance Guide: Optimize your data processing

🔧 Integration

Python Bindings: Use Veloxx from Python
JavaScript/WASM: Run Veloxx in the browser or Node.js

💡 Common Patterns

// Chain operations for data pipeline
let result = df
    .filter(&age_condition)?
    .select_columns(vec!["name".to_string(), "salary".to_string()])?
    .sort(vec!["salary".to_string()], false)?;

// Handle missing data
let clean_df = df.drop_nulls()?;
let filled_df = df.fill_nulls(Value::I32(0))?;

// Export results
df.to_csv("output.csv")?;

🤝 Community

GitHub Discussions: Ask questions and share ideas
Issues: Report bugs or request features
Contributing Guide: Help improve Veloxx

Pro Tip

Start small with simple operations and gradually explore more advanced features. The Veloxx API is designed to be intuitive and chainable for building complex data processing pipelines.

Performance Note

Veloxx is optimized for performance with columnar storage and lazy evaluation. For large datasets, consider using features like chunked processing and streaming I/O.

Prerequisites​

Create a New Project​

Add Veloxx to Your Project​

Your First DataFrame​

Basic Operations​

Working with CSV Files​

Next Steps​

🚀 Advanced Features​

📚 Learning Resources​

🔧 Integration​

💡 Common Patterns​

🤝 Community​