Incremental Processing

Incremental processing is a powerful feature that allows promptprep to only process files that have changed since the last run. This can significantly improve performance when working with large codebases.

Overview

When you run promptprep with the --incremental option, it will:

  1. Check the modification time of each file

  2. Compare it with the last run timestamp

  3. Only process files that have been modified since the last run

  4. Include unchanged files from the previous output

This approach saves time and resources, especially for large projects where only a few files change between runs.

Basic Usage

To use incremental processing:

promptprep --incremental [other options]

When you use the --incremental option without specifying a timestamp, promptprep will:

  1. Look for a previous output file (specified with -o, --output-file)

  2. Extract the timestamp from that file

  3. Use that timestamp as the reference point

If no previous output file exists or the timestamp can’t be extracted, promptprep will process all files.

Specifying a Timestamp

You can explicitly specify the reference timestamp using the --last-run-timestamp option:

promptprep --incremental --last-run-timestamp 1678886400.0 [other options]

The timestamp should be a Unix timestamp (seconds since January 1, 1970). You can generate a timestamp in various ways:

  • In Python: import time; print(time.time())

  • In Bash: date +%s

  • In JavaScript: Math.floor(Date.now() / 1000)

Example Workflow

Here’s a typical workflow using incremental processing:

  1. Initial run:

    promptprep -d ./my_project -o snapshot.txt
    
  2. Later, after making changes to some files:

    promptprep -d ./my_project --incremental -o updated_snapshot.txt
    
  3. promptprep will: - Detect which files have changed since the creation of snapshot.txt - Only process those changed files - Include unchanged files from snapshot.txt - Save the result to updated_snapshot.txt

Combining with Other Features

Incremental processing works well with other promptprep features:

With Diff Generation

Track changes over time by combining incremental processing with diff generation:

# First run
promptprep -d . -o baseline.txt

# Later, after making changes
promptprep -d . --incremental --diff baseline.txt -o changes.txt

This will show you exactly what changed between runs.

With Metadata

Get statistics about your changes:

promptprep -d . --incremental --metadata -o updated_snapshot.txt

The metadata will include information about how many files were processed incrementally.

With Different Output Formats

Incremental processing works with all output formats:

promptprep -d . --incremental --format markdown -o snapshot.md
promptprep -d . --incremental --format html -o snapshot.html

Advanced Use Cases

Continuous Integration

In a CI/CD pipeline, you can use incremental processing to only analyze files that have changed in a pull request:

# Get the base branch timestamp
BASE_TIMESTAMP=$(git show --format=%at -s origin/main)

# Process only files that changed since the base branch
promptprep -d . --incremental --last-run-timestamp $BASE_TIMESTAMP -o pr_changes.txt

Scheduled Snapshots

Create regular snapshots of your codebase, processing only what has changed:

# In a cron job or scheduled task
TIMESTAMP=$(date +%s)
promptprep -d . --incremental -o "snapshots/snapshot_$TIMESTAMP.txt"

Performance Considerations

Incremental processing can significantly improve performance, but there are some factors to consider:

File Count vs. File Size

The performance benefit depends on:

  • The number of files in your project

  • The size of those files

  • The percentage of files that have changed

For projects with many large files where only a few files change between runs, the performance improvement can be substantial.

Overhead

There is some overhead involved in:

  • Reading the previous output file

  • Extracting unchanged content

  • Checking file modification times

For very small projects, this overhead might outweigh the benefits.

Best Practices

  1. Keep Previous Outputs: Store previous output files if you plan to use incremental processing.

  2. Use with Version Control: Incremental processing works well with version-controlled projects where changes are tracked.

  3. Consider File Patterns: Use with -x, --extensions and -e, --exclude-dirs to focus on relevant files.

  4. Verify Results: Occasionally run a full (non-incremental) process to ensure consistency.

Troubleshooting

If incremental processing isn’t working as expected:

  1. Check Timestamps: Ensure the timestamp is in the correct format (Unix timestamp).

  2. Verify File Modification Times: Some file systems or operations might not update modification times correctly.

  3. Check Previous Output: Make sure the previous output file exists and is readable.

  4. Run with Verbose Output: Add the -v, --verbose flag to see more information about what promptprep is doing.

Limitations

There are some limitations to be aware of:

  1. Deleted Files: If files have been deleted since the last run, they might still appear in the output unless you use the --diff option.

  2. Renamed Files: Renamed files are treated as new files, as promptprep tracks changes based on file paths.

  3. Configuration Changes: If you change configuration options (like excluded directories), you should run a full process instead of an incremental one.