No description
Find a file
Victor Freire 6f662ba196 checkpoint
2025-12-26 17:55:47 -03:00
.claude checkpoint 2025-12-22 18:40:28 -03:00
lib checkpoint 2025-12-26 17:55:47 -03:00
src checkpoint 2025-12-26 17:55:47 -03:00
.editorconfig checkpoint 2025-12-26 17:55:47 -03:00
.gitignore init command 2025-12-22 16:57:56 -03:00
add.go checkpoint 2025-12-23 18:32:49 -03:00
CLAUDE.md update spec 2025-12-23 15:18:15 -03:00
flake.lock initial commit 2025-12-22 16:38:52 -03:00
flake.nix checkpoint 2025-12-26 17:55:47 -03:00
git.go checkpoint 2025-12-22 18:10:20 -03:00
go.mod go mod tidy 2025-12-22 18:50:17 -03:00
go.sum go mod tidy 2025-12-22 18:50:17 -03:00
index.mlb checkpoint 2025-12-26 17:55:47 -03:00
init.go init command 2025-12-22 16:57:56 -03:00
list.go checkpoint 2025-12-23 18:32:49 -03:00
lockfile.go checkpoint 2025-12-23 18:32:49 -03:00
main.go update spec 2025-12-23 18:11:00 -03:00
millet.toml checkpoint 2025-12-26 17:55:47 -03:00
pathmap.go checkpoint 2025-12-23 18:32:49 -03:00
README.md checkpoint 2025-12-26 17:55:47 -03:00
remove.go checkpoint 2025-12-22 18:31:19 -03:00
resolver.go checkpoint 2025-12-23 18:43:15 -03:00
sync.go checkpoint 2025-12-23 18:43:15 -03:00
upgrade.go checkpoint 2025-12-23 18:32:49 -03:00
vendor.go checkpoint 2025-12-22 18:31:19 -03:00
verify.go checkpoint 2025-12-22 18:40:28 -03:00

SML Package Manager Specification

Overview

A package manager for Standard ML that uses MLton's MLB (ML Basis) Path Maps as the primary integration mechanism. The package manager downloads dependencies to a global cache (similar to Go) and generates path maps for build-time dependency resolution.

Core Principles

  1. Native Integration - Works directly with MLton's existing tooling via path maps
  2. Reproducible Builds - Lockfile ensures consistent dependency resolution
  3. VCS-Based - Dependencies are version-controlled repositories (Git, with Mercurial planned)
  4. Flexible Versioning - Supports semver tags, branches, or direct commit references
  5. Global Cache - Shared package storage across projects to avoid redundant downloads
  6. Optional Vendoring - Support for committing dependencies to repository
  7. Simple and Transparent - Text-based configuration and generated artifacts

Summary

This package manager design provides:

  • Reproducible builds via lockfile with exact commit SHAs
  • Efficient global package cache with bare repositories and worktrees
  • Native MLton integration via path maps
  • Flexible dependency specification: semver tags, branches, or commits
  • Optional vendoring for offline builds
  • Local development support via path dependencies
  • Simple, transparent text-based formats
  • Explicit name conflict resolution via aliases
  • Familiar workflow for modern developers

The use of MLB path maps as the integration mechanism keeps the tool simple while providing powerful dependency management for Standard ML projects. Dependencies can be specified using semantic versioning (when tags are available), or by referencing branches or specific commits directly. Name conflicts are handled explicitly through user-defined aliases, ensuring predictable and stable variable names across dependency changes.

Directory Structure

Global Package Cache

~/.smlpm/
  git/
    db/                              # Bare repositories (all history)
      github.com-abc123/             # Hash of "github.com/user/repo"
      github.com-def456/             # Hash of another repo URL
    checkouts/                       # Working trees (by commit SHA)
      github.com-abc123/             # Matches db/ hash
        a1b2c3d4e5f6.../             # Worktree at commit a1b2c3...
          smlpm.toml
          src/
          lib.mlb
        f6e5d4c3b2a1.../             # Worktree at commit f6e5d4...
          smlpm.toml
          src/
          lib.mlb
  cache/
    downloads/
  config.toml

Structure explanation:

  • git/db/ contains bare Git repositories with all history and tags
  • Repository directories are named using a hash of the Git URL (avoids filesystem path issues)
  • git/checkouts/<repo-hash>/<commit-sha>/ are Git worktrees checked out at specific commits
  • Each commit SHA (full 40-character hash) gets its own worktree directory
  • Multiple projects can share the same worktree if they use the same commit
  • Different projects can use different versions simultaneously
  • Unused worktrees can be cleaned up with smlpm clean

Project Structure

myproject/
  smlpm.toml          # Package manifest
  smlpm.lock            # Lockfile (committed)
  smlpm.pathmap       # Generated path map (gitignored)
  src/
    main.mlb
    main.sml
  vendor/              # Optional vendored dependencies
    github.com/
      user/
        package-name/      # Copy of worktree (no commit SHA needed)
          smlpm.toml
          src/

File Formats

Package Manifest (smlpm.toml)

[package]
name = "my-app"
version = "0.1.0"
mlb = "src/main.mlb"  # Entry point (optional)

[dependencies]
sml-json = { git = "https://github.com/user/sml-json", tag = "^1.2.0" }
sml-http = { git = "https://github.com/user/sml-http", tag = "2.1.4" }

[dev-dependencies]
sml-testing = { git = "https://github.com/user/sml-testing", tag = "^2.0.0" }

Fields:

  • package.name - Package identifier (must follow naming rules: lowercase alphanumeric with single hyphens)
  • package.version - Semantic version (must be valid semver)
  • package.mlb - Entry MLB file (optional, can be inferred)
  • dependencies - Production dependencies
  • dev-dependencies - Development-only dependencies

Package name requirements:

  • Pattern: ^[a-z0-9]+(-[a-z0-9]+)*$
  • Examples: json, sml-json, http-client, lib2d
  • Invalid: JSON, -json, json-, json--lib, json_lib

Dependency Structure:

Each dependency is a table with a VCS source (git, or hg in the future) and a version specifier (tag, branch, or rev):

# Git with semver tag constraint (version resolution applies)
sml-json = { git = "https://github.com/user/sml-json", tag = "^1.2.0" }

# Git with exact tag
sml-http = { git = "https://github.com/user/sml-http", tag = "2.1.4" }

# Git with branch reference
experimental = { git = "https://github.com/user/experimental", branch = "main" }

# Git with commit reference
pinned = { git = "https://github.com/user/pinned", rev = "abc123def456..." }

# Path dependency (for local development)
local-lib = { path = "../local-lib" }

With alias (for name conflicts or readability):

alice-json = { git = "https://github.com/alice/json", tag = "1.0.0", as = "alice-json" }
bob-json = { git = "https://github.com/bob/json", tag = "2.0.0", as = "bob-json" }

VCS Fields:

  • git - Git repository URL (required for Git dependencies)
  • hg - Mercurial repository URL (planned for future)

Version Specifier Fields (exactly one required, except for path dependencies):

  • tag - Semver version constraint (supports version ranges like ^1.0.0)
  • branch - Branch name to track (e.g., "main", "develop")
  • rev - Exact commit SHA to pin to

Other Fields:

  • path - Local filesystem path (mutually exclusive with VCS fields)
  • as - Alias for the path map variable name

Tag Constraints (Semantic Versioning):

The tag field supports semver constraints. Versions follow the MAJOR.MINOR.PATCH format:

  • MAJOR - Incompatible API changes
  • MINOR - Backwards-compatible functionality additions
  • PATCH - Backwards-compatible bug fixes

See semver.org for full specification.

Tag constraint formats:

  • "1.2.3" - Exact version
  • "^1.2.0" - Compatible with 1.2.0 (>=1.2.0, <2.0.0) - allows MINOR and PATCH updates
  • "~1.2.0" - Approximately 1.2.0 (>=1.2.0, <1.3.0) - allows PATCH updates only
  • ">=1.2.0" - Minimum version (any version >=1.2.0)
  • "1.2.x" - Any patch version of 1.2 (equivalent to ~1.2.0)
  • ">1.2.0 <2.0.0" - Range (any version between 1.2.0 and 2.0.0)

Pre-release versions:

  • "1.2.3-alpha", "1.2.3-beta.1", "1.2.3-rc.2"
  • Pre-releases are only matched by exact version or explicit pre-release ranges
  • "^1.2.0" will NOT match "1.2.3-alpha"
  • "1.2.3-alpha" matches only that specific pre-release

Dependency Conflict Resolution:

Conflict resolution uses semver compatibility rules for all transitive dependencies, regardless of how the top-level dependency is specified. For example, if package A (fetched via branch = "main") depends on json ^1.2.0, and package B (fetched via tag = "^2.0.0") depends on json ^1.5.0, the resolver will find the highest version satisfying both constraints.

  • If constraints are compatible (e.g., ^1.2.0 and ^1.5.0), the highest compatible version is chosen
  • Incompatible constraints (e.g., ^1.0.0 and ^2.0.0) result in an error

Path Dependencies:

local-lib = { path = "../sml-json" }
http-dev = { path = "/home/user/projects/sml-http" }

Path dependencies point to a local directory containing a package. They are useful for:

  • Developing multiple packages simultaneously
  • Testing changes to a dependency before publishing
  • Working with packages that aren't published yet

Path dependencies are resolved at sync time and their absolute paths are written to the path map. The lockfile records them with a path field instead of git/rev.

Note: Path dependencies should typically not be committed. Use them during development, then switch back to VCS dependencies before committing.

Lockfile (smlpm.lock)

JSON format for precise dependency resolution:

{
  "version": 1,
  "packages": {
    "sml-json": {
      "git": "https://github.com/user/sml-json",
      "tag": "1.2.3",
      "rev": "abc123def456...",
      "integrity": "sha256-abc123def456...",
      "vendored": false,
      "alias": null,
      "variable": "SMLPM_JSON",
      "dependencies": {}
    },
    "alice-json": {
      "git": "https://github.com/alice/json",
      "tag": "2.0.0",
      "rev": "def456abc123...",
      "integrity": "sha256-xyz789...",
      "vendored": false,
      "alias": "alice-json",
      "variable": "SMLPM_ALICE_JSON",
      "dependencies": {}
    },
    "sml-testing": {
      "git": "https://github.com/user/sml-testing",
      "tag": "2.0.0",
      "rev": "789abc456def...",
      "integrity": "sha256-xyz789...",
      "vendored": false,
      "alias": null,
      "variable": "SMLPM_TESTING",
      "dev": true,
      "dependencies": {}
    }
  }
}

Fields:

  • version - Lockfile format version
  • packages - Map of package names to resolved metadata
    • git - Git repository URL (or hg for Mercurial in future)
    • tag - Resolved semver tag (present when dependency uses tag constraint)
    • branch - Branch name (present when dependency uses branch)
    • rev - Git commit SHA (always present for reproducibility)
    • integrity - SHA-256 hash of repository contents at this commit
    • vendored - Whether package is in vendor directory
    • alias - User-specified alias (null if none)
    • variable - Computed path map variable name (e.g., "SMLPM_JSON")
    • dev - Whether this is a dev dependency
    • dependencies - Transitive dependencies

Generated Path Map (smlpm.pathmap)

Plain text format compatible with MLton's -mlb-path-map:

SML_LIB /usr/lib/mlton/sml
SMLPM_JSON /home/user/.smlpm/git/checkouts/github.com-a1b2c3d4/abc123def456789abcdef0123456789abcdef01
SMLPM_HTTP /home/user/.smlpm/git/checkouts/github.com-e5f6a7b8/def456abc789012def456789012def456789012
SMLPM_TESTING /home/user/.smlpm/git/checkouts/github.com-c9d0e1f2/789abc012def345abc012def345abc012def345

Each package path includes:

  • Hashed repository directory (e.g., github.com-a1b2c3d4)
  • Full 40-character commit SHA subdirectory
  • Multiple projects using the same commit share the same worktree directory

Variable Naming:

  • Prefix: SMLPM_
  • Derived from package name or user-specified alias
  • Converted to uppercase
  • Special characters (-, .) replaced with _

Variable Name Generation Algorithm:

  1. If user specified as alias in manifest → use alias (uppercased)
  2. Otherwise, use the dependency name (the key in the dependencies table)
  3. Generate: SMLPM_<UPPERCASE_NAME>
  4. Check for conflicts with existing variables
  5. If conflict exists and no alias provided:
    • Error and require user to add aliases (recommended)
    • Alternative: Auto-expand to include author name (e.g., SMLPM_ALICE_JSON)

Conflict Detection:

The package manager detects name conflicts when:

  • Two packages would generate the same variable name
  • A user alias conflicts with another package's name or alias
  • A package name conflicts with reserved variables (e.g., SML_LIB)

Conflict Resolution - Recommended Approach:

When a conflict is detected without explicit aliases, error with helpful message:

Error: Multiple packages would generate variable 'SMLPM_JSON':
  - json (from https://github.com/alice/json, tag 1.0.0)
  - json (from https://github.com/bob/json, tag 2.0.0)

To resolve this conflict, use distinct names or add aliases in smlpm.toml:

[dependencies]
alice-json = { git = "https://github.com/alice/json", tag = "1.0.0" }
bob-json = { git = "https://github.com/bob/json", tag = "2.0.0" }

Conflict Resolution Example:

Given conflicting packages with distinct names:

[dependencies]
alice-json = { git = "https://github.com/alice/json", tag = "1.0.0" }
bob-json = { git = "https://github.com/bob/json", tag = "2.0.0" }

Generated path map:

SMLPM_ALICE_JSON /home/user/.smlpm/git/checkouts/github.com-abc123/abc123def456
SMLPM_BOB_JSON /home/user/.smlpm/git/checkouts/github.com-def456/def456abc789

Usage in MLB files:

(* Explicit, unambiguous *)
$(SMLPM_ALICE_JSON)/json.mlb
$(SMLPM_BOB_JSON)/json.mlb

Alternative: Auto-Expansion (Optional)

Some implementations may choose to auto-expand conflicting names:

Warning: Name conflict for 'json', automatically expanded:
  SMLPM_ALICE_JSON -> alice-json
  SMLPM_BOB_JSON -> bob-json

Add explicit aliases in smlpm.toml to customize these names.

This approach is less explicit but more convenient for users.

Priority Order:

  1. Path dependencies (local filesystem paths)
  2. Vendored packages (from ./vendor/)
  3. Global cache (from ~/.smlpm/git/checkouts/)
  4. System libraries (e.g., SML_LIB)

Commands

Global Flags

-v, --version

Print version information and exit.

smlpm --version

Core Commands

smlpm init

Initialize a new package in the current directory.

smlpm init [NAME] [-d DIR]

Options:

  • -d DIR - Directory to initialize (defaults to current directory)

Creates smlpm.toml with basic structure.

Package name validation: If NAME is provided, it will be used as package.name in smlpm.toml and must follow naming rules:

  • Lowercase alphanumeric with single hyphens between characters
  • Regex: ^[a-z0-9]+(-[a-z0-9]+)*$
  • Examples: my-app, json, http-client

If NAME is omitted, the directory basename is used as the package name.

If name is invalid, init fails with error message.

smlpm sync

Synchronize dependencies from smlpm.toml.

smlpm sync [OPTIONS]

Options:

  • --vendor - Install to ./vendor/ instead of global cache
  • --dev - Include dev dependencies (default: true)
  • --production - Exclude dev dependencies

Behavior:

  1. Reads smlpm.toml
  2. Resolves dependency versions
  3. Clones/fetches Git repositories to global cache or vendor directory
  4. Checks out appropriate commits based on version constraints
  5. Validates version consistency - warns if Git tag doesn't match package.version in dependency's smlpm.toml
  6. Updates or creates smlpm.lock
  7. Generates smlpm.pathmap

Version validation: When resolving a dependency by semver tag, smlpm checks if the Git tag matches the package.version field in the dependency's smlpm.toml:

# If github.com/user/json has tag v1.2.3 but smlpm.toml has version = "1.2.2"
smlpm sync
Warning: Package 'github.com/user/json' version mismatch
  Git tag: v1.2.3
  smlpm.toml: 1.2.2
  This may indicate the package manifest wasn't updated before tagging.
  Continuing with Git tag version (v1.2.3)...

The warning is informational only - sync continues using the Git tag as the source of truth.

smlpm add

Add a new dependency.

smlpm add <package> [version] [OPTIONS]

Options:

  • --dev - Add as dev dependency

Example:

smlpm add github.com/user/sml-json ^1.2.0
smlpm add github.com/user/sml-testing --dev

Package name validation: After fetching the repository, smlpm validates the package.name field in its smlpm.toml. If invalid:

smlpm add github.com/user/SomeRepo ^1.0.0
# Fetches repository...
# Error: Invalid package name 'JSON-Library' in smlpm.toml.
# Package names must be lowercase alphanumeric with single hyphens.
# Pattern: ^[a-z0-9]+(-[a-z0-9]+)*$

Updates smlpm.toml and runs sync.

smlpm remove

Remove a dependency.

smlpm remove <package>

Updates smlpm.toml and smlpm.lock.

smlpm upgrade

Upgrade dependencies to latest compatible versions.

smlpm upgrade [package]

Upgrades lockfile with new resolved versions.

Build Integration

smlpm build

Build with automatic path map handling.

smlpm build <mlb-file> [OPTIONS]

Example:

smlpm build src/main.mlb -output myapp

Equivalent to:

mlton -mlb-path-map smlpm.pathmap -output myapp src/main.mlb

Vendoring Commands

smlpm vendor

Copy dependencies to ./vendor/ directory.

smlpm vendor [OPTIONS]

Options:

  • --prune - Remove unused vendored packages
  • --verify - Check vendor checksums match lockfile

Behavior:

  1. Copies packages from global cache to ./vendor/
  2. Updates lockfile to mark packages as vendored
  3. Regenerates path map to point to vendor directory

Utility Commands

smlpm list

List installed dependencies.

smlpm list [OPTIONS]

Options:

  • --tree - Show dependency tree
  • --dev - Include dev dependencies

smlpm verify

Verify package integrity.

smlpm verify

Checks that installed packages match lockfile checksums.

smlpm clean

Clean unused worktrees, cache, or vendor directory.

smlpm clean [OPTIONS]

Options:

  • --cache - Clean download cache
  • --vendor - Clean vendor directory

Package Name Conflicts

Problem

Multiple packages with the same dependency name would generate identical path map variables:

[dependencies]
json = { git = "https://github.com/alice/json", tag = "1.0.0" }
json = { git = "https://github.com/bob/json", tag = "2.0.0" }  # TOML error: duplicate key!

This is actually prevented by TOML syntax (duplicate keys are not allowed). The solution is to use distinct dependency names.

Solution: Use Distinct Names or Aliases

Use different dependency names when you need multiple packages that would otherwise conflict:

[dependencies]
alice-json = { git = "https://github.com/alice/json", tag = "1.0.0" }
bob-json = { git = "https://github.com/bob/json", tag = "2.0.0" }

Or use the as field to set a custom variable name:

[dependencies]
json-v1 = { git = "https://github.com/alice/json", tag = "1.0.0", as = "alice-json" }
json-v2 = { git = "https://github.com/bob/json", tag = "2.0.0", as = "bob-json" }

This generates distinct variables:

SMLPM_ALICE_JSON /home/user/.smlpm/git/checkouts/github.com-abc123/abc123def456
SMLPM_BOB_JSON /home/user/.smlpm/git/checkouts/github.com-def456/def456abc789

Usage in MLB Files

With distinct names, references are explicit:

(* src/main.mlb *)
$(SML_LIB)/basis/basis.mlb

(* Two different JSON implementations *)
$(SMLPM_ALICE_JSON)/json.mlb
$(SMLPM_BOB_JSON)/json.mlb

local
   $(SMLPM_ALICE_JSON)/json.mlb
in
   use-alice-json.sml
end

local
   $(SMLPM_BOB_JSON)/json.mlb
in
   use-bob-json.sml
end

Alias Best Practices

  1. Use descriptive names for clarity:

    vlpn = { git = "https://github.com/org/very-long-package-name", tag = "1.0.0" }
    
  2. Namespace by author when using similar packages:

    alice-json = { git = "https://github.com/alice/json", tag = "1.0.0" }
    bob-json = { git = "https://github.com/bob/json", tag = "2.0.0" }
    
  3. Keep names stable - changing dependency names breaks MLB files

  4. Document names in project README if non-obvious

Reserved Variable Names

Certain variable names are reserved and cannot be used for packages:

System Reserved:

  • SML_LIB - MLton's Standard ML Basis Library
  • MLTON_ROOT - MLton installation directory
  • Any variable starting with SML_ or MLTON_ (reserved for system use)

Package Manager Reserved:

  • SMLPM - Reserved prefix for all package variables
  • Variables without the SMLPM_ prefix are not allowed for packages

Validation: If a user tries to use a reserved name as an alias:

Error: Alias 'sml-lib' would generate reserved variable 'SMLPM_SML_LIB'
Reserved patterns:
  - SML_*
  - MLTON_*

Choose a different alias.

Best Practice: Use descriptive, non-conflicting aliases that clearly identify the package purpose.

Transitive Dependency Conflicts

Name conflicts can also arise from transitive dependencies:

Scenario:

[dependencies]
http-client = { git = "https://github.com/alice/http-client", tag = "1.0.0" }  # depends on json
xml-parser = { git = "https://github.com/bob/xml-parser", tag = "2.0.0" }       # depends on json

Both transitive dependencies (each named json) would generate SMLPM_JSON.

Resolution:

The package manager should:

  1. Detect transitive conflicts during dependency resolution
  2. Report the conflict with the dependency chain:
Error: Transitive dependency name conflict for 'json':
  - json (v1.5.0) from https://github.com/alice/json
    required by: http-client (v1.0.0)
  - json (v2.1.0) from https://github.com/bob/json
    required by: xml-parser (v2.0.0)

To resolve, add the conflicting transitive dependencies as direct dependencies with distinct names:

  [dependencies]
  http-client = { git = "https://github.com/alice/http-client", tag = "1.0.0" }
  xml-parser = { git = "https://github.com/bob/xml-parser", tag = "2.0.0" }
  alice-json = { git = "https://github.com/alice/json", tag = "^1.5.0" }
  bob-json = { git = "https://github.com/bob/json", tag = "^2.1.0" }

Note: Users must explicitly add conflicting transitive dependencies to their direct dependencies with distinct names. This makes the conflict resolution visible and explicit in the project manifest.

Important: The lockfile tracks which variable name each package is assigned to ensure consistency across installations:

{
  "packages": {
    "alice-json": {
      "git": "https://github.com/alice/json",
      "tag": "1.5.0",
      "variable": "SMLPM_ALICE_JSON",
      ...
    },
    "bob-json": {
      "git": "https://github.com/bob/json",
      "tag": "2.1.0",
      "variable": "SMLPM_BOB_JSON",
      ...
    }
  }
}

VCS-Based Package Resolution

Git Repository URLs

The git field specifies the Git repository URL. URLs can be in several formats:

Full URLs:

[dependencies]
json = { git = "https://github.com/user/json", tag = "^1.0.0" }
http = { git = "git@github.com:user/http", tag = "^2.0.0" }
xml = { git = "https://git.example.com/team/xml", tag = "^1.5.0" }

Short URLs (https:// prefix added automatically):

[dependencies]
json = { git = "github.com/user/json", tag = "^1.0.0" }
http = { git = "gitlab.com/org/http", tag = "^2.0.0" }
lib = { git = "git.sr.ht/~user/lib", tag = "^1.0.0" }

If a URL in the git field doesn't start with a protocol (https://, http://, git@, ssh://), https:// is automatically prepended.

URL Normalization

Different URL formats pointing to the same repository are normalized to a canonical form:

git@github.com:user/repo.git
https://github.com/user/repo.git
https://github.com/user/repo
github.com/user/repo

All normalize to: github.com/user/repo

Normalization Rules:

  1. Strip protocol prefix (https://, http://, git@, ssh://)
  2. Strip .git suffix
  3. Convert SSH colon to slash (:/)
  4. Lowercase the domain (DNS is case-insensitive)
  5. Keep path case-sensitive (repo names are case-sensitive)

Repository Directory Naming

To avoid filesystem issues (path length limits, special characters), repository directories are named using a hash of the normalized URL.

Hash Function:

  • Use first 8 characters of SHA-256 hash of normalized URL
  • Format: <domain>-<hash> (for readability)

Example:

Normalized URL: "github.com/user/json"
SHA-256: sha256("github.com/user/json") = "abc123def456..."
Directory: "github.com-abc123de"

Storage locations:

~/.smlpm/git/db/github.com-abc123de/           # Bare repo
~/.smlpm/git/checkouts/github.com-abc123de/    # Worktrees

Metadata file (for reverse lookup): Each db directory contains a .smlpm-url file with the original URL:

~/.smlpm/git/db/github.com-abc123de/.smlpm-url

Contents: github.com/user/json

This allows tools to map between hashes and URLs.

Version Resolution via Git Tags

When resolving semver version constraints, smlpm:

  1. Clone/fetch the repository using Git
  2. List tags matching semver format (e.g., v1.2.3, 1.2.3)
  3. Filter by constraint (e.g., ^1.0.0 means >=1.0.0, <2.0.0)
  4. Choose highest compatible version
  5. Checkout the tag
  6. Read package manifest and validate version matches tag
  7. Record commit SHA in lockfile for reproducibility

Version validation: After checking out a tag, smlpm compares the Git tag with the package.version field in smlpm.toml:

  • If v1.2.3 tag has version = "1.2.3" → ✓ OK
  • If v1.2.3 tag has version = "1.2.2"⚠️ Warning (continues with tag version)
  • If v1.2.3 tag has version = "1.3.0"⚠️ Warning (continues with tag version)

The Git tag is always the source of truth for version resolution.

For branch/commit references, no tag resolution or version validation occurs - the specified branch or commit is used directly.

Tag format conventions (when using semver):

Git tags must follow Semantic Versioning (semver) format:

  • Preferred: v1.2.3, v1.2.3-alpha, v1.2.3-beta.1 (with 'v' prefix)
  • Also accepted: 1.2.3, 1.2.3-alpha (without prefix)
  • Must match pattern: [v]MAJOR.MINOR.PATCH[-prerelease][+build]

Examples of valid tags:

  • v1.2.3 - Standard release
  • 1.2.3 - Standard release (no prefix)
  • v2.0.0-alpha - Pre-release
  • v1.5.0-beta.1 - Pre-release with number
  • v1.2.3+build.123 - With build metadata

Examples of invalid tags (ignored by smlpm):

  • release-1.2.3 - Wrong prefix
  • v1.2 - Incomplete version (missing PATCH)
  • 1.2.x - Wildcard not allowed in tags
  • latest - Not a version number

Note: If a package has no tags, it can still be used via branch or commit references.

Git Operations

smlpm uses bare repositories with Git worktrees for efficient multi-version support.

Initial sync:

smlpm sync
# For each dependency:
# 1. Normalize URL and compute hash
# 2. Clone bare repo if doesn't exist:
#    git clone --bare <url> ~/.smlpm/git/db/<domain>-<hash>/
# 3. Write URL to .smlpm-url file
# 4. Fetch tags:
#    git --git-dir=~/.smlpm/git/db/<domain>-<hash>/ fetch --tags
# 5. Find matching tag for version constraint
# 6. Create worktree at commit SHA:
#    git --git-dir=~/.smlpm/git/db/<domain>-<hash>/ \
#        worktree add ~/.smlpm/git/checkouts/<domain>-<hash>/<commit-sha> <commit-sha>
# 7. Verify commit SHA matches lockfile (if exists)

Subsequent syncs:

smlpm sync
# For each dependency:
# 1. Check if worktree exists at ~/.smlpm/git/checkouts/<hash>/<commit-sha-from-lockfile>
# 2. If not, create it from existing bare repo
# 3. If bare repo doesn't exist, clone it

Upgrading:

smlpm upgrade
# For each dependency:
# 1. git --git-dir=~/.smlpm/git/db/<hash>/ fetch --tags
# 2. Find new highest compatible version
# 3. Get commit SHA for new tag
# 4. Create new worktree at new commit SHA (if doesn't exist)
# 5. Update lockfile with new commit SHA
# 6. Old worktree remains (can be cleaned up later)

Cleanup unused versions:

smlpm clean
# For each package:
# 1. Scan all projects' lockfiles
# 2. List all worktrees for this package
# 3. Remove worktrees not referenced by any project

Storage Structure

~/.smlpm/git/db/
  github.com-abc123de/        # Hashed repo name
    .smlpm-url                # Contains: "github.com/user/json"
    HEAD
    config
    objects/
    refs/

Each package is stored as a bare repository containing all history and tags. The directory name is a hash of the normalized Git URL.

Working trees are created on-demand by commit SHA:

~/.smlpm/git/checkouts/
  github.com-abc123de/        # Matches db/ hash
    a1b2c3d4e5f6.../          # Worktree for commit a1b2c3...
      smlpm.toml
      json.mlb
      src/
    f6e5d4c3b2a1.../          # Worktree for commit f6e5d4...
      smlpm.toml
      json.mlb
      src/

Benefits:

  • Multiple projects can use different versions simultaneously
  • Each unique commit gets its own worktree directory
  • Worktrees are lightweight (shared objects with bare repo)
  • Deduplication: projects using same version share same worktree
  • Easy cleanup: remove worktrees not referenced by any lockfile

Multiple Projects, Multiple Versions

Scenario: You have two projects using different versions of the same package.

Project A (~/projectA/smlpm.lock):

{
  "github.com/user/json": {
    "version": "1.0.0",
    "git": "https://github.com/user/json",
    "rev": "abc123def456789abcdef0123456789abcdef01"
  }
}

Project A path map points to:

SMLPM_JSON /home/user/.smlpm/git/checkouts/github.com/user/json/abc123def456789abcdef0123456789abcdef01

Project B (~/projectB/smlpm.lock):

{
  "github.com/user/json": {
    "version": "1.2.2",
    "git": "https://github.com/user/json",
    "rev": "def456abc789012def456789012def456789012"
  }
}

Project B path map points to:

SMLPM_JSON /home/user/.smlpm/git/checkouts/github.com/user/json/def456abc789012def456789012def456789012

Both projects work simultaneously because they use different worktree directories. The bare repository (~/.smlpm/git/db/github.com/user/json.git) is shared, containing all history for both versions.

If Project C also uses v1.0.0: It shares the same abc123def... worktree with Project A - no duplication!

Each package is stored as a complete Git repository, allowing:

  • Easy updates via git fetch
  • Local modifications for debugging
  • Full Git history available
  • Works offline once cloned

Private Repositories

Private repositories work via Git's credential system:

SSH keys:

[dependencies]
private-lib = { git = "git@company.internal:team/private-lib", tag = "1.0.0" }

HTTPS with credentials:

[dependencies]
internal-lib = { git = "https://git.company.com/internal/lib", tag = "1.0.0" }

Uses Git's credential helper (no passwords stored in smlpm.toml).

Version Specifier Examples

You can reference packages in multiple ways using the version specifier fields:

[dependencies]
# Semver tag constraint (requires Git tags, version resolution applies)
json = { git = "https://github.com/user/json", tag = "^1.0.0" }

# Specific branch (no tags needed)
experimental = { git = "https://github.com/user/experimental", branch = "develop" }
main-lib = { git = "https://github.com/user/main-lib", branch = "main" }

# Specific commit (no tags needed)
pinned = { git = "https://github.com/user/pinned", rev = "abc123def456..." }

When to use each:

  • tag (tag = "^1.0.0") - For stable, released packages with semantic versioning
  • branch (branch = "main") - For tracking latest development, no tags needed
  • rev (rev = "abc123...") - For precise pinning, maximum reproducibility

Lockfile representation (tag):

{
  "packages": {
    "json": {
      "git": "https://github.com/user/json",
      "tag": "1.2.3",
      "rev": "abc123def456...",
      "integrity": "sha256-..."
    }
  }
}

Lockfile representation (branch):

{
  "packages": {
    "experimental": {
      "git": "https://github.com/user/experimental",
      "branch": "develop",
      "rev": "def456abc789...",
      "integrity": "sha256-..."
    }
  }
}

Lockfile representation (commit):

{
  "packages": {
    "pinned": {
      "git": "https://github.com/user/pinned",
      "rev": "abc123def456...",
      "integrity": "sha256-..."
    }
  }
}

Note: The lockfile always records the exact commit SHA for reproducibility, regardless of how the dependency was specified.

Dependency Resolution

Version Resolution Algorithm

smlpm resolves versions using Semantic Versioning (semver) constraints:

  1. Parse all direct dependencies from smlpm.toml
  2. For each dependency, resolve transitive dependencies
  3. Build dependency graph
  4. Resolve version conflicts using semver compatibility rules:
    • If compatible versions exist, choose highest compatible version
    • If incompatible, report error with conflict details
  5. Validate no circular dependencies
  6. Check for variable name conflicts and validate aliases
  7. Generate flat list of resolved packages with exact versions

Semver Compatibility:

  • ^1.2.0 is compatible with ^1.5.0 - both satisfied by 1.5.x
  • ^1.2.0 is NOT compatible with ^2.0.0 - different major versions
  • ~1.2.0 is compatible with ~1.2.3 - both satisfied by 1.2.3

Dependency Resolution Priority

  1. Direct dependencies override transitive
  2. Highest compatible version wins
  3. Dev dependencies are optional (excluded with --production)

Validation and Warnings

Version Mismatch Detection

When resolving dependencies via semver tags, smlpm validates that the package.version field matches the Git tag:

Validation process:

  1. Checkout Git tag (e.g., v1.2.3)
  2. Read smlpm.toml from the checked-out code
  3. Compare tag version with package.version field
  4. If mismatch: emit warning, continue with Git tag version

Example warning:

Warning: Package 'github.com/user/json' version mismatch
  Git tag: v1.2.3
  smlpm.toml: version = "1.2.2"
  
  This may indicate the package manifest wasn't updated before tagging.
  The package will be treated as version 1.2.3 (from Git tag).

Behavior:

  • Warning only - Does not block sync
  • Git tag is authoritative - Version resolution uses the tag, not smlpm.toml
  • Logged - Warning appears in terminal output
  • Non-blocking - Dependency resolution continues normally

Why warn?

  • Helps package maintainers catch mistakes
  • Alerts consumers to potential issues
  • Encourages good versioning practices
  • But doesn't break builds for minor inconsistencies

When validation occurs:

  • During smlpm sync when resolving semver dependencies
  • During smlpm add when adding new dependencies
  • During smlpm upgrade when finding new versions

When validation doesn't occur:

  • Branch references ({ branch = "main" }) - no version field expected
  • Commit references ({ rev = "abc123" }) - no version field expected
  • Tag references ({ tag = "v1.2.3" }) - tag is explicit, no validation needed

Vendoring

Vendoring Behavior

Optional by Default:

  • Default: packages in global cache (~/.smlpm/git/checkouts/)
  • Opt-in: smlpm sync --vendor or smlpm vendor

Path Map Priority:

  1. Path dependencies → use local filesystem path
  2. If vendor/ exists → use vendored packages
  3. Otherwise → use global cache

Lockfile Tracking:

  • vendored: true flag in lockfile indicates vendored state
  • path field in lockfile indicates path dependency
  • Allows mixed modes (some packages vendored, path deps, others cached)

Without Vendoring:

/vendor/
/smlpm.pathmap

With Vendoring:

/smlpm.pathmap

Package Requirements

Package Structure

Each package must provide:

  1. MLB entry point (e.g., lib.mlb, package-name.mlb)
  2. Package manifest (smlpm.toml)
  3. Source files

Optional (for versioned releases): 4. Semver Git tags for releases (e.g., v1.2.3)

Versioning Requirements

Packages can be referenced in three ways using the version specifier fields:

1. Semver tags (recommended for releases):

[dependencies]
json = { git = "https://github.com/user/json", tag = "^1.0.0" }

Requires Git tags following semver (e.g., v1.2.3, v2.0.0-beta). Semver conflict resolution applies.

2. Branch references:

[dependencies]
json = { git = "https://github.com/user/json", branch = "main" }

Always uses latest commit on the specified branch.

3. Commit references:

[dependencies]
json = { git = "https://github.com/user/json", rev = "abc123def456..." }

Pins to a specific commit SHA.

For versioned releases, follow Semantic Versioning:

  1. Git tags should be valid semver (e.g., v1.2.3, v2.0.0-beta)
  2. Version field in smlpm.toml should match the latest tag
  3. Breaking changes require MAJOR version bump
  4. New features require MINOR version bump
  5. Bug fixes require PATCH version bump

Example release workflow:

# 1. Update version in smlpm.toml to match the tag you'll create
vim smlpm.toml  # Set version = "1.2.3"

# 2. Commit changes
git commit -am "Release v1.2.3"

# 3. Tag with semver (matching smlpm.toml version)
git tag v1.2.3

# 4. Push tag
git push origin v1.2.3

Important: Keep package.version in sync with Git tags!

  • Git tag v1.2.3 should have version = "1.2.3" in smlpm.toml (without the 'v')
  • If they don't match, smlpm will show a warning when others use your package
  • The Git tag is always the source of truth for version resolution

For development/unreleased packages:

  • No tags required
  • Reference by branch or commit
  • Users can still use your package
  • package.version can be any value (e.g., "0.0.0-dev")

Package Manifest Example

[package]
name = "sml-json"            # Must follow naming rules: ^[a-z0-9]+(-[a-z0-9]+)*$
version = "1.2.3"            # Must match Git tag (without 'v' prefix)
mlb = "json.mlb"
description = "JSON parsing and serialization for Standard ML"

[dependencies]
sml-lib = { git = "https://github.com/user/sml-lib", tag = "^1.0.0" }

Name validation:

  • ✓ Valid: sml-json, json, http-client, lib2d
  • ✗ Invalid: SML-JSON, -json, json-, json_lib

Package Identification

Packages are identified by their dependency name (the key in the dependencies table) and Git repository URL (the git field).

The package.name field in smlpm.toml is the package's own identifier and must follow package naming rules.

Package Naming Rules

The package.name field in smlpm.toml must follow these rules:

Valid characters:

  • Lowercase letters: a-z
  • Digits: 0-9
  • Hyphens: - (only between other characters)

Rules:

  • Must be lowercase only (no uppercase letters)
  • Must start and end with alphanumeric character
  • Hyphens allowed only between characters (no leading, trailing, or consecutive hyphens)
  • Length: 1-63 characters

Valid package names:

[package]
name = "json"         
name = "sml-json"     
name = "http-client"  
name = "lib2d"        
name = "x"            

Invalid package names:

[package]
name = "JSON"          (uppercase)
name = "-json"         (leading hyphen)
name = "json-"         (trailing hyphen)
name = "json--lib"     (consecutive hyphens)
name = "json_lib"      (underscore not allowed)
name = "json.lib"      (dot not allowed)

Validation regex:

^[a-z0-9]+(-[a-z0-9]+)*$

Example:

# Repository: github.com/user/MyAwesomeLibrary
# Package name must still follow rules:
[package]
name = "awesome-library"    # Valid: lowercase with hyphen
version = "1.0.0"

These rules ensure:

  • Compatibility with filesystems and URLs
  • Consistency across the ecosystem
  • No case-sensitivity issues
  • Clear, readable package names

Usage Workflow

Starting a New Project

mkdir myproject && cd myproject
smlpm init

# Add dependencies by editing smlpm.toml:
# sml-json = { git = "https://github.com/user/sml-json", tag = "^1.2.0" }
# sml-testing = { git = "https://github.com/user/sml-testing", tag = "^2.0.0" }

# Then sync to fetch and resolve dependencies
smlpm sync

Building

# Option 1: Use smlpm build wrapper
smlpm build src/main.mlb -output myapp

# Option 2: Use MLton directly
mlton -mlb-path-map smlpm.pathmap -output myapp src/main.mlb

Using Dependencies in MLB Files

(* src/main.mlb *)
$(SML_LIB)/basis/basis.mlb
$(SMLPM_SML_JSON)/json.mlb
$(SMLPM_SML_HTTP)/http.mlb

local
   $(SMLPM_SML_TESTING)/testing.mlb
in
   main.sml
end

Local Development

When developing a dependency locally, use the path option to point to a local directory instead of fetching from Git.

Setup:

  1. Edit smlpm.toml to use a path dependency:
[dependencies]
# Change from VCS dependency:
# sml-json = { git = "https://github.com/user/sml-json", tag = "^1.2.0" }

# To path dependency:
sml-json = { path = "../sml-json" }
  1. Sync and build:
smlpm sync
smlpm build src/main.mlb

Example workflow:

# Clone the dependency you want to work on
cd ~/projects
git clone https://github.com/user/sml-json.git

# In your main project, switch to path dependency
cd ~/projects/my-app
# Edit smlpm.toml to use path = "../sml-json"

# Sync and build
smlpm sync
smlpm build src/main.mlb

# Make changes to sml-json, rebuild, test...
# No need to run sync again - just rebuild

# When done, revert smlpm.toml to use VCS dependency
# sml-json = { git = "https://github.com/user/sml-json", tag = "^1.2.0" }
smlpm sync

Notes:

  • Path dependencies should not be committed to version control
  • Both relative and absolute paths are supported
  • The path must contain a valid smlpm package (with smlpm.toml)
  • Changes to the local package are picked up immediately on rebuild (no sync needed)
  • Remember to switch back to VCS dependencies before committing

Vendoring for Production

# Vendor dependencies
smlpm vendor

# Commit to git
git add vendor/ smlpm.lock smlpm.toml
git commit -m "Vendor dependencies"

# CI/CD builds work without network access
mlton -mlb-path-map smlpm.pathmap -output myapp src/main.mlb

Future Considerations

Potential Extensions

  1. Package Registry - Central registry for package discovery
  2. Workspace Support - Monorepo with multiple packages
  3. Build Scripts - Custom build steps (pre/post install)
  4. Platform-Specific Dependencies - Different deps per OS/arch
  5. Alternative Compilers - Support for SML/NJ, Poly/ML via different path map formats

Open Questions

  1. How to handle packages without smlpm.toml? (Legacy/external packages)
  2. Should there be a package naming convention/registry?
  3. How to handle MLB file discovery in packages? (convention vs explicit)
  4. How to handle compiler-specific dependencies?
  5. Name conflict resolution: Error immediately or auto-expand with warning?

Implementation Notes

Path Map Generation Algorithm

For each package in lockfile:
  1. Check if package is a path dependency (has 'path' field)
     → Use the resolved absolute path
  2. Check if package is vendored (vendored: true)
     → Use ./vendor/<normalized-package-path>
  3. Otherwise use global cache worktree
     → Use ~/.smlpm/git/checkouts/<normalized-package-path>/<commit-sha>
  4. Generate variable name:
     a. If package has 'as' alias in manifest
        → Use SMLPM_<UPPERCASE_ALIAS>
     b. Otherwise extract package name from path
        → Use SMLPM_<UPPERCASE_NAME>
  5. Check for variable name conflicts:
     a. If conflict exists and no alias provided
        → Error with helpful message, OR
        → Auto-expand to include author (with warning)
     b. If conflict exists but aliases resolve it
        → Continue
  6. Write variable and path to smlpm.pathmap

Note: The commit SHA in the path ensures each project gets exactly the version specified in its lockfile, even when multiple projects use different versions of the same package.

Integrity Verification

Use SHA-256 hashes stored in lockfile to verify package integrity:

  1. Download package
  2. Compute SHA-256 hash
  3. Compare with lockfile
  4. Reject on mismatch

Transitive Dependency Resolution

Flattening Strategy:

All dependencies (direct and transitive) are flattened into a single dependency list. This means:

  • No nested node_modules-style directories
  • Each package appears exactly once in the lockfile
  • All packages are stored at the same level in the global cache
  • Path map variables reference packages directly

Resolution Algorithm:

  1. Collect all dependencies - Gather direct and transitive dependencies into a flat list
  2. Resolve version conflicts - For each package that appears multiple times with different version requirements:
    • If all requirements are compatible → choose the highest compatible version
    • If requirements are incompatible → error with conflict details
  3. Generate flat dependency list - Create a single list of package@version pairs
  4. Check for name conflicts - Ensure no variable name collisions (see Name Conflicts section)
  5. Write to lockfile - Store flattened resolution

Example:

Given:

[dependencies]
http-client = { git = "https://github.com/alice/http-client", tag = "^1.0.0" }  # depends on json ^1.2.0
xml-parser = { git = "https://github.com/bob/xml-parser", tag = "^2.0.0" }       # depends on json ^1.5.0

Resolution:

  • Both require json with compatible versions (^1.2.0 and ^1.5.0)
  • Choose highest: json 1.5.2 (satisfies both)
  • Flatten to: http-client 1.0.3, xml-parser 2.0.1, json 1.5.2

Lockfile contains all three at the top level:

{
  "packages": {
    "http-client": {"git": "https://github.com/alice/http-client", "tag": "1.0.3", ...},
    "xml-parser": {"git": "https://github.com/bob/xml-parser", "tag": "2.0.1", ...},
    "json": {"git": "https://github.com/user/json", "tag": "1.5.2", ...}
  }
}

Version Conflict Example:

Given:

[dependencies]
old-lib = { git = "https://github.com/alice/old-lib", tag = "1.0.0" }  # depends on json ^1.2.0
new-lib = { git = "https://github.com/bob/new-lib", tag = "2.0.0" }    # depends on json ^2.0.0

Error:

Error: Cannot resolve version conflict for 'json'
  - old-lib@1.0.0 requires json ^1.2.0
  - new-lib@2.0.0 requires json ^2.0.0
  No version satisfies both requirements.

Possible solutions:
  - Update old-lib to a version compatible with json ^2.0.0
  - Use an older version of new-lib compatible with json ^1.2.0

Storage Structure:

Bare repositories (all packages stored once, with hashed names):

~/.smlpm/git/db/
  github.com-a1b2c3de/      # github.com/alice/http-client
  github.com-e5f6a7b8/      # github.com/bob/xml-parser
  github.com-c9d0e1f2/      # github.com/user/json

Working trees by commit SHA:

~/.smlpm/git/checkouts/
  github.com-a1b2c3de/
    f1e2d3c4.../            # Worktree at commit f1e2d3c4
      ...
  github.com-e5f6a7b8/
    b5a6c7d8.../            # Worktree at commit b5a6c7d8
      ...
  github.com-c9d0e1f2/
    a9b8c7d6.../            # Worktree at commit a9b8c7d6
      ...

Each repository is checked out to the resolved commit SHA from the lockfile.

Vendor directory (if using vendoring, also flattened):

myproject/vendor/
  github.com/
    alice/
      http-client/        # Copy of worktree (no commit SHA in path)
        smlpm.toml
        ...
    bob/
      xml-parser/
        smlpm.toml
        ...
    user/
      json/
        smlpm.toml
        ...

This flat structure simplifies dependency management and makes it clear exactly which version of each package is being used.

Security Considerations

  1. Integrity Checks - Always verify package hashes from lockfile
  2. HTTPS Only - Require HTTPS for package downloads
  3. Lockfile Commitment - Encourage committing lockfile to prevent supply chain attacks
  4. Vendor Auditing - Vendored code can be audited before commit

Compatibility

MLton Versions

Designed for MLton 20210117 and later. Requires MLB path map support.

Other Compilers

Path maps are MLton-specific. For other compilers, could generate:

  • CM files for SML/NJ
  • Use files for Poly/ML
  • Future extension point