# Hobbesgram — OS/2 File Archive **v1.15** — A flat-file PHP file-sharing archive in the style of the original Hobbes OS/2 archive at hobbes.nmsu.edu. No database required. ## Requirements - PHP 8.1+ (PHP 8.4 recommended) - Apache 2.4 with `mod_rewrite` enabled - PHP-FPM (shared-hosting compatible) - Writable `data/` directory ## Installation 1. Upload all files to your web root (`htdocs/`). 2. Ensure the `data/` directory and all its subdirectories are writable by the web server. On most shared hosts: ``` chmod -R 755 data/ ``` 3. Visit `http://yoursite.com/setup` to create the admin account and seed the default category tree. The setup page is only reachable until the first user exists. 4. Delete `setup.php` and `check.php` from the server once setup is done. `check.php` is a diagnostic tool that reveals server information. 5. Configure your site name, tagline, color theme, and landing page text via **Admin → Settings**, **Admin → CSS**, and **Admin → Landing Page**. 6. *(Optional)* Place files in `data/pool/` via FTP for batch import through the Pool approval interface. 7. *(Optional)* Configure Archive.org S3 credentials in **Admin → Settings** to enable mirroring files to the Internet Archive. ## Directory Structure ``` htdocs/ index.php Single entry point and URL router config.php Paths, file size limits, allowed extensions, role list, OS/2 UA patterns, default settings, THEME_PRESETS constant (5 built-in color themes) check.php Diagnostic page (delete after setup) setup.php First-run wizard (self-disables after use) php.ini Upload/memory limits for PHP-FPM environments .htaccess mod_rewrite rules; CGIPassAuth for wget/curl auth .user.ini PHP limits override for shared hosting README.md This file CHANGELOG.md Version history hobbes.txt Example hobbes.txt format reference (large batch) pmmail.txt Example pmmail.txt format reference (single entry) includes/ functions.php Utility functions: roles, CSRF, flash messages, pagination, date formatting, filename safety, OS/2 UA detection, category tree helpers, category path building, file icon map, file_public_meta() for mirror/catalog output storage.php Atomic JSON read/write (temp + rename, no flock). CRUD helpers for users, files, categories, invites, pool listing, session management, download log (dllog_append / dllog_load) auth.php Custom PHP session save handler (no flock), current_user(), require_role(), can() checks, HTTP Basic Auth support (auth_try_basic) search.php Inverted keyword index: index, remove, query, rebuild functions markdown.php Minimal Markdown-to-HTML parser (no infinite loops) pages/ One PHP file per route home.php Landing page (renders Markdown landing content) browse.php Category browser with file listing and DL counts file.php File detail page; Archive.org link; wget hint file/edit.php Edit metadata / move to category (editor+); Rename file on disk (admin only); Delete file from disk (admin only) download.php File streaming with download counter increment; logs wget/curl downloads to dllog download_meta.php Public JSON metadata sidecar for a file (served at /download/{path}.json) catalog.php Public JSON catalog of all approved files (served at /catalog.json) mirror.php Public mirror info page with wget/curl examples and bulk-mirror shell script (/mirror) search.php Keyword search results with DL counts upload.php Web file upload form (contributor+); category field locked when a limit is active pool.php Approval queue: web uploads (inline edit+approve) + FTP single files + FTP folder batch import (editor+) login.php Authentication form logout.php Session teardown register.php Account creation (open or invite-only) invite.php Invite code generation and listing profile.php User dashboard: upload history, invite codes, personal display theme override setup.php First-run account and category seeding admin/ index.php Admin dashboard (stats, quick links) settings.php Site name, tagline, open registration, global contributor upload category, Archive.org S3 credentials css.php Color palette editor; 5 named theme presets; site-default preset selector users.php User list, role changes, account management, Cat. Access column, Limits button user_limits.php Per-user upload category restriction (/admin/user-limits/{username}) categories.php Category tree editor (add, rename, nest, delete); rename does not change the slug landing.php Landing page Markdown content editor splash.php Splash screen content editor (non-OS/2 visitors) meta_merge.php Bulk metadata import from hobbes.txt / pmmail.txt or a .zip bundle of .txt files bulk_delete.php Bulk file deletion by category (admin only) repair.php Archive integrity check: orphaned files, empty-browse categories, duplicate categories reports.php Quality reports: duplicate files (size+MD5), same filename in multiple locations, files missing descriptions mirror.php Mirror files to Archive.org (per-file and batch); wget/curl download log viewer templates/ header.php HTML head, CSS custom properties (--c-*), navigation, category sidebar, flash messages; applies per-user theme override if set footer.php Page footer data/ All persistent state (never served directly) .htaccess Denies all HTTP access to data/ and subdirs settings/ settings.json Site settings, CSS palette, active theme preset, Archive.org credentials categories/ categories.json Category tree (id, name, slug, parent, desc) users/ One .json file per registered user files/ One .json metadata record per uploaded file uploads/ Uploaded files organized by category slug path e.g. uploads/multimedia/images/icons/file.zip pool/ FTP staging folder for pending imports invites/ One .json file per invite code index/ search.json Inverted search index (keyword -> [file ids]) dllog.json wget/curl download log (capped at 5,000 entries) sessions/ PHP session files (custom handler, no flock) merges/ Temporary meta-merge sessions (auto-expire 2 h) ``` ## User Roles | Role | Description | |------|-------------| | **guest** | Visitors with an OS/2 User-Agent string get full browse and download access automatically. No account required. | | **contributor** | Registered user. Can upload files (pending editor approval) and generate invite codes. | | **editor** | Can approve or reject uploads, edit any file's metadata (title, description, author, etc.), move files between categories, import from the FTP pool, and run Meta Merge. | | **admin** | Full access. All editor capabilities plus: user management, site settings, CSS theming, bulk file deletion, single-file deletion, file renaming, quality reports, and Archive.org mirroring. | ## Access Control - **OS/2 browser detection** is based on the HTTP User-Agent string. Recognized patterns: `OS/2`, `Warp`, `WebExplorer`, `Warpzilla`, `Lynx.*OS`, `SPRY`, `PMX`. All browsers sending one of these strings receive guest browse/download access without a login. - **Non-OS/2 visitors** see the landing page and splash screen only. They must create an account (via invite or open registration) to browse or download. - **Open registration** can be toggled in Admin → Settings. When off, new accounts require an invite code. - **Invite codes** are generated by contributors and above. Each code sets the invited user's starting role. - **CSRF tokens** protect all state-changing POST requests. ## wget / curl Access Registered users can download files from the command line using HTTP Basic Auth: ```sh wget --user=USERNAME --password=PASSWORD "https://yoursite.com/download/path/to/file.zip" curl -u USERNAME:PASSWORD "https://yoursite.com/download/path/to/file.zip" -O ``` The file detail page shows a pre-filled `wget` command for logged-in users. When credentials are not provided on a download URL, the server returns a `401 Unauthorized` response with a `WWW-Authenticate` header so `wget` and `curl` know to prompt for or accept credentials. Admin users can view wget/curl download activity in **Admin → Mirror to Archive.org → Download Log tab**. ## File Uploads **Web uploads** (contributor+): Upload via `/upload`. The file is stored in `data/uploads/` and marked pending. An editor or admin must approve it in the Pool. When a contributor's account has a staging category assigned (via per-user or global contributor category setting), the upload form shows two separate category fields: - **Staging Category** — locked; the file lands here while awaiting review. - **Requested Category** — a full category dropdown the contributor uses to indicate where they want the file placed after approval. The `requested_category` is stored in the file's metadata. In the Pool, the editor's category dropdown pre-fills with the contributor's requested destination (and shows the staging location as a secondary note). The editor can change the category before approving; the file is moved on disk at approval time as usual. When no staging category is configured the contributor selects a single category as both the physical location and the requested destination (existing behaviour). **FTP single file import** (editor+): Drop a file into `data/pool/` via FTP. It appears in `/pool` with a metadata entry form. Supply title, description, author, and category, then click Import. A companion `.meta.json` file can pre-fill the form: ```json { "title": "...", "desc": "...", "author": "...", "version": "..." } ``` **FTP folder batch import** (editor+): Drop an entire directory tree into `data/pool/`. The folder and its subdirectories are mapped to a new (or existing) category hierarchy. Each file gets a title derived from its filename; required fields default to "Unknown" and can be edited after import. ## Meta Merge (`/admin/meta-merge`) Bulk import of metadata from plain-text files. Upload a `.txt` file (or a `.zip` bundle containing multiple `.txt` files) in either supported format. The system parses it, matches entries to existing archive files by category path and filename, and presents a review page before writing any changes. **Supported formats:** `hobbes.txt` — one or more blocks separated by dashed lines: ``` ---------------------------------------- DIR: pub/multimedia/images/icons FILE: 1700ico2.zip DESC: Multi-line description of the file. ---------------------------------------- ``` `pmmail.txt` — labelled key: value fields: ``` Archive Filename: pmmail-3-25-00-1993.wpi Short Description: Email client for OS/2. Long Description: PMMail is an enhanced TCP/IP email client... Proposed directory for placement: /pub/os2/apps/internet/mail/reader/pmm Your name: Neil Waldhauer Program URL: http://pmmail.os2voice.org/ Operating System/Version: OS/2, ArcaOS and eComStation Additional requirements: See the readme. ``` Path matching: the `DIR` / *Proposed directory* value is stripped of leading `pub/` or `hobbes/pub/` prefixes, then each path segment is slugified and compared against the category path of each file in the archive. Merge behavior: by default, only empty or "Unknown" fields are updated. Tick "Overwrite existing fields" to replace all values from the txt. Unmatched entries offer: filename-only suggestions, manual file-ID entry, or Skip. A before/after diff is shown for every matched entry before anything is written. ## Bulk Delete (`/admin/bulk-delete`) Admin-only tool for removing multiple files from a category at once. - Select a category from the dropdown; the page lists all files in that category (approved and pending). - Per-page options: 25 / 50 / 100 / All. A warning is shown if "All" is selected and the category contains more than 250 files. - "Select ALL N files in this category (all pages)" marks every file in the category for deletion regardless of current pagination. - Deletion removes the physical file, the metadata JSON, and all search index entries. Empty category directories are cleaned up. Single-file delete is also available to admins from the file edit page (`/file/edit/{id}`) via the Danger Zone section at the bottom of the form. ## File Editing / Recategorisation Editors and admins can edit any approved file's metadata from the file detail page (Edit Metadata button) or directly at `/file/edit/{id}`. When the category is changed, the physical file is moved on disk and `stored_name` is updated. If a file with the same name already exists in the target category, the move is blocked and an error is shown. **Admins** can also rename the physical file on disk from the Danger Zone section of the edit page. The rename validates the extension (must match the original), checks for name collisions in the current directory, and updates both `original_name` and `stored_name` in the metadata. ## Archive Repair (`/admin/repair`) Scans the archive for integrity problems and reports: - **Orphaned files** — physical files on disk with no matching metadata record. - **Empty-browse categories** — categories with no approved files and no sub-categories (candidates for pruning). - **Duplicate categories** — category names that appear more than once under the same parent. No changes are made automatically; the report is read-only. ## Quality Reports (`/admin/reports`) Three optional reports for archive hygiene: - **Duplicate Files** — groups files that share the same size and MD5 hash. Useful for finding accidental re-uploads across different categories. - **Same Filename** — groups files that share the same filename (case- insensitive) regardless of location. Not necessarily duplicates, but worth reviewing. - **Missing Descriptions** — lists approved files with no description, or whose description is a placeholder value ("Unknown", "N/A", etc.). All reports include direct links to each file's Edit Metadata page. ## Archive.org Mirroring (`/admin/mirror`) Files can be mirrored to the Internet Archive for long-term preservation. **Setup:** Enter your Archive.org S3 credentials in **Admin → Settings → Archive.org Mirror Credentials**. Get your keys at `archive.org/account/s3.php`. **Per-file upload:** Click "Upload" next to any file in the Not Yet Mirrored list. The item identifier is set to `hobbesgram-{file_id}` and all available metadata is attached as Archive.org headers. **Batch upload:** Select a batch size (5–50) and click "Start Batch Upload" to upload the next N unmirrored files in sequence. Once mirrored, `archiveorg_id` is saved to the file's metadata and a link to the Archive.org item appears on the file detail page. ## CSS Theming The entire color scheme is controlled from **Admin → CSS**. Colors are stored as CSS custom properties (`--c-*`) applied at render time; no external CSS files are needed. **Five built-in presets:** | Preset | Description | |--------|-------------| | OS/2 Classic | Grey desktop with navy accents (default) | | Dark Mode | Dark grey background with blue highlights | | Green Terminal | Black background with green-on-black text | | Hobbes OG | Deep blue palette echoing the original hobbes.nmsu.edu | | Amber | Black background with amber terminal text | **Site-default preset:** Admins select the active preset in Admin → CSS. The selection is saved as `active_preset` in `settings.json`. **Per-user theme:** Registered users can override the site default from their Profile page, choosing any of the five presets or reverting to the site default. ## Atomic Writes and Shared Hosting All JSON writes use a temp file + atomic `rename()` pattern. `flock()` is never used. This is safe on NFS mounts and shared hosting filesystems where `flock()` can block indefinitely. The same pattern is used for all `storage.php` writes (settings, users, files, categories, invites), the custom PHP session save handler in `auth.php`, and meta merge session files in `data/merges/`. The `.htaccess` in `data/` denies direct HTTP access to all data files. If your host does not support `.htaccess`, move the `data/` directory above the web root and update `DATA_DIR` in `config.php`. ## Allowed File Types | Category | Extensions | |----------|-----------| | OS/2 programs | `zip wpi exe cmd bat inf rpm tar gz bz2 lzh arj 7z cab img iso` | | Media | `jpg jpeg png gif bmp ico wav mp3 mid midi au aiff avi mov mp4 mpeg mpg` | | Documents | `txt nfo diz doc html htm pdf rtf me` | Maximum upload size: 200 MB (configurable in `config.php` and `php.ini`). ## Search Keyword search uses an inverted index stored in `data/index/search.json`. The index is updated automatically when files are approved or their metadata is edited. To rebuild the full index from scratch, use **Admin → Rebuild Search Index**. Search behavior: - Queries are tokenized (3+ characters, non-stop words, case-insensitive) - **Periods within a token are preserved** — `file.txt` is indexed and searched as a single literal term, not split into `file` and `txt` - **Space-separated terms are OR'd** — `file txt` returns files matching either term - **Quoted phrases** (`"file manager"`) perform a literal case-insensitive substring match against all metadata fields, bypassing the keyword index