phase-8g: add sqllite.

This commit is contained in:
Sai Kiran Vella
2026-06-05 23:34:25 -04:00
parent 447a86b46e
commit 83116baa3c
20 changed files with 2514 additions and 239 deletions
+600
View File
@@ -0,0 +1,600 @@
# Database Security & Hardening Guide
## Executive Summary
Your codebase is **currently safe** from SQL injection because it uses `better-sqlite3`'s parameterized queries correctly. However, the new abstraction layers below provide:
1. **Type-safe query construction** (QueryBuilder)
2. **Audit logging** for compliance (QueryAudit)
3. **Statement caching** for performance (DatabaseConnection)
4. **Transaction support** for atomic operations
5. **Clear separation of concerns** between data access and business logic
---
## Current Safety Assessment
**SQL Injection**: Safe
Your code uses parameterized queries (`?` placeholders) throughout:
```typescript
// SAFE — all values in parameter array
this.db.prepare('SELECT * FROM holdings WHERE ticker = ?').get(id);
// SAFE — INSERT uses parameters
this.db.prepare('INSERT INTO holdings (...) VALUES (?, ?, ?, ?, ?)').run(
ticker, shares, costBasis, type, source
);
```
The `better-sqlite3` library handles parameter binding internally — user input never touches the SQL string.
---
## New Architecture: QueryBuilder + DatabaseConnection
### Problem Solved
While your code is secure, it has several maintainability issues:
1. **Hardcoded SQL strings** scattered across repositories
2. **No audit trail** — impossible to trace mutations for compliance
3. **No statement caching** — compiler recompiles the same queries repeatedly
4. **No type safety** — column names are strings, easy to typo
5. **Mixed concerns** — repositories call `.prepare()` directly; hard to add logging/caching globally
### Solution: Three Layers
```
┌─────────────────────────────────────────────────┐
│ Controllers / Services (business logic) │
└────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────┐
│ DatabaseConnection (timing, logging, caching) │
├─────────────────────────────────────────────────┤
│ - Execute queries via QueryBuilder │
│ - Log to QueryAudit │
│ - Cache prepared statements │
│ - Measure execution time │
│ - Support transactions │
└────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────┐
│ QueryBuilder (type-safe, column-validated) │
├─────────────────────────────────────────────────┤
│ - Whitelist column/table names │
│ - Build SQL with validated identifiers │
│ - Keep all user input in parameter array │
│ - Fluent API for clarity │
└────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────┐
│ QueryAudit (compliance trail) │
├─────────────────────────────────────────────────┤
│ - Log every query: timestamp, SQL, params │
│ - Track READ / WRITE / DELETE actions │
│ - Measure performance │
│ - Generate audit reports │
└────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────┐
│ better-sqlite3 (SQLite execution) │
│ (parameterized → injection-safe) │
└─────────────────────────────────────────────────┘
```
---
## Usage Examples
### QueryBuilder — Type-Safe Query Construction
All column and table names are validated against a whitelist. User input stays in the parameter array.
#### SELECT
```typescript
// Safe: columns validated, params isolated
const qb = new QueryBuilder('holdings')
.select(['ticker', 'shares', 'cost_basis'])
.where('type = ? AND shares > ?', ['stock', 10])
.orderBy('ticker', 'ASC')
.limit(100);
const rows = db.all(qb);
// Equivalent SQL: SELECT ticker, shares, cost_basis FROM holdings
// WHERE type = ? AND shares > ? ORDER BY ticker ASC LIMIT 100
// Params: ['stock', 10]
```
#### INSERT
```typescript
const qb = new QueryBuilder('holdings')
.insert(['ticker', 'shares', 'cost_basis', 'type', 'source'],
['AAPL', 100, 15000, 'stock', 'Manual']);
db.run(qb);
// Equivalent SQL: INSERT INTO holdings (ticker, shares, cost_basis, type, source)
// VALUES (?, ?, ?, ?, ?)
// Params: ['AAPL', 100, 15000, 'stock', 'Manual']
```
#### UPDATE
```typescript
const qb = new QueryBuilder('holdings')
.update({ shares: 150, cost_basis: 22500 })
.where('ticker = ?', ['AAPL']);
db.run(qb);
// Equivalent SQL: UPDATE holdings SET shares = ?, cost_basis = ? WHERE ticker = ?
// Params: [150, 22500, 'AAPL']
```
#### DELETE
```typescript
const qb = new QueryBuilder('holdings')
.delete()
.where('ticker = ?', ['AAPL']);
db.run(qb);
// Equivalent SQL: DELETE FROM holdings WHERE ticker = ?
// Params: ['AAPL']
```
### DatabaseConnection — Unified Data Access
Wraps better-sqlite3 with logging, caching, and audit trails.
```typescript
// In server/app.ts
import BetterSqlite3 from 'better-sqlite3';
import { DatabaseConnection, QueryAudit } from './server/db';
const betterSqlite3Db = new BetterSqlite3('./market-screener.db');
const audit = new QueryAudit();
const db = new DatabaseConnection(betterSqlite3Db, { audit, logSlowQueries: 100 });
// Pass `db` to repositories, not the raw better-sqlite3 instance
app.register(ScreenerController, { db });
```
### QueryAudit — Compliance Logging
Automatically logs all queries with timestamps, performance metrics, and parameters.
```typescript
// In your app
const audit = new QueryAudit(async (entry) => {
// Optional: send to compliance logger, file, or remote service
if (entry.action === 'WRITE' && entry.error) {
console.error(`WRITE failed at ${entry.timestamp}: ${entry.error}`);
}
});
const db = new DatabaseConnection(betterSqlite3Db, { audit });
// Later: inspect the audit trail
db.printAudit();
// Output:
// === Query Audit Report ===
// Total entries: 42
// Showing last 100 entries:
//
// [2026-06-05T12:34:56.789Z] READ ✓ (1.23ms) — 5 rows
// SQL: SELECT ticker, shares, cost_basis FROM holdings WHERE type = ? ORDER BY ticker ASC
// Params: ["stock"]
//
// [2026-06-05T12:34:57.456Z] WRITE ✓ (0.89ms) — 1 rows
// SQL: INSERT INTO holdings (ticker, shares, ...) VALUES (?, ?, ...)
// Params: ["AAPL", 100, 15000, "stock", "Manual"]
```
---
## Migration Path: Refactor Repositories
Update your repositories to use the new `DatabaseConnection` and `QueryBuilder`.
### Before (Current)
```typescript
export class MarketCallRepository {
constructor(private readonly db: Db) {}
list(): MarketCall[] {
const rows = this.db
.prepare('SELECT * FROM market_calls ORDER BY created_at DESC')
.all() as CallRow[];
return rows.map(MarketCallRepository.toCall);
}
get(id: string): MarketCall | null {
const row = this.db
.prepare('SELECT * FROM market_calls WHERE id = ?')
.get(id) as CallRow | undefined;
return row ? MarketCallRepository.toCall(row) : null;
}
}
```
### After (Hardened)
```typescript
import { DatabaseConnection, QueryBuilder } from '../db';
export class MarketCallRepository {
constructor(private readonly db: DatabaseConnection) {}
list(): MarketCall[] {
const qb = new QueryBuilder('market_calls')
.select(['id', 'title', 'quarter', 'date', 'thesis', 'tickers', 'snapshot', 'created_at'])
.orderBy('created_at', 'DESC');
const rows = this.db.all<CallRow>(qb);
return rows.map(MarketCallRepository.toCall);
}
get(id: string): MarketCall | null {
const qb = new QueryBuilder('market_calls')
.select(['id', 'title', 'quarter', 'date', 'thesis', 'tickers', 'snapshot', 'created_at'])
.where('id = ?', [id]);
const row = this.db.get<CallRow>(qb);
return row ? MarketCallRepository.toCall(row) : null;
}
create({ title, quarter, date, thesis, tickers, snapshot }: CreateCallInput): MarketCall {
const call = {
id: randomUUID(),
title,
quarter,
date: date ?? new Date().toISOString().slice(0, 10),
thesis,
tickers: tickers ?? [],
snapshot: snapshot ?? {},
createdAt: new Date().toISOString(),
};
const qb = new QueryBuilder('market_calls')
.insert(
['id', 'title', 'quarter', 'date', 'thesis', 'tickers', 'snapshot', 'created_at'],
[call.id, call.title, call.quarter, call.date, call.thesis,
JSON.stringify(call.tickers), JSON.stringify(call.snapshot), call.createdAt],
);
this.db.run(qb);
return call;
}
delete(id: string): boolean {
const qb = new QueryBuilder('market_calls')
.delete()
.where('id = ?', [id]);
const changes = this.db.run(qb);
return changes > 0;
}
private static toCall(row: CallRow): MarketCall {
return {
id: row.id,
title: row.title,
quarter: row.quarter,
date: row.date,
thesis: row.thesis,
tickers: JSON.parse(row.tickers),
snapshot: JSON.parse(row.snapshot),
createdAt: row.created_at,
};
}
}
```
**Key improvements:**
1. **Explicit columns** — Only SELECT the columns you need (better for indexing)
2. **Audit trail** — Every query is logged automatically
3. **Type safety** — QueryBuilder validates column names at compile time (via TypeScript)
4. **Performance** — Prepared statements are cached
5. **Clarity** — Fluent API makes queries self-documenting
---
## Whitelist of Safe Columns
The `QueryBuilder` validates all column/table names against a whitelist to prevent injection via identifiers:
### Holdings Table
- `ticker`
- `shares`
- `cost_basis`
- `type`
- `source`
### Market Calls Table
- `id`
- `title`
- `quarter`
- `date`
- `thesis`
- `tickers`
- `snapshot`
- `created_at`
### Adding New Columns
When you add a new column:
1. Update the DDL in `server/db/index.ts`
2. Add the column name to `SAFE_COLUMNS` in `QueryBuilder.ts`
3. Update the relevant domain type in `server/types/`
4. Update the repository to select/insert the new column
Example: Adding `updated_at` to `market_calls`
```typescript
// 1. Update DDL
const DDL = `
CREATE TABLE IF NOT EXISTS market_calls (
...
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL -- NEW
);
`;
// 2. Update QueryBuilder.ts
const SAFE_COLUMNS = new Set([
// ... existing columns
'updated_at', // NEW
]);
// 3. Update types
export interface MarketCall {
// ... existing fields
updatedAt: string; // NEW
}
// 4. Update repository
const qb = new QueryBuilder('market_calls')
.select(['id', 'title', 'quarter', 'date', 'thesis', 'tickers', 'snapshot', 'created_at', 'updated_at']) // ADDED
.where('id = ?', [id]);
```
---
## Performance: Statement Caching
`DatabaseConnection` automatically caches prepared statements. The first execution of a query compiles it; subsequent executions reuse the compiled statement.
```typescript
// First call: compiles the statement (1.5ms overhead)
const qb1 = new QueryBuilder('holdings').select(['ticker', 'shares']).where('type = ?', ['stock']);
db.all(qb1); // ~1.5ms
// Second call: reuses the cached statement (0.1ms)
const qb2 = new QueryBuilder('holdings').select(['ticker', 'shares']).where('type = ?', ['etf']);
db.all(qb2); // ~0.1ms (same SQL template)
```
Cache key is the complete SQL string. If you generate different SQL, it creates a new cached statement.
---
## Transactions: Atomic Operations
Use `db.transaction()` to execute multiple queries as a single atomic unit. If any query fails, all are rolled back.
```typescript
db.transaction(() => {
// Create a market call
const qb1 = new QueryBuilder('market_calls')
.insert(['id', 'title', ...], [callId, 'Q4 Earnings', ...]);
db.run(qb1);
// Add related tickers as separate records (if you had a separate table)
for (const ticker of tickers) {
const qb2 = new QueryBuilder('call_tickers')
.insert(['call_id', 'ticker'], [callId, ticker]);
db.run(qb2);
}
// If ANY query fails, BOTH are rolled back
// If all succeed, both are committed
});
```
---
## Audit Trail: Compliance & Debugging
The `QueryAudit` class tracks every database operation automatically.
### Built-in Features
```typescript
const audit = db.getAudit();
// Get the last 100 queries
const recent = audit.recent(100);
// Filter by action type
const writes = audit.byAction(AuditAction.WRITE);
// Generate a human-readable report
console.log(audit.report());
```
### Custom Callback
Send audit entries to a logging service or file:
```typescript
const audit = new QueryAudit(async (entry) => {
if (entry.action === 'WRITE') {
// Log all mutations to your compliance logger
await complianceLogger.log({
timestamp: entry.timestamp,
action: entry.action,
sql: entry.sql,
params: entry.params,
rowsAffected: entry.rowsAffected,
});
}
});
const db = new DatabaseConnection(betterSqlite3Db, { audit });
```
---
## Slow Query Logging
By default, queries slower than 100ms are logged to `console.warn`:
```typescript
const db = new DatabaseConnection(betterSqlite3Db, { logSlowQueries: 100 });
// Output:
// [SLOW QUERY] 234.56ms
// SELECT ticker, shares, cost_basis FROM holdings WHERE type = ? ORDER BY ticker ASC
```
Adjust the threshold based on your needs:
```typescript
new DatabaseConnection(betterSqlite3Db, { logSlowQueries: 50 }); // warn on >50ms
new DatabaseConnection(betterSqlite3Db, { logSlowQueries: 5000 }); // warn on >5s
```
---
## Common Pitfalls & How to Avoid Them
### ❌ DON'T: Hardcode user input in SQL
```typescript
// NEVER DO THIS
const ticker = getUserInput(); // e.g. "AAPL'; DROP TABLE holdings; --"
const qb = new QueryBuilder('holdings')
.select(['ticker', 'shares'])
.where(`ticker = '${ticker}'`); // SQL INJECTION!
```
### ✅ DO: Use parameter placeholders
```typescript
// ALWAYS DO THIS
const ticker = getUserInput();
const qb = new QueryBuilder('holdings')
.select(['ticker', 'shares'])
.where('ticker = ?', [ticker]); // User input is a PARAMETER
```
### ❌ DON'T: Use string concatenation for column names
```typescript
// NEVER DO THIS
const sortCol = getUserInput(); // e.g. "ticker; DELETE FROM holdings; --"
const qb = new QueryBuilder('holdings')
.select(['ticker', 'shares'])
.orderBy(`${sortCol}`); // COLUMN NAME INJECTION!
```
### ✅ DO: Column names come from your code, not user input
```typescript
// ALWAYS DO THIS
const sortCol = getUserInput(); // e.g. "ticker"
const ALLOWED_SORT_COLS = ['ticker', 'shares', 'type'];
if (!ALLOWED_SORT_COLS.includes(sortCol)) {
throw new Error('Invalid sort column');
}
const qb = new QueryBuilder('holdings')
.select(['ticker', 'shares'])
.orderBy(sortCol); // Whitelist prevents injection
```
---
## Testing
The new abstractions make testing easier:
```typescript
import { DatabaseConnection, QueryBuilder, QueryAudit } from '../db';
import BetterSqlite3 from 'better-sqlite3';
describe('MarketCallRepository', () => {
let db: DatabaseConnection;
let repo: MarketCallRepository;
beforeEach(() => {
// Use in-memory SQLite for tests
const rawDb = new BetterSqlite3(':memory:');
rawDb.exec(DDL); // Initialize schema
db = new DatabaseConnection(rawDb);
repo = new MarketCallRepository(db);
});
it('should insert and retrieve a call', () => {
const call = repo.create({
title: 'Q4 Earnings',
quarter: 'Q4',
thesis: 'FANG tech breakout',
tickers: ['GOOGL', 'META', 'NVDA'],
});
expect(call.id).toBeDefined();
const retrieved = repo.get(call.id);
expect(retrieved).toEqual(call);
// Verify the audit trail
const audit = db.getAudit();
const writes = audit.byAction(AuditAction.WRITE);
expect(writes.length).toBeGreaterThan(0);
});
});
```
---
## Summary
| Feature | Before | After |
|---------|--------|-------|
| SQL injection protection | ✅ Parameterized queries | ✅ Parameterized + column whitelist |
| Audit trail | ❌ None | ✅ QueryAudit with timestamp & params |
| Performance | ⚠️ No statement caching | ✅ Automatic statement cache |
| Type safety | ⚠️ String column names | ✅ Validated at build time |
| Testing | ⚠️ Hard to mock | ✅ Testable via DatabaseConnection |
| Transactions | ⚠️ Manual raw DB calls | ✅ `db.transaction()` |
| Slow query logging | ❌ None | ✅ Automatic > 100ms warning |
---
## Next Steps
1. **Review** the three new files:
- `server/db/QueryBuilder.ts` — Query construction
- `server/db/QueryAudit.ts` — Audit logging
- `server/db/DatabaseConnection.ts` — Unified access
2. **Update `server/app.ts`** to create and wire `DatabaseConnection`
3. **Refactor repositories** to use `QueryBuilder` and `DatabaseConnection` (see migration examples above)
4. **Add tests** for repositories using in-memory SQLite
5. **Deploy** with confidence — you now have audit trails and safeguards