Skip to content
Xophmeister edited this page Apr 13, 2013 · 2 revisions

Database Schema

The database is SQLite-based and inherits some of its structure from its spreadsheet predecessor. As, in my day job, I have to work with a database with no constraints beyond primary keys and nullity checks, I've somewhat gone to town (perhaps excessively) to show how even something as lightweight as SQLite can do things properly.

A Note on Times and Dates

SQLite doesn't have a datetime type, so all temporal data is stored as its respective Unix time (i.e., an integer). When times do not have a date component, we normalise them about the Unix epoch (e.g., so 12:34 is 1970-01-01 12:34, which is Unix time 45240); dates without times are normalised to midnight.

Stations

Stations in the UK, amongst other things, are all given a three character code, which is standardised by the government. For example, London Waterloo is WAT; the coding tends to be semantically similar to the stations' actual names. While it's not necessary to use these standard codes, it makes things easier and, either way, the table forces you to at least enter a three character string in upper case.

CREATE TABLE stations (
  code text primary key not null check (length(code) = 3) check (code = upper(code)),
  name text not null
);

stations.code is used as a foreign key in subsequent tables.

Modelling Train Routes

Train routes are modelled as a (potentially cyclic) directed graph with a star topology. This model serves our purpose.

All routes start from an origin station at a particular time. Train companies change their routes periodically, so an origin must be given a validity period, whose expiry date can be left null if this data is not yet known. Thus, all route origins ("trains") are constrained to be unique with regard to these three data (station, departure time and validity start).

Consequently, the system can't model routes that depart from a station at the same time: As we are confining ourselves to monitoring a small number of set routes, this is of little concern. Moreover, the system does not allow for different routes on different days of the week (e.g., distinguishing between weekday and weekend routes): Again, as the purpose of the system is to log commutes, this isn't a huge problem.

CREATE TABLE trains (
  id integer primary key asc autoincrement,
  stationFrom text not null constraint validOrigin references stations (code),
  departT integer not null,
  validFrom integer not null,
  validTo integer default null,
  constraint validValidity check (validTo is null or validTo > validFrom),
  constraint uniqueOrigin unique (stationFrom, departT, validFrom)
);

The route each train takes is defined by creating waypoint records. Each waypoint must have an arrival time after the train's departure time (constrained with a trigger) and can only be visited once. (Note that the waypoint can be the origin station.) If the train's arrival time crosses midnight, then it must be stored with a date of 1970-01-02.

CREATE TABLE routes (
  train integer not null references trains (id),
  stationTo text not null constraint validWaypoint references stations (code),
  arriveT integer not null,
  constraint uniqueRoute primary key (train, stationTo)
);

CREATE TRIGGER validateWaypoint
before insert on routes for each row
when (select departT from trains where id = new.train) >= new.arriveT
begin
  select raise(abort, 'Train cannot go back in time!');
end;

Journey Log

[Write up]

CREATE TABLE log (
  stationFrom text    not null constraint validStationFrom references stations (code),
  stationTo   text    not null constraint validStationTo   references stations (code),
  scheduleT   integer not null unique,
  arriveT     integer not null,
  excuse      text,
  cancelled   integer not null default 0 check (cancelled in (0,1)),
  
  constraint uniqueLog         primary key (stationFrom, stationTo, scheduleT),
  constraint differentStations check (stationFrom != stationTo),
  constraint journeyTime       check (arriveT > scheduleT)
);

CREATE TRIGGER validateRoute
before insert on log for each row
when (
  select count(*)
  from   (
           select id as train, stationFrom, departT from trains
           union
           select train, stationTo as stationFrom, arriveT as departT from routes
         ) as origin
  join   routes as destination
  on     destination.train     = origin.train
  and    destination.stationTo = new.stationTo
  join   trains
  on     trains.id             = origin.train
  where  origin.stationFrom    = new.stationFrom
  and    origin.departT        = strftime('%s', '1970-01-01 ' || time(new.scheduleT, 'unixepoch'))
  and    new.scheduleT         between trains.validFrom and ifnull(trains.validTo, 10000000000)
) = 0
begin
  select raise(abort, 'Not a valid route!');
end;

Clone this wiki locally