Skip to content

Sequence classification

Besides one-step forecasting, esnfed does sequence classification: each input is a (possibly variable-length) sequence that maps to one class. The reservoir turns each sequence into a fixed feature vector (the mean or last extended state); a ridge readout on one-hot targets then classifies by argmax.

Because the readout is again a ridge regression, the exact federated scheme carries over unchanged — clients exchange the summed statistics A = FᵀF, B = FᵀY and the server solves once, recovering the pooled classifier exactly.

import numpy as np
from esnfed import EchoStateNetwork, topologies, datasets
from esnfed import classification as clf

jv = datasets.load_japanese_vowels()          # 9 speakers, 12-d cepstra
W = topologies.random_reservoir(120, density=0.1, rng=0)
esn = EchoStateNetwork(jv.n_features, jv.n_classes, W,
                       spectral_radius=0.9, leaking_rate=0.3, washout=0, ridge=1e-3)

# centralised
W_out = clf.train_classifier(esn, jv.X_train, jv.y_train, jv.n_classes)
acc = clf.accuracy(jv.y_test, clf.predict_labels(esn, jv.X_test, W_out))

# exact federated — one client per speaker, no raw audio shared
clients = datasets.group_clients(jv.X_train, jv.y_train, jv.groups_train)
W_fed = clf.federated_classifier(esn, clients, jv.n_classes)        # == centralised

API

Function Purpose
reservoir_features(esn, sequences, pool) sequence → feature (pool="mean"/"last")
train_classifier(esn, X, y, n_classes) centralised ridge classifier
federated_classifier(esn, client_data, n_classes) exact federated (sum A_k,B_k)
ensemble_classify(members, X) average softmax over heterogeneous members
predict_labels / predict_proba argmax / class probabilities
accuracy(y_true, y_pred) classification accuracy

Heterogeneous clients

When each client has its own reservoir, parameters can't be averaged — combine predictions instead:

members = [(esn_k, clf.train_classifier(esn_k, X_k, y_k, n_classes)) for ...]
y_pred = clf.ensemble_classify(members, X_test)

Validated on real benchmarks

See Benchmark datasets. Measured results (one client per natural group, no raw data shared):

Dataset Task Clients Federated Local-only
Japanese Vowels speaker ID (9-class) 9 speakers 0.98 ~0.11 (chance)
HAR Smartphones activity (6-class) 21 subjects 0.92 0.66

In both cases the federated classifier is numerically identical to centralised training on the pooled data; on Japanese Vowels the per-speaker split is extreme label skew, so local-only training cannot work at all — federation is essential. On HAR, a prediction ensemble over heterogeneous per-subject reservoirs reaches ≈0.84, between local-only and the shared-reservoir federated classifier.