A crash survey of noninvasive neuroengineering

The true potential of noninvasive neuroengineering is still waiting to be unlocked. Unfortunately, no existing implementation of a noninvasive neurotechnology - one that doesn’t break the skin - yet provides the value required of a consumer facing product. This shouldn’t stop you from pushing forward, but in the following methods, let’s examine the common approaches, and where boundaries and limitations lie.

Common neural signal features
- Signal features comparison
Signal recording
- Recording system comparison
Signal processing overview
- Preprocessing techniques
- Spectral density techniques

I. Common neural signal features

Not designed for interfacing

The brain wasn’t designed for interfacing. Neurons, having only the evolutionary pressure to communicate with each other electrochemically, have no particular reason to interact with our electronics. Fundamentally, neurons don’t leak a lot of information to the exterior extracellular matrix, and any particular leakage is subject to the inverse square rule (or worse) like most physical processes.

This means that most non-invasively detectible effects occur over very large, aggregate spatial and temporal regions. You are always trading off reliability of detection with resolution and transfer bitrate.

Signal features comparison

Signal feature	Common use	Limitations
P300 familiarity signal	familiarity of images	needs repeated exposure, false positives
	typing on a keyboard	accomplished by flashing the letters; slow, few letters per minute max achieved
δ-band activity/SCP	cursor movements	very slow cursor modulation/bitrate (tens of seconds)
β-band activity; α/β ratio	alertness, awakeness, focus	predominantly modulated by eyes-openess; alertness highly variable between individuals
	eyes open or closed	lag of seconds to transition
	focus level	much less accurate than pupil size visualized through a webcam, highly variable between individuals
sensorimotor: μ-band (β-band subset)	arm/hand/finger movement	can detect movement onset; high-variance trial-to-trial, slow (seconds)
	speech	can detect speech onset; distinguishing speech content extremely difficult
sensorimotor: local motor potential	hand/finger position	high-variance trial-to-trial, person-to-person, not well replicated
VEPs	typing on keyboard	Person-to-person variability. In persons working, highest transfer bitrate. Requires pre-arranged and accurately-timed stimuli, usually flashing lights. Can be epilepsy trigger.
	multiple-choice
Error potentials	mismovement, misspeech, typing correction	Person-to-person variability. Usually in the μ and β-bands. Aggregate effect, noisy/error-prone to use on individual trials.
EMG activation	cursor movement	high SNR, easy to detect, easy to use; but not particularly impressive, less considered to be "neural" signals
	keyboard movement
	blinking facial movements
EKG	heart rate, heart rate variability	high SNR, very easy to detect; circuits need to be very safe

II. Signal recording

The devices that record human neural come in many forms, even though they may operate by identical principles. Don’t be confused by the many forms that they purport to take: instead, examine their design and execution by the physical usability of setting up the recording process, the stability and reliability of the resulting recording, and the data value in terms of their spatial and temporal resolution to provide you a useful signal.

Ignore the marketing

Some of them, perhaps the most familiar of which, are marketed as life-style or performance quantification, promising to measure some aspects of your exercise or mood. Some market themselves as toys or presentation aids, allow you to perform “telekinesis” or virtual control. Some actually purport to be EEG, EMG (electromyography), or EOG recording and research tools. Some still do claim to record EEG, but tailor its use specially for educational and demonstration purposes. Whatever their marketed purpose, their significance to you, the neuroengineer, primarily rides upon their design - both in their circuit design and their human-centered usability - to give you the signal features you wish to use.

Recording system comparison

Recording hardware	Common use	Usability	Stability	Data value	Limitations
OpenBCI 32bit + Ultracortex	P300, α, μ, β, δ, VEPs	Moderate-High: Fast set-up, many API choices (some in early development), very flexible	Moderate: somewhat motion-resistant, good wireless stability	High: adjustable spatial resolution, high sampling rate, modifiable with 3D printer	Expensive: 8 channel resolution for $800, and 16 channel can be 1100 or more
OpenBCI Ganglion + Ultracortex	P300, α, μ, β, δ, VEPs	Moderate-High: Fast set-up, many API choices (some in early development), very flexible	Moderate: somewhat motion-resistant, good wireless stability	Moderate: adjustable spatial resolution, high sampling rate, much more limited channel count	Moderately expensive: 4 channel resolution for $400
OpenBCI Ganglion + own physical hardware	P300, α, μ, β, δ, VEPs, EMG, EKG	Moderate: many API choices (some in early development), very flexible, but need to design and manufacture make own wearable interface	Moderate: somewhat motion-resistant, good wireless stability. High if used for EMG/EKG.	Moderate: adjustable spatial resolution, high sampling rate, much more limited channel count. High if used for EMG/EKG where less channels are needed.	Difficult to interface with neural signals with stability; may be great option for EMG or EKG.
Neurosky Mindwave	α/β	Moderate: Quick to achieve a feed of very limited metrics	Low-moderate: motion-susceptible, variable between individuals	Low: single-channel low bandwidth recordings; neural only	Inexpensive but very limited data
Emotiv EPOC	P300, α, β, μ, VEPs	Low-moderate: Can achieve a feed of very limited metrics,	Low-moderate: motion-susceptible, variable between individuals	Moderate: 16-channel recordings; neural only	Limited data usage, pre-set algorithm outputs, EEG access available only on additional subscription, limited APIs
Emotiv Insight	α, β	Moderate: Quick to achieve a feed of very limited metrics	Low-moderate: motion-susceptible, variable between individuals	Low: 4-channel low bandwidth recordings; neural only; does not allow raw EEG data output	Limited data usage, pre-set algorithm outputs, no raw EEG access
Thync	-	-	-	-	-
Versus	-	-	-	-	-
Bitalino	P300, α, β, μ, VEPs, EMG, EKG	Low-Moderate: provides circuitry only, wireless; need own streaming and processing stack	Moderate: when used for EMG/EKG, dependent on your wearable interfacing setup	Moderate: various channels and individual preamps, can record at high bitrates depending on own circuitry	Very versatile and small tools, easy to sew into wearables, but need to do a lot of ground work
FlexVolt Sensor	EMG, EKG	Moderate: provides circuitry, wires and electrodes, and some processing stack, wireless; limited API and existing codebase	Moderate: when use for EMG/EKG, but there are a lot of excess wires	Moderate: 2-8 channel variants at 256 Hz	Limited existing codebase, need a bit of reverse engineering from existing manufacturer's OS code, may want to replace existing wiring
Backyard Brains Muscle Spiker	EMG	High: 1 channel but works out of box; a quick demo	High: when used for EMG, existing hardware filters very stable	Low-moderate: can obtain pre-filtered and post-filtered (hardware filter) signal, trade off stability with usability	Very easy to get started and produce a quick demo, but about the same as a very-low-channel (1 ch) bitalino when you want more complex behavior
Backyard Brains Heart & Brain Spiker	EEG, EKG	High: 1 channel but works out of box; a quick demo	High--moderate: when used for EEG, existing hardware filters very stable, but headband designs less so	Low-moderate: can obtain pre-filtered and post-filtered (hardware filter) signal, trade off stability with usability. May need gel use.	Very easy to get started, has some codebase and ideas, but about the same as a very-low-channel (1 ch) bitalino when you want more complex behavior

III. Signal processing overview

This is the “glue” between the raw signals and some proxy of usable behavior, and you may need to produce a lot of glue code for differing time resolutions and wearable modalities, especially when creating a hierarchical device that are dependent on many inputs. A large number of common techniques have been developed in the past several decades, with increasing computational complexity, and lagging up to several decades behind cutting-edge compute and learning techniques.

Underdetermined & underconstrained

While at first glance algorithm design may seem to be a good way to make breakthroughs, the dependence of cutting-edge learning algorithms on the stability and SNR of the signal capture cannot be emphasized enough. Overfitting to noise and artifacts of particular datasets is extremely easy due to the lack of SNR of useful features, and because neural behavior is highly underconstrained and underdetermined.

“The brain has about 10¹⁴ synapses and we only live for about 10⁹ seconds. So we have a lot more parameters than data. This motivates the idea that we must do a lot of unsupervised learning since the perceptual input (including proprioception) is the only place we can get 10^5 dimensions of constraint per second.” - Geoffrey Hinton

Comparison of feature extraction systems

Preprocessing, noise rejection, spatial filtering

Preprocessing	Common use	Advantage	Limitations
Common average referencing (CAR)	Common noise rejection, power rejection, motion artifact rejection	Very simple to implement, low compute and memory requirements, $${\text{sig}_i}'=\text{sig}_i-\frac{1}{n}\sum\limits_{i}^{n}\text{sig}_i$$	Noise ingress/flux must be similar in magnitude; sensors must be facing the same direction in space. Performs better for power rejection than for motion artifacts. Smears signals spatially. Can destroy out-of-phase traveling waves.
Surface Laplacian, minimal/fixed form	Common noise rejection	When implemented as $$\begin{bmatrix} 0&-\frac{1}{4}&0\\ -\frac{1}{4}&1&-\frac{1}{4}\\ 0&-\frac{1}{4}&0 \end{bmatrix}$$ or larger forms, works across curved electrode grids (such as EEG). Lower spatial smearing than CAR. Still very simple to implement.	Localized strong noise much more likely to smear over legitimate signals. Very simplistic and prone to bad assumptions.
Surface Laplacian, spherical spline form	Common noise rejection	$$\nabla_{\text{surf}}^{2} P_{n}(\cos \theta)=-\frac{n(n+1)}{r^2}P_n(\cos \theta)$$, works across curved EEG grids. Lower spatial smearing than CAR. Better assumptions than above minimal form.	Much more annoying to implement than static forms. Requires known geometry or physical model of headform. Slow.
Common spatial pattern (CSP)	Pre-extract features specific to block-design contrast conditions	Implemented as $$\mathbf{w}=\text{argmax}_{\mathbf{w}}\frac{\|\|\mathbf{w}\mathbf{X_1} \|\|^2}{\|\|\mathbf{w}\mathbf{X_2}\|\|^2}$$ gives precomputed spatial weight w for areas with maximum variance between conditions. Signal can be very usable and strong after processing.	Requires the signal conditions to be stationary across time; requires a designated and planned training phase with separate recording blocks to obtain w, which can be unreliable.
Principal component analysis (PCA)	Find spatial weights that maximize variance across all timeseries (cannot pre-define blocks). Noise rejection, motion artifact rejection. Improve classification.	Fast, easy to apply. Implemented as $$\mathbf{X^TX}=\mathbf{W\Sigma^2W^T}$$ Commonly used, requires a training phase but does not need to be pre-planned well. Top PCs often useful for something. Can be used to reject noise (by reprojecting and excluding smallest PCs).	Top PCs may not mean anything, and can be a mishmash of multiple phenomena. Single phenomena can be split over several PCs into + and - components. If SNR is low, noise may still dominate variance and therefore PCs. Cannot predefine blocks.
Independent component analysis (ICA)	Noise rejection, motion artifact rejection. Improve classification.	Excellent at power noise rejection and motion artifact rejection. Generally separates well non-physiological from physiological signals.	Very slow compared to all above techniques. Mostly only usable offline. Quite often need an additional stack to identify useful v. non-useful components; this is often done only by eye.

Oscillatory and spectral density

Signal processing	Common use	Advantage	Limitations
Gabor filter / short-time (windowed) FFT	Oscillations, spectral power density, spectral power ratios	Extremely common, size of codebase, fast to implement, often even have hardware instructions	feature extraction highly sensitive to Gabor frame size
Multitaper FFT	Oscillations	Precision determination of oscillatory frequency	Slow, latency and low temporal resolution after implementation; needs 3 full oscillations for accurate determination
Wavelet transforms	Oscillations, spectral power density	Relatively precise determination of oscillatory frequency; approaches the Gabor-Heisenberg theoretical limit; faster than FFT if only need a few spectral features at a time	Much slower if need as many spectral features as FFT
AR models	Spectral power density	Similar features extracted as FFT, but depending on noise level not interchangable with FFT spectral components. Can be faster depending on order.	Not as well as documented in neural usage as FFT, except in some cases. Not well-suited for oscillations.
Analytic signal envelope	Spectral power density, instantaneous signal phase	Fast to compute for known spectral bands, even for wide bandwidths. Simple to compute acausally as the absolute value of the hilbert analytic transform; phase can be extracted precisely as the cosine angle. More difficult as a causal filter, but can be approximated as FIR.	Must be approximated as FIR as a causal filter. Not efficient to use if need many bands, but may be the best way to extract phase.