Unsigned to Signed Data types
As of version 0.103.0 SpikeInterface has changed one of its defaults for interacting with
Recording
objects. We no longer autocast unsigned dtypes to signed implicitly. This
means that some users of SpikeInterface will need to add one additional line of code to their scripts
to explicitly handle this conversion.
Why this matters?
For those that want a deeper understanding of dtypes NumPy provides a great explanation.
For our purposes it is important to know that many pieces of recording equipment opt to store their electrophysiological data as unsigned integers
(e.g., Intan, Maxwell Biosystems, 3Brain Biocam).
Similarly to signed integers, in order to convert to real units these file formats only need to store a gain
and an offset
. Our RecordingExtractor
’s maintain the dtype that the file format utilizes, which means that some of our
RecordingExtractor
’s will have unsigned dtypes.
The problem with using unsigned dtypes is that many types of functions (including the ones we use from SciPy
) perform poorly with unsigned integers.
This is made worse by the fact that these failures are silent (i.e. no error is triggered but the operation leads to nonsensical data). So the
solution required is to convernt unsigned integers into signed integers. Previously we did this under the hood, automatically for users that had
a Recording
object with an unsigned dtype.
We decided, however, that implicitly performing this action was not the best course of action, since:
explicit is always better than implicit
some functions would magically change the dtype of the
Recording
object, which can cause confusion
So from version 0.103.0, users will now explicitly have to perform this transformation of their data. This will help users better understand how they are processing their data during an analysis pipeline as well as better understand the provenance of their pipeline.
Using unsigned_to_signed
For users that receive an error because their Recording
is unsigned, their is one additional step that must be done:
import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre
# Intan is an example of unsigned data
recording = se.read_intan('path/to/my/file.rhd', stream_id='0')
# to get a signed version of our Recording we use the following function
recording_signed = spre.unsigned_to_signed(recording)
# we can now apply any preprocessing functions like normal, e.g.
recording_filtered = spre.bandpass_filter(recording_signed)
Now with the signed dtype of the Recording
one can use a SpikeInterface pipeline as usual.
If you are curious if your Recording
is unsigned you can simply check the repr or use get_dtype()
# the repr automatically displays the dtype
print(recording)
# use method on the Recording object
print(recording.get_dtype())
In either case, if the dtype displayed has a u
at the beginning (e.g. uint16
) then your recording is
unsigned. If it doesn’t have the u
(e.g. int16
) then it is signed and would not need this preprocessing step.
Bit depth
One final important piece of information for some users is the concept of bit depth, which is the number of bits used to
sample the data. The bit_depth
argument that can be fed into the unsigned_to_signed
function.
This should be used in cases where the ADC bit depth does not match the bit depth of the data type (e.g., if the data is
stored as uint16
but the ADC is 12 bits).
Let’s make a concrete example: the Biocam acquisition system from 3Brain uses a 12-bit ADC and stores the data as
uint16
. This means that the data is stored in a 16-bit unsigned integer format, but the actual data
only covers a 12-bit range. Therefore, that the “zero” of the data is not at 0, nor at half of the uint16
range (i.e. 2^15),
but rather at 2048 (i.e., 2^12).
In this case, setting the bit_depth
argument to 12 will allow the unsigned_to_signed
function to
correctly convert the unsigned data to signed data and offset the data to be centered around 0, by subtracting 2048
while converting the data from unsigned to signed.
recording_unsigned = se.read_biocam('path/to/my/file.brw')
# we can now convert to signed with the correct bit depth
recording_signed = spre.unsigned_to_signed(recording_unsigned, bit_depth=12)
Additional Notes
Some sorters make use of SpikeInterface preprocessing either within their wrappers or within their own code base. So remember to use the “signed” version of your recording for the rest of your pipeline.
Using
unsigned_to_signed
in versions less than 0.103.0 does not hurt your scripts. This option was available previously along with the implicit option. Adding this into scripts with old versions of SpikeInterface will still work and will “future-proof” your scripts for when you update to a version greater than or equal to 0.103.0.For additional information on units and scaling in SpikeInterface see Working with physical units in SpikeInterface recordings.