Unsigned to Signed Data types

As of version 0.103.0 SpikeInterface has changed one of its defaults for interacting with Recording objects. We no longer autocast unsigned dtypes to signed implicitly. This means that some users of SpikeInterface will need to add one additional line of code to their scripts to explicitly handle this conversion.

Why this matters?

For those that want a deeper understanding of dtypes NumPy provides a great explanation. For our purposes it is important to know that many pieces of recording equipment opt to store their electrophysiological data as unsigned integers (e.g., Intan, Maxwell Biosystems, 3Brain Biocam). Similarly to signed integers, in order to convert to real units these file formats only need to store a gain and an offset. Our RecordingExtractor’s maintain the dtype that the file format utilizes, which means that some of our RecordingExtractor’s will have unsigned dtypes.

The problem with using unsigned dtypes is that many types of functions (including the ones we use from SciPy) perform poorly with unsigned integers. This is made worse by the fact that these failures are silent (i.e. no error is triggered but the operation leads to nonsensical data). So the solution required is to convernt unsigned integers into signed integers. Previously we did this under the hood, automatically for users that had a Recording object with an unsigned dtype.

We decided, however, that implicitly performing this action was not the best course of action, since:

  1. explicit is always better than implicit

  2. some functions would magically change the dtype of the Recording object, which can cause confusion

So from version 0.103.0, users will now explicitly have to perform this transformation of their data. This will help users better understand how they are processing their data during an analysis pipeline as well as better understand the provenance of their pipeline.

Using unsigned_to_signed

For users that receive an error because their Recording is unsigned, their is one additional step that must be done:

import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre

# Intan is an example of unsigned data
recording = se.read_intan('path/to/my/file.rhd', stream_id='0')
# to get a signed version of our Recording we use the following function
recording_signed = spre.unsigned_to_signed(recording)
# we can now apply any preprocessing functions like normal, e.g.
recording_filtered = spre.bandpass_filter(recording_signed)

Now with the signed dtype of the Recording one can use a SpikeInterface pipeline as usual.

If you are curious if your Recording is unsigned you can simply check the repr or use get_dtype()

# the repr automatically displays the dtype
print(recording)
# use method on the Recording object
print(recording.get_dtype())

In either case, if the dtype displayed has a u at the beginning (e.g. uint16) then your recording is unsigned. If it doesn’t have the u (e.g. int16) then it is signed and would not need this preprocessing step.

Bit depth

One final important piece of information for some users is the concept of bit depth, which is the number of bits used to sample the data. The bit_depth argument that can be fed into the unsigned_to_signed function. This should be used in cases where the ADC bit depth does not match the bit depth of the data type (e.g., if the data is stored as uint16 but the ADC is 12 bits). Let’s make a concrete example: the Biocam acquisition system from 3Brain uses a 12-bit ADC and stores the data as uint16. This means that the data is stored in a 16-bit unsigned integer format, but the actual data only covers a 12-bit range. Therefore, that the “zero” of the data is not at 0, nor at half of the uint16 range (i.e. 2^15), but rather at 2048 (i.e., 2^12). In this case, setting the bit_depth argument to 12 will allow the unsigned_to_signed function to correctly convert the unsigned data to signed data and offset the data to be centered around 0, by subtracting 2048 while converting the data from unsigned to signed.

recording_unsigned = se.read_biocam('path/to/my/file.brw')
# we can now convert to signed with the correct bit depth
recording_signed = spre.unsigned_to_signed(recording_unsigned, bit_depth=12)

Additional Notes

  1. Some sorters make use of SpikeInterface preprocessing either within their wrappers or within their own code base. So remember to use the “signed” version of your recording for the rest of your pipeline.

  2. Using unsigned_to_signed in versions less than 0.103.0 does not hurt your scripts. This option was available previously along with the implicit option. Adding this into scripts with old versions of SpikeInterface will still work and will “future-proof” your scripts for when you update to a version greater than or equal to 0.103.0.

  3. For additional information on units and scaling in SpikeInterface see Working with physical units in SpikeInterface recordings.