120 Beacon St, Somerville, MA 02143
Introduction To NeuroMaker BCI Connect Data Files

Introduction To Our Data

So you want to learn about As we have learned through our other NeuroMaker BCI posts, there is an incredible amount of interesting data that we can gather from our Focus 1 headsets. We can observe our attentive state, meditative state and brainwaves ranging from alpha to gamma all in real time. In order to collect data for the many, many experiments that we wish to conduct, we must store this information in a place where we can analyze and interpret the results.

On the bottom right-hand side of your NeuroMaker BCI Connect application, you will notice a “start collection” and “stop collection” buttons. This is how we can collect data for our experiments! As soon as you press “start collection“, a timer will begin counting the number of seconds of data being collected. Once you have completed recording the data pieces you require, press “stop collection“. Once you have pressed this button, your counter will disappear and a .csv file will appear in the root folder of your program folder which will display the data you recorded!

We will be using these data files often within our NeuroMaker BCI curriculum. Sometimes these will be used to collect data about how we respond to different mental states we wish to measure. Other times, this will be used to create data we can use to program different python applications. Either way, understanding and using this data will be crucial to our journey to becoming BCI practicioners.

What is Inside Our Files

Once you have found the data file in your root folder, you will notice many columns of data containing the information you recorded. From left to right, each column will list each of the different readings from the NeuroMaker BCI Connect Dashboard. The first thing you will notice is that each column of data has a different amount of data points. For example, over the course of 60 seconds of recording, we should find about 60 cells with Meditation data, 60 cells with Attention data and hundreds of values for Raw EEG.

This is because the sampling rate is different for each different value of our readings. This is typically measured in samples per second known as Hertz. Our attention and meditation algorithms measure data in 1 Hertz, meaning we will have one number in a cell for each second we record data. Our Raw EEG sampling rate on the other hand measures data at 160 Hertz, meaning we will have one hundred and sixty numbers in cells for each second of recording taken. Finally, the readings for different brainwaves, like Alpha or Gamma waves, are collected at about 2 Hertz, meaning we will have two numbers in cells for each second of data recorded.

All of this information will be recorded in one file per recording. This is why each some columns are longer and the others shorter!

Data File Considerations

The programming within our Focus 1 headbands are coded to provide this data at the sampling rates we discussed above. However, while making one of these recordings you will most likely not see this same amount of data recorded. For example, although our Raw EEG should collect about 160 samples per second, perhaps we will only see 100 samples in one second of your recorded data. This is completely normal for a device like this and could be due to the following reasons:

  • The electrode on your headband has slipped slightly during your recording session and was unable to detect a signal
  • Your Wifi signal quality between the headband and the computer was slightly interrupted
  • Your device has been paired between many different computer multiple times and needs to calibrate
  • Your Focus 1 headset is low on battery

This situation is called packet loss and is very normal to experience. As an aspiring BCI scientist using this kind of technology, we must be aware of these slight changes.

Example of a Good Data File

The images below show segments of a “good” data file. In this case, we will define “good” as having a minimum package loss as possible to show meaningful results from our experiments. Let’s take a full look at the file linked here and then we will break down some important points to keep in mind.

Click HERE to see the DataFile spreadsheet.  

This file was collected to provide experimental data to detect a student’s mental state while completing different tasks.

Within the NeuroMaker BCI curriculum, we can use templates to detect how well you can focus on your homework or other activities by using your EEG headsets! But first we will need to determine if the dataset we received is accurate enough for our purposes.

Within that file, we recorded a little over 2 minutes of data, which ended up being 132 seconds. Let’s break down what different pieces we can look at in a file like this to determine whether this data is usable or not.

Let’s Explore The Data File!

Image 1:  The Original DataFile Saved

The initial .xlsx file saved after clicking “Stop Collection”.   

Click HERE to see the DataFile spreadsheet. 

Image 2:  The number of Attention & Meditation data points are Equal to the duration of recording.

The Attention & Meditation data is sampled at a data sampling rate of 1 Hertz. 

This data file is from a recording that was a little over 2 minutes long.

The Attention & Meditation data columns each have 132 data points collected, or 132 recorded seconds.

This file shows that the “Stop Collection” button was clicked a bit after the 2 minute time point.

Click HERE to see the DataFile Spreadsheet. 

Images 2-4 correspond to the second spreadsheet page.

Image 3: Brainwave data is collected at a 2 Hertz data sampling rate.

The Brainwave data columns have 261 datapoints. This makes sense.

Sampling rate for the brainwave  is about 2x the Attention & Meditation data sampling rate, or 2 Hertz.

Ideally, we would expect 262  data points.  (132 * 2 ≈ 262).

No data package loss would mean that the Attention: Brainwave data length comparison would be exactly 1:2. The value of this recording though, is basically 2:1. The inaccuracy is about 0.76% inaccuracy which is negligible.

Click HERE to see the DataFile Spreadsheet. 

Images 2-4 correspond to the second spreadsheet page.

Image 4: Raw EEG data has a data sampling rate of 160 Hertz.

The Raw EEG data column has 20722 data points.

The data sampling rate for EEG data is 160 Hertz or 160 samples per second.

If we compare to the ideal situation, 132 seconds, and 160 samples collected per second, we would expect 21120 datapoints.  The inaccuracy is about 1.88% inaccuracy which is negligible.

Click HERE to see the DataFile Spreadsheet. 

Images 2-4 correspond to the second spreadsheet page.

You may not ever see a file that completely fills 100% of the data we expect, however ensuring that packet loss remains at 5% or less is a good rule of thumb for the experiments that we will be conducting with NeuroMaker BCI.

Example of a Bad Data File

Below we have an image of a different data set that recorded 60 seconds of EEG data. Although we cannot see each every single cell on the Raw EEG column, we can see the full length of the Attention and Meditation data. We know that our Attention and Meditation sampling rate is 1 Hertz and should therefore expect 60 data points.

Unfortunately, we only see 7 data points out of the 60 expected. This shows severe packet loss and means that in this case,  86.67% of the data was lost.

The students undergoing this experiment may have removed their EEG headbands, used up all the battery in their device or encountered some other technical trouble. Regardless of the reason, 86.67% package loss is much, much higher than the 5% we can tolerate, so this experiment must be conducted again in order to get the results we need.

Final Thoughts

As future scientists and engineers, we must always ask ourselves, “Does this data make sense?” Although you have access to EEG technology capable of collecting and processing complication brain signals, the best tool we have available for making use of our data is our own common sense. Please use the guidelines and pictures above to help you determine whether the data you collect during your experiments can be useful to you!

en_USEnglish